Commit graph

91 commits

Author SHA1 Message Date
Benjamin Lee
1410b6f409
implement per-event filtering for per-room account_data on /sync 2024-06-05 00:07:06 -07:00
Benjamin Lee
7a0b8c986f
implement global account_data filtering in /sync
TODO docs on raw_event_allowed, and figure out how we want to organize
it with CompiledRoomEventFilter::raw_event_allowed
2024-06-05 00:07:06 -07:00
Benjamin Lee
c3cf97df7a
implement filter.room.include_leave for /sync 2024-06-05 00:07:06 -07:00
Benjamin Lee
d69b88566a
implement per-event filtering for ephemeral events in /sync
I've asked a few times for clarification on whether the `senders` field
in the filter applies to userids mentioned in the typing/receipt ephemeral
events, and never got a response. Synapse does not filter these userids by
sender, so we're gonna go with that.
2024-06-05 00:07:06 -07:00
Benjamin Lee
98d93da3a8
implement per-event state filtering for joined rooms in /sync 2024-06-05 00:07:06 -07:00
Benjamin Lee
f4f3be8c30
implement per-event state filtering for left rooms in /sync 2024-06-05 00:07:06 -07:00
Benjamin Lee
b85110a292
implement per-event state filtering for invited rooms on /sync
This one is a little weird, because the stripped invite state events are
not deserialized.
2024-06-05 00:07:06 -07:00
Benjamin Lee
4c9728cbad
implement per-event timeline filtering on /sync
This is the filter.room.timeline.{senders,types,contains_url} fields, and
their associated not_* pairs.

I decided not to change the `prev_batch` calculation for sliding-sync to
use the new `oldest_event_count` value, because I'm not confident in the
correct behavior. The current sliding-sync behavior is gives `prev_batch
= oldest_event_count` except when there are no new events. In this
case, `oldest_event_count` is `None`, but the current sliding-sync
implementation uses `prev_batch = since`. This is definitely wrong,
because both `since` and `prev_batch` are exclusive bounds. If the
correct thing to do is to return the lower exclusive bound of the range
of events that may have been included in the timeline, then we would
want `since - 1`. The other option would be to return `prev_batch =
None`, like we have in sync v3. I don't know which of these is correct,
so I'm just gonna keep the current (definitely incorrect) behavior to
avoid making things worse.
2024-06-05 00:07:06 -07:00
Benjamin Lee
745eaa9b48
implement room.account_data.(not_)rooms filter on /sync 2024-06-04 20:02:43 -07:00
Benjamin Lee
84f356e67b
implement room.ephemeral.(not_)rooms filter on /sync 2024-06-04 20:02:43 -07:00
Benjamin Lee
4e1d091bbc
skip left/invited rooms with no updates in /sync
Before this change we were just returning an empty object for left or
invited rooms that don't have any updates. This is valid coloredding to
the spec, but it's a nicer to debug if they are omitted it and results in
a little less network traffic. For joined rooms, we are already skipping
empty updates.

With filtering support, it's much more common to have sync responses where
many rooms are empty, because all of the state/timeline events may be
filtered out.
2024-06-04 20:02:43 -07:00
Benjamin Lee
e6f2b6c9ad
implement room.state.(not_)rooms filter on /sync 2024-06-04 20:02:43 -07:00
Benjamin Lee
c48abf9f13
implement room.timeline.(not_)rooms filter on /sync
I asked in #matrix-spec:matrix.org and go clarification that we should be
omitting the timeline field completely for rooms that are filtered out
by the timeline.(not_)rooms filter. Ruma's skip_serializing_if attribute
on the timeline field will currently cause it to be omitted when events is
empty. If [this fix][1] is merged, it will be omitted only when events is
empty, prev_batch is None, and limited is false.

[1]: https://github.com/ruma/ruma/pull/1796

TODO: maybe do something about clippy::too_many_arguments
2024-06-04 20:02:43 -07:00
Benjamin Lee
458d6842fb
implement top-level (not_)rooms filter on /sync
These are the fields at filter.room.{rooms,not_rooms}, that apply to all
categories. The category-specific room filters are in
filter.room.{state,timeline,ephemeral}.{rooms,not_rooms}.
2024-06-04 20:02:42 -07:00
Charles Hall
a5e7ce6c33
improve "Leave event has no state" event
Now it includes the user, room, and event ID. As a bonus, the sync
function is now slightly less gigantic.

TODO: put this in a separate MR, and include a similar change for
invited rooms
2024-06-04 20:02:42 -07:00
Lambda
f35cbfd89e
More tracing spans 2024-06-04 13:32:31 -07:00
Benjamin Lee
ec1b086a35
very minor cleanup in the sync endpoint
I meant to do this in 146465693e, but
looks like I forgot.
2024-05-30 10:19:24 -07:00
Charles Hall
8f0fdfb2f2
upgrade all cargo dependencies
Unfortunately we need to pull tracing-opentelemetry from git because
there hasn't been a release including the dependency bump on the other
opentelemetry crates.
2024-05-26 19:47:00 -07:00
Benjamin Lee
8d09a7e490 don't return extra member count or e2ee device updates from sync
Previously, we were returning redundant member count updates or encrypted
device updates from the /sync endpoint in some cases. The extra member
count updates are spec-compliant, but unnecessary, while the extra
encrypted device updates violate the spec.

The refactor necessary to fix this bug is also necessary to support
filtering on state events in sync.

Details:

Joined room incremental sync needs to examine state events for four
purposes:

 1. determining whether we need to return an update to room member counts
 2. determining the set of left/joined devices for encrypted rooms
    (returned in `device_lists`)
 3. returning state events to the client (in `rooms.joined.*.state`)
 4. tracking which member events we have sent to the client, so they can
    be omitted on future requests when lazy-loading is enabled.

The state events that we need to examine for the first two cases is member
events in the delta between `since` and the end of `timeline`. For the
second two cases, we need the delta between `since` and the start of
`timeline`, plus contextual member events for any senders that occur in
`timeline`. The second list is subject to filtering, while the first is
not.

Before this change, we were using the same set of state events that we are
returning to the client (cases 3/4) to do the analysis for cases 1/2.
In a compliant implementation, this would result in us missing some
relevant member events in 1/2 in addition to seeing redundant member
events. In current grapevine this is not the case because the set of
events that we return to the client is always a superset of the set that
is needed for cases 1/2. This is because we don't support filtering, and
we have an existing bug[1] where we are returning the delta between
`since` and the end of `timeline` rather than the start.

[1]: https://gitlab.computer.surgery/matrix/grapevine-fork/-/issues/5

Fixing this is necessary to implement filtering because otherwise
we would start missing some member events for member count or encrypted
device updates if the relevant member events are rejected by the filter.
This would be much worse than our current behavior.
2024-05-20 21:13:13 +00:00
Charles Hall
f8961d5578
rename Ruma to Ar
This follows the pattern of the previous commit.
2024-05-19 19:04:20 -07:00
Charles Hall
7ea98dac72
rename RumaResponse to Ra
It's very commonly used so having a short name is worthwhile, I think.
2024-05-19 19:03:45 -07:00
Charles Hall
230ebd3884
don't automatically wrap in RumaResponse
This allows us to use the `ruma_route` convenience function even when we
need to add our own hacks into the responses, thus making us less
reliant on Ruma.
2024-05-18 18:31:36 -07:00
Benjamin Lee
146465693e
remove sync response cache
This cache can serve invalid responses, and has an extremely low hit
rate.

It serves invalid responses because because it's only keyed off
the `since` parameter, but many of the other request parameters also
affect the response or it's side effects. This will become worse once we
implement filtering, because there will be a wider space of parameters
with different responses. This problem is fixable, but not worth it
because of the low hit rate.

The low hit rate is because normal clients will always issue the next
sync request with `since` set to the `prev_batch` value of the previous
response. The only time we expect to see multiple requests with the same
`since` is when the response is empty, but we don't cache empty
responses.

This was confirmed experimentally by logging cache hits and misses over
15 minutes with a wide variety of clients. This test was run on
matrix.computer.surgery, which has only a few active users, but a
large volume of sync traffic from many rooms. Over the test period, we
had 3 hits and 5309 misses. All hits occurred in the first minute, so I
suspect that they had something to do with client recovery from an
offline state. The clients that were connected during the test are:

 - element web
 - schildichat web
 - iamb
 - gomuks
 - nheko
 - fractal
 - fluffychat web
 - fluffychat android
 - cinny web
 - element android
 - element X android

Fixes: #2
2024-05-16 21:33:06 -07:00
Charles Hall
0afc1d2f50
change rustfmt configuration
This change is fully automated, except the `rustfmt.toml` changes and
a few clippy directives to allow specific functions with too many lines
because they are longer now.
2024-05-16 19:11:40 -07:00
Charles Hall
04184c6137
use gender-neutral pronouns 2024-05-16 16:17:40 -07:00
Charles Hall
05be778fbb
stop putting comments in the middle of call chains
`rustfmt` doesn't handle this very well.
2024-05-16 16:17:40 -07:00
Charles Hall
1911ad34d9
stop putting comments and code on the same line 2024-05-16 15:22:35 -07:00
Charles Hall
3efe3fb337
remove comments about filtering buggy items 2024-05-16 01:08:48 -07:00
Charles Hall
baab928281
enable too_many_lines lint
And just disable it everywhere it fires, I know.
2024-05-14 20:01:24 -07:00
Charles Hall
db4951c5fd
enable semicolon_if_nothing_returned lint 2024-05-14 20:01:24 -07:00
Charles Hall
96e1877639
enable redundant_closure_for_method_calls lint 2024-05-14 20:01:24 -07:00
Charles Hall
c4a9bca16f
enable match_wildcard_for_single_variants lint 2024-05-14 20:01:24 -07:00
Charles Hall
9606f59141
enable manual_let_else lint 2024-05-14 20:01:23 -07:00
Charles Hall
ebae8ceeb0
enable implicit_clone lint 2024-05-14 19:59:43 -07:00
Charles Hall
623824dc0c
enable if_not_else lint 2024-05-14 19:59:40 -07:00
Charles Hall
a32b7c1ac1
enable flat_map_option lint 2024-05-14 16:41:04 -07:00
Charles Hall
da440934bd
enable doc_markdown lint 2024-05-14 16:34:10 -07:00
Charles Hall
9abe4799db
enable string_add lint 2024-05-12 19:01:29 -07:00
Charles Hall
cc5977b4e4
enable same_name_method lint 2024-05-12 18:51:48 -07:00
Charles Hall
052f3088e9
enable let_underscore_must_use lint 2024-05-12 18:51:26 -07:00
Charles Hall
52c2893073
enable if_then_some_else_none lint 2024-05-12 18:51:26 -07:00
Charles Hall
885fc8428c
enable deref_by_slicing lint 2024-05-12 18:51:26 -07:00
Charles Hall
71c48f66c4
enable as_conversions lint
There were some very, uh, creative (and inconsistent) ways to convert
between numeric types in here...
2024-05-12 18:51:26 -07:00
Charles Hall
d748544f0e
enable unreachable_pub lint
This causes some other lints to start firing too (which is good), but
I'm going to fix them in follow-up commits to keep things organized.
2024-05-12 18:51:26 -07:00
Charles Hall
f27941d510
remove half-baked presence implementation
But I'm leaving behind the database state for now in case we want it
back later, so we won't need to do a migration or whatever.
2024-04-30 21:54:55 -07:00
Matthias Ahouansou
2c73c3adbb
fix(sync): send phoney leave event where room state is unknown on invite rejection 2024-04-06 14:12:18 +01:00
Timo Kösters
d2817679e5
refactor: remove previous typing implementation and add sync wakeup for new one 2024-03-22 08:24:17 +01:00
Timo Kösters
6bd7ff4917
improvement: do not save typing edus in db 2024-03-22 07:48:44 +01:00
Timo Kösters
879a8b969d
improvement: use simpler rocksdb config 2024-03-21 15:04:40 +01:00
Matthias Ahouansou
07bb369c5c
perf: remove unnecessary async 2024-03-05 20:20:19 +00:00