I've asked a few times for clarification on whether the `senders` field
in the filter applies to userids mentioned in the typing/receipt ephemeral
events, and never got a response. Synapse does not filter these userids by
sender, so we're gonna go with that.
This is the filter.room.timeline.{senders,types,contains_url} fields, and
their associated not_* pairs.
I decided not to change the `prev_batch` calculation for sliding-sync to
use the new `oldest_event_count` value, because I'm not confident in the
correct behavior. The current sliding-sync behavior is gives `prev_batch
= oldest_event_count` except when there are no new events. In this
case, `oldest_event_count` is `None`, but the current sliding-sync
implementation uses `prev_batch = since`. This is definitely wrong,
because both `since` and `prev_batch` are exclusive bounds. If the
correct thing to do is to return the lower exclusive bound of the range
of events that may have been included in the timeline, then we would
want `since - 1`. The other option would be to return `prev_batch =
None`, like we have in sync v3. I don't know which of these is correct,
so I'm just gonna keep the current (definitely incorrect) behavior to
avoid making things worse.
Before this change we were just returning an empty object for left or
invited rooms that don't have any updates. This is valid coloredding to
the spec, but it's a nicer to debug if they are omitted it and results in
a little less network traffic. For joined rooms, we are already skipping
empty updates.
With filtering support, it's much more common to have sync responses where
many rooms are empty, because all of the state/timeline events may be
filtered out.
I asked in #matrix-spec:matrix.org and go clarification that we should be
omitting the timeline field completely for rooms that are filtered out
by the timeline.(not_)rooms filter. Ruma's skip_serializing_if attribute
on the timeline field will currently cause it to be omitted when events is
empty. If [this fix][1] is merged, it will be omitted only when events is
empty, prev_batch is None, and limited is false.
[1]: https://github.com/ruma/ruma/pull/1796
TODO: maybe do something about clippy::too_many_arguments
These are the fields at filter.room.{rooms,not_rooms}, that apply to all
categories. The category-specific room filters are in
filter.room.{state,timeline,ephemeral}.{rooms,not_rooms}.
Now it includes the user, room, and event ID. As a bonus, the sync
function is now slightly less gigantic.
TODO: put this in a separate MR, and include a similar change for
invited rooms
Unfortunately we need to pull tracing-opentelemetry from git because
there hasn't been a release including the dependency bump on the other
opentelemetry crates.
Previously, we were returning redundant member count updates or encrypted
device updates from the /sync endpoint in some cases. The extra member
count updates are spec-compliant, but unnecessary, while the extra
encrypted device updates violate the spec.
The refactor necessary to fix this bug is also necessary to support
filtering on state events in sync.
Details:
Joined room incremental sync needs to examine state events for four
purposes:
1. determining whether we need to return an update to room member counts
2. determining the set of left/joined devices for encrypted rooms
(returned in `device_lists`)
3. returning state events to the client (in `rooms.joined.*.state`)
4. tracking which member events we have sent to the client, so they can
be omitted on future requests when lazy-loading is enabled.
The state events that we need to examine for the first two cases is member
events in the delta between `since` and the end of `timeline`. For the
second two cases, we need the delta between `since` and the start of
`timeline`, plus contextual member events for any senders that occur in
`timeline`. The second list is subject to filtering, while the first is
not.
Before this change, we were using the same set of state events that we are
returning to the client (cases 3/4) to do the analysis for cases 1/2.
In a compliant implementation, this would result in us missing some
relevant member events in 1/2 in addition to seeing redundant member
events. In current grapevine this is not the case because the set of
events that we return to the client is always a superset of the set that
is needed for cases 1/2. This is because we don't support filtering, and
we have an existing bug[1] where we are returning the delta between
`since` and the end of `timeline` rather than the start.
[1]: https://gitlab.computer.surgery/matrix/grapevine-fork/-/issues/5
Fixing this is necessary to implement filtering because otherwise
we would start missing some member events for member count or encrypted
device updates if the relevant member events are rejected by the filter.
This would be much worse than our current behavior.
This allows us to use the `ruma_route` convenience function even when we
need to add our own hacks into the responses, thus making us less
reliant on Ruma.
This cache can serve invalid responses, and has an extremely low hit
rate.
It serves invalid responses because because it's only keyed off
the `since` parameter, but many of the other request parameters also
affect the response or it's side effects. This will become worse once we
implement filtering, because there will be a wider space of parameters
with different responses. This problem is fixable, but not worth it
because of the low hit rate.
The low hit rate is because normal clients will always issue the next
sync request with `since` set to the `prev_batch` value of the previous
response. The only time we expect to see multiple requests with the same
`since` is when the response is empty, but we don't cache empty
responses.
This was confirmed experimentally by logging cache hits and misses over
15 minutes with a wide variety of clients. This test was run on
matrix.computer.surgery, which has only a few active users, but a
large volume of sync traffic from many rooms. Over the test period, we
had 3 hits and 5309 misses. All hits occurred in the first minute, so I
suspect that they had something to do with client recovery from an
offline state. The clients that were connected during the test are:
- element web
- schildichat web
- iamb
- gomuks
- nheko
- fractal
- fluffychat web
- fluffychat android
- cinny web
- element android
- element X android
Fixes: #2
This change is fully automated, except the `rustfmt.toml` changes and
a few clippy directives to allow specific functions with too many lines
because they are longer now.