This was missed in the initial fix in 9a50c244 ("validate event type and
membership for create_join and create_invite"), but significantly less
impactful than the original vulnerability because it only affects
invite/join events that are able to pass auth/signature checks with our
server's signature. You could use this to forge invite events from a
local user, but not much else.
This fixes a vulnerability where an attacker with a a malicious remote
server and a user on the local server can trick the local server into
signing arbitrary events. The attacker issue a remote leave as the local
user to a room on the malicious server. Without any validation of the
make_leave response, the local server would sign the attacker-controlled
event and pass it back to the malicious server with send_leave.
The join endpoints is also fixed in this commit, but is less useful for
exploitation because the local server replaces the "content" field
returned by the remote server. Remote invites are unaffected because we
already check that the event returned from /invite has the same event ID
as the event passed to it.
Both of these endpoints sign the received event so without the
validation a malicious server can use these endpoints to trick our
server into signing effectively arbitrary forged events from local
users.
Rebased from a continuwuity patch by nex. The create_join changes were
not present in the continuwuity version because these checks were
already present there.
Co-authored-by: Olivia Lee <olivia@computer.surgery>
Servers are required to preserve unknown properties in event content,
since they may be added by a future version of the spec. Round-tripping
through RoomRedactionEventContent results in dropping all unknown
properties.
Previously we were only using trust-dns for resolving SRV records in
server discovery, and then for resolving the hostname from the SRV
record target if one exists. With the previous behavior, admins need to
ensure that both their system resolver and trust-dns are working
correctly in order for outgoing traffic to work reliably. This can be
confusing to debug, because it's not obvious to the admin if or when
each resolver are being used. Now, everything goes through trust-dns and
outgoing federation DNS should fail/succeed more predictably.
I also expect some performance improvement from having an in-process DNS
cache, but haven't taken measurements yet.
Previously, read receipts would only be forwarded via federation
incidentally when some PDU was later sent to the destination server.
Trigger a send without any event to collect EDUs and get read receipts
out directly.
The primary motivation for this change is to support databases that
don't take a path, e.g. out of process databases.
This configuration structure leaves the door open for other media
storage mechanisms in the future, such as S3.
It's also structured to avoid `#[serde(flatten)]` so that we can use
`#[serde(deny_unknown_fields)]`.
Previously we required every alias in a canonical alias event sent by a
client to be valid, and would only validate local aliases. This
prevented clients from adding/removing canonical aliases if there were
existing remote or invalid aliases.
Previously, we would only attempt to validate the aliases in the event
content if we were able to parse the event, and would silently allow it
otherwise.
I think it's most important for people to read the "Chat with us", "Can
I use it?" and "Expectations management" sections, though I'm not sure
what the best ordering is. This is probably fine.
This is also a good spot to link to the pre-built binaries. And since I
did that, I can also remove the bit about not publishing binary builds
from the introduction section.
The previous logic would increment the backoff counter both when a
request actually fails and when we do not make a request because the
server was already in backoff. This lead to a positive feedback loop
where every request made while a server is in backoff increases the
backoff delay, making it impossible to recover from backoff unless the
entire backoff delay elapses with zero requests.
Failing to reset the backoff state resulted in a monotonically
increasing backoff delay. If a remote server was temporarily
unavailable, we would have a persistently increased rate of key query
failures until the backoff state was reset by a server restart. If
enough key queries were attempted while the remote was unavailable, it
can accumulate an arbitrarily long backoff delay and effectively block
all future key queries to this server.
* Capitalize Conduit and Grapevine
* Replace a "Not" with "Note"
* Change a "Conduit 0.7.0" to "Conduit <0.8.0"
* Expand "db" to "database" and "database schema" as appropriate
This was based on the similar section in the hedgedoc. I dropped the bit
about grapevine users getting banned from the conduwuit rooms cause we
haven't heard about that happening for a while and the original case was
pretty unclear.
If all join requests to resident servers fail or if the joining server
is the only resident server (i.e. the room is local-only), we would
previously send a 500 error, even if the more correct response would be
M_UNAUTHORIZED (e.g. if the user tries to join an invite-only room).
To fix this, we now return the error generated by attempting the join
locally, which correctly informs the client about why their request
failed.