Commit graph

2815 commits

Author SHA1 Message Date
Lambda
6a44d0af2b Fix u8_slice_to_hex()
For bytes <0x10, this would omit the leading zero.
2024-10-20 18:50:05 +00:00
Andreas Fuchs
76a633cb66
Log failed remote resident server join requests 2024-10-13 19:26:35 -07:00
Andreas Fuchs
e001356653
Return local join error if all remote joins fail
If all join requests to resident servers fail or if the joining server
is the only resident server (i.e. the room is local-only), we would
previously send a 500 error, even if the more correct response would be
M_UNAUTHORIZED (e.g. if the user tries to join an invite-only room).

To fix this, we now return the error generated by attempting the join
locally, which correctly informs the client about why their request
failed.
2024-10-13 19:10:58 -07:00
Charles Hall
5a490a4397
fix mod/use order
Yes, it does actually bother me, thanks for asking.
2024-10-03 15:28:24 -07:00
mikoto
287f6b9163
refactor calculate_invite_state
That was terribly named and terribly implemented.

Co-authored-by: Charles Hall <charles@computer.surgery>
2024-10-03 10:52:07 -07:00
Lambda
e14b7f28f2
Implement federation self-test 2024-09-27 10:51:32 -07:00
Lambda
6022d56094
Use enums for options to send_request(), add allow_loopback 2024-09-27 10:48:12 -07:00
Lambda
94d523ebcb
Reload TLS config on SIGHUP 2024-09-27 09:51:17 -07:00
Lambda
39880cc6ac
Abstract over sd_notify 2024-09-27 09:50:51 -07:00
Charles Hall
6ab87f97dd
include traceresponse header if possible
This can help with debugging.
2024-09-26 19:01:15 -07:00
Benjamin Lee
9add9a1e96
fix room version comparisons
Fixes a set of bugs introduced by 00b77144c1,
where we replaced explicit `RoomVersionId` matches with `version < V11`
comparisons. The `Ord` impl on `RoomVersionId` does not work like that,
and is in fact a lexicographic string comparison[1]. The most visible
effect of these bugs is that incoming redaction events would sometimes
be ignored.

Instead of reverting to the explicit matches, which were quite verbose,
I implemented a `RoomVersion` struct that has flags for each property
that we care about. This is similar to the approach used by ruma[2] and
synapse[3].

[1]: 7cfa3be0c6/crates/ruma-common/src/identifiers/room_version_id.rs (L136)
[2]: 7cfa3be0c6/crates/ruma-state-res/src/room_version.rs
[3]: c856ae4724/synapse/api/room_versions.py
2024-09-26 13:01:25 -07:00
Charles Hall
ad37eae869
use OnceLock instead of RwLock for SERVICES
It actually has the semantics we need. Until we get rid of SERVICES.
2024-09-25 10:43:05 -07:00
Charles Hall
032e1ca3c6
hide global services jank in service module
Mainly to make it easier to initialize the SERVICES global correctly in
more than one place.

Also this stuff really shouldn't live at the crate root anyway.
2024-09-25 10:43:05 -07:00
Charles Hall
1fd20cdeba
factor server_name change check into a reusable fn 2024-09-25 10:43:05 -07:00
Charles Hall
c2c6083277
make load_or_create *only* load_or_create
Extracted the other logic to its current singular callsite for now.

The load_or_create function finally does nothing other than load or
create the database (and do some related checks, which is fine). This
paves the way for more/better database surgery tooling.
2024-09-25 10:39:46 -07:00
Charles Hall
e9caf228b3
move config check into config load function 2024-09-25 10:39:46 -07:00
Charles Hall
75ef57e0ce
remove config check
* Database load function is the wrong place for this
* There's no good lower bound to check for this
* Surely people setting this to a small value would realize what they're
  in for
2024-09-25 10:39:42 -07:00
Benjamin Lee
279c6472c5
split some logic out of KeyValueDatabase::load_or_create
This method did _a lot_ of things at the same time. In order to use
`KeyValueDatabase` for the migrate-db command, we need to be able to
open a db without attempting to apply all the migrations and without
spawning a bunch of unrelated background tasks.

The state after this refactor is still not great, but it's enough to do
a migration tool.
2024-09-24 20:57:57 -07:00
Benjamin Lee
059dfe54e3
factor out helper function for db migrations 2024-09-24 16:02:04 -07:00
Benjamin Lee
e2318cad8a
fix serving tls by setting rustls default crypto provider
The rustls version bump in c24f79b79b
introduced a panic when serving listeners with 'tls = true':

> thread 'main' panicked at /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/rustls-0.23.13/src/crypto/mod.rs:265:14:
> no process-level CryptoProvider available -- call CryptoProvider::install_default() before this point

This commit fixes this by setting the default provider to ring. I chose
ring (the old rustls default) over aws-lc-rs (the new default) for a few
reasons:

 - Judging by github issues, aws-lc-rs seems to have a lot of build problems.
   We don't need more of that.
 - The "motivation" section in the aws-lc-rs docs only talks about FIPS,
   which we do not care about.
 - My past experience with things that start with "aws-" has been very
   negative.
2024-09-23 23:39:23 -07:00
Lambda
084d862e51
Allow configuring served components per listener 2024-09-23 16:43:52 -07:00
Lambda
d62d0e2f0e
Split routes into components 2024-09-23 16:43:52 -07:00
Charles Hall
b0d1cc1b63
bump otel to v0.24.0
Someone contributed opentelemetry-prometheus support for v0.24 and this
version also doesn't put stupid requirements on the tokio version. This
version of the OTel ecosystem also fixes an apparent bug with some hacks
I plan on doing in the future...
2024-09-23 14:22:55 -07:00
Charles Hall
c24f79b79b
update rust deps except rocksdb and otel clownery
* OTel v0.25.0 requires downgrading Tokio to 1.38 [0]
* They have a fix for this but aren't cutting a release just for release
  schedule reasons [1]
* Prometheus support (at least for server-pull) was dropped at OTel
  v0.23 and isn't planned to be picked up again until OTel v1 [2]
* No real reasoning was provided for this decision AFAICT [3] [4]
* So many compiler errors
* Unhelpful changelogs

The last two points are what made me give up on trying to upgrade to
OTel v0.24 too.

RocksDB isn't updated because we'd need to update our nixpkgs too but
that causes other problems, such as an upstream bug in liburing when
building for musl.

[0]: https://github.com/open-telemetry/opentelemetry-rust/issues/2094
[1]: https://github.com/open-telemetry/opentelemetry-rust/issues/2094#issuecomment-2346834030
[2]: https://docs.rs/opentelemetry-prometheus/0.17.0/opentelemetry_prometheus/index.html
[3]: https://github.com/open-telemetry/opentelemetry-rust/pull/1792
[4]: https://github.com/open-telemetry/opentelemetry-rust/pull/1792#issuecomment-2121514344
2024-09-23 14:22:55 -07:00
Benjamin Lee
c1bf4a8ee3
changelog entry for CLI compatibility break 2024-09-21 14:11:40 -07:00
Benjamin Lee
5315bac0c5
split out separate error type for serve command 2024-09-21 14:11:40 -07:00
Benjamin Lee
86515d53cc
move 'serve' command logic into a submodule of 'cli'
The changes to 'main.rs' and 'cli/serve.rs' in this commit are almost
pure code-motion.
2024-09-21 14:11:39 -07:00
Benjamin Lee
be87774a3b
set up structure for multiple cli commands
The previous cli is now behind the 'serve' subcommand.
2024-09-21 14:11:26 -07:00
Charles Hall
1ee3bbb316
oops, i dropped my fork
The maintainers had a discussion internally and decided it's unlikely
that we'll have the capacity to try to do a rewrite, which was the
original reason for the suffix's presence. So, now can get rid of it.
2024-09-20 16:52:05 -07:00
Charles Hall
d388994657
rewrite media key parser
Fixes a regression in e2cba15ed2 where the
Content-Type and Content-Disposition parts are extracted in the wrong
order.

Fixes a long-standing issue in b6d721374f
where the Content-Type part was allowed to be completely missing rather
than present and 0 bytes long.

Improves the error messages for various parsing failures to be unique
and more obvious.
2024-09-19 15:27:10 -07:00
Charles Hall
88b009a8d4
update changelog 2024-09-19 15:23:59 -07:00
Charles Hall
b34d78a030
skip over broken keys instead of aborting
Errors will show up in the logs in this case with detailed information
about what broke.

In the future we should add some kind of database integrity check
functionality and also functionality to repair/delete broken data, but
for now this at least makes it work 99.99% of the time.
2024-09-19 15:23:59 -07:00
Charles Hall
cb3e0c620a
improve media key decoding logs
On my HS I observed 5 instances of keys with the following format:

* MXC bytes.
* A 0xFF byte.
* 4 bytes where the width and height are supposed to be, which are
  supposed to be 8 bytes in length.
* 3 consecutive 0xFF bytes. This means that the `content-type` and
  `content-disposition` sections both parse as the empty string, and
  there's an extra separator at the end too.
* Extra bytes, all of which were `image/png`.

The 4 bytes where the width and height are supposed to be were one of:

* 003ED000
* 003EE000
* 003EF001

Which seems to have some kind of pattern to it...

After much digging, we have absolutely no idea what could've caused
this. Cursed.
2024-09-19 15:23:20 -07:00
Charles Hall
d848e787d3
ignore files that were probably never created
File data is inserted into the database before being created on disk,
which means that it's possible for data to exist in the database that
doesn't exist on disk. In this case, the media deletion functions should
simply ignore this error.
2024-09-19 12:29:51 -07:00
Lambda
ca6bc74074 Fix X-Matrix signature validation for incoming requests
For HTTP/1 requests, an inbound Request's URI contains only the path and
query parameters, since there's no way to synthesize the authority part.
This is exactly what we need for the X-Matrix "uri" field.

HTTP/2 requests however can contain the :authority pseudo-header, which
is used to populate the Request's URI. Using a URL that includes an
authority breaks the signature check.

Largely inspired by conduit MR !631
(https://gitlab.com/famedly/conduit/-/merge_requests/631).

Co-authored-by: strawberry <strawberry@puppygock.gay>
2024-09-19 16:25:23 +00:00
Lambda
0d6a7eb968 Disable unauthenticated media access 2024-09-18 20:33:28 +00:00
Charles Hall
b9ee898920
require client base_url, rename from authority
The previous code used `server_name` as a fallback but in reality there
is no real relationship between `server_name` and the location clients
are supposed to make requests to.

Additionally, the `insecure` option is gone, because we now allow users
to control the entire URL, so they're free to choose the scheme.
2024-09-18 13:03:49 -07:00
Benjamin Lee
48850605b0
changelog entry for media deletion admin commands 2024-09-17 19:31:54 -07:00
Benjamin Lee
ba7b224c38
add dry-run mode to delete-remote-media-files admin command 2024-09-17 19:31:54 -07:00
Benjamin Lee
9d14c5d461
add admin command to delete all remote media files 2024-09-17 19:31:51 -07:00
Benjamin Lee
d7087c66bb
add admin command to delete individual media files 2024-09-17 19:13:54 -07:00
Benjamin Lee
7672cc8473
use OwnedMxcUri in media service
Not using `MxcData` because it borrows it's fields, and so we wouldn't
be able to return an owned `MxcData` from functions that read the db.
2024-09-15 00:32:17 -07:00
Benjamin Lee
e2cba15ed2
factor out helper for parsing media keys
Leaving this private in `database::key_value::media` because the way
the metadata is encoded in media keys is a mess. I want to fix that in
the future, and want to limit the number of things that rely on it for
now.
2024-09-15 00:32:17 -07:00
Lambda
3bb4a25c1d Include old verify keys in _matrix/key/v2/server response 2024-09-13 17:02:30 +00:00
Lambda
296824fef4 Always use local keypair instead of trying to find our own keys in cache 2024-09-13 17:02:30 +00:00
Lambda
458a7458dc Support specifying old_verify_keys in config 2024-09-13 17:02:30 +00:00
Lambda
5691cf0868 Better debugging for signing key fetching 2024-09-13 13:31:04 +00:00
Charles Hall
9e6a5e6604
update changelog 2024-09-08 14:08:32 -07:00
Charles Hall
449c27642c
hide sliding sync behind explicit option
We want to make sure users know this sliding sync impl is pretty buggy
before they attempt to use it.
2024-09-08 14:08:32 -07:00
Charles Hall
806cc0cb28
serve well-known client and server config
This way users can have a simpler time configuring this stuff and we can
worry about the spec compliance parts and specifying the same thing over
and over parts.
2024-09-08 13:35:38 -07:00