Commit graph

2689 commits

Author SHA1 Message Date
Charles Hall
07d05fa1f9
add subcmd to repair some persistent state 2024-11-01 14:40:45 -07:00
Charles Hall
5be1e20eb4
call maximize_fd_limit at top of main
This way we don't shoot ourselves in the foot by forgetting to do it for
other subcommands (e.g. that manipulate the database) in the future.
2024-11-01 13:16:36 -07:00
Andreas Fuchs
9529d43a21 ChangeLog entry for check-config subcmd 2024-11-01 12:16:49 -04:00
Andreas Fuchs
dcf64f03fb Validate generated config file in the nixos module
This uses the usual pkgs.runCommand pattern to ensure that no
non-parseable config files can make it to the command line.
2024-11-01 12:10:01 -04:00
Andreas Fuchs
a02c551a5e Disallow any unknown fields in configuration files
This will break backwards compatibility of configurations, but
ensures that a previously-configured setting won't get dropped
arbitrarily. Pretty much worth it, I think.
2024-11-01 12:09:58 -04:00
Andreas Fuchs
26ba489aa3 Add a "check-config" command to validate config files & tests for it 2024-11-01 12:08:17 -04:00
Lambda
70ee206031 Extract source address for requests 2024-10-25 20:48:38 +00:00
Lambda
3247c64cd8 Add support for HAProxy proxy protocol for listeners 2024-10-25 20:48:38 +00:00
Lambda
99f3e2aecd Refactor server listener spawning 2024-10-25 20:47:04 +00:00
Charles Hall
86481fd651
make reload_handles optional for creating Services
This will be useful for instantiating services in CLI subcommands, which
have different requirements around observeability.
2024-10-25 11:27:11 -07:00
Charles Hall
b03c2a15b3
add observability infrastructure for cli subcmds 2024-10-25 11:27:11 -07:00
K900
b93c39ee93 fix: use non-alias name in nixos module 2024-10-24 14:46:44 +00:00
Charles Hall
ce7efc1eff
move lasttimelinecount_cache to service 2024-10-20 13:29:33 -07:00
Charles Hall
107f4521e0
move appservice_in_room_cache to service 2024-10-20 13:29:33 -07:00
Charles Hall
9d62865b28
move our_real_users_cache to service 2024-10-20 13:29:33 -07:00
Charles Hall
d3b62e598d
move shortstatekey_cache to service 2024-10-20 13:29:33 -07:00
Charles Hall
190b788683
move statekeyshort_cache to service 2024-10-20 13:29:33 -07:00
Charles Hall
2b2b4169df
move eventidshort_cache to service 2024-10-20 13:29:33 -07:00
Charles Hall
095ee483ac
move auth_chain_cache to service 2024-10-20 13:29:33 -07:00
Charles Hall
47502d1f36
move shorteventid_cache to service 2024-10-20 13:29:33 -07:00
Charles Hall
7563360bee
move pdu_cache to service 2024-10-20 13:29:32 -07:00
Charles Hall
fb534d8140
move userdevicesessionid_uiaarequest to service 2024-10-20 13:29:32 -07:00
Charles Hall
a1fe0f3fff
add service type for short
This will be necessary in the near future.
2024-10-20 13:29:32 -07:00
Charles Hall
e0cf163486
delete useless admin commands
To clear caches, restart the server. We may want to consider adding the
cache sizes and database memory usage as metrics in the future.
2024-10-20 13:29:28 -07:00
Lambda
6a44d0af2b Fix u8_slice_to_hex()
For bytes <0x10, this would omit the leading zero.
2024-10-20 18:50:05 +00:00
Andreas Fuchs
76a633cb66
Log failed remote resident server join requests 2024-10-13 19:26:35 -07:00
Andreas Fuchs
e001356653
Return local join error if all remote joins fail
If all join requests to resident servers fail or if the joining server
is the only resident server (i.e. the room is local-only), we would
previously send a 500 error, even if the more correct response would be
M_UNAUTHORIZED (e.g. if the user tries to join an invite-only room).

To fix this, we now return the error generated by attempting the join
locally, which correctly informs the client about why their request
failed.
2024-10-13 19:10:58 -07:00
Charles Hall
5a490a4397
fix mod/use order
Yes, it does actually bother me, thanks for asking.
2024-10-03 15:28:24 -07:00
mikoto
287f6b9163
refactor calculate_invite_state
That was terribly named and terribly implemented.

Co-authored-by: Charles Hall <charles@computer.surgery>
2024-10-03 10:52:07 -07:00
Lambda
e14b7f28f2
Implement federation self-test 2024-09-27 10:51:32 -07:00
Lambda
6022d56094
Use enums for options to send_request(), add allow_loopback 2024-09-27 10:48:12 -07:00
Lambda
94d523ebcb
Reload TLS config on SIGHUP 2024-09-27 09:51:17 -07:00
Lambda
39880cc6ac
Abstract over sd_notify 2024-09-27 09:50:51 -07:00
Charles Hall
6ab87f97dd
include traceresponse header if possible
This can help with debugging.
2024-09-26 19:01:15 -07:00
Benjamin Lee
9add9a1e96
fix room version comparisons
Fixes a set of bugs introduced by 00b77144c1,
where we replaced explicit `RoomVersionId` matches with `version < V11`
comparisons. The `Ord` impl on `RoomVersionId` does not work like that,
and is in fact a lexicographic string comparison[1]. The most visible
effect of these bugs is that incoming redaction events would sometimes
be ignored.

Instead of reverting to the explicit matches, which were quite verbose,
I implemented a `RoomVersion` struct that has flags for each property
that we care about. This is similar to the approach used by ruma[2] and
synapse[3].

[1]: 7cfa3be0c6/crates/ruma-common/src/identifiers/room_version_id.rs (L136)
[2]: 7cfa3be0c6/crates/ruma-state-res/src/room_version.rs
[3]: c856ae4724/synapse/api/room_versions.py
2024-09-26 13:01:25 -07:00
Charles Hall
ad37eae869
use OnceLock instead of RwLock for SERVICES
It actually has the semantics we need. Until we get rid of SERVICES.
2024-09-25 10:43:05 -07:00
Charles Hall
032e1ca3c6
hide global services jank in service module
Mainly to make it easier to initialize the SERVICES global correctly in
more than one place.

Also this stuff really shouldn't live at the crate root anyway.
2024-09-25 10:43:05 -07:00
Charles Hall
1fd20cdeba
factor server_name change check into a reusable fn 2024-09-25 10:43:05 -07:00
Charles Hall
c2c6083277
make load_or_create *only* load_or_create
Extracted the other logic to its current singular callsite for now.

The load_or_create function finally does nothing other than load or
create the database (and do some related checks, which is fine). This
paves the way for more/better database surgery tooling.
2024-09-25 10:39:46 -07:00
Charles Hall
e9caf228b3
move config check into config load function 2024-09-25 10:39:46 -07:00
Charles Hall
75ef57e0ce
remove config check
* Database load function is the wrong place for this
* There's no good lower bound to check for this
* Surely people setting this to a small value would realize what they're
  in for
2024-09-25 10:39:42 -07:00
Benjamin Lee
279c6472c5
split some logic out of KeyValueDatabase::load_or_create
This method did _a lot_ of things at the same time. In order to use
`KeyValueDatabase` for the migrate-db command, we need to be able to
open a db without attempting to apply all the migrations and without
spawning a bunch of unrelated background tasks.

The state after this refactor is still not great, but it's enough to do
a migration tool.
2024-09-24 20:57:57 -07:00
Benjamin Lee
059dfe54e3
factor out helper function for db migrations 2024-09-24 16:02:04 -07:00
Benjamin Lee
e2318cad8a
fix serving tls by setting rustls default crypto provider
The rustls version bump in c24f79b79b
introduced a panic when serving listeners with 'tls = true':

> thread 'main' panicked at /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/rustls-0.23.13/src/crypto/mod.rs:265:14:
> no process-level CryptoProvider available -- call CryptoProvider::install_default() before this point

This commit fixes this by setting the default provider to ring. I chose
ring (the old rustls default) over aws-lc-rs (the new default) for a few
reasons:

 - Judging by github issues, aws-lc-rs seems to have a lot of build problems.
   We don't need more of that.
 - The "motivation" section in the aws-lc-rs docs only talks about FIPS,
   which we do not care about.
 - My past experience with things that start with "aws-" has been very
   negative.
2024-09-23 23:39:23 -07:00
Lambda
084d862e51
Allow configuring served components per listener 2024-09-23 16:43:52 -07:00
Lambda
d62d0e2f0e
Split routes into components 2024-09-23 16:43:52 -07:00
Charles Hall
b0d1cc1b63
bump otel to v0.24.0
Someone contributed opentelemetry-prometheus support for v0.24 and this
version also doesn't put stupid requirements on the tokio version. This
version of the OTel ecosystem also fixes an apparent bug with some hacks
I plan on doing in the future...
2024-09-23 14:22:55 -07:00
Charles Hall
c24f79b79b
update rust deps except rocksdb and otel clownery
* OTel v0.25.0 requires downgrading Tokio to 1.38 [0]
* They have a fix for this but aren't cutting a release just for release
  schedule reasons [1]
* Prometheus support (at least for server-pull) was dropped at OTel
  v0.23 and isn't planned to be picked up again until OTel v1 [2]
* No real reasoning was provided for this decision AFAICT [3] [4]
* So many compiler errors
* Unhelpful changelogs

The last two points are what made me give up on trying to upgrade to
OTel v0.24 too.

RocksDB isn't updated because we'd need to update our nixpkgs too but
that causes other problems, such as an upstream bug in liburing when
building for musl.

[0]: https://github.com/open-telemetry/opentelemetry-rust/issues/2094
[1]: https://github.com/open-telemetry/opentelemetry-rust/issues/2094#issuecomment-2346834030
[2]: https://docs.rs/opentelemetry-prometheus/0.17.0/opentelemetry_prometheus/index.html
[3]: https://github.com/open-telemetry/opentelemetry-rust/pull/1792
[4]: https://github.com/open-telemetry/opentelemetry-rust/pull/1792#issuecomment-2121514344
2024-09-23 14:22:55 -07:00
Benjamin Lee
c1bf4a8ee3
changelog entry for CLI compatibility break 2024-09-21 14:11:40 -07:00
Benjamin Lee
5315bac0c5
split out separate error type for serve command 2024-09-21 14:11:40 -07:00