This is useful to easily distinguish missing files from corrupted keys.
All existing usage sites have been modified so there is no behavior
change in this commit.
Our current code should never write these, because we have an allowlist
of thumbnail sizes. None of the allowed sizes contain a 0xFF byte. We
have observed keys with a 0xFF in the thumbnail size a couple times on
real servers, and believe an early version of conduit wrote these before
the allowlist was added.
These keys were originally handled correctly, and were broken by
e2cba15ed2. Before that commit, we were
parsing media keys backwards, and never tried to read the thumbnail size
or mxc url.
Fixes a regression in e2cba15ed2 where the
Content-Type and Content-Disposition parts are extracted in the wrong
order.
Fixes a long-standing issue in b6d721374f
where the Content-Type part was allowed to be completely missing rather
than present and 0 bytes long.
Improves the error messages for various parsing failures to be unique
and more obvious.
On my HS I observed 5 instances of keys with the following format:
* MXC bytes.
* A 0xFF byte.
* 4 bytes where the width and height are supposed to be, which are
supposed to be 8 bytes in length.
* 3 consecutive 0xFF bytes. This means that the `content-type` and
`content-disposition` sections both parse as the empty string, and
there's an extra separator at the end too.
* Extra bytes, all of which were `image/png`.
The 4 bytes where the width and height are supposed to be were one of:
* 003ED000
* 003EE000
* 003EF001
Which seems to have some kind of pattern to it...
After much digging, we have absolutely no idea what could've caused
this. Cursed.
Leaving this private in `database::key_value::media` because the way
the metadata is encoded in media keys is a mess. I want to fix that in
the future, and want to limit the number of things that rely on it for
now.
Previously, we only fetched keys once, only requesting them again if we have any missing, allowing for ancient keys to be used to sign PDUs and transactions
Now we refresh keys that either have or are about to expire, preventing attacks that make use of leaked private keys of a homeserver
We also ensure that when validating PDUs or transactions, that they are valid at the origin_server_ts or time of us receiving the transaction respectfully
As to not break event authorization for old rooms, we need to keep old keys around
We move verify_keys which we no longer see in direct requests to the origin to old_verify_keys
We keep old_verify_keys indefinitely as mentioned above, as to not break event authorization (at least until a future MSC addresses this)
Original patch by Matthias. Benjamin just rebased it onto grapevine and
fixed clippy/rustc warnings.
Co-authored-by: Benjamin Lee <benjamin@computer.surgery>
This ensures that the tokenization algorithm will remain in sync between
querying, indexing, and deindexing. The existing code had slightly
different behavior for querying, because it did not discard words with
>50 bytes. This was inconsequential, because >50 byte tokens are never
present in the index.
Unfortunately we need to pull tracing-opentelemetry from git because
there hasn't been a release including the dependency bump on the other
opentelemetry crates.
This change is fully automated, except the `rustfmt.toml` changes and
a few clippy directives to allow specific functions with too many lines
because they are longer now.