Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC3757: Restricting who can overwrite a state event #3757

Open
wants to merge 39 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
ff5fd48
Add 'Restricting who can overwrite a state event'
andybalaam Mar 25, 2022
610f244
Rename MSC3757 to its proper number
andybalaam Mar 25, 2022
bfde329
Clarify wording of comment
andybalaam Mar 31, 2022
344e876
Add unstable room version
andybalaam Mar 31, 2022
ccb7e52
Note that this requires a new room version
andybalaam Mar 31, 2022
1ce7e0e
Refer to MSC3760 as an alternative
andybalaam Mar 31, 2022
6df6109
Re-word link to MSC3760
andybalaam Mar 31, 2022
6e108b3
Update to m.self to match latest thinking on MSC3672
andybalaam Apr 26, 2022
bd4176f
Remove the user of 'unprivileged'
andybalaam May 11, 2022
e352a1d
Add a note about allowed characters
andybalaam May 11, 2022
5e95ff3
Reword proposed auth rule
AndrewFerr Sep 4, 2024
68dc97f
Nitpick: always use formatted text for state_key
AndrewFerr Sep 11, 2024
17890fd
Nitpick: remove trailing whitespace
AndrewFerr Sep 11, 2024
f962bf3
Change recommended room versions to apply on
AndrewFerr Sep 11, 2024
dd9b33e
Mention that _ can't be in any form of server name
AndrewFerr Sep 24, 2024
eb0eed6
Add issue of incompatibility with long MXIDs
AndrewFerr Sep 24, 2024
ac24510
Add issue of underscores in domain names
AndrewFerr Sep 24, 2024
9490cbd
Fix typo
AndrewFerr Sep 24, 2024
486b0cd
Use device ID suffix in location beacon example
AndrewFerr Sep 26, 2024
590ff96
Increase state key size limit & set suffix limit
AndrewFerr Sep 26, 2024
d9b149d
Keep original size limit on unprefixed state keys
AndrewFerr Sep 27, 2024
ae17437
Move paragraph to alternative section
AndrewFerr Sep 27, 2024
8222738
Add alternative of state key arrays
AndrewFerr Sep 27, 2024
63955d7
Add alternative of field for non-state events
AndrewFerr Sep 27, 2024
07d784a
Clarify state subkey/array relevance to user IDs
AndrewFerr Sep 27, 2024
99698ef
Fix formatting of auth rule's numeric list
AndrewFerr Sep 27, 2024
9f4f31a
Rephrase the current restrictions on state events
AndrewFerr Oct 7, 2024
75f03da
Better explain limitations of current restrictions
AndrewFerr Oct 7, 2024
e833e8a
Reword and reformat
AndrewFerr Oct 7, 2024
a4b40b5
Remove redundant explanation for separating with _
AndrewFerr Oct 7, 2024
a0da59b
Scope proposal to a future room version
AndrewFerr Oct 7, 2024
5855a7f
Elaborate on multi-component state key alternative
AndrewFerr Oct 8, 2024
deba3b8
Add sub-headers to alternatives section
AndrewFerr Oct 8, 2024
8090f69
Propose sender-scoped state with ownership flag
AndrewFerr Oct 8, 2024
3a0d095
Fix typo
AndrewFerr Oct 8, 2024
e16482a
Mention impact of sender-scoped state on servers
AndrewFerr Oct 8, 2024
1ddddb6
Say how the ownership flag impacts administration
AndrewFerr Oct 9, 2024
fd87b8a
Fix contradictions for flag alternative
AndrewFerr Oct 10, 2024
2a77266
Change intro example from locations to MatrixRTC
AndrewFerr Dec 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Nitpick: always use formatted text for state_key
  • Loading branch information
AndrewFerr committed Sep 11, 2024
commit 68dc97f807ca824ae2e7cc60d1478dfa1820c73c
18 changes: 9 additions & 9 deletions proposals/3757-restricting-who-can-overwrite-a-state-event.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ so we need a mechanism to prevent other peers from doing so.

[MSC3489](https://github.com/matrix-org/matrix-spec-proposals/pull/3489) originally proposed that the event type could be made variable,
appending an ID to each separately posted event so that each one could
separately be locked to the same mxid in the state_key. However, this is
separately be locked to the same mxid in the `state_key`. However, this is
problematic because you can't proactively refer to these event types in the
`events` field of the `m.room.power_levels` event to allow users to post
them - and they also are awkward for some client implementations to
Expand All @@ -27,12 +27,12 @@ manipulate.
## Proposal
andybalaam marked this conversation as resolved.
Show resolved Hide resolved

Therefore, we need a different way to state that a given state event may only
be written by its owner. **We propose that if a state event's state_key *starts with* a matrix ID (followed by an underscore), only the sender with that matrix ID (or higher PL users) can set the state event.** This is an extension of the current behaviour where state events may be overwritten only if the sender's mxid *exactly equals* the state_key.
be written by its owner. **We propose that if a state event's `state_key` *starts with* a matrix ID (followed by an underscore), only the sender with that matrix ID (or higher PL users) can set the state event.** This is an extension of the current behaviour where state events may be overwritten only if the sender's mxid *exactly equals* the `state_key`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it valid to have an empty string as the suffix? E.g @kegan:matrix.org_? What about having arbitrary unicode characters? Null bytes? Please let's be sensible and force sensible restrictions on this new user-defined variable, lest we end up in another hell of poor validation causing problems for clients/servers.

I would suggest a very strong restriction of [a-z0-9] unless there are good reasons to allow other characters (there almost certainly are not).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allowing an empty suffix probably won't hurt. I can imagine cases of the suffix being a property that may take on an empty/nullish value.

For a device ID to be usable as a suffix, the suffix charset must include all characters that can be used as a device ID. Unfortunately, there's no specced device ID charset that I can find, and in practice it is quite broad:

  • Device IDs generated by QR code logins (at least on Element X Android) can contain / and +.
  • /_matrix/client/v3/login allows setting a custom device ID of any string, and the Synapse implementation allows all sorts of characters (and lets me set a device ID of a b ?!@#$%^&*()-=_+/).

So if we want device ID suffixes now, maybe defining a suffix charset is premature, unless this MSC also defines a stricter device ID charset. But that raises the issue of what to with existing device IDs that don't follow the charset...

Copy link
Member

@kegsay kegsay Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is surely too literal though. You can map device IDs deterministically to a valid character set, at its most basic SHA256(device_id).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea. I was worried that the input size of device IDs would be too large to safely hash, but the point is to avoid hash collisions, the chance of which should still be small.

Then, this MSC can do away with trying to define a suffix length / charset that can fit a raw device ID.

I will still propose a few non-word characters to be in the suffix charset, because it may be handy to have a suffix containing multiple properties that are easy to separate with a non-word character (eg. @user:hs_prop1_prop2_prop3). The spec already uses a charset of [0-9a-zA-Z.=_-] for email login secrets/tokens, and would be a decent charset to use here as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, a handy consequence of not restricting the suffix charset is the ability to prefix any existing state key with a user's MXID to scope its ownership to that user. With a restricted suffix charset, there may be some state keys that would be invalid as a suffix.

IMO as long as the charset of state keys in general is left unrestricted, there's little benefit in restricting the suffix charset, unless doing so is a step towards restricting the general state key charset.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that there's no point in restricting these specific state keys when all other state keys are still arbitrary strings. To take advantage of this MSC, you must give users permission to send state events of certain types. That means users will be able to send unprefixed state keys too, which will not have any character set restrictions.

Restricting state keys in general might be a good idea to do in combination with MSC2828

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree with the attitude of "we don't have good validation, so let's not add validation". There's zero reason to punt this down the line to another MSC. It's absolutely trivial to expand the character set or length limits in another MSC. It's very difficult to restrict it once there are client implementations in the wild relying on there being no restrictions. Validation is important to reduce the attack surface of any newly added features. The suffix string will be stored in new places where the state key is not (e.g DBs will likely either have this as a column, or indexed in such a way to allow efficient ordered lookups), which means there will be new code written to read this input. Validate the input, please.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the intent of a restricted suffix charset is to ease parsing out the user ID prefix, then the usefulness of the restriction might be limited by how the prefix must be able to contain any printable ASCII character, due to having to support historical user IDs. Also, since the parsing is concerned more with the user ID prefix than with the suffix, locking down the suffix charset might not impact much.

Besides, the "suffix string" isn't really meant to be a new kind of syntax for state keys, but is just a result of the semantic of user-scoped state keys via string packing. It's perhaps better to think of a scoped state key not as a user ID + a suffix, but as an ordinary state key prefixed by a user ID; or rather, that this MSC proposes allowing state keys to optionally be prefixed by a user ID. Being prefixed doesn't change the nature of the content of the rest of the state key, but applying different restrictions to prefixed & unprefixed state keys would imply that it does.

Though if restricting the charset is non-negotiable, then maybe a compromise is to apply a broad charset, like all printable ASCII characters.


We also allow users with higher PL than the original sender to overwrite state
events even if their mxid doesn't match the event's state_key. This fixes an abuse
events even if their mxid doesn't match the event's `state_key`. This fixes an abuse
vector where a user can immutably graffiti the state within a room
by sending state events whose state_key is their matrix ID.
by sending state events whose `state_key` is their matrix ID.

Practically speaking, this means modifying the [authorization rules](https://spec.matrix.org/v1.2/rooms/v9/#authorization-rules) such that rule 8:

Expand Down Expand Up @@ -83,9 +83,9 @@ for each unique event.
An earlier draft of this MSC proposed putting a flag on the contents of the
event (outside of the E2EE payload) called `m.peer_unwritable: true` to indicate
if other users were prohibited from overwriting the event or not. However, this
unravelled when it became clear that there wasn't a good value for the state_key,
unravelled when it became clear that there wasn't a good value for the `state_key`,
which needs to be unique and not subject to races from other malicious users.
By scoping who can set the state_key to be the mxid of the sender, this problem
By scoping who can set the `state_key` to be the mxid of the sender, this problem
goes away.

[MSC3760](https://github.com/matrix-org/matrix-spec-proposals/pull/3760)
Expand All @@ -101,14 +101,14 @@ introduce an attack on state resolution. For instance: if a user had higher
PL at some point in the past, will they be able to abuse somehow this to
overwrite the state event, despite not being its owner?

When using a state_key prefix to restrict who can write the event, we have
When using a `state_key` prefix to restrict who can write the event, we have
deliberately chosen an underscore to terminate the mxid prefix, as underscores
are not allowed in domain names. A pure prefix match will **not** be sufficient,
as `@matthew:matrix.org` will match a state_key of form `@matthew:matrix.org.evil.com:id1`.
as `@matthew:matrix.org` will match a `state_key` of form `@matthew:matrix.org.evil.com:id1`.

This changes auth rules in a backwards incompatible way, which will break any
use cases which assume that higher PL users cannot overwrite state events whose
state_key is a different mxid. This is considered a feature rather than a bug,
`state_key` is a different mxid. This is considered a feature rather than a bug,
fixing an abuse vector where users could send arbitrary state events
which could never be overwritten.

andybalaam marked this conversation as resolved.
Show resolved Hide resolved
Expand Down