Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] MSC3814: Dehydrated devices with SSSS #3814
base: main
Are you sure you want to change the base?
[WIP] MSC3814: Dehydrated devices with SSSS #3814
Changes from 14 commits
80243a4
ed2c5eb
703281e
0a149c5
a4e87a6
3827bc0
12acd43
f756db3
6223db4
e3c9ac8
7f24f0d
f85c18d
4954c27
087154a
e7c8266
cf5ae99
d751d33
11149e4
1500897
5742c52
a58288a
21a3d67
6be9078
ec17903
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This property isn't defined anywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can say something like: the server is allowed to discard any non-
m.room.encrypted
to-device message that it receives for the dehydrated device. There's no point in keeping key requests sent to the dehydrated device because it won't send anything back.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why include a
device_id
if you can only have a single dehydrated device? It has implications that you can provide more than one (and causes additional error checking of whether the provided device ID matches the dehydrated device ID).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the device ID is redundant. Though since this MSC has been written a new use-case for this endpoint has been found.
In the sliding sync world we have split out the fetching of to-device events into a separate sync loop. Namely one of the biggest problems of the existing
/sync
mechanism is that you get too much data all at once and the downloading and processing of that data prevents the UI from being updated.To-device events are one of those things that are not directly related to the things that a client will want to display in a room or room list, so putting it into a separate sync loop allows the main loop to quickly send updates while to-device moves along in the background. More info here: matrix-org/matrix-rust-sdk#1928
I think that old sync could handle such a split as well, so I would suggest here to rename the endpoint to become
/sync/to_device/{device_id}
wheredevice_id
might be optional and used only in the case of a dehydrated device.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for the
device_id
parameter is that, while one client is fetching events, another client could create a new dehydrated device. Without thedevice_id
parameter, the server could think that the client wants to fetch the events for the new device which, if there are any, it won't be able to decrypt since it's for a device that it doesn't know about. With thedevice_id
parameter, the server will at least be able to say that there are no more events (since the device has been replaced by a new one).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds quite racy to me -- how does the server know that one dehydrated device is claimed? How would the client know to make a new one instead of claim the old one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's OK for multiple clients to rehydrate the same device (unlike in the previous proposal), because it never becomes a real device. So the server can just wait until some client fetches all the events before dropping the device.
Making a new device and rehydrating an old one are two different use cases. Rehydration happens after you log in, and you're setting up encryption and trying to get keys. It only happens once in the device's lifetime. Creating a new dehydrated device would happen after you've already set up your encryption and already attempted to rehydrate a device.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a POST and not a GET like /sync and /messages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, the rationale was because the call has side-effects (deleting the device).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit weird that it doesn't follow the pattern of /messages, /events or /sync imo. I'll try implementing it as a GET without the device deletion first and see how that works out, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A GET endpoint with side-effects seems like a big no-no to me. Everyone expects a GET request to have approximately zero side-effects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, but we're also proposing removing the side-effects? SGTM in that case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, the current implementation no longer automatically deletes the device on the server side, but relies on the client to delete/create a new device. So we're going to try to make this a GET.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not what https://spec.matrix.org/v1.9/appendices/#pagination says should happen. I'd like to see this changed before this stabilises.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original reason why this endpoint used an empty
events
array to signal the end was that the client using the finalnext_batch
token would signal to the server that it could delete the to-device messages. Since we aren't doing that any more, we can make it work like the appendix says it should.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably specify what happens when we use
POST /delete_devices
andDELETE /devices/{deviceId}
on the dehydrated device. AlsoPOST /logout/all
. Presumably those would delete the dehydrated device. So maybe we don't need our ownDELETE
endpoint here? Though this endpoint allows you to delete the dehydrated device without knowing the device ID, so might still be useful.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we only need one fallback key. Normal clients store two fallback keys because they upload a new fallback key after the current fallback key is used. But with dehydrated devices, we will never upload a new fallback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise, I don't think we need this because we won't upload new OTKs after we upload the dehydrated device. It also makes assumptions about the structure of the OTK IDs, which may not be true for all olm implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need these, because the key ID doesn't get used when decrypting events, and any OTKs that are included in the dehydrated device will surely have been published -- any unpublished OTKs should just be omitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if I'm reading the iOS implementation correctly, the key is encoded with unpadded base64 (as is done with the other keys in secret storage)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we doing this [replacing the dehydrated device periodically] or not?
It seems like both have serious downsides. If we do replace it, we have a very racy operation that is certain to cause UTDs in practice. If we don't replace it, then we'll end up with no remaining OTKs at all, and an incredibly long list of to-device messages all of which have to be downloaded and decrypted by any new clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this sill works with v2? can we still sync on the dehydrated device?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't sync as a different device in this proposal. You can fetch the events for that device, but this proposal implicitly deletes the device in that case, which means you can't keep the device alive after that. So imo your only option is to replace it (which is somewhat easy to do, but you might need to authenticate the new signature upload/device?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section should contain a discussion of malicious dehydrated devices injected by the server.
The MSC in its current form is almost purely focused on the mechanics of creation and rehydration of dehydrated devices. There is very little discussion of how these devices should be treated by message senders.
And yet, these are not ordinary devices, as evidenced by the fact that we are proposing special UI/UX for them in the section concerning the
dehydrated
flag. The obvious concern is that dehydrated devices could end up being hidden or obscured from the user's view—even more so than ordinary devices—and therefore bear an even greater risk of malicious injection.The long-term plan is to move to a model where a user's devices must be signed by the user's cryptographic identity in order to be considered valid (see MSC4153) . Given that context, and the fact that dehydrated devices are a completely new feature, I strongly recommend that this MSC should require that a dehydrated device MUST be signed by a pinned (TOFU-trusted) user identity in order to be considered valid. If the dehydrated device is not signed, or is signed by a user identity which is not the one that is currently pinned by the client, the dehydrated device MUST be ignored by senders as if it it does no exist. That is, clients MUST NOT send any to-device messages to such a device nor accept any to-device messages from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dehydrated devices shouldn't be sending to-device messages, so it's probably safe to say that we should not accept any to-device messages from any devices marked as dehydrated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Client implementation: https://gitlab.com/famedly/company/frontend/famedlysdk/-/merge_requests/1111
Server implementation: matrix-org/synapse#13581
Both not merged yet and notably missing is the dehydrated device format.