-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC4048: Authenticated key backup #4048
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a sensible idea. Just have a few questions.
proposals/4048-signed-key-backup.md
Outdated
The `AuthData` object for the [`m.megolm_backup.v1.curve25519-aes-sha2` key | ||
backup | ||
algorithm](https://spec.matrix.org/unstable/client-server-api/#backup-algorithm-mmegolm_backupv1curve25519-aes-sha2) | ||
has a new optional property called `signing_public_key`, contains the public |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the rationale behind using a new key for backup signing instead of reusing the master signing key or device key? Is it to make it possible to revoke the backup signing key without affecting any other keys?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In several MSC we are seeing introduction of new keys (tofu, constraint membership), looks like we are going to soon have tons of such keys. It's mainly annoying because adding a new key requires to update the exisiting 4S in order to update the account recovery backup, and this will require users to enter their passphrase on upgrade.
Could we try to go with scoped signatures?
As per:
Given a scope named my-scope
, a signature of a content would mean to take the canonical json of content then append my-scope
to that string then sign it with an existing key (e.g MSK).
scoped_signature = sign(canonical_json_content + "my-scope", alice_msk).
And the signature should be uploaded with the ed25519. scheme:
"signatures": {
"@alice:example.com": {
"ed25519.myscope:base64+master+signing+public+key": "base64+scoped+signature+of+content"
}
},
The scope is added in the string to sign so that the home server cannot switch signature from another scope (or no scope) of the same content.
In this case I suppose that using the scoped signature by the SSK makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi there! GitHub likes to notify me whenever someone uses my name with an @ in front of it as an example. I'll untag myself from this, but just a heads up :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a comment in the "Alternatives" section
Co-authored-by: Andrew Ferrazzutti <[email protected]>
proposals/4048-signed-key-backup.md
Outdated
Rather than using a new signing key, we could use an existing signing key, such | ||
as one of the cross-signing keys. This would remove the need for users to | ||
enter their Secret Storage key to add the new signing key. However, this means | ||
that a user cannot create a key backup without also using cross-signing. Using | ||
a separate key also allows the user to give someone else (such as a bot) | ||
permission to write to their backups without allowing them to perform any | ||
cross-signing operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@poljar, @davidegirardi, and myself discussed another alternative that avoids adding a new cryptographically-independent key while also avoiding all of the downsides mentioned above.
We propose deriving the backup signing key from the existing backup encryption key using a KDF. This retains the ability to have write-only bots, by sharing the derived signing key without sharing the backup recovery key. It also avoids having to set up cross-signing.
The derivation step could look something like this:
signing_key = HKDF("", backup_recovery_secret, "BACKUP_SIGNING_KEY", 32)
Where backup_recovery_secret
are the raw secret bytes of the backup recovery key, prior to clamping and other modifications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you meant:
signing_key = HKDF("", decryption_key, "BACKUP_SIGNING_KEY", 32)
Where the decryption_key
is the Backup recovery key before it was clamped into a Curve25519
key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then there's no reason to even keep using signatures as the auth method, since we could treat the derived key as a symmetric one and use the more efficient HMAC instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could treat the derived key as a symmetric one and use the more efficient HMAC instead.
Do you mean to reinstate the outer MAC, or to use the signing key to produce a MAC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean to reinstate the outer MAC, or to use the signing key to produce a MAC?
The MAC has to be on the outside so that we don't even touch the data before we authenticate it. The "signing" key is then no longer a signing key, but a symmetric key used for HMAC:
hmac_key = HKDF("", decryption_key, "BACKUP_AUTH_KEY", 32)
(The mechanics of it are the same, but the intention is different.)
If we'd like to avoid changing the backup algorithm version, we could always name the property differently to avoid collision with the old MAC property. But given it's likely we'll need a new backup algorithm version for other reasons, I'm not sure it makes sense to try to avoid it.
I think you meant:
signing_key = HKDF("", decryption_key, "BACKUP_SIGNING_KEY", 32)
Where the
decryption_key
is the Backup recovery key before it was clamped into aCurve25519
key.
Right, I did. I mixed up the specifics of that but that's indeed the right solution. I'll edit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT, this will still require us to put the signing/MAC key in SSSS (so that devices that can't read can still write), but it allows getting both the decryption and signing/MAC key from a single key.
How would we give access to just the MAC key in the SSSS to such devices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would we give access to just the MAC key in the SSSS to such devices?
In theory, Secret Storage can encrypt different sets of secrets with different keys, though in practice all clients just use the default key.
However, if the secrets are being shared from one device to another, it can share the MAC key without sharing the backup decryption key.
Hmm. Maybe we don't have to store the MAC key, but just share it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think it's pretty simple. If the Secret Storage key has encrypted the backup key, then we don't need to encrypt the MAC key, as we can just derive it. If the Secret Storage key has not encrypted the backup key, but we want it to encrypt the MAC key, then we can store it using the name m.key_backup.mac
. The MAC key can also be shared using the same name. I'll update the MSC to say something about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue we never want to store m.key_backup.mac
in SSSS, to simplify reasoning. Clients which have access to the SSSS should be expected to have full (write and read) access to the key backup anyway.
Sharing m.key_backup.mac
using the Secret Sharing API makes sense though because it gives us an opportunity to support write-only clients.
In theory, Secret Storage can encrypt different sets of secrets with different keys, though in practice all clients just use the default key.
I would avoid introducing this, so that we don't needlessly increase complexity and make everything harder to reason about. I don't think we even have any kind of key ID there, so a client would have to try decrypting with each SSSS key it has in succession? Edit: Correction, we have exactly that, but clients don't typically use it, instead only ever using the default_key
. So I still maintain it's better to not pull in this relatively complex and untested functionality just now. (In other news, I now know what the term default_key
mentioned the other day was referring to. TIL.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you may be right about simplifying reasoning. I'll leave it in for now, but will probably remove it later.
proposals/4048-signed-key-backup.md
Outdated
- a `signatures` property: the `SessionData` is a [signed JSON | ||
object](https://spec.matrix.org/unstable/appendices/#signing-json), signed | ||
using the backup signing key, using the public key (encoded in unpadded | ||
base64) as the key ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that SessionData
is the plaintext, it sounds like we would have to decrypt the backup in order to find out whether the signature is correct. Given that we are also removing the outer MAC, that would leave us operating on unauthenticated cryptographic data, which is a no-no.
As an aside, I think this MSC would benefit from a "Mechanics" section describing concrete steps in a "You take THIS to perform THAT" kind of manner rather than leaving it implicit from the description of the components.
proposals/4048-signed-key-backup.md
Outdated
Rather than using a new signing key, we could use an existing signing key, such | ||
as one of the cross-signing keys. This would remove the need for users to | ||
enter their Secret Storage key to add the new signing key. However, this means | ||
that a user cannot create a key backup without also using cross-signing. Using | ||
a separate key also allows the user to give someone else (such as a bot) | ||
permission to write to their backups without allowing them to perform any | ||
cross-signing operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to add, the primary advantage here is that the auth mechanism should protect the outer ciphertext rather than the plaintext, to prevent chosen-ciphertext attacks. It being a MAC rather than a signature is just a more elegant approach and a better-fitting tool for the job, but both would work.
Will give it another read.
proposals/4048-signed-key-backup.md
Outdated
Rather than using a new signing key, we could use an existing signing key, such | ||
as one of the cross-signing keys. This would remove the need for users to | ||
enter their Secret Storage key to add the new signing key. However, this means | ||
that a user cannot create a key backup without also using cross-signing. Using | ||
a separate key also allows the user to give someone else (such as a bot) | ||
permission to write to their backups without allowing them to perform any | ||
cross-signing operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT, this will still require us to put the signing/MAC key in SSSS (so that devices that can't read can still write), but it allows getting both the decryption and signing/MAC key from a single key.
How would we give access to just the MAC key in the SSSS to such devices?
proposals/4048-signed-key-backup.md
Outdated
The following changes are made to the cleartext `session_data` property of the | ||
`KeyBackupData` object is deprecated: | ||
|
||
- a new `mac2` [FIXME: get a better name. suggestions?] property is added, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd go for backup_mac
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we may have to change the backup algorithm name for other reasons. If we're doing that, we might get away with reusing the MAC field name.
Co-authored-by: Denis Kasak <[email protected]>
|
||
The backup MAC key can be shared/stored using [the Secrets | ||
module](https://spec.matrix.org/unstable/client-server-api/#secrets) using the | ||
name `m.megolm_backup.v1.mac`. Note that if the backup decryption key (the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a bit more detail on that, and maybe the use case?
Something like:
It is possible in the protocol to only make part of the secrets available to certain devices. With that mecanism it's possible to have a session that is not authorised to access history (m.megolm_backup.v1
is not shared), but that yet remains able to participate in the backup for new keys (needs then only the m.megolm_backup.v1.mac
). It would also allow clients to not cache the backup decryption key, but still participate in the backup.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also #4048 (comment) for some more nuance on this.
TL;DR, I would:
- Only store the backup decryption key in SSSS. That is, never store the MAC key.
- To be able to support write-only clients, the MAC key could be shared using Secret Sharing.
- The MAC key would therefore only ever be derived or shared directly via to-device.
Avoiding the storage of the MAC key will lower complexity and ease reasoning, because we will avoid hard-to-debug situations, such as the SSSS containing only the MAC key, and therefore all clients being able to write to the backup but none of them being able to read it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it feels weird to store an intermediate value (the MAC key) in SSSS.
proposals/4048-signed-key-backup.md
Outdated
`m.megolm_backup.v1.curve25519-aes-sha2` key backup algorithm. | ||
|
||
The following changes are made to the cleartext `session_data` property of the | ||
`KeyBackupData` object is deprecated: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notice that the server side is using some metadata of KeyBackupData
to decide what to do when the same key is uploaded: keep or replace existing one.
The new meta data added to the session_data
should be considered by this algorithm.
Co-authored-by: Denis Kasak <[email protected]>
proposals/4048-signed-key-backup.md
Outdated
`KeyBackupData` object: | ||
|
||
- a new `mac2` [FIXME: get a better name. suggestions?] property is added, | ||
which is a MAC of the `SessionData` ciphertext (prior to base64-encoding), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the MAC cover the ephemeral
property too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why we would want to leave it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I'm not sure that it's strictly necessary, but not much reason no to, and better to do so just in case. I think the only reasons for leaving it out are: the original MAC left it out, and if we include it, we need to decide on a format. But for the format, I think we can do the same thing as Signed JSON.
requires the decryption key for the backup. In addition, the deniability | ||
property mainly refers to the fact that a recipient cannot prove the | ||
authenticity of the message to a third party, and usually is not concerned with | ||
preventing self-incrimination. And in fact, a confiscated device may already |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Just because I created a megolm session doesn’t mean that I was the one who encrypted the messages in it, as megolm is symmetric? So proving I own the creation of a key doesn’t achieve much in terms of deniability aiui; a given message could have been fabricated by the other party? (at least until you try to send a msg with the same ratchet key - but i guess the same would be true if the megolm session was entirely fabricated, in terms of happening at the wrong place relative to other megolm sessions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Megolm session has a signing key that only the creator knows the private part. So while anyone can encrypt a message with the Megolm session, they won't be able to produce a correct signature, so the message won't be validated.
- define new HMAC method that covers other fields - indicate key source for unauthenticated keys - define migration
backup. However this removes the ability for a client to be able to write to | ||
the backup without being able to read from it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This MSC could really do with explaining why the ability to write to the backup but not read from it it is useful.
Something something bots, I gather?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not just bots, but also as a prerequisite component for less privileged clients which don't have the ability to read all of history while still being able to participate in a room.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do any such clients or bots exist currently?
I feel like a bot or client writing but not reading from backup really should be an implementation requirement of this MSC to prove that it is actually useful to keep the ability to write but not read from the backup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure whether they exist at the moment. But retaining the ability to have clients of reduced power (rather than requiring that they are all of equal, maximal power) seems like an obviously useful property.
Can you expand on how an example implementation would prove this further than a thought experiment? Or why the benefits are not obvious from the thought experiment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not obvious to me that a non-maximally-powerful client is useful. Maybe I have not been privy to some discussions about such potential use-cases?
Even the bot case that @richvdh mentioned doesn't really make sense to me. Why would I provide the bot access to my account?
I also don't know of any other chat network which has the concept of less-privileged clients (so as far as I know we are not copying prior art). It feels like such a client would be very confusing to end users.
I'm just a bit wary of doing things because it's a "useful property" when the property has not actually been shown to be useful (think: the overcomplicated megolm ratchet which has a mechanism for skipping by multiples of
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this MSC is a good opportunity to fix the issues with the existing backups (namely that the MAC is useless due to a libolm bug matrix-org/matrix-spec#1712). There doesn't seem to be any reason to maintain backwards compatibility, and instead we should just hard-fork the backups (the old backups can be for old keys and new keys can go into the new backup).
I'm also still not entirely convinced that an asymmetric backup is the way to go, but I'm open to being shown examples of that being a useful feature.
|
||
The backup MAC key can be shared/stored using [the Secrets | ||
module](https://spec.matrix.org/unstable/client-server-api/#secrets) using the | ||
name `m.megolm_backup.v1.mac`. Note that if the backup decryption key (the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it feels weird to store an intermediate value (the MAC key) in SSSS.
3. Using the shared secret, generate 80 bytes by performing an HKDF using | ||
SHA-256 as the hash, with a salt of 32 bytes of 0, and with the empty string | ||
as the info. The first 32 bytes are used as the AES key, the next 32 bytes | ||
are discarded, and the last 16 bytes are used as the AES initialization | ||
vector. (This is the same as the key generation for | ||
`m.megolm_backup.v1.curve25519-aes-sha2`, except that the generated MAC key | ||
is discarded since it is unused.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason to derive 80 bytes when we are discarding 32 of them? It seems it's just for compatibility with the (broken) m.megolm_backup.v1.curve25519-aes-sha2
algorithm?
and `session_key` properties defined for | ||
`m.megolm_backup.v1.curve25519-aes-sha2`. | ||
|
||
### `m.megolm_backup.v1.curve25519-aes-sha2` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just want to clarify my understanding here. We are modifying the definition of the existing m.megolm_backup.v1.curve25519-aes-sha2
algorithm that is already specified here to make it compatible with m.backup.v2.curve25519-aes-sha2
?
Why? I don't see any reason for uploading new keys to an old backup.
5. Pass the raw encrypted data (prior to base64 encoding) through HMAC-SHA-256 | ||
using the MAC key generated above. The first 8 bytes of the resulting MAC | ||
are base64-encoded, and become the `mac` property of the `session_data`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not how the current algorithm works. See matrix-org/matrix-spec#1712
It passes an empty byte string instead of the raw encrypted data.
backup. However this removes the ability for a client to be able to write to | ||
the backup without being able to read from it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do any such clients or bots exist currently?
I feel like a bot or client writing but not reading from backup really should be an implementation requirement of this MSC to prove that it is actually useful to keep the ability to write but not read from the backup.
Rendered