Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC4048: Authenticated key backup #4048

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

uhoreg
Copy link
Member

@uhoreg uhoreg commented Aug 23, 2023

@uhoreg uhoreg changed the title MSCxxxx: Signed key backup MSC4048: Signed key backup Aug 23, 2023
@uhoreg uhoreg added e2e proposal A matrix spec change proposal client-server Client-Server API kind:feature MSC for not-core and not-maintenance stuff needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. labels Aug 23, 2023
Copy link
Member

@AndrewFerr AndrewFerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a sensible idea. Just have a few questions.

proposals/4048-signed-key-backup.md Outdated Show resolved Hide resolved
proposals/4048-signed-key-backup.md Outdated Show resolved Hide resolved
proposals/4048-signed-key-backup.md Show resolved Hide resolved
proposals/4048-signed-key-backup.md Outdated Show resolved Hide resolved
proposals/4048-signed-key-backup.md Outdated Show resolved Hide resolved
The `AuthData` object for the [`m.megolm_backup.v1.curve25519-aes-sha2` key
backup
algorithm](https://spec.matrix.org/unstable/client-server-api/#backup-algorithm-mmegolm_backupv1curve25519-aes-sha2)
has a new optional property called `signing_public_key`, contains the public
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale behind using a new key for backup signing instead of reusing the master signing key or device key? Is it to make it possible to revoke the backup signing key without affecting any other keys?

Copy link
Member

@BillCarsonFr BillCarsonFr Aug 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In several MSC we are seeing introduction of new keys (tofu, constraint membership), looks like we are going to soon have tons of such keys. It's mainly annoying because adding a new key requires to update the exisiting 4S in order to update the account recovery backup, and this will require users to enter their passphrase on upgrade.

Could we try to go with scoped signatures?

As per:

Given a scope named my-scope, a signature of a content would mean to take the canonical json of content then append my-scope to that string then sign it with an existing key (e.g MSK).

scoped_signature = sign(canonical_json_content + "my-scope", alice_msk).

And the signature should be uploaded with the ed25519. scheme:

"signatures": {
    "@alice:example.com": {
        "ed25519.myscope:base64+master+signing+public+key": "base64+scoped+signature+of+content"
    }
},

The scope is added in the string to sign so that the home server cannot switch signature from another scope (or no scope) of the same content.

In this case I suppose that using the scoped signature by the SSK makes sense

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi there! GitHub likes to notify me whenever someone uses my name with an @ in front of it as an example. I'll untag myself from this, but just a heads up :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment in the "Alternatives" section

AndrewFerr
AndrewFerr previously approved these changes Sep 19, 2023
Comment on lines 82 to 88
Rather than using a new signing key, we could use an existing signing key, such
as one of the cross-signing keys. This would remove the need for users to
enter their Secret Storage key to add the new signing key. However, this means
that a user cannot create a key backup without also using cross-signing. Using
a separate key also allows the user to give someone else (such as a bot)
permission to write to their backups without allowing them to perform any
cross-signing operations.
Copy link
Member

@dkasak dkasak Sep 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@poljar, @davidegirardi, and myself discussed another alternative that avoids adding a new cryptographically-independent key while also avoiding all of the downsides mentioned above.

We propose deriving the backup signing key from the existing backup encryption key using a KDF. This retains the ability to have write-only bots, by sharing the derived signing key without sharing the backup recovery key. It also avoids having to set up cross-signing.

The derivation step could look something like this:

signing_key = HKDF("", backup_recovery_secret, "BACKUP_SIGNING_KEY", 32)

Where backup_recovery_secret are the raw secret bytes of the backup recovery key, prior to clamping and other modifications.

Copy link
Contributor

@poljar poljar Sep 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant:

signing_key = HKDF("", decryption_key, "BACKUP_SIGNING_KEY", 32)

Where the decryption_key is the Backup recovery key before it was clamped into a Curve25519 key.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then there's no reason to even keep using signatures as the auth method, since we could treat the derived key as a symmetric one and use the more efficient HMAC instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could treat the derived key as a symmetric one and use the more efficient HMAC instead.

Do you mean to reinstate the outer MAC, or to use the signing key to produce a MAC?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to reinstate the outer MAC, or to use the signing key to produce a MAC?

The MAC has to be on the outside so that we don't even touch the data before we authenticate it. The "signing" key is then no longer a signing key, but a symmetric key used for HMAC:

hmac_key = HKDF("", decryption_key, "BACKUP_AUTH_KEY", 32)

(The mechanics of it are the same, but the intention is different.)

If we'd like to avoid changing the backup algorithm version, we could always name the property differently to avoid collision with the old MAC property. But given it's likely we'll need a new backup algorithm version for other reasons, I'm not sure it makes sense to try to avoid it.

I think you meant:

signing_key = HKDF("", decryption_key, "BACKUP_SIGNING_KEY", 32)

Where the decryption_key is the Backup recovery key before it was clamped into a Curve25519 key.

Right, I did. I mixed up the specifics of that but that's indeed the right solution. I'll edit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, this will still require us to put the signing/MAC key in SSSS (so that devices that can't read can still write), but it allows getting both the decryption and signing/MAC key from a single key.

How would we give access to just the MAC key in the SSSS to such devices?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would we give access to just the MAC key in the SSSS to such devices?

In theory, Secret Storage can encrypt different sets of secrets with different keys, though in practice all clients just use the default key.

However, if the secrets are being shared from one device to another, it can share the MAC key without sharing the backup decryption key.

Hmm. Maybe we don't have to store the MAC key, but just share it?

Copy link
Member Author

@uhoreg uhoreg Sep 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think it's pretty simple. If the Secret Storage key has encrypted the backup key, then we don't need to encrypt the MAC key, as we can just derive it. If the Secret Storage key has not encrypted the backup key, but we want it to encrypt the MAC key, then we can store it using the name m.key_backup.mac. The MAC key can also be shared using the same name. I'll update the MSC to say something about that.

Copy link
Member

@dkasak dkasak Sep 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue we never want to store m.key_backup.mac in SSSS, to simplify reasoning. Clients which have access to the SSSS should be expected to have full (write and read) access to the key backup anyway.

Sharing m.key_backup.mac using the Secret Sharing API makes sense though because it gives us an opportunity to support write-only clients.

In theory, Secret Storage can encrypt different sets of secrets with different keys, though in practice all clients just use the default key.

I would avoid introducing this, so that we don't needlessly increase complexity and make everything harder to reason about. I don't think we even have any kind of key ID there, so a client would have to try decrypting with each SSSS key it has in succession? Edit: Correction, we have exactly that, but clients don't typically use it, instead only ever using the default_key. So I still maintain it's better to not pull in this relatively complex and untested functionality just now. (In other news, I now know what the term default_key mentioned the other day was referring to. TIL.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you may be right about simplifying reasoning. I'll leave it in for now, but will probably remove it later.

Comment on lines 47 to 50
- a `signatures` property: the `SessionData` is a [signed JSON
object](https://spec.matrix.org/unstable/appendices/#signing-json), signed
using the backup signing key, using the public key (encoded in unpadded
base64) as the key ID
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that SessionData is the plaintext, it sounds like we would have to decrypt the backup in order to find out whether the signature is correct. Given that we are also removing the outer MAC, that would leave us operating on unauthenticated cryptographic data, which is a no-no.

As an aside, I think this MSC would benefit from a "Mechanics" section describing concrete steps in a "You take THIS to perform THAT" kind of manner rather than leaving it implicit from the description of the components.

@AndrewFerr AndrewFerr dismissed their stale review September 22, 2023 21:16

My review shouldn't be final

proposals/4048-signed-key-backup.md Outdated Show resolved Hide resolved
Comment on lines 82 to 88
Rather than using a new signing key, we could use an existing signing key, such
as one of the cross-signing keys. This would remove the need for users to
enter their Secret Storage key to add the new signing key. However, this means
that a user cannot create a key backup without also using cross-signing. Using
a separate key also allows the user to give someone else (such as a bot)
permission to write to their backups without allowing them to perform any
cross-signing operations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to add, the primary advantage here is that the auth mechanism should protect the outer ciphertext rather than the plaintext, to prevent chosen-ciphertext attacks. It being a MAC rather than a signature is just a more elegant approach and a better-fitting tool for the job, but both would work.

Will give it another read.

Comment on lines 82 to 88
Rather than using a new signing key, we could use an existing signing key, such
as one of the cross-signing keys. This would remove the need for users to
enter their Secret Storage key to add the new signing key. However, this means
that a user cannot create a key backup without also using cross-signing. Using
a separate key also allows the user to give someone else (such as a bot)
permission to write to their backups without allowing them to perform any
cross-signing operations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, this will still require us to put the signing/MAC key in SSSS (so that devices that can't read can still write), but it allows getting both the decryption and signing/MAC key from a single key.

How would we give access to just the MAC key in the SSSS to such devices?

proposals/4048-signed-key-backup.md Outdated Show resolved Hide resolved
The following changes are made to the cleartext `session_data` property of the
`KeyBackupData` object is deprecated:

- a new `mac2` [FIXME: get a better name. suggestions?] property is added,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd go for backup_mac.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we may have to change the backup algorithm name for other reasons. If we're doing that, we might get away with reusing the MAC field name.

proposals/4048-signed-key-backup.md Outdated Show resolved Hide resolved
@uhoreg uhoreg changed the title MSC4048: Signed key backup MSC4048: Authenticated key backup Sep 29, 2023

The backup MAC key can be shared/stored using [the Secrets
module](https://spec.matrix.org/unstable/client-server-api/#secrets) using the
name `m.megolm_backup.v1.mac`. Note that if the backup decryption key (the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a bit more detail on that, and maybe the use case?
Something like:
It is possible in the protocol to only make part of the secrets available to certain devices. With that mecanism it's possible to have a session that is not authorised to access history (m.megolm_backup.v1 is not shared), but that yet remains able to participate in the backup for new keys (needs then only the m.megolm_backup.v1.mac). It would also allow clients to not cache the backup decryption key, but still participate in the backup.

WDYT?

Copy link
Member

@dkasak dkasak Sep 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also #4048 (comment) for some more nuance on this.

TL;DR, I would:

  1. Only store the backup decryption key in SSSS. That is, never store the MAC key.
  2. To be able to support write-only clients, the MAC key could be shared using Secret Sharing.
  3. The MAC key would therefore only ever be derived or shared directly via to-device.

Avoiding the storage of the MAC key will lower complexity and ease reasoning, because we will avoid hard-to-debug situations, such as the SSSS containing only the MAC key, and therefore all clients being able to write to the backup but none of them being able to read it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it feels weird to store an intermediate value (the MAC key) in SSSS.

`m.megolm_backup.v1.curve25519-aes-sha2` key backup algorithm.

The following changes are made to the cleartext `session_data` property of the
`KeyBackupData` object is deprecated:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice that the server side is using some metadata of KeyBackupData to decide what to do when the same key is uploaded: keep or replace existing one.
The new meta data added to the session_data should be considered by this algorithm.

`KeyBackupData` object:

- a new `mac2` [FIXME: get a better name. suggestions?] property is added,
which is a MAC of the `SessionData` ciphertext (prior to base64-encoding),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the MAC cover the ephemeral property too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why we would want to leave it out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I'm not sure that it's strictly necessary, but not much reason no to, and better to do so just in case. I think the only reasons for leaving it out are: the original MAC left it out, and if we include it, we need to decide on a format. But for the format, I think we can do the same thing as Signed JSON.

requires the decryption key for the backup. In addition, the deniability
property mainly refers to the fact that a recipient cannot prove the
authenticity of the message to a third party, and usually is not concerned with
preventing self-incrimination. And in fact, a confiscated device may already
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Just because I created a megolm session doesn’t mean that I was the one who encrypted the messages in it, as megolm is symmetric? So proving I own the creation of a key doesn’t achieve much in terms of deniability aiui; a given message could have been fabricated by the other party? (at least until you try to send a msg with the same ratchet key - but i guess the same would be true if the megolm session was entirely fabricated, in terms of happening at the wrong place relative to other megolm sessions)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Megolm session has a signing key that only the creator knows the private part. So while anyone can encrypt a message with the Megolm session, they won't be able to produce a correct signature, so the message won't be validated.

- define new HMAC method that covers other fields
- indicate key source for unauthenticated keys
- define migration
Comment on lines +21 to +22
backup. However this removes the ability for a client to be able to write to
the backup without being able to read from it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This MSC could really do with explaining why the ability to write to the backup but not read from it it is useful.

Something something bots, I gather?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not just bots, but also as a prerequisite component for less privileged clients which don't have the ability to read all of history while still being able to participate in a room.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do any such clients or bots exist currently?

I feel like a bot or client writing but not reading from backup really should be an implementation requirement of this MSC to prove that it is actually useful to keep the ability to write but not read from the backup.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether they exist at the moment. But retaining the ability to have clients of reduced power (rather than requiring that they are all of equal, maximal power) seems like an obviously useful property.

Can you expand on how an example implementation would prove this further than a thought experiment? Or why the benefits are not obvious from the thought experiment?

Copy link
Contributor

@sumnerevans sumnerevans Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious to me that a non-maximally-powerful client is useful. Maybe I have not been privy to some discussions about such potential use-cases?

Even the bot case that @richvdh mentioned doesn't really make sense to me. Why would I provide the bot access to my account?

I also don't know of any other chat network which has the concept of less-privileged clients (so as far as I know we are not copying prior art). It feels like such a client would be very confusing to end users.

I'm just a bit wary of doing things because it's a "useful property" when the property has not actually been shown to be useful (think: the overcomplicated megolm ratchet which has a mechanism for skipping by multiples of $$2^n$$ but is not used for >100 messages in most cases rendering the complexity pointless).

Copy link
Contributor

@sumnerevans sumnerevans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this MSC is a good opportunity to fix the issues with the existing backups (namely that the MAC is useless due to a libolm bug matrix-org/matrix-spec#1712). There doesn't seem to be any reason to maintain backwards compatibility, and instead we should just hard-fork the backups (the old backups can be for old keys and new keys can go into the new backup).

I'm also still not entirely convinced that an asymmetric backup is the way to go, but I'm open to being shown examples of that being a useful feature.


The backup MAC key can be shared/stored using [the Secrets
module](https://spec.matrix.org/unstable/client-server-api/#secrets) using the
name `m.megolm_backup.v1.mac`. Note that if the backup decryption key (the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it feels weird to store an intermediate value (the MAC key) in SSSS.

Comment on lines +65 to +71
3. Using the shared secret, generate 80 bytes by performing an HKDF using
SHA-256 as the hash, with a salt of 32 bytes of 0, and with the empty string
as the info. The first 32 bytes are used as the AES key, the next 32 bytes
are discarded, and the last 16 bytes are used as the AES initialization
vector. (This is the same as the key generation for
`m.megolm_backup.v1.curve25519-aes-sha2`, except that the generated MAC key
is discarded since it is unused.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to derive 80 bytes when we are discarding 32 of them? It seems it's just for compatibility with the (broken) m.megolm_backup.v1.curve25519-aes-sha2 algorithm?

and `session_key` properties defined for
`m.megolm_backup.v1.curve25519-aes-sha2`.

### `m.megolm_backup.v1.curve25519-aes-sha2`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to clarify my understanding here. We are modifying the definition of the existing m.megolm_backup.v1.curve25519-aes-sha2 algorithm that is already specified here to make it compatible with m.backup.v2.curve25519-aes-sha2?

Why? I don't see any reason for uploading new keys to an old backup.

Comment on lines +152 to +154
5. Pass the raw encrypted data (prior to base64 encoding) through HMAC-SHA-256
using the MAC key generated above. The first 8 bytes of the resulting MAC
are base64-encoded, and become the `mac` property of the `session_data`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not how the current algorithm works. See matrix-org/matrix-spec#1712

It passes an empty byte string instead of the raw encrypted data.

Comment on lines +21 to +22
backup. However this removes the ability for a client to be able to write to
the backup without being able to read from it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do any such clients or bots exist currently?

I feel like a bot or client writing but not reading from backup really should be an implementation requirement of this MSC to prove that it is actually useful to keep the ability to write but not read from the backup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client-server Client-Server API e2e kind:feature MSC for not-core and not-maintenance stuff needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. proposal A matrix spec change proposal
Projects
Status: Scheduled - v1.10
Development

Successfully merging this pull request may close these issues.

10 participants