Submitted by SecureKeys' Baha Shaaban, edited by DIF's Chris Kelly.
In short, ECDH-1PU is a key derivation process that allows for sender authenticity and enables a “Perfect Forward Secrecy” mechanism, in addition to significant performance gains over JWS message nested in a JWE envelope, as used by existign ECDH-ES aproaches. This article walks through how ECDH-ES works step by step, showing how it achieves sender identity authentication using nested messages (JWS in JWE), and finally showing how ECDH-1PU is a better choice for authenticating the sender. This helps maintain the use of a single JWE message (without JWS) to meet the need for constrained agents (such as IOT devices) by reducing their communication footprint.
Elliptic-curve Diffie–Hellman (ECDH) is a protocol that allows two parties to establish a secure and private channel over an insecure and observed network, and forms the basis for the secure encryption on many popular messaging apps, including Facebook Messenger, Whatsapp, Signal and Skype… as well as DIDComm V1, which is being superseded by a more broadly interoperable V2. So-called “Diffie-Hellman” key exchanges ensure that messages sent over the channel they create can only be correctly interpreted by the sender and intended recipient, regardless of eavesdropping or security breaches on the communications channel.
As described in the Message Encryption section of the DIDComm v2 specification, the DIDComm protocol for transmitting Decentralized IDentifiers (DIDs) requires protecting a message using either Anonymous Encryption (aka `Anoncrypt`) or Sender Authenticated Encryption (aka `Authcrypt`). Either of these methods take different inputs and create a secure channel between two or more parties.
In the “AnonCrypt” handshake, there is no pre-existing “sender key” involved and it is intended for only the recipients of the message. This mechanism requires encryption with a Content Encryption Key (a “cek”) being wrapped with a key agreement mechanism using `ECDH-ES` for each recipient. DIDComm uses this “key-wrapping mode” with ECDH-ES to ensure only the intended recipients can decrypt the final message. The benefit of using `ECDH-ES` is that it's widely used and available in many crypto libraries for most modern languages. Building JWE envelopes using this type of encryption should be relatively easy using existing JOSE libraries in your preferred language.
“AuthCrypt'', however, requires a pre-existing (and published/discoverable) sender key to encrypt the message for the sender, and against which the sender can be authenticated. This encryption mechanism uses the ECDH-1PU specification from IETF, the main topic of this article. As a growing number of independent implementations get announced, 1PU advances on the Standards Track at IETF and achieves maturity as a specification; for this reason, we are offering this educational resource so that understanding of its mechanisms can grow as well.
The all-important “Z” in ECDH-ES, first step in understanding ECDH-1PU
To understand how ECDH-1PU is significant, knowledge about the internals of ECDH-ES is required. ECDH-ES key agreement requires the sender to execute the following steps for each recipient to derive the key used to wrap the `cek`:
1. Generate an ephemeral key (aka `epk`).
2. Build `apu`, the producer (sender) identity. For Anoncrypt, this will represent the X value of `epk`, base64URL-encoded.
3. Build `apv`, the receiver (recipient) identity. It can optionally contain the recipient `kid`, base64URL-encoded.
4. Compute `Z`: the key derivation process output of ECDH for each recipient using the above values with the **private** `epk` key and the recipient’s **public** key on the sender side. An example is found in appendix C of the IETF RFC7518. On their end, the recipient will get the **public** `epk` and therefore does the same computation with their own **private** key.
5. Finally, the computed derived key is used to wrap the `cek`, the symmetric key used to encrypt/decrypt the payload content (the `ciphertext` section) of the JWE envelope .
So, to be clear, ECDH-ES takes as mandatory inputs an ephemeral key generated by the sender and the recipient's public key (a static long-lived key) to compute the key derivation `Z`; for this reason, the ES notation means Ephemeral-Static in ECDH-ES. This derivation fits neatly the requirement to protect messages for recipients without revealing the sender's identity (ie: no static sender key is used in the key derivation process when the recipient derives `Z`).
Key derivation beyond ECDH-ES to enable AuthCrypt
In the previous section, key derivation using an ephemeral key does not reveal who sent the message. This is useful for messages requiring anonymity of the author, e.g. a router agent receiving a message does not need to authenticate the sender in any way; its only purpose is to route the message to an end recipient. For this router agent, passing along Anoncrypt messages is acceptable.
In most cases, an end recipient requires authenticating the original sender. This means recipients will need to hold the sender's public key prior to receiving their messages in order to authenticate them. Since ECDH-ES does not involve the sender key, the only way to authenticate a sender is to nest a JWS in a JWE message which is “heavier” than a plain JWE-only message, i.e., more complex (i.e. expensive) to process and route.
Another, newer, option would be to use a new key derivation process that involves the sender's key. ECDH-1PU was introduced for this specific purpose; it uses the sender's static key in the key derivation process. The following section is dedicated to this process.
The Advantages of using ECDH-1PU (adapted from the DIDComm v2 Introduction)
The advantages of public key authenticated encryption with ECDH-1PU
compared to using nested, signed-then-encrypted documents include:
Size and Efficiency
The resulting message size is more compact, as an additional layer of headers and base64url-encoding is avoided. A 500-byte payload when encrypted and authenticated with ECDH-1PU (with P-256 keys and "A256GCM" Content Encryption Method) results in a 1087-byte JWE in Compact Encoding. An equivalent, nested-then-signed-then-encrypted JOSE message using the same keys and encryption method is 1489 bytes (37% larger).
In both cases, though, the same cryptographics primitives achieve the same levels of confidentiality and authenticity, so these savings in code size, so crucial for constrained environments, come at no cost to privacy and security outcomes.
Increased Security
The generic composition of signatures and public key encryption involves a number of subtle details that are essential to security (namely, to the traits of Public Key Authenticated Encryption or PKAE). Providing a dedicated algorithm for public key authenticated encryption reduces complexity for users of JOSE libraries, which lowers the incidence of human error and design flaws with cybersecurity implications.
Flexibility
ECDH-1PU provides only authenticity and not the stronger security properties of non-repudiation or third-party verifiability. This can be an advantage in applications where privacy, anonymity, or plausible deniability are goals.
ECDH-1PU for sender authentication
Similar to ECDH-ES, the 1PU process executes key derivation to compute a Z but has 2 computations rather than 1, with the final result being formed by concatenating both as described here (Adapted from the Key Derivation section of RFC7518):
1. The first is called `Ze`, which is the exact same key derivation as ECDH-ES using a private encryption provider key (`epk`) and the public recipient key on the sender side. (The recipient side will involve the public `epk` and the private recipient key).
2. The second is called `Zs`; in this second computation, we use the sender's static (long-lived) key instead of `epk`. So on the sender side we derive `Zs` by using the sender's private key and the recipient's public key. (The recipient side will use the sender's public key and the recipient's private key on their end).
The final `Z` is the concatenation of `Ze` and `Zs` which is then used in the key-wrapping, analogously to ECDH-ES.
There are special considerations in the process to protect against sender impersonation as described in Section 2.1 of the draft:
- In Key Agreement with Key Wrapping mode, the JWE Authentication Tag is included in the input to the Key Derivation Function as described in section Section 2.3. This ensures that the content of the JWE was produced by the original sender and not by another recipient, as described in the Key Management Algorithms Section of the RFC.
- Key Agreement with Key Wrapping mode MUST only be used with content encryption algorithms that are compactly committing AEADs as described in the Authenticated Encryption with Associated Data (AEAD) specification.
- The AES_CBC_HMAC_SHA2 algorithms described in section 5.2 of RFC7518 are compactly committing and can be used with ECDH-1PU in Key Agreement with Key Wrapping mode. Other content encryption algorithms MUST be rejected.
- In Direct Key Agreement mode, any JWE content encryption algorithm MAY be used. This mode is NOT supported in DIDComm V2.
The requirement to include the JWE Authentication Tag in the input to the Key Derivation Function implies an adjustment to the order of operations performed during JWE Message Encryption described in section 5.1 of [RFC7516]. Steps 3-8 are deferred until after step15, using the randomly generated CEK from step 2 for encryption of the message content.
To sum up, these considerations require:
1. The use of the `AES_CBC_HMAC_SHA` family of content encryption algorithms to encrypt the payload. Currently, JWE supports the following three algorithms in this family:
1. A128CBC-HS256
2. A192CBC-HS384
3. A256CBC-HS512
Note, however, that the DIDComm v2 specification constrains payload encryption options to minimize interoperability issues across implementations, so only the third-listed encryption algorithm, A256CBC-HS512, should be used for DIDComm v2 purposes.
2. Encrypt the payload prior to wrapping `cek` with the derived `Z`. The output is labelled `ciphertext` and `tag`
3. Use the resulting `tag` from the previous step as the value `len(tag)`+`tag` set in `cctag` in the key derivations of `Ze` and `Zs`.
Additionally, the `skid` protected header is also introduced as a `kid` (key ID) to reference the sender key. This will help recipients resolve the key behind `skid` and execute the ECDH-1PU process explained in the previous section.
For the sake of consistency, `apu` and `apv` must be set to the values mentioned in section 5.8 of the DIDComm Messaging protocol to further restrain and protect the message.
Conclusion
ECDH-1PU is a public-key derivation process that allows for sender authentication and offers not only increased security, but also performance gains as mentioned above, especially when compared to a JWS message nested in a JWE envelope. This article walked through the implementation of ECDH-ES, showing how it achieves sender identity authentication using nested messages (JWS in JWE), and finally showing how ECDH-1PU is a better choice for authenticating the sender. This helps maintain the use of a single JWE message (without JWS) to meet the need for constrained agents (such as IOT devices) by reducing their communication footprint, as well as making it suitable where anonymity and privacy are of particular concern.
To find out more about joining the DIDComm Working Group at DIF, visit their page here, or follow their work on GitHub here.