diff mbox series

RFC: Cryptographic attestation for email-based patch workflows

Message ID 20190910121324.GA6867@pure.paranoia.local (mailing list archive)
State New, archived
Headers show
Series RFC: Cryptographic attestation for email-based patch workflows | expand

Commit Message

Konstantin Ryabitsev Sept. 10, 2019, 12:13 p.m. UTC
Hello, all:

This is a very "raw" idea that stems from a handful of conversations
that took place at the Kernel Summit. I wanted to pass it along to this
list in hopes that it can generate some workable ideas (or shot down and
allowed to die early).

# Problem

One of the recurring concerns raised by kernel developers is the fact
that email-based patch workflow offers no git-native mechanism of
cryptographic integrity attestation. In other words, the only mechanism
for someone to verify that patch contents have not been altered is
via PGP-signed email. For a slew of reasons, this is not a sufficiently
good solution:

- PGP support in mail clients continues to be sub-par
- Patch archival and management tools (like patchwork) remove easy
  ability to verify PGP signatures because they need to modify email
  bodies (but not patch content), e.g. to add Reviewed-By: or similar
  taglines
- Tools like git-am have no native support for verifying PGP signatures

# Proposed approach

I recommend that we provide a way to include cryptographic signature
information natively using git-format-patch, using roughly the following
process:

- generate a signify-compatible cryptographic signature of the verbatim
  patch content, perhaps slightly normalized for things like LF vs. CRLF
  line endings (see minisign/libsodium for crypto details)
- include both the signature and the public key in the area below '---',
  using "Minisig:" and "Minikey:" taglines

For example:

---8<---
From b41a2a0f817caddc9a76f43c3c9ed7d8edd6b2de Mon Sep 17 00:00:00 2001
From: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Date: Tue, 10 Sep 2019 06:15:36 -0400
Subject: [PATCH] Second commit

Change the greeting.

Signed-off-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
---
 foo.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

 Minisig: RWT9fcUvSnHPLiqWgXEnn98sgk8nl4FteDRkD+9lVK+He//eLOxNZ5QjCROoKJgPGpL4uzoHicN+f6gB54qmtO1cQtfvjS+++QU=
 Minikey: RWT9fcUvSnHPLqqyfLbkGBMEscBWciFFp2iBj2XnZPzW69OVIoYwZ25q

--ᐞ
2.21.0
---8<---

When git-am encounters a signed patch, it should:

1. check if the email in From: matches existing entries in git's TOFU
   (trust on first use) database, which is a simple key-value store
   like:

   konstantin@linuxfoundation.org: RWT9fcUvSnHPLqqyfLbkGBMEscBWciFFp2iBj2XnZPzW69OVIoYwZ25q

2. if no matches, add a new entry to the TOFU tracking database and
   consider the key automatically trusted (perhaps configurable)
3. if there are existing matches:

   a. compare the keys to make sure they haven't changed
   b. if keys changed, emit a warning and let developer decide if they
      trust the key change
   c. if keys did not change, validate the signature
   d. if validation failed, alert the developer and error out

4. if the TOFU db exists at all, git-am should check if the email
   address in From: matches any existing records and alert if the patch
   carries no signature (in case it's been removed by a malicious
   attacker).

All of these operations should be sufficiently fast, since both ECC
crypto and key-value lookups are fast operations that don't require a
lot of resources.

# Why minisigs?

In my experience, the kinds of developers who submit patches to mailing
lists would consider PGP/GnuPG too cumbersome to bootstrap, which is why I
lean towards managing keys natively by git. In my mind, the process
would go like this:

- developer sends patches to the mailing list
- maintainer responds with "looks good, but please sign and resubmit
  by passing --minisign to git-format-patch"
- developer runs `git format-patch --minisign`, which walks them through
  generating the key and storing it in a dedicated file
- git can take care of passphrase handling by hooking into
  credential-helper and credential-cache routines

# Coupling with PGP

Communities relying on the PGP web of trust can tie minikeys with their
PGP identity by creating a UID entry containing their minisign public key,
e.g.:

pub   rsa4096/E63EDCA9329DD07E 2011-11-07 [SC]
      DE0E66E32F1FDD0902666B96E63EDCA9329DD07E
uid                 [ultimate] Konstantin Ryabitsev <konstantin@linuxfoundation.org>
uid                 [ultimate] Konstantin Ryabitsev <RWT9fcUvSnHPLqqyfLbkGBMEscBWciFFp2iBj2XnZPzW69OVIoYwZ25q>

Such UIDs can be revoked as necessary and new ones can be created --
plus they are searchable using standard gnupg/keyserver tools.

# Comments?

I'd love to hear your feedback on the idea. Even if this scheme is not
used by maintainers directly, it offers ways of verifying if patches
stored in public archives (such as public-inbox) have been modified and
provides some developer attestation of email-based workflows.

-K

Comments

Dave Huseby Sept. 27, 2019, 3:24 p.m. UTC | #1
On 10.09.2019 08:13, Konstantin Ryabitsev wrote:
># Proposed approach
>
>I recommend that we provide a way to include cryptographic signature
>information natively using git-format-patch, using roughly the following
>process:
>
>- generate a signify-compatible cryptographic signature of the verbatim
>  patch content, perhaps slightly normalized for things like LF vs. CRLF
>  line endings (see minisign/libsodium for crypto details)
>- include both the signature and the public key in the area below '---',
>  using "Minisig:" and "Minikey:" taglines

I like where you're heading with this suggestion however there are some
issues. It is not clear what bytes the signature was calculated over.
Does it include the "From:" line of the email? How about the
"Signed-off-by"? If there is no binding of the identity of the submitter
to the key pair then you'll have problems with the TOFU policy you
describe further down (explained later). Also, since we're trying to
move to a Git that supports signatures from multiple different signing
tools and to also support multi-sig sign-offs (e.g. first the author,
then the reviewer, then the merger) these taglines need to be more
compex. At the very least either there needs to be a signature type
tagline or the type of the signature needs to be baked into the key and
signture values (see Secure Scuttlebutt encoding of keys/sigs). Also, if
we want to do chained multisign there should be some framing of what is
signed by each signature. If you're looking for something email like, we
could borrow from mime email attachment encoding to provide framing.

>For example:
>
>---8<---
>From b41a2a0f817caddc9a76f43c3c9ed7d8edd6b2de Mon Sep 17 00:00:00 2001
>From: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
>Date: Tue, 10 Sep 2019 06:15:36 -0400
>Subject: [PATCH] Second commit
>
>Change the greeting.
>
>Signed-off-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
>---
> foo.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> Minisig: RWT9fcUvSnHPLiqWgXEnn98sgk8nl4FteDRkD+9lVK+He//eLOxNZ5QjCROoKJgPGpL4uzoHicN+f6gB54qmtO1cQtfvjS+++QU=
> Minikey: RWT9fcUvSnHPLqqyfLbkGBMEscBWciFFp2iBj2XnZPzW69OVIoYwZ25q
>
>diff --git a/foo.c b/foo.c
>index d40a2b9..dcfad55 100644
>--- a/foo.c
>+++ b/foo.c
>@@ -1,5 +1,5 @@
> #include <stdio.h>
> int main()
> {
>-    printf("Hello, World!");
>+    printf("Hello, Signed World!");
> }
>--ᐞ
>2.21.0
>---8<---

I would instead have at the very least the signature tool in the value:

	Key: minisign|RWT9fcUvSnHPLqqyfLbkGBMEscBWciFFp2iBj2XnZPzW69OVIoYwZ25q
  Sig: minisign|RWT9fcUvSnHPLiqWgXEnn98sgk8nl4FteDRkD+9lVK+He//eLOxNZ5QjCROoKJgPGpL4uzoHicN+f6gB54qmtO1cQtfvjS+++QU=

But this doesn't solve multi-sig. 

>When git-am encounters a signed patch, it should:
>
>1. check if the email in From: matches existing entries in git's TOFU
>   (trust on first use) database, which is a simple key-value store
>   like:
>
>   konstantin@linuxfoundation.org: RWT9fcUvSnHPLqqyfLbkGBMEscBWciFFp2iBj2XnZPzW69OVIoYwZ25q
>
>2. if no matches, add a new entry to the TOFU tracking database and
>   consider the key automatically trusted (perhaps configurable)
>3. if there are existing matches:
>
>   a. compare the keys to make sure they haven't changed
>   b. if keys changed, emit a warning and let developer decide if they
>      trust the key change
>   c. if keys did not change, validate the signature
>   d. if validation failed, alert the developer and error out
>
>4. if the TOFU db exists at all, git-am should check if the email
>   address in From: matches any existing records and alert if the patch
>   carries no signature (in case it's been removed by a malicious
>   attacker).
>
>All of these operations should be sufficiently fast, since both ECC
>crypto and key-value lookups are fast operations that don't require a
>lot of resources.

TOFU has the problem of not providing cryptographic provenance over keys
while maintaining provenance on the binding between other identity
attributes and those keys (e.g. author string). In step 3 above there's
no way to know for sure that the submitter is actually who they claim to
be. Reviewers have no way of knowing that the new key used with the
patch is a legitimate key update. The only option with this design is to
do some out-of-band key verification (i.e. call the submitter and have
them read the key to you over the phone). Out-of-band key validation
hasn't scaled for GPG and it won't scale here either.

Instead of TOFU, a more secure design would require key enrollment, key
rotation, key recovery, and key revocation to all be separate,
cryptographically verified updates to the attribute-to-key database in
the repo. Your instinct for storing the attribute+key data in the repo
itself (i.e. in-band) is correct because it makes Git repos
self-verifiable in that cloning is all you need to do to get all of the
data necessary for verifying all of the digital signatures.

Key enrollment should be a separate patch submission that adds the
author and public key to the database file. The patch must be signed
using the key that is being added to the database. This provides the
provenance anchor for the key and also binds the attribute to the key.
The record in the data should also contain some data that enables secure
key recovery/rotation. A simple hashed secret passphrase and nonce
works. In the future, key recovery/rotation can be done by the owner by
submitting a new patch that updates the database record with a new key
and recovery data, signed with the new key and also including the nonce
and secret passphrase used to generate the previous key recovery hash.
Key revocation is a patch that removes the record from the database that
is signed by the key that is being removed or a new key plus the
recovery secret passphrase and nonce.

So back to the original proposal, I like the simplicity but with a few
tweaks, it could be an air tight digital signature scheme for emailed
patches. If air tight provenance is not what you're aiming for, then why
are you even using cryptography?

That's my 2p on this. I like where you're head is at Kontantin.

Cheers!
Dave
Konstantin Ryabitsev Sept. 30, 2019, 3:37 p.m. UTC | #2
On Fri, Sep 27, 2019 at 08:24:37AM -0700, dwh@linuxprogrammer.org wrote:
>>- generate a signify-compatible cryptographic signature of the 
>>verbatim
>> patch content, perhaps slightly normalized for things like LF vs. CRLF
>> line endings (see minisign/libsodium for crypto details)
>>- include both the signature and the public key in the area below '---',
>> using "Minisig:" and "Minikey:" taglines
>
>I like where you're heading with this suggestion however there are some
>issues. It is not clear what bytes the signature was calculated over.

Just the actual patch.

>Does it include the "From:" line of the email? How about the
>"Signed-off-by"? If there is no binding of the identity of the submitter
>to the key pair then you'll have problems with the TOFU policy you
>describe further down (explained later). 

Ok, I'll argue my point on this later. :)

>Also, since we're trying to
>move to a Git that supports signatures from multiple different signing
>tools and to also support multi-sig sign-offs (e.g. first the author,
>then the reviewer, then the merger) these taglines need to be more
>compex. At the very least either there needs to be a signature type
>tagline or the type of the signature needs to be baked into the key and
>signture values (see Secure Scuttlebutt encoding of keys/sigs). Also, if
>we want to do chained multisign there should be some framing of what is
>signed by each signature. If you're looking for something email like, we
>could borrow from mime email attachment encoding to provide framing.

No, we definitely don't want to go down the MIME path (it's routinely 
mangled by archivers, so we're much more likely to lose anything that 
comes in via MIME attachments).

>I would instead have at the very least the signature tool in the value:
>
>	Key: minisign|RWT9fcUvSnHPLqqyfLbkGBMEscBWciFFp2iBj2XnZPzW69OVIoYwZ25q
> Sig: minisign|RWT9fcUvSnHPLiqWgXEnn98sgk8nl4FteDRkD+9lVK+He//eLOxNZ5QjCROoKJgPGpL4uzoHicN+f6gB54qmtO1cQtfvjS+++QU=

I don't think it's worth it to abstract this out. The main benefits of
minisigs are:

- it's an emerging standard (www.kernel.org should soon offer minisig 
  signatures on tarball downloads)
- it's short enough to include both the key and the signature into a few 
  bytes of information -- attempting to do the same with PGP would 
  balloon the message into kilobytes, even if ECC keys/subkeys are used

>But this doesn't solve multi-sig.

It doesn't attempt to, but it can be achieved in a number of clever 
ways. For example, the reviewer can sign the minisig signature on the 
original patch. E.g.:

Reviewed-by: Alter Ego <mricon@kernel.org>
Reviewed-minisig: {minisig signature of RWT9fcUvS...fvjS+++QU=}
Reviewed-minikey: {reviewer pubkey}

This would give you a chain of attestation to the original patch.

For series, this would be more complicated, since Reviewed-by: is 
usually posted for the cover letter. Perhaps series cover letters could 
include a signature of all individual patch signatures.

That said, this is not something I'm trying to solve -- my goal is to 
provide tamper-evident attestation of patches sent to mailing lists, and 
I expound on that further down.

>TOFU has the problem of not providing cryptographic provenance over 
>keys
>while maintaining provenance on the binding between other identity
>attributes and those keys (e.g. author string). In step 3 above there's
>no way to know for sure that the submitter is actually who they claim to
>be. 

Correct, but the same is true for any other key distribution mechanism.  
Even with PGP, most people I work with use the TOFU approach -- if a key 
is in their keyring, it's considered automatically trusted. My goal is 
not really to come up with a tamper-proof solution, but to offer a chain 
of cryptographic attestation. Developers can then *choose* to 
incorporate tamper-proof features of it into their workflows via tool 
support.

>Reviewers have no way of knowing that the new key used with the
>patch is a legitimate key update. The only option with this design is to
>do some out-of-band key verification (i.e. call the submitter and have
>them read the key to you over the phone). Out-of-band key validation
>hasn't scaled for GPG and it won't scale here either.

Yes, because delegated trust is *hard*. :) We either must rely on 
delegated trust via certification authorities -- with all the potential 
for abuse there -- or we must make trust decisions on our own, which 
doesn't scale well. As far as I can tell, there is no easy solution to 
this problem. TOFU just formalizes everyone's current coping mechanism.

>Instead of TOFU, a more secure design would require key enrollment, key
>rotation, key recovery, and key revocation to all be separate,
>cryptographically verified updates to the attribute-to-key database in
>the repo. 

Right, as long as it's understood that this delegates trust to people 
with write access to this repository (and infrastructure admins). This 
also has important drawbacks in the case of the Linux kernel, for 
example:

- there are thousands of people committing patches to the Linux kernel, 
  and it's not the same thousands of people with each mainline release, 
  so keeping key information for all of them in the repository would be 
  next to impossible
- while torvalds/linux.git is considered "canonical" for Linux, most 
  developers would be working off of other trees (netdev, arm, etc), so 
  it would make little sense for them to put their key information into 
  the mainline repository

>Your instinct for storing the attribute+key data in the repo
>itself (i.e. in-band) is correct because it makes Git repos
>self-verifiable in that cloning is all you need to do to get all of the
>data necessary for verifying all of the digital signatures.

Right, this is an important problem to solve, but it's not the one I'm 
trying to address. :) I'm specifically interested in cryptographic 
attestation of patches sent to mailing lists -- *before* code even makes 
it into git. Since there is no way to translate signatures on patches 
into git commit signatures, I'm not even attempting to solve that 
problem.

>Key enrollment should be a separate patch submission that adds the
>author and public key to the database file. The patch must be signed
>using the key that is being added to the database. This provides the
>provenance anchor for the key and also binds the attribute to the key.

I like it, but this won't scale for Linux kernel due to the reasons I've 
described above -- thousands of developers who come and go, plus 
multiple "canonical" linux.git trees, depending on the component you're 
working on.

>So back to the original proposal, I like the simplicity but with a few
>tweaks, it could be an air tight digital signature scheme for emailed
>patches. If air tight provenance is not what you're aiming for, then why
>are you even using cryptography?

My primary goal is to remove one of the last bits where we explicitly 
trust infrastructure in the Linux Kernel development -- vger.kernel.org 
and lore.kernel.org. If either of these systems are compromised, an 
attacker would be able to modify patches on the fly in order to insert 
malicious code without leaving a trace (e.g. intercept a patch as it is 
sent to the actual maintainer, but not when sent to others or to the 
archiver). Adding basic cryptographic signatures to the process will 
hopefully make this attack vector less likely and will at least offer a 
mechanism to perform post-mortem forensic examinations -- without 
introducing a central certification authority.

-K
diff mbox series

Patch

diff --git a/foo.c b/foo.c
index d40a2b9..dcfad55 100644
--- a/foo.c
+++ b/foo.c
@@ -1,5 +1,5 @@ 
 #include <stdio.h>
 int main()
 {
-    printf("Hello, World!");
+    printf("Hello, Signed World!");
 }