mbox series

[RFC,0/2] Opaque author and committer identifiers

Message ID 20220919145231.48245-1-sandals@crustytoothpaste.net (mailing list archive)
Headers show
Series Opaque author and committer identifiers | expand

Message

brian m. carlson Sept. 19, 2022, 2:52 p.m. UTC
There have been frequent discussions on the list about the mailmap and
how it's not currently the ideal way to map former identities to current
identities.  This is especially true for transgender people, who often
don't want to associate their deadname with their current name in plain
text.

This is an RFC series that I talked about to some folks at Git Merge.
Roughly, this documents a format for an opaque identifier which is
compatible with existing implementations by overloading the email
address field with something that is not a real email address and cannot
be confused with one.  This opaque identifier is the fingerprint of some
key, which in most cases will be an SSH key.

It also proposes moving the mailmap out of the main history and into a
special ref for this purpose.  Notably, so as not to make the same
mistake we did with grafts, where they are not pushed by default and so
nobody uses them, the proposal here is to change tooling so that the
mailmap refs are easy to push and pull and that this is done by default
(with an easy way to opt-out).  By default, local changes to the mailmap
ref are squashed into the current commit such that there is only one
commit on the ref.  This preserves the existing mapping while not
retaining former identities, which we don't really need. (Who wants to
send email to a contributor's address at a former employer which doesn't
work anymore?)

Since this series needs a way to cart mailmap information around in
patches and I would not like to repeat the same design as base-commit,
I've proposed a separate header to include this information around.  I'm
not terribly attached to this proposal and am open to other ideas if
folks like them better, but I feel it moves us in a useful direction to
being able to include other metadata in a structured way and to sending
signed commits by patch, which other folks wanted to do (and I am in
favour of).  (I'm willing to implement such a feature based on this
approach in the future if folks desire.)

All of these changes will be optional to adopt.  Projects need not use
them if they don't want to.  However, I am proposing that they be
advertised prominently as a preferred option (for example, in the "Tell
me who you are" message) to encourage adoption.  Appropriate tooling
will be included to make this easy.

In addition, besides the general benefits for trans folks and the
ability to operate anonymously or pseudonymously, I also think using an
opaque identifier will cut down on spam.  I have received many unwelcome
solicitations from employers and survey-toting academics to my email
address, as I'm sure others have.  Receiving fewer of these in the
future will be a nice bonus.

For those folks using forges, it should be noted that associating an
identifier with an account should be very easy, since the forge usually
has SSH key support and commit verification and thus, the user's keys,
so there's no change to workflow on forges once they implement this
feature.  For those forges which use the user's personal name in the UI,
this can simply be replaced by the personal name the user has registered
with the forge.

None of this deals with rewriting identities in existing commits.  We
have what we have now and can't change it, but we can do something
different going forward.  If there is interest in the hashed mailmap
approach or another similar approach, I'm open to resurrecting that in
addition provided we agree as a project not to write tools which
trivially invert the hashed mailmap (which was the reason I dropped that
series in the first place).

I realize this is a radical departure from what we've done historically,
so this is an RFC series.  It's to gauge interest in this proposal and
design and to discuss alternatives before implementation. If we like
this approach, I will agree to implement it as my time allows, which I
expect could be done in a single series of under 30 patches.

I've CC'd some of the folks I talked to about this and some folks who I
think might be interested, but of course any constructive feedback is
welcome.

brian m. carlson (2):
  doc: specify a header for including arbitrary format-patch metadata
  docs: document a format for anonymous author and committer IDs

 Documentation/technical/anonymous-id.txt      | 143 ++++++++++++++++++
 .../technical/format-patch-metadata.txt       |  58 +++++++
 2 files changed, 201 insertions(+)
 create mode 100644 Documentation/technical/anonymous-id.txt
 create mode 100644 Documentation/technical/format-patch-metadata.txt