Message ID | 5590a68c5ba7081cd7e64c708b5c25db23f5e95b.1597406877.git.martin.agren@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | more SHA-256 documentation | expand |
Martin Ågren <martin.agren@gmail.com> writes: > Document that in SHA-1 repositories, we use SHA-1 for "want"s and > "have"s, and in SHA-256 repositories, we use SHA-256. Ehh, doesn't this directly contradict the transition plan of "on the wire everything will use SHA-1 version for now?" > Signed-off-by: Martin Ågren <martin.agren@gmail.com> > --- > Documentation/technical/http-protocol.txt | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/Documentation/technical/http-protocol.txt b/Documentation/technical/http-protocol.txt > index 51a79e63de..507f28f9b3 100644 > --- a/Documentation/technical/http-protocol.txt > +++ b/Documentation/technical/http-protocol.txt > @@ -401,8 +401,9 @@ at all in the request stream: > The stream is terminated by a pkt-line flush (`0000`). > > A single "want" or "have" command MUST have one hex formatted > -SHA-1 as its value. Multiple SHA-1s MUST be sent by sending > -multiple commands. > +object name as its value. Multiple object names MUST be sent by sending > +multiple commands. (An object name is a SHA-1 hash in a SHA-1 repo > +and a SHA-256 hash in a SHA-256 repo.) > > The `have` list is created by popping the first 32 commits > from `c_pending`. Less can be supplied if `c_pending` empties.
On 2020-08-14 at 17:28:27, Junio C Hamano wrote: > Martin Ågren <martin.agren@gmail.com> writes: > > > Document that in SHA-1 repositories, we use SHA-1 for "want"s and > > "have"s, and in SHA-256 repositories, we use SHA-256. > > Ehh, doesn't this directly contradict the transition plan of "on the > wire everything will use SHA-1 version for now?" SHA-256 repositories interoperate currently using SHA-256 object IDs. It was originally intended that we wouldn't update the protocol, but that leads to much of the testsuite failing since it's impossible to move objects from one place to another. If we wanted to be more pedantically correct and optimize for the future, we could say that the values use the format negotiated by the "object-format" protocol extension and SHA-1 otherwise.
On Fri, 14 Aug 2020 at 22:23, brian m. carlson <sandals@crustytoothpaste.net> wrote: > > On 2020-08-14 at 17:28:27, Junio C Hamano wrote: > > Martin Ågren <martin.agren@gmail.com> writes: > > > > > Document that in SHA-1 repositories, we use SHA-1 for "want"s and > > > "have"s, and in SHA-256 repositories, we use SHA-256. > > > > Ehh, doesn't this directly contradict the transition plan of "on the > > wire everything will use SHA-1 version for now?" Yes, the transition plan would probably need updating there. I'm just trying to document what we have. > SHA-256 repositories interoperate currently using SHA-256 object IDs. > It was originally intended that we wouldn't update the protocol, but > that leads to much of the testsuite failing since it's impossible to > move objects from one place to another. > > If we wanted to be more pedantically correct and optimize for the > future, we could say that the values use the format negotiated by the > "object-format" protocol extension and SHA-1 otherwise. Hmm, I didn't think of that. Would we ever regret that we've painted such a "big" picture and wish to refine it somehow? Compared to admittedly being fairly narrow as I am here, then loosen things later. I'll think about it, but I think I could go either way. Martin
"brian m. carlson" <sandals@crustytoothpaste.net> writes: > On 2020-08-14 at 17:28:27, Junio C Hamano wrote: >> Martin Ågren <martin.agren@gmail.com> writes: >> >> > Document that in SHA-1 repositories, we use SHA-1 for "want"s and >> > "have"s, and in SHA-256 repositories, we use SHA-256. >> >> Ehh, doesn't this directly contradict the transition plan of "on the >> wire everything will use SHA-1 version for now?" > > SHA-256 repositories interoperate currently using SHA-256 object IDs. > It was originally intended that we wouldn't update the protocol, but > that leads to much of the testsuite failing since it's impossible to > move objects from one place to another. > > If we wanted to be more pedantically correct and optimize for the > future, we could say that the values use the format negotiated by the > "object-format" protocol extension and SHA-1 otherwise. Yup. I think a reasonable evolution path is 0) everything on the wire is SHA-1 and no local operation knows SHA-256 (i.e. a few releases ago) 1) local operations are either SHA-1 or SHA-256 but not both. On the wire, only protocol for SHA-1 repositories are defined, so SHA-256 repositories cannot talk with anybody using any official protocol, but a "borked" SHA-1 protocol that naturally extends the object names width exists and SHA-256 repositories can interoperate with each other. This will be a backward compatibility nightmare, as Git from SHA-256 repository that tries to talk to SHA-1 repository will fail but without grace (i.e. the current situation). 2) on-the-wire protocol gains just one new capability to safely unleash SHA-256 repositories to talk to the wider world. The "borked" SHA-1 protocol above will become official when the object-format=sha256 capability is negotiated by both ends. At this stage, SHA-256 repositories still cannot talk with SHA-1 repositories, but at least they can talk among themselves as long as they use new-enough version of Git that knows about the new capability. 3) on-the-fly SHA-1 vs SHA-256 migration gets implemented. SHA-256 reposotories trying to talk to somebody else, after discovering that the other end lacks object-format=sha256 capability, on-the-fly converts its SHA-256 objecst to SHA-1 and vice versa. Between SHA-256 repositories, the capability above in 2) will allow native conversation with SHA-256. Reaching 3) may be a lot of work, but at least we should get to 2) to be able to safely let SHA-256 repositories to talk to the outside world (yes, I consider it OK for SHA-256 repositories talking among themselves in a private setting in the current state, and it would be a good milestone and also test towards the eventual goal of reaching 3), and with much smaller effort. Thanks.
Junio C Hamano <gitster@pobox.com> writes: > "brian m. carlson" <sandals@crustytoothpaste.net> writes: > >> On 2020-08-14 at 17:28:27, Junio C Hamano wrote: >>> Martin Ågren <martin.agren@gmail.com> writes: >>> >>> > Document that in SHA-1 repositories, we use SHA-1 for "want"s and >>> > "have"s, and in SHA-256 repositories, we use SHA-256. >>> >>> Ehh, doesn't this directly contradict the transition plan of "on the >>> wire everything will use SHA-1 version for now?" >> >> SHA-256 repositories interoperate currently using SHA-256 object IDs. >> It was originally intended that we wouldn't update the protocol, but >> that leads to much of the testsuite failing since it's impossible to >> move objects from one place to another. >> >> If we wanted to be more pedantically correct and optimize for the >> future, we could say that the values use the format negotiated by the >> "object-format" protocol extension and SHA-1 otherwise. Yes, that's wonderful. I was confused when I said about the evolution path. We still would want to eventually do the on-the-fly migration over the wire to make SHA-1 and SHA-256 repositories interoperate, but at least we already can allow SHA-256 repositories safely attempt to talk to SHA-1 repositories and gracefully fail. Thanks.
diff --git a/Documentation/technical/http-protocol.txt b/Documentation/technical/http-protocol.txt index 51a79e63de..507f28f9b3 100644 --- a/Documentation/technical/http-protocol.txt +++ b/Documentation/technical/http-protocol.txt @@ -401,8 +401,9 @@ at all in the request stream: The stream is terminated by a pkt-line flush (`0000`). A single "want" or "have" command MUST have one hex formatted -SHA-1 as its value. Multiple SHA-1s MUST be sent by sending -multiple commands. +object name as its value. Multiple object names MUST be sent by sending +multiple commands. (An object name is a SHA-1 hash in a SHA-1 repo +and a SHA-256 hash in a SHA-256 repo.) The `have` list is created by popping the first 32 commits from `c_pending`. Less can be supplied if `c_pending` empties.
Document that in SHA-1 repositories, we use SHA-1 for "want"s and "have"s, and in SHA-256 repositories, we use SHA-256. Signed-off-by: Martin Ågren <martin.agren@gmail.com> --- Documentation/technical/http-protocol.txt | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)