Message ID | 20180927012455.234876-1-steadmon@google.com (mailing list archive) |
---|---|
Headers | show |
Series | Add proto v2 archive command with HTTP support | expand |
On Wed, Sep 26, 2018 at 6:25 PM Josh Steadmon <steadmon@google.com> wrote: > > This is the second version of my series to add a new protocol v2 command > for archiving, with support for HTTP(S). > > NEEDSWORK: a server built with this series is not backwards-compatible > with clients that set GIT_PROTOCOL=version=2 or configure > protocol.version=2. The old client will unconditionally send "argument > ..." packet lines, which breaks the server's expectations of a > "command=archive" request, So if an old client sets protocol to v2, it would only apply that protocol version to fetch, not archive, so it would start following a v0 conversation, but as the protocol version is set, it would be transmitted to the server. This sounds like a bug in the client? > while the server's capability advertisement > in turn breaks the clients expectation of either an ACK or NACK. Could a modern client send either another protocol version (3?) or a special capability along the protocol version ("fixed_archive") > I've been discussing workarounds for this with Jonathan Nieder, but > please let me know if you have any suggestions for v3 of this series. Care to open the discussion to the list? What are the different approaches, what are the pros/cons?
Stefan Beller wrote: > On Wed, Sep 26, 2018 at 6:25 PM Josh Steadmon <steadmon@google.com> wrote: >> I've been discussing workarounds for this with Jonathan Nieder, but >> please let me know if you have any suggestions for v3 of this series. > > Care to open the discussion to the list? What are the different > approaches, what are the pros/cons? Do you mean sending video of chatting in the office? Josh and I discussed that 1. Clients sending version=2 when they do not, in fact, speak protocol v2 for a service is a (serious) bug. (Separately from this series) we should fix it. 2. That bug is already in the wild, alas. Fortunately the semantics of GIT_PROTOCOL as a list of key/value pairs is well defined. So we have choices of (a) bump version to version=3 (b) pass another value 'version=2:yesreallyversion=2' (c) etc. 3. This is likely to affect push, too. Thanks and hope that helps, Jonathan
On 2018.09.27 11:20, Stefan Beller wrote: > On Wed, Sep 26, 2018 at 6:25 PM Josh Steadmon <steadmon@google.com> wrote: > > > > This is the second version of my series to add a new protocol v2 command > > for archiving, with support for HTTP(S). > > > > NEEDSWORK: a server built with this series is not backwards-compatible > > with clients that set GIT_PROTOCOL=version=2 or configure > > protocol.version=2. The old client will unconditionally send "argument > > ..." packet lines, which breaks the server's expectations of a > > "command=archive" request, > > So if an old client sets protocol to v2, it would only apply that > protocol version > to fetch, not archive, so it would start following a v0 conversation, but > as the protocol version is set, it would be transmitted to the server. > This sounds like a bug in the client? Yeah, basically. We're telling the server we support v2, even if the specific operation we're trying to do doesn't have a v2 implementation on the client. So this is going to make it ugly to replace existing commands. > > while the server's capability advertisement > > in turn breaks the clients expectation of either an ACK or NACK. > > Could a modern client send either another protocol version (3?) > or a special capability along the protocol version ("fixed_archive") > > > I've been discussing workarounds for this with Jonathan Nieder, but > > please let me know if you have any suggestions for v3 of this series. > > Care to open the discussion to the list? What are the different > approaches, what are the pros/cons? Jonathan suggested something along the lines of what you said above, adding a new field in GIT_PROTOCOL. So we'd send something like "version=2:archive_version=2" and have the server detect the latter. I'm not sure if that's the best way to go about this since I'm not familiar with the version detection code for other parts of the system. I worry that it will lead us down the path of having to specify a version for every command that we eventually convert to protocol v2. On the other hand, I don't see any other way to work around this, at least in the archive case. We can't peek at the client's transmissions on the server, because v2 requires that the server speaks first...
Jonathan Nieder <jrnieder@gmail.com> writes: > 1. Clients sending version=2 when they do not, in fact, speak protocol > v2 for a service is a (serious) bug. (Separately from this > series) we should fix it. > > 2. That bug is already in the wild, alas. Fortunately the semantics of > GIT_PROTOCOL as a list of key/value pairs is well defined. So we > have choices of (a) bump version to version=3 (b) pass another > value 'version=2:yesreallyversion=2' (c) etc. > > 3. This is likely to affect push, too. Do you mean that existing "git push", "git fetch" and "git archive" sends version=2 even when they are not capable of speaking protocol v2? I thought that "git archive [--remote]" was left outside of the protocol update (that was the reason why the earlier attempt took a hacky route of "shallow clone followed by local archive"), so there is no "git archive" in the wild that can even say "version=$n" (which requires you to be at least version=1)?
On 2018.09.27 15:20, Junio C Hamano wrote: > Jonathan Nieder <jrnieder@gmail.com> writes: > > > 1. Clients sending version=2 when they do not, in fact, speak protocol > > v2 for a service is a (serious) bug. (Separately from this > > series) we should fix it. > > > > 2. That bug is already in the wild, alas. Fortunately the semantics of > > GIT_PROTOCOL as a list of key/value pairs is well defined. So we > > have choices of (a) bump version to version=3 (b) pass another > > value 'version=2:yesreallyversion=2' (c) etc. > > > > 3. This is likely to affect push, too. > > Do you mean that existing "git push", "git fetch" and "git archive" > sends version=2 even when they are not capable of speaking protocol > v2? I thought that "git archive [--remote]" was left outside of the > protocol update (that was the reason why the earlier attempt took a > hacky route of "shallow clone followed by local archive"), so there > is no "git archive" in the wild that can even say "version=$n" > (which requires you to be at least version=1)? Yes, the version on my desktop sends version=2 when archiving: ∫ which git /usr/bin/git ∫ git --version git version 2.19.0.605.g01d371f741-goog ∫ GIT_TRACE_PACKET=${HOME}/server_trace git daemon \ --enable=upload-archive \ --base-path=${HOME}/src/bare-repos & [1] 258496 ∫ git archive --remote git://localhost/test-repo.git HEAD >! test.tar ∫ grep version ~/server_trace 15:31:22.377869 pkt-line.c:80 packet: git< git-upload-archive /test-repo.git\0host=localhost\0\0version=2\0
Josh Steadmon <steadmon@google.com> writes: > Yes, the version on my desktop sends version=2 when archiving: > > ∫ which git > /usr/bin/git > ∫ git --version > git version 2.19.0.605.g01d371f741-goog > ∫ GIT_TRACE_PACKET=${HOME}/server_trace git daemon \ > --enable=upload-archive \ > --base-path=${HOME}/src/bare-repos & > [1] 258496 > ∫ git archive --remote git://localhost/test-repo.git HEAD >! test.tar > ∫ grep version ~/server_trace > 15:31:22.377869 pkt-line.c:80 packet: git< git-upload-archive /test-repo.git\0host=localhost\0\0version=2\0 Ah, that's truly broken. Come to think of it, do we need to be using uniform versions across different endpoints? The archive request could be at v3 while fetch request could still be at v2, in which case the design to use a single protocol.version variable is probably the root cause of the confusion? Perhaps like protocol.<name>.allow, we would want protocol.<name>.version or something like that (and no protocol.version) to make it clear that protocol v2 used for fetching has nothing to do with protocol v1 or v2 or v3 used for archiving? Luckily, protocol.version is still marked as experimental so it is not too bad that we caught the design mistake (if it is one) and can now correct it before the damage spreads too widely.
This is the second version of my series to add a new protocol v2 command for archiving, with support for HTTP(S). NEEDSWORK: a server built with this series is not backwards-compatible with clients that set GIT_PROTOCOL=version=2 or configure protocol.version=2. The old client will unconditionally send "argument ..." packet lines, which breaks the server's expectations of a "command=archive" request, while the server's capability advertisement in turn breaks the clients expectation of either an ACK or NACK. I've been discussing workarounds for this with Jonathan Nieder, but please let me know if you have any suggestions for v3 of this series. Josh Steadmon (4): archive: follow test standards around assertions archive: use packet_reader for communications archive: implement protocol v2 archive command archive: allow archive over HTTP(S) with proto v2 Documentation/technical/protocol-v2.txt | 21 ++++++++- builtin/archive.c | 58 +++++++++++++++++++------ builtin/upload-archive.c | 27 ++++++++++-- http-backend.c | 13 +++++- serve.c | 7 +++ t/t5000-tar-tree.sh | 33 +++++++------- t/t5701-git-serve.sh | 1 + transport-helper.c | 7 +-- 8 files changed, 130 insertions(+), 37 deletions(-) Range-diff against v1: -: ---------- > 1: c2e371ad24 archive: follow test standards around assertions 1: b514184273 ! 2: a65f73f627 archive: use packet_reader for communications @@ -6,7 +6,10 @@ handling, which will make implementation of protocol v2 support in git-archive easier. + This refactoring does not change the behavior of "git archive". + Signed-off-by: Josh Steadmon <steadmon@google.com> + Reviewed-by: Stefan Beller <sbeller@google.com> diff --git a/builtin/archive.c b/builtin/archive.c @@ -42,24 +45,24 @@ - if (!buf) + status = packet_reader_read(&reader); + -+ if (status == PACKET_READ_FLUSH) ++ if (status != PACKET_READ_NORMAL || reader.pktlen <= 0) die(_("git archive: expected ACK/NAK, got a flush packet")); - if (strcmp(buf, "ACK")) { - if (starts_with(buf, "NACK ")) - die(_("git archive: NACK %s"), buf + 5); - if (starts_with(buf, "ERR ")) - die(_("remote error: %s"), buf + 4); -+ if (strcmp(reader.buffer, "ACK")) { -+ if (starts_with(reader.buffer, "NACK ")) -+ die(_("git archive: NACK %s"), reader.buffer + 5); -+ if (starts_with(reader.buffer, "ERR ")) -+ die(_("remote error: %s"), reader.buffer + 4); ++ if (strcmp(reader.line, "ACK")) { ++ if (starts_with(reader.line, "NACK ")) ++ die(_("git archive: NACK %s"), reader.line + 5); ++ if (starts_with(reader.line, "ERR ")) ++ die(_("remote error: %s"), reader.line + 4); die(_("git archive: protocol error")); } - if (packet_read_line(fd[0], NULL)) + status = packet_reader_read(&reader); -+ if (status != PACKET_READ_FLUSH) ++ if (status == PACKET_READ_NORMAL && reader.pktlen > 0) die(_("git archive: expected a flush")); /* Now, start reading from fd[0] and spit it out to stdout */ 2: 1518c15dc1 < -: ---------- archive: implement protocol v2 archive command -: ---------- > 3: 0a8cc5e331 archive: implement protocol v2 archive command 3: 1b7ad8d8f6 ! 4: 97a1424f32 archive: allow archive over HTTP(S) with proto v2 @@ -10,16 +10,20 @@ +++ b/builtin/archive.c @@ status = packet_reader_read(&reader); - if (status != PACKET_READ_FLUSH) + if (status == PACKET_READ_NORMAL && reader.pktlen > 0) die(_("git archive: expected a flush")); - } + } else if (version == protocol_v2 && -+ starts_with(transport->url, "http")) ++ (starts_with(transport->url, "http://") || ++ starts_with(transport->url, "https://"))) + /* + * Commands over HTTP require two requests, so there's an -+ * additional server response to parse. ++ * additional server response to parse. We do only basic sanity ++ * checking here that the versions presented match across ++ * requests. + */ -+ discover_version(&reader); ++ if (version != discover_version(&reader)) ++ die(_("git archive: received different protocol versions in subsequent requests")); /* Now, start reading from fd[0] and spit it out to stdout */ rv = recv_sideband("archive", fd[0], 1); @@ -40,7 +44,10 @@ struct strbuf buf = STRBUF_INIT; + if (!strcmp(service_name, "git-upload-archive")) { -+ /* git-upload-archive doesn't need --stateless-rpc */ ++ /* ++ * git-upload-archive doesn't need --stateless-rpc, because it ++ * always handles only a single request. ++ */ + argv[1] = "."; + argv[2] = NULL; + } @@ -63,10 +70,12 @@ --- a/transport-helper.c +++ b/transport-helper.c @@ + strbuf_addf(&cmdbuf, "connect %s\n", name); ret = run_connect(transport, &cmdbuf); } else if (data->stateless_connect && - (get_protocol_version_config() == protocol_v2) && +- (get_protocol_version_config() == protocol_v2) && - !strcmp("git-upload-pack", name)) { ++ get_protocol_version_config() == protocol_v2 && + (!strcmp("git-upload-pack", name) || + !strcmp("git-upload-archive", name))) { strbuf_addf(&cmdbuf, "stateless-connect %s\n", name);