Message ID | 3ed7cf87aaa40ee66b20aa929d89d28fefcec312.1589393036.git.liu.denton@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | remote-curl: partial fix for a deadlock with stateless rpc | expand |
On Wed, May 13, 2020 at 02:04:57PM -0400, Denton Liu wrote: > Currently, remote-curl acts as a proxy and blindly forwards packets > between an HTTP server and fetch-pack. In the case of a stateless RPC > connection where the connection is terminated with a partially written > packet, remote-curl will blindly send the partially written packet > before waiting on more input from fetch-pack. Meanwhile, fetch-pack will > read the partial packet and continue reading, expecting more input. This > results in a deadlock between the two processes. > > Instead of blindly forwarding packets, inspect each packet to ensure > that it is a full packet, erroring out if a partial packet is sent. Hmm. Right now there's no assumption in rpc_in that we're writing pktlines. Will this always be the case? I think the answer is probably yes. If there's a case where it isn't, it might be something like v0 git-over-http against a server which doesn't have the sideband capability. > diff --git a/remote-curl.c b/remote-curl.c > index da3e07184a..8b740354e5 100644 > --- a/remote-curl.c > +++ b/remote-curl.c > @@ -682,6 +682,8 @@ static curlioerr rpc_ioctl(CURL *handle, int cmd, void *clientp) > struct rpc_in_data { > struct rpc_state *rpc; > struct active_request_slot *slot; > + struct strbuf len_buf; > + int remaining; This remaining is "remaining in the current packet", I assume? An "int" should be OK for that. Using a strbuf feels a bit like overkill, because we have to remember to free it (which I think doesn't actually happen in your patch). Could we just use a "char len_buf[4]" (you'd need an extra int to count how many bytes you've received there). > @@ -702,7 +705,42 @@ static size_t rpc_in(char *ptr, size_t eltsize, > return size; > if (size) > data->rpc->any_written = 1; > - write_or_die(data->rpc->in, ptr, size); > + > + while (unwritten) { > + if (!data->remaining) { > + int digits_remaining = 4 - data->len_buf.len; > + if (digits_remaining > unwritten) > + digits_remaining = unwritten; > + strbuf_add(&data->len_buf, ptr, digits_remaining); > + ptr += digits_remaining; > + unwritten -= digits_remaining; So we actually save up the 4 bytes, not sending them at all until we get them. I wonder if this might be easier to follow if our count was simply advisory. I.e., continue to write data as we get it, but keep a small state machine tracking pktlines (which could even be done in its own separate struct/function pair). I dunno. It might be about the same level of confusing, but it makes it easy to keep the logic separate from rpc_in, and it lets us put all of the policy bits at the end, after we know we've received all of the data (in post_rpc): if (data->len_buf.len < 4) die("incomplete packet header"); if (data->remaining) die("incomplete packet"); /* imagine we kept the actual pktlen value, instead of just counting * down remaining */ if (data->pktlen) die("did not end in flush"); Notably, I'm not sure if your code will complain if the connection dies will reading the 4-byte header (remaining would still be 0). That won't leave the caller trying to read the packet (we never sent them the header), but they may still be waiting for a response. > + if (data->len_buf.len == 4) { > + data->remaining = packet_length(data->len_buf.buf); > + if (data->remaining < 0) { > + die(_("remote-curl: bad line length character: %.4s"), data->len_buf.buf); > + } else if (data->remaining <= 1) { > + data->remaining = 0; This treats 0001 delimiters the same as a 0000 flush. Expecting 0 more bytes is the right thing, but would we later want to differentiate in post_rpc()? I.e., is it ever correct for the server data to end with a delim? > + } else if (data->remaining < 4) { > + die(_("remote-curl: bad line length %d"), data->remaining); We don't use an 0002 or 0003 packet for anything yet, but this would need to be updated if we ever did. I wonder if we should treat them also as zero-length packets and quietly pass through, which is likely the right thing to do. OTOH, this should complain loudly enough that somebody would presumably notice as soon as they tried to use those packets. Regarding testing, I think the ideal thing would be a proxy layer we could insert that simply eats all data after N bytes. That would be easy to do if we could use --upload-pack='git-upload-pack | head -c 500' or something. But we need it to happen between curl and the server side of Git. Possibly we could insert something between apache and git-http-backend (simialr to how we handle broken-smart-http.sh. -Peff
Hi Peff, On Fri, May 15, 2020 at 05:38:44PM -0400, Jeff King wrote: > On Wed, May 13, 2020 at 02:04:57PM -0400, Denton Liu wrote: > > > Currently, remote-curl acts as a proxy and blindly forwards packets > > between an HTTP server and fetch-pack. In the case of a stateless RPC > > connection where the connection is terminated with a partially written > > packet, remote-curl will blindly send the partially written packet > > before waiting on more input from fetch-pack. Meanwhile, fetch-pack will > > read the partial packet and continue reading, expecting more input. This > > results in a deadlock between the two processes. > > > > Instead of blindly forwarding packets, inspect each packet to ensure > > that it is a full packet, erroring out if a partial packet is sent. > > Hmm. Right now there's no assumption in rpc_in that we're writing > pktlines. Will this always be the case? > > I think the answer is probably yes. If there's a case where it isn't, it > might be something like v0 git-over-http against a server which doesn't > have the sideband capability. As far as I can tell from skimming the code, v0 uses always pktlines although I'm far from being an expert on Git's networking stuff. Perhaps we could implement this such that the line-length checking only happens for stateless_connect()?
On Mon, May 18, 2020 at 05:08:57AM -0400, Denton Liu wrote: > > Hmm. Right now there's no assumption in rpc_in that we're writing > > pktlines. Will this always be the case? > > > > I think the answer is probably yes. If there's a case where it isn't, it > > might be something like v0 git-over-http against a server which doesn't > > have the sideband capability. > > As far as I can tell from skimming the code, v0 uses always pktlines > although I'm far from being an expert on Git's networking stuff. Perhaps > we could implement this such that the line-length checking only happens > for stateless_connect()? Yeah, that would certainly limit the possibility of unintended side effects (and I don't think there's any benefit to this patch for the non-stateless-connect cases). We'd still be locking stateless-connect into always using pktlines, but I think that's probably OK in practice. AFAICT it's the case now, and I doubt we'd have any desire to change it in the future (and if we do, this is all local-ish code that we could modify). One unfortunate thing about the protocol (but not new to your patch) is that this will be a problem for _any_ remote-helper which claims to support stateless-connect. So they'd all need to carry similar code to deal with this issue. Right now remote-curl is the only one, but probably git-remote-ext and git-remote-fd could support this, too. Those are pretty exotic, though (I don't think anyone has even bothered to make them support v2 yet). -Peff
diff --git a/remote-curl.c b/remote-curl.c index da3e07184a..8b740354e5 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -682,6 +682,8 @@ static curlioerr rpc_ioctl(CURL *handle, int cmd, void *clientp) struct rpc_in_data { struct rpc_state *rpc; struct active_request_slot *slot; + struct strbuf len_buf; + int remaining; }; /* @@ -692,6 +694,7 @@ static size_t rpc_in(char *ptr, size_t eltsize, size_t nmemb, void *buffer_) { size_t size = eltsize * nmemb; + size_t unwritten = size; struct rpc_in_data *data = buffer_; long response_code; @@ -702,7 +705,42 @@ static size_t rpc_in(char *ptr, size_t eltsize, return size; if (size) data->rpc->any_written = 1; - write_or_die(data->rpc->in, ptr, size); + + while (unwritten) { + if (!data->remaining) { + int digits_remaining = 4 - data->len_buf.len; + if (digits_remaining > unwritten) + digits_remaining = unwritten; + strbuf_add(&data->len_buf, ptr, digits_remaining); + ptr += digits_remaining; + unwritten -= digits_remaining; + + if (data->len_buf.len == 4) { + data->remaining = packet_length(data->len_buf.buf); + if (data->remaining < 0) { + die(_("remote-curl: bad line length character: %.4s"), data->len_buf.buf); + } else if (data->remaining <= 1) { + data->remaining = 0; + } else if (data->remaining < 4) { + die(_("remote-curl: bad line length %d"), data->remaining); + } else { + data->remaining -= 4; + } + write_or_die(data->rpc->in, data->len_buf.buf, 4); + strbuf_reset(&data->len_buf); + } + } + + if (data->remaining) { + int remaining = data->remaining; + if (remaining > unwritten) + remaining = unwritten; + write_or_die(data->rpc->in, ptr, remaining); + ptr += remaining; + unwritten -= remaining; + data->remaining -= remaining; + } + } return size; } @@ -920,6 +958,8 @@ static int post_rpc(struct rpc_state *rpc, int flush_received) curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, rpc_in); rpc_in_data.rpc = rpc; rpc_in_data.slot = slot; + strbuf_init(&rpc_in_data.len_buf, 4); + rpc_in_data.remaining = 0; curl_easy_setopt(slot->curl, CURLOPT_FILE, &rpc_in_data); curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 0); @@ -936,6 +976,9 @@ static int post_rpc(struct rpc_state *rpc, int flush_received) if (!rpc->any_written) err = -1; + if (rpc_in_data.remaining) + err = error(_("%d bytes are still expected"), rpc_in_data.remaining); + curl_slist_free_all(headers); free(gzip_body); return err;
Currently, remote-curl acts as a proxy and blindly forwards packets between an HTTP server and fetch-pack. In the case of a stateless RPC connection where the connection is terminated with a partially written packet, remote-curl will blindly send the partially written packet before waiting on more input from fetch-pack. Meanwhile, fetch-pack will read the partial packet and continue reading, expecting more input. This results in a deadlock between the two processes. Instead of blindly forwarding packets, inspect each packet to ensure that it is a full packet, erroring out if a partial packet is sent. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Denton Liu <liu.denton@gmail.com> --- Notes: Unfortunately, I'm not really sure how to test this. remote-curl.c | 45 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 44 insertions(+), 1 deletion(-)