From patchwork Tue Sep 14 15:30:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 12493967 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE3E8C433EF for ; Tue, 14 Sep 2021 15:30:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D8F4E61157 for ; Tue, 14 Sep 2021 15:30:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234721AbhINPbk (ORCPT ); Tue, 14 Sep 2021 11:31:40 -0400 Received: from cloud.peff.net ([104.130.231.41]:47008 "EHLO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234724AbhINPbj (ORCPT ); Tue, 14 Sep 2021 11:31:39 -0400 Received: (qmail 24297 invoked by uid 109); 14 Sep 2021 15:30:21 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 14 Sep 2021 15:30:21 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 24213 invoked by uid 111); 14 Sep 2021 15:30:20 -0000 Received: from coredump.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 14 Sep 2021 11:30:20 -0400 Authentication-Results: peff.net; auth=none Date: Tue, 14 Sep 2021 11:30:20 -0400 From: Jeff King To: git@vger.kernel.org Cc: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason Subject: [PATCH 1/9] serve: rename is_command() to parse_command() Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The is_command() function not only tells us whether the pktline is a valid command string, but it also parses out the command (and complains if we see a duplicate). Let's rename it to make those extra functions a bit more obvious. Signed-off-by: Jeff King --- Obviously not strictly necessary, but the name caused me a fair bit of confusion at first while touching this code. serve.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/serve.c b/serve.c index 1817edc7f5..fd88b95343 100644 --- a/serve.c +++ b/serve.c @@ -163,7 +163,7 @@ static int is_valid_capability(const char *key) return c && c->advertise(the_repository, NULL); } -static int is_command(const char *key, struct protocol_capability **command) +static int parse_command(const char *key, struct protocol_capability **command) { const char *out; @@ -251,7 +251,7 @@ static int process_request(void) BUG("Should have already died when seeing EOF"); case PACKET_READ_NORMAL: /* collect request; a sequence of keys and values */ - if (is_command(reader.line, &command) || + if (parse_command(reader.line, &command) || is_valid_capability(reader.line)) strvec_push(&keys, reader.line); else From patchwork Tue Sep 14 15:30:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 12493969 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEE67C433EF for ; Tue, 14 Sep 2021 15:30:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A3B866113B for ; Tue, 14 Sep 2021 15:30:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234682AbhINPcJ (ORCPT ); Tue, 14 Sep 2021 11:32:09 -0400 Received: from cloud.peff.net ([104.130.231.41]:47012 "EHLO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234167AbhINPcI (ORCPT ); Tue, 14 Sep 2021 11:32:08 -0400 Received: (qmail 24303 invoked by uid 109); 14 Sep 2021 15:30:51 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 14 Sep 2021 15:30:51 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 24223 invoked by uid 111); 14 Sep 2021 15:30:50 -0000 Received: from coredump.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 14 Sep 2021 11:30:50 -0400 Authentication-Results: peff.net; auth=none Date: Tue, 14 Sep 2021 11:30:50 -0400 From: Jeff King To: git@vger.kernel.org Cc: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason Subject: [PATCH 2/9] serve: return capability "value" from get_capability() Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When the client sends v2 capabilities, they may be simple, like: foo or have a value like: foo=bar (all of the current capabilities actually expect a value, but the protocol allows for boolean ones). We use get_capability() to make sure the client's pktline matches a capability. In doing so, we parse enough to see the "=" and the value (if any), but we immediately forget it. Nobody cares for now, because they end up parsing the values out later using has_capability(). But in preparation for changing that, let's pass back a pointer so the callers know what we found. Note that unlike has_capability(), we'll return NULL for a "simple" capability. Distinguishing these will be useful for some future patches. Signed-off-by: Jeff King --- We get rid of has_capability() later, so the inconsistency in return types will go away. serve.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/serve.c b/serve.c index fd88b95343..78a4e83554 100644 --- a/serve.c +++ b/serve.c @@ -139,7 +139,7 @@ void protocol_v2_advertise_capabilities(void) strbuf_release(&value); } -static struct protocol_capability *get_capability(const char *key) +static struct protocol_capability *get_capability(const char *key, const char **value) { int i; @@ -149,16 +149,25 @@ static struct protocol_capability *get_capability(const char *key) for (i = 0; i < ARRAY_SIZE(capabilities); i++) { struct protocol_capability *c = &capabilities[i]; const char *out; - if (skip_prefix(key, c->name, &out) && (!*out || *out == '=')) + if (!skip_prefix(key, c->name, &out)) + continue; + if (!*out) { + *value = NULL; return c; + } + if (*out++ == '=') { + *value = out; + return c; + } } return NULL; } static int is_valid_capability(const char *key) { - const struct protocol_capability *c = get_capability(key); + const char *value; + const struct protocol_capability *c = get_capability(key, &value); return c && c->advertise(the_repository, NULL); } @@ -168,7 +177,8 @@ static int parse_command(const char *key, struct protocol_capability **command) const char *out; if (skip_prefix(key, "command=", &out)) { - struct protocol_capability *cmd = get_capability(out); + const char *value; + struct protocol_capability *cmd = get_capability(out, &value); if (*command) die("command '%s' requested after already requesting command '%s'", From patchwork Tue Sep 14 15:31:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 12493971 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 446B0C433EF for ; Tue, 14 Sep 2021 15:31:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2748160234 for ; Tue, 14 Sep 2021 15:31:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233642AbhINPc0 (ORCPT ); Tue, 14 Sep 2021 11:32:26 -0400 Received: from cloud.peff.net ([104.130.231.41]:47016 "EHLO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233202AbhINPcZ (ORCPT ); Tue, 14 Sep 2021 11:32:25 -0400 Received: (qmail 24309 invoked by uid 109); 14 Sep 2021 15:31:08 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 14 Sep 2021 15:31:08 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 24227 invoked by uid 111); 14 Sep 2021 15:31:07 -0000 Received: from coredump.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 14 Sep 2021 11:31:07 -0400 Authentication-Results: peff.net; auth=none Date: Tue, 14 Sep 2021 11:31:07 -0400 From: Jeff King To: git@vger.kernel.org Cc: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason Subject: [PATCH 3/9] serve: add "receive" method for v2 capabilities table Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org We have a capabilities table that tells us what we should tell the client we are capable of, and what to do when a client gives us a particular command (e.g., "command=ls-refs"). But it doesn't tell us what to do when the client sends us back a capability (e.g., "object-format=sha256"). We just collect them all in a strvec and hope somebody can use them later. Instead, let's provide a function pointer in the table to act on these. This will eventually help us avoid collecting the strings, which will be more efficient and less prone to mischief. Using the new method is optional, which helps in two ways: - we can move existing capabilities over to this new system gradually in individual commits - some capabilities we don't actually do anything with anyway. For example, the client is free to say "agent=git/1.2.3" to us, but we do not act on the information in any way. Signed-off-by: Jeff King --- serve.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/serve.c b/serve.c index 78a4e83554..a161189984 100644 --- a/serve.c +++ b/serve.c @@ -70,6 +70,16 @@ struct protocol_capability { * This field should be NULL for capabilities which are not commands. */ int (*command)(struct repository *r, struct packet_reader *request); + + /* + * Function called when a client requests the capability as a + * non-command. This may be NULL if the capability does nothing. + * + * For a capability of the form "foo=bar", the value string points to + * the content after the "=" (i.e., "bar"). For simple capabilities + * (just "foo"), it is NULL. + */ + void (*receive)(struct repository *r, const char *value); }; static struct protocol_capability capabilities[] = { @@ -164,12 +174,17 @@ static struct protocol_capability *get_capability(const char *key, const char ** return NULL; } -static int is_valid_capability(const char *key) +static int receive_client_capability(const char *key) { const char *value; const struct protocol_capability *c = get_capability(key, &value); - return c && c->advertise(the_repository, NULL); + if (!c || !c->advertise(the_repository, NULL)) + return 0; + + if (c->receive) + c->receive(the_repository, value); + return 1; } static int parse_command(const char *key, struct protocol_capability **command) @@ -262,7 +277,7 @@ static int process_request(void) case PACKET_READ_NORMAL: /* collect request; a sequence of keys and values */ if (parse_command(reader.line, &command) || - is_valid_capability(reader.line)) + receive_client_capability(reader.line)) strvec_push(&keys, reader.line); else die("unknown capability '%s'", reader.line); From patchwork Tue Sep 14 15:31:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 12493973 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0F3BC433EF for ; Tue, 14 Sep 2021 15:31:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A664E604D1 for ; Tue, 14 Sep 2021 15:31:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234825AbhINPcw (ORCPT ); Tue, 14 Sep 2021 11:32:52 -0400 Received: from cloud.peff.net ([104.130.231.41]:47020 "EHLO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234274AbhINPct (ORCPT ); Tue, 14 Sep 2021 11:32:49 -0400 Received: (qmail 24316 invoked by uid 109); 14 Sep 2021 15:31:31 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 14 Sep 2021 15:31:31 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 24231 invoked by uid 111); 14 Sep 2021 15:31:30 -0000 Received: from coredump.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 14 Sep 2021 11:31:30 -0400 Authentication-Results: peff.net; auth=none Date: Tue, 14 Sep 2021 11:31:30 -0400 From: Jeff King To: git@vger.kernel.org Cc: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason Subject: [PATCH 4/9] serve: provide "receive" function for object-format capability Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org We get any "object-format" specified by the client by searching for it in the collected list of capabilities the client sent. We can instead just handle it as soon as they send it. This is slightly more efficient, and gets us one step closer to dropping that collected list. Note that we do still have to do our final hash check after receiving all capabilities (because they might not have sent an object-format line at all, and we still have to check that the default matches our repository algorithm). Since the check_algorithm() function would now be done to a single if() statement, I've just inlined it in its only caller. There should be no change of behavior here, except for two broken-protocol cases: - if the client sends multiple conflicting object-format capabilities (which they should not), we'll now choose the last one rather than the first. We could also detect and complain about the duplicates quite easily now, which we could not before, but I didn't do so here. - if the client sends a bogus "object-format" with no equals sign, we'll now say so, rather than "unknown object format: ''" Signed-off-by: Jeff King --- serve.c | 34 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/serve.c b/serve.c index a161189984..f6ea2953eb 100644 --- a/serve.c +++ b/serve.c @@ -10,6 +10,7 @@ #include "upload-pack.h" static int advertise_sid = -1; +static int client_hash_algo = GIT_HASH_SHA1; static int always_advertise(struct repository *r, struct strbuf *value) @@ -33,6 +34,17 @@ static int object_format_advertise(struct repository *r, return 1; } +static void object_format_receive(struct repository *r, + const char *algo_name) +{ + if (!algo_name) + die("object-format capability requires an argument"); + + client_hash_algo = hash_algo_by_name(algo_name); + if (client_hash_algo == GIT_HASH_UNKNOWN) + die("unknown object format '%s'", algo_name); +} + static int session_id_advertise(struct repository *r, struct strbuf *value) { if (advertise_sid == -1 && @@ -104,6 +116,7 @@ static struct protocol_capability capabilities[] = { { .name = "object-format", .advertise = object_format_advertise, + .receive = object_format_receive, }, { .name = "session-id", @@ -228,22 +241,6 @@ static int has_capability(const struct strvec *keys, const char *capability, return 0; } -static void check_algorithm(struct repository *r, struct strvec *keys) -{ - int client = GIT_HASH_SHA1, server = hash_algo_by_ptr(r->hash_algo); - const char *algo_name; - - if (has_capability(keys, "object-format", &algo_name)) { - client = hash_algo_by_name(algo_name); - if (client == GIT_HASH_UNKNOWN) - die("unknown object format '%s'", algo_name); - } - - if (client != server) - die("mismatched object format: server %s; client %s\n", - r->hash_algo->name, hash_algos[client].name); -} - enum request_state { PROCESS_REQUEST_KEYS, PROCESS_REQUEST_DONE, @@ -317,7 +314,10 @@ static int process_request(void) if (!command) die("no command requested"); - check_algorithm(the_repository, &keys); + if (client_hash_algo != hash_algo_by_ptr(the_repository->hash_algo)) + die("mismatched object format: server %s; client %s\n", + the_repository->hash_algo->name, + hash_algos[client_hash_algo].name); if (has_capability(&keys, "session-id", &client_sid)) trace2_data_string("transfer", NULL, "client-sid", client_sid); From patchwork Tue Sep 14 15:33:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 12493975 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DEE38C433EF for ; Tue, 14 Sep 2021 15:33:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BE55B60F21 for ; Tue, 14 Sep 2021 15:33:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234162AbhINPe3 (ORCPT ); Tue, 14 Sep 2021 11:34:29 -0400 Received: from cloud.peff.net ([104.130.231.41]:47024 "EHLO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232656AbhINPe1 (ORCPT ); Tue, 14 Sep 2021 11:34:27 -0400 Received: (qmail 24322 invoked by uid 109); 14 Sep 2021 15:33:09 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 14 Sep 2021 15:33:09 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 24235 invoked by uid 111); 14 Sep 2021 15:33:09 -0000 Received: from coredump.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 14 Sep 2021 11:33:09 -0400 Authentication-Results: peff.net; auth=none Date: Tue, 14 Sep 2021 11:33:09 -0400 From: Jeff King To: git@vger.kernel.org Cc: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason Subject: [PATCH 5/9] serve: provide "receive" function for session-id capability Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Rather than pulling the session-id string from the list of collected capabilities, we can handle it as soon as we receive it. This gets us closer to dropping the collected list entirely. The behavior should be the same, with one exception. Previously if the client sent us multiple session-id lines, we'd report only the first. Now we'll pass each one along to trace2. This shouldn't matter in practice, since clients shouldn't do that (and if they do, it's probably sensible to log them all). As this removes the last caller of the static has_capability(), we can remove it, as well (and in fact we must to avoid -Wunused-function complaining). Signed-off-by: Jeff King --- I had originally dropped has_capability() in a separate patch, to keep this one more readable. That breaks bisectability, but only with -Werror. I'm not sure where we should fall on that spectrum (I generally bisect with -Wno-error just because warnings may come and go when working with different compilers than what was normal at the time). Not that big a deal either way for this patch, but I wonder if people have opinions in general. serve.c | 33 +++++++++------------------------ 1 file changed, 9 insertions(+), 24 deletions(-) diff --git a/serve.c b/serve.c index f6ea2953eb..6bbf54cbbe 100644 --- a/serve.c +++ b/serve.c @@ -57,6 +57,14 @@ static int session_id_advertise(struct repository *r, struct strbuf *value) return 1; } +static void session_id_receive(struct repository *r, + const char *client_sid) +{ + if (!client_sid) + client_sid = ""; + trace2_data_string("transfer", NULL, "client-sid", client_sid); +} + struct protocol_capability { /* * The name of the capability. The server uses this name when @@ -121,6 +129,7 @@ static struct protocol_capability capabilities[] = { { .name = "session-id", .advertise = session_id_advertise, + .receive = session_id_receive, }, { .name = "object-info", @@ -221,26 +230,6 @@ static int parse_command(const char *key, struct protocol_capability **command) return 0; } -static int has_capability(const struct strvec *keys, const char *capability, - const char **value) -{ - int i; - for (i = 0; i < keys->nr; i++) { - const char *out; - if (skip_prefix(keys->v[i], capability, &out) && - (!*out || *out == '=')) { - if (value) { - if (*out == '=') - out++; - *value = out; - } - return 1; - } - } - - return 0; -} - enum request_state { PROCESS_REQUEST_KEYS, PROCESS_REQUEST_DONE, @@ -252,7 +241,6 @@ static int process_request(void) struct packet_reader reader; struct strvec keys = STRVEC_INIT; struct protocol_capability *command = NULL; - const char *client_sid; packet_reader_init(&reader, 0, NULL, 0, PACKET_READ_CHOMP_NEWLINE | @@ -319,9 +307,6 @@ static int process_request(void) the_repository->hash_algo->name, hash_algos[client_hash_algo].name); - if (has_capability(&keys, "session-id", &client_sid)) - trace2_data_string("transfer", NULL, "client-sid", client_sid); - command->command(the_repository, &reader); strvec_clear(&keys); From patchwork Tue Sep 14 15:33:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 12493977 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64DAAC433F5 for ; Tue, 14 Sep 2021 15:33:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4497060F21 for ; Tue, 14 Sep 2021 15:33:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234357AbhINPeh (ORCPT ); Tue, 14 Sep 2021 11:34:37 -0400 Received: from cloud.peff.net ([104.130.231.41]:47030 "EHLO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234274AbhINPeh (ORCPT ); Tue, 14 Sep 2021 11:34:37 -0400 Received: (qmail 24329 invoked by uid 109); 14 Sep 2021 15:33:19 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 14 Sep 2021 15:33:19 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 24239 invoked by uid 111); 14 Sep 2021 15:33:18 -0000 Received: from coredump.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 14 Sep 2021 11:33:18 -0400 Authentication-Results: peff.net; auth=none Date: Tue, 14 Sep 2021 11:33:18 -0400 From: Jeff King To: git@vger.kernel.org Cc: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason Subject: [PATCH 6/9] serve: drop "keys" strvec Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org We collect the set of capabilities the client sends us in a strvec. While this is usually small, there's no limit to the number of capabilities the client can send us (e.g., they could just send us "agent" pkt-lines over and over, and we'd keep adding them to the list). Since all code has been converted away from using this list, let's get rid of it. This avoids a potential attack where clients waste our memory. Note that we do have to replace it with a flag, because some of the flush-packet logic checks whether we've seen any valid commands or keys. Signed-off-by: Jeff King --- serve.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/serve.c b/serve.c index 6bbf54cbbe..baa0a17502 100644 --- a/serve.c +++ b/serve.c @@ -239,7 +239,7 @@ static int process_request(void) { enum request_state state = PROCESS_REQUEST_KEYS; struct packet_reader reader; - struct strvec keys = STRVEC_INIT; + int seen_capability_or_command = 0; struct protocol_capability *command = NULL; packet_reader_init(&reader, 0, NULL, 0, @@ -263,10 +263,11 @@ static int process_request(void) /* collect request; a sequence of keys and values */ if (parse_command(reader.line, &command) || receive_client_capability(reader.line)) - strvec_push(&keys, reader.line); + seen_capability_or_command = 1; else die("unknown capability '%s'", reader.line); + /* Consume the peeked line */ packet_reader_read(&reader); break; @@ -275,7 +276,7 @@ static int process_request(void) * If no command and no keys were given then the client * wanted to terminate the connection. */ - if (!keys.nr) + if (!seen_capability_or_command) return 1; /* @@ -309,7 +310,6 @@ static int process_request(void) command->command(the_repository, &reader); - strvec_clear(&keys); return 0; } From patchwork Tue Sep 14 15:37:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 12493989 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63384C433F5 for ; Tue, 14 Sep 2021 15:37:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 430D960241 for ; Tue, 14 Sep 2021 15:37:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234274AbhINPiZ (ORCPT ); Tue, 14 Sep 2021 11:38:25 -0400 Received: from cloud.peff.net ([104.130.231.41]:47032 "EHLO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233202AbhINPiZ (ORCPT ); Tue, 14 Sep 2021 11:38:25 -0400 Received: (qmail 24340 invoked by uid 109); 14 Sep 2021 15:37:07 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 14 Sep 2021 15:37:07 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 24284 invoked by uid 111); 14 Sep 2021 15:37:06 -0000 Received: from coredump.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 14 Sep 2021 11:37:06 -0400 Authentication-Results: peff.net; auth=none Date: Tue, 14 Sep 2021 11:37:06 -0400 From: Jeff King To: git@vger.kernel.org Cc: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason Subject: [PATCH 7/9] ls-refs: ignore very long ref-prefix counts Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Because each "ref-prefix" capability from the client comes in its own pkt-line, there's no limit to the number of them that a misbehaving client may send. We read them all into a strvec, which means the client can waste arbitrary amounts of our memory by just sending us "ref-prefix foo" over and over. One possible solution is to just drop the connection when the limit is reached. If we set it high enough, then only misbehaving or malicious clients would hit it. But "high enough" is vague, and it's unfriendly if we guess wrong and a legitimate client hits this. But we can do better. Since supporting the ref-prefix capability is optional anyway, the client has to further cull the response based on their own patterns. So we can simply ignore the patterns once we cross a certain threshold. Note that we have to ignore _all_ patterns, not just the ones past our limit (since otherwise we'd send too little data). The limit here is fairly arbitrary, and probably much higher than anyone would need in practice. It might be worth limiting it further, if only because we check it linearly (so with "m" local refs and "n" patterns, we do "m * n" string comparisons). But if we care about optimizing this, an even better solution may be a more advanced data structure anyway. I didn't bother making the limit configurable, since it's so high and since Git should behave correctly in either case. It wouldn't be too hard to do, but it makes both the code and documentation more complex. Signed-off-by: Jeff King --- We're perhaps bending "optional" a little here. The client does know if we said "yes, we support ref-prefix" and until now, that meant they could trust us to cull. But no version of Git has ever relied on that (we tell the transport code "if you can limit by these prefixes, go for it" but then just post-process the result). The other option is that we could just say "no, you're sending too many prefixes" and hangup. This seemed friendlier to me (though either way, I really find it quite unlikely anybody would legitimately hit this limit). ls-refs.c | 19 +++++++++++++++++-- t/t5701-git-serve.sh | 31 +++++++++++++++++++++++++++++++ 2 files changed, 48 insertions(+), 2 deletions(-) diff --git a/ls-refs.c b/ls-refs.c index a1a0250607..839fb0caa9 100644 --- a/ls-refs.c +++ b/ls-refs.c @@ -40,6 +40,12 @@ static void ensure_config_read(void) config_read = 1; } +/* + * The maximum number of "ref-prefix" lines we'll allow the client to send. + * If they go beyond this, we'll avoid using the prefix feature entirely. + */ +#define MAX_ALLOWED_PREFIXES 65536 + /* * Check if one of the prefixes is a prefix of the ref. * If no prefixes were provided, all refs match. @@ -141,6 +147,7 @@ static int ls_refs_config(const char *var, const char *value, void *data) int ls_refs(struct repository *r, struct packet_reader *request) { struct ls_refs_data data; + int too_many_prefixes = 0; memset(&data, 0, sizeof(data)); strvec_init(&data.prefixes); @@ -156,8 +163,16 @@ int ls_refs(struct repository *r, struct packet_reader *request) data.peel = 1; else if (!strcmp("symrefs", arg)) data.symrefs = 1; - else if (skip_prefix(arg, "ref-prefix ", &out)) - strvec_push(&data.prefixes, out); + else if (skip_prefix(arg, "ref-prefix ", &out)) { + if (too_many_prefixes) { + /* ignore any further ones */ + } else if (data.prefixes.nr >= MAX_ALLOWED_PREFIXES) { + strvec_clear(&data.prefixes); + too_many_prefixes = 1; + } else { + strvec_push(&data.prefixes, out); + } + } else if (!strcmp("unborn", arg)) data.unborn = allow_unborn; } diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh index 930721f053..b095bfa0ac 100755 --- a/t/t5701-git-serve.sh +++ b/t/t5701-git-serve.sh @@ -158,6 +158,37 @@ test_expect_success 'refs/heads prefix' ' test_cmp expect actual ' +test_expect_success 'ignore very large set of prefixes' ' + # generate a large number of ref-prefixes that we expect + # to match nothing; the value here exceeds MAX_ALLOWED_PREFIXES + # from ls-refs.c. + { + echo command=ls-refs && + echo object-format=$(test_oid algo) + echo 0001 && + perl -le "print \"refs/heads/$_\" for (1..65536+1)" && + echo 0000 + } | + test-tool pkt-line pack >in && + + # and then confirm that we see unmatched prefixes anyway (i.e., + # that the prefix was not applied). + cat >expect <<-EOF && + $(git rev-parse HEAD) HEAD + $(git rev-parse refs/heads/dev) refs/heads/dev + $(git rev-parse refs/heads/main) refs/heads/main + $(git rev-parse refs/heads/release) refs/heads/release + $(git rev-parse refs/tags/annotated-tag) refs/tags/annotated-tag + $(git rev-parse refs/tags/one) refs/tags/one + $(git rev-parse refs/tags/two) refs/tags/two + 0000 + EOF + + test-tool serve-v2 --stateless-rpc out && + test-tool pkt-line unpack actual && + test_cmp expect actual +' + test_expect_success 'peel parameter' ' test-tool pkt-line pack >in <<-EOF && command=ls-refs From patchwork Tue Sep 14 15:37:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 12493991 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B381BC433FE for ; Tue, 14 Sep 2021 15:37:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 96C3860F36 for ; Tue, 14 Sep 2021 15:37:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234388AbhINPik (ORCPT ); Tue, 14 Sep 2021 11:38:40 -0400 Received: from cloud.peff.net ([104.130.231.41]:47036 "EHLO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233202AbhINPij (ORCPT ); Tue, 14 Sep 2021 11:38:39 -0400 Received: (qmail 24345 invoked by uid 109); 14 Sep 2021 15:37:22 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 14 Sep 2021 15:37:22 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 24288 invoked by uid 111); 14 Sep 2021 15:37:21 -0000 Received: from coredump.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 14 Sep 2021 11:37:21 -0400 Authentication-Results: peff.net; auth=none Date: Tue, 14 Sep 2021 11:37:21 -0400 From: Jeff King To: git@vger.kernel.org Cc: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason Subject: [PATCH 8/9] serve: reject bogus v2 "command=ls-refs=foo" Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When we see a line from the client like "command=ls-refs", we parse everything after the equals sign as a capability, which we check against our capabilities table. If we don't recognize the command (e.g., "command=foo"), we'll reject it. But we use the same parser that checks for regular capabilities like "object-format=sha256". And so we'll accept "ls-refs=foo", even though everything after the equals is bogus, and simply ignored. This isn't really hurting anything, but the request does violate the spec. Let's tighten it up to prevent any surprising behavior. Signed-off-by: Jeff King --- serve.c | 2 +- t/t5701-git-serve.sh | 10 ++++++++++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/serve.c b/serve.c index baa0a17502..123abbaa83 100644 --- a/serve.c +++ b/serve.c @@ -220,7 +220,7 @@ static int parse_command(const char *key, struct protocol_capability **command) if (*command) die("command '%s' requested after already requesting command '%s'", out, (*command)->name); - if (!cmd || !cmd->advertise(the_repository, NULL) || !cmd->command) + if (!cmd || !cmd->advertise(the_repository, NULL) || !cmd->command || value) die("invalid command '%s'", out); *command = cmd; diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh index b095bfa0ac..ebb41657ab 100755 --- a/t/t5701-git-serve.sh +++ b/t/t5701-git-serve.sh @@ -72,6 +72,16 @@ test_expect_success 'request invalid command' ' test_i18ngrep "invalid command" err ' +test_expect_success 'requested command is command=value' ' + test-tool pkt-line pack >in <<-\EOF && + command=ls-refs=whatever + object-format=$(test_oid algo) + 0000 + EOF + test_must_fail test-tool serve-v2 --stateless-rpc 2>err in <<-EOF && command=fetch From patchwork Tue Sep 14 15:37:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 12493993 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14887C433EF for ; Tue, 14 Sep 2021 15:37:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E97A660F36 for ; Tue, 14 Sep 2021 15:37:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234426AbhINPjB (ORCPT ); Tue, 14 Sep 2021 11:39:01 -0400 Received: from cloud.peff.net ([104.130.231.41]:47040 "EHLO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233202AbhINPjB (ORCPT ); Tue, 14 Sep 2021 11:39:01 -0400 Received: (qmail 24351 invoked by uid 109); 14 Sep 2021 15:37:43 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 14 Sep 2021 15:37:43 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 24292 invoked by uid 111); 14 Sep 2021 15:37:42 -0000 Received: from coredump.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 14 Sep 2021 11:37:42 -0400 Authentication-Results: peff.net; auth=none Date: Tue, 14 Sep 2021 11:37:42 -0400 From: Jeff King To: git@vger.kernel.org Cc: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason Subject: [PATCH 9/9] serve: reject commands used as capabilities Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Our table of v2 "capabilities" contains everything we might tell the client we support. But there are differences in how we expect the client to respond. Some of the entries are true capabilities (i.e., we expect the client to say "yes, I support this"), and some are ones we expect them to send as commands (with "command=ls-refs" or similar). When we receive a capability used as a command, we complain about that. But when we receive a command used as a capability (e.g., just "ls-refs" in a pkt-line by itself), we silently ignore it. This isn't really hurting anything (clients shouldn't send it, and we'll ignore it), but we can tighten up the protocol to match what we expect to happen. There are two new tests here. The first one checks a capability used as a command, which already passes. The second tests a command as a capability, which this patch fixes. Signed-off-by: Jeff King --- serve.c | 2 +- t/t5701-git-serve.sh | 19 +++++++++++++++++++ 2 files changed, 20 insertions(+), 1 deletion(-) diff --git a/serve.c b/serve.c index 123abbaa83..9149fbb2f2 100644 --- a/serve.c +++ b/serve.c @@ -201,7 +201,7 @@ static int receive_client_capability(const char *key) const char *value; const struct protocol_capability *c = get_capability(key, &value); - if (!c || !c->advertise(the_repository, NULL)) + if (!c || c->command || !c->advertise(the_repository, NULL)) return 0; if (c->receive) diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh index ebb41657ab..209122b747 100755 --- a/t/t5701-git-serve.sh +++ b/t/t5701-git-serve.sh @@ -72,6 +72,25 @@ test_expect_success 'request invalid command' ' test_i18ngrep "invalid command" err ' +test_expect_success 'request capability as command' ' + test-tool pkt-line pack >in <<-\EOF && + command=agent + 0000 + EOF + test_must_fail test-tool serve-v2 --stateless-rpc 2>err in <<-\EOF && + command=ls-refs + fetch + 0000 + EOF + test_must_fail test-tool serve-v2 --stateless-rpc 2>err in <<-\EOF && command=ls-refs=whatever