Message ID | pull.1767.v4.git.1722489776279.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v4] http: do not ignore proxy path | expand |
On Thu, Aug 01, 2024 at 05:22:56AM +0000, Ryan Hendrickson via GitGitGadget wrote: > From: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> > > The documentation for `http.proxy` describes that option, and the > environment variables it overrides, as supporting "the syntax understood > by curl". curl allows SOCKS proxies to use a path to a Unix domain > socket, like `socks5h://localhost/path/to/socket.sock`. Git should > therefore include, if present, the path part of the proxy URL in what it > passes to libcurl. > > Co-authored-by: Jeff King <peff@peff.net> > Signed-off-by: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> Thanks for crediting me. I'll add my: Signed-off-by: Jeff King <peff@peff.net> to be explicit that the proxy script is under the DCO. > - curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); > + strbuf_addstr(&proxy, proxy_auth.host); > + if (proxy_auth.path) { > + curl_version_info_data *ver = curl_version_info(CURLVERSION_NOW); > + > + if (ver->version_num < 0x075400) > + die("libcurl 7.84 or later is required to support paths in proxy URLs"); > + > + if (!starts_with(proxy_auth.protocol, "socks")) > + die("Invalid proxy URL '%s': only SOCKS proxies support paths", > + curl_http_proxy); > + > + if (strcasecmp(proxy_auth.host, "localhost")) > + die("Invalid proxy URL '%s': host must be localhost if a path is present", > + curl_http_proxy); > + > + strbuf_addch(&proxy, '/'); > + strbuf_add_percentencode(&proxy, proxy_auth.path, 0); This all looks good to me. I wondered briefly whether proxy_auth.protocol could ever be NULL, but I think we refuse to parse such a URL. > +start_socks() { > + mkfifo socks_output && > + { > + "$PERL_PATH" "$TEST_DIRECTORY/socks4-proxy.pl" "$1" >socks_output & > + echo $! > "$TRASH_DIRECTORY/socks.pid" > + } && > + read line <socks_output && > + test "$line" = ready > +} > + > +# The %30 tests that the correct amount of percent-encoding is applied to the > +# proxy string passed to curl. > +test_lazy_prereq SOCKS_PROXY 'test_have_prereq PERL && start_socks "$TRASH_DIRECTORY/%30.sock"' OK, I see you figured out that the lazy prereq requires giving the full path to the socket. :) I had forgotten that we also run the prereq in a subshell to avoid side effects, but you caught that, as well. All of this to me is good evidence that the non-lazy version you had originally is a better approach. But I don't think it's worth spending time fighting over, so I'm OK either way. > +test_atexit 'test ! -e "$TRASH_DIRECTORY/socks.pid" || kill "$(cat "$TRASH_DIRECTORY/socks.pid")"' Ah, good. I had meant to mention cleaning up at the end (as we do for git-daemon and apache), but forgot. I'm glad you included it here. > +test_expect_success SOCKS_PROXY 'clone via Unix socket' ' > + test_when_finished "rm -rf clone" && > + test_config_global http.proxy "socks4://localhost$PWD/%2530.sock" && { > + { > + GIT_TRACE_CURL=$PWD/trace git clone "$HTTPD_URL/smart/repo.git" clone 2>err && > + grep -i "SOCKS4 request granted." trace > + } || > + grep "^fatal: libcurl 7\.84 or later" err > + } > +' Looks good. Usually I do not bother with escaping "." in grep messages, as it is already a loose match and it keeps the test easier to read. But it is OK to do so. We do have a test_grep wrapper which gives nicer output when the match fails, but maybe that is a bad choice here, since we accept either of the two messages. (Likewise for using test_cmp, I suppose). The use of "||" in the command-chaining is unusual enough that it's probably worth calling out either in a comment or in the commit message. > +test_expect_success 'Unix socket requires socks*:' ' > + ! git clone -c http.proxy=localhost/path https://example.com/repo.git 2>err && { > + grep "^fatal: Invalid proxy URL '\''localhost/path'\'': only SOCKS proxies support paths" err || > + grep "^fatal: libcurl 7\.84 or later" err > + } > +' > + > +test_expect_success 'Unix socket requires localhost' ' > + ! git clone -c http.proxy=socks4://127.0.0.1/path https://example.com/repo.git 2>err && { > + grep "^fatal: Invalid proxy URL '\''socks4://127\.0\.0\.1/path'\'': host must be localhost if a path is present" err || > + grep "^fatal: libcurl 7\.84 or later" err > + } > +' > + Likewise here, I'd probably just match the single-quotes with "." for readability (but it's OK to write it as you did). You can (since a week or so ago) also use a here-doc body like: test_expect_success 'Unix socket requires socks*:' - <<\EOT ... test_grep "^fatal: Invalid proxy URL 'localhost/path': only SOCKS proxies support paths" err || ... EOT but I'm OK with it either way. The same "||" comment applies. -Peff
Jeff King <peff@peff.net> writes: > On Thu, Aug 01, 2024 at 05:22:56AM +0000, Ryan Hendrickson via GitGitGadget wrote: > >> From: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> >> >> The documentation for `http.proxy` describes that option, and the >> environment variables it overrides, as supporting "the syntax understood >> by curl". curl allows SOCKS proxies to use a path to a Unix domain >> socket, like `socks5h://localhost/path/to/socket.sock`. Git should >> therefore include, if present, the path part of the proxy URL in what it >> passes to libcurl. >> >> Co-authored-by: Jeff King <peff@peff.net> >> Signed-off-by: Ryan Hendrickson <ryan.hendrickson@alum.mit.edu> > > Thanks for crediting me. I'll add my: > > Signed-off-by: Jeff King <peff@peff.net> > > to be explicit that the proxy script is under the DCO. OK, I'll amend it while queuing this v4. Thanks. >> +# The %30 tests that the correct amount of percent-encoding is applied to the >> +# proxy string passed to curl. >> +test_lazy_prereq SOCKS_PROXY 'test_have_prereq PERL && start_socks "$TRASH_DIRECTORY/%30.sock"' > > OK, I see you figured out that the lazy prereq requires giving the full > path to the socket. :) I had forgotten that we also run the prereq in a > subshell to avoid side effects, but you caught that, as well. ;-) > All of this to me is good evidence that the non-lazy version you had > originally is a better approach. But I don't think it's worth spending > time fighting over, so I'm OK either way. I'd be OK either way, too. Thanks, both.
diff --git a/Documentation/config/http.txt b/Documentation/config/http.txt index 162b33fc52f..a14371b5c96 100644 --- a/Documentation/config/http.txt +++ b/Documentation/config/http.txt @@ -5,8 +5,8 @@ http.proxy:: proxy string with a user name but no password, in which case git will attempt to acquire one in the same way it does for other credentials. See linkgit:gitcredentials[7] for more information. The syntax thus is - '[protocol://][user[:password]@]proxyhost[:port]'. This can be overridden - on a per-remote basis; see remote.<name>.proxy + '[protocol://][user[:password]@]proxyhost[:port][/path]'. This can be + overridden on a per-remote basis; see remote.<name>.proxy + Any proxy, however configured, must be completely transparent and must not modify, transform, or buffer the request or response in any way. Proxies which diff --git a/http.c b/http.c index 623ed234891..6c6cc5c822a 100644 --- a/http.c +++ b/http.c @@ -1227,6 +1227,8 @@ static CURL *get_curl_handle(void) */ curl_easy_setopt(result, CURLOPT_PROXY, ""); } else if (curl_http_proxy) { + struct strbuf proxy = STRBUF_INIT; + if (starts_with(curl_http_proxy, "socks5h")) curl_easy_setopt(result, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS5_HOSTNAME); @@ -1265,7 +1267,27 @@ static CURL *get_curl_handle(void) if (!proxy_auth.host) die("Invalid proxy URL '%s'", curl_http_proxy); - curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); + strbuf_addstr(&proxy, proxy_auth.host); + if (proxy_auth.path) { + curl_version_info_data *ver = curl_version_info(CURLVERSION_NOW); + + if (ver->version_num < 0x075400) + die("libcurl 7.84 or later is required to support paths in proxy URLs"); + + if (!starts_with(proxy_auth.protocol, "socks")) + die("Invalid proxy URL '%s': only SOCKS proxies support paths", + curl_http_proxy); + + if (strcasecmp(proxy_auth.host, "localhost")) + die("Invalid proxy URL '%s': host must be localhost if a path is present", + curl_http_proxy); + + strbuf_addch(&proxy, '/'); + strbuf_add_percentencode(&proxy, proxy_auth.path, 0); + } + curl_easy_setopt(result, CURLOPT_PROXY, proxy.buf); + strbuf_release(&proxy); + var_override(&curl_no_proxy, getenv("NO_PROXY")); var_override(&curl_no_proxy, getenv("no_proxy")); curl_easy_setopt(result, CURLOPT_NOPROXY, curl_no_proxy); diff --git a/t/socks4-proxy.pl b/t/socks4-proxy.pl new file mode 100644 index 00000000000..4c3a35c0083 --- /dev/null +++ b/t/socks4-proxy.pl @@ -0,0 +1,48 @@ +use strict; +use IO::Select; +use IO::Socket::UNIX; +use IO::Socket::INET; + +my $path = shift; + +unlink($path); +my $server = IO::Socket::UNIX->new(Listen => 1, Local => $path) + or die "unable to listen on $path: $!"; + +$| = 1; +print "ready\n"; + +while (my $client = $server->accept()) { + sysread $client, my $buf, 8; + my ($version, $cmd, $port, $ip) = unpack 'CCnN', $buf; + next unless $version == 4; # socks4 + next unless $cmd == 1; # TCP stream connection + + # skip NUL-terminated id + while (sysread $client, my $char, 1) { + last unless ord($char); + } + + # version(0), reply(5a == granted), port (ignored), ip (ignored) + syswrite $client, "\x00\x5a\x00\x00\x00\x00\x00\x00"; + + my $remote = IO::Socket::INET->new(PeerHost => $ip, PeerPort => $port) + or die "unable to connect to $ip/$port: $!"; + + my $io = IO::Select->new($client, $remote); + while ($io->count) { + for my $fh ($io->can_read(0)) { + for my $pair ([$client, $remote], [$remote, $client]) { + my ($from, $to) = @$pair; + next unless $fh == $from; + + my $r = sysread $from, my $buf, 1024; + if (!defined $r || $r <= 0) { + $io->remove($from); + next; + } + syswrite $to, $buf; + } + } + } +} diff --git a/t/t5564-http-proxy.sh b/t/t5564-http-proxy.sh index bb35b87071d..039d7fc748e 100755 --- a/t/t5564-http-proxy.sh +++ b/t/t5564-http-proxy.sh @@ -39,4 +39,45 @@ test_expect_success 'clone can prompt for proxy password' ' expect_askpass pass proxuser ' +start_socks() { + mkfifo socks_output && + { + "$PERL_PATH" "$TEST_DIRECTORY/socks4-proxy.pl" "$1" >socks_output & + echo $! > "$TRASH_DIRECTORY/socks.pid" + } && + read line <socks_output && + test "$line" = ready +} + +# The %30 tests that the correct amount of percent-encoding is applied to the +# proxy string passed to curl. +test_lazy_prereq SOCKS_PROXY 'test_have_prereq PERL && start_socks "$TRASH_DIRECTORY/%30.sock"' + +test_atexit 'test ! -e "$TRASH_DIRECTORY/socks.pid" || kill "$(cat "$TRASH_DIRECTORY/socks.pid")"' + +test_expect_success SOCKS_PROXY 'clone via Unix socket' ' + test_when_finished "rm -rf clone" && + test_config_global http.proxy "socks4://localhost$PWD/%2530.sock" && { + { + GIT_TRACE_CURL=$PWD/trace git clone "$HTTPD_URL/smart/repo.git" clone 2>err && + grep -i "SOCKS4 request granted." trace + } || + grep "^fatal: libcurl 7\.84 or later" err + } +' + +test_expect_success 'Unix socket requires socks*:' ' + ! git clone -c http.proxy=localhost/path https://example.com/repo.git 2>err && { + grep "^fatal: Invalid proxy URL '\''localhost/path'\'': only SOCKS proxies support paths" err || + grep "^fatal: libcurl 7\.84 or later" err + } +' + +test_expect_success 'Unix socket requires localhost' ' + ! git clone -c http.proxy=socks4://127.0.0.1/path https://example.com/repo.git 2>err && { + grep "^fatal: Invalid proxy URL '\''socks4://127\.0\.0\.1/path'\'': host must be localhost if a path is present" err || + grep "^fatal: libcurl 7\.84 or later" err + } +' + test_done