perf: do allow `GIT_PERF_*` to be overridden again

Message ID	pull.1900.git.1743764167548.gitgitgadget@gmail.com (mailing list archive)
State	New
Headers	show Received: from mail-wr1-f50.google.com (mail-wr1-f50.google.com [209.85.221.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7A6431BEF87 for <git@vger.kernel.org>; Fri, 4 Apr 2025 10:56:10 +0000 (UTC) Message-Id: <pull.1900.git.1743764167548.gitgitgadget@gmail.com> Date: Fri, 04 Apr 2025 10:56:07 +0000 Subject: [PATCH] perf: do allow `GIT_PERF_*` to be overridden again Fcc: Sent Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk MIME-Version: 1.0 To: git@vger.kernel.org Cc: Patrick Steinhardt <ps@pks.im>, Johannes Schindelin <johannes.schindelin@gmx.de>, Johannes Schindelin <johannes.schindelin@gmx.de> From: Johannes Schindelin <johannes.schindelin@gmx.de>
Series	perf: do allow `GIT_PERF_` to be overridden again \| expand perf: do allow `GIT_PERF_` to be overridden again

Message ID

pull.1900.git.1743764167548.gitgitgadget@gmail.com (mailing list archive)

State

New

Headers

Message-Id: <pull.1900.git.1743764167548.gitgitgadget@gmail.com>
Date: Fri, 04 Apr 2025 10:56:07 +0000
Subject: [PATCH] perf: do allow `GIT_PERF_*` to be overridden again
Fcc: Sent
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Precedence: bulk
MIME-Version: 1.0
To: git@vger.kernel.org
Cc: Patrick Steinhardt <ps@pks.im>,
    Johannes Schindelin <johannes.schindelin@gmx.de>,
    Johannes Schindelin <johannes.schindelin@gmx.de>
From: Johannes Schindelin <johannes.schindelin@gmx.de>

Series

perf: do allow `GIT_PERF_*` to be overridden again | expand

Commit Message

Johannes Schindelin April 4, 2025, 10:56 a.m. UTC

From: Johannes Schindelin <johannes.schindelin@gmx.de>

A common way to run Git's performance benchmarks on repositories other
than Git's own repository (which is not exactly large when compared to
actually large repositories) is to run them like this:

	GIT_PERF_LARGE_REPO=/path/to/my/large/repo \
	./p1234-*.sh -ivx

Contrary to developers' common expectations, this failed to work when
Git was built with a different `GIT_PERF_LARGE_REPO` value specified at
build time: That build-time option would have been written to the
`GIT-BUILD-OPTIONS` file, which in turn would have been sourced by
`test-lib.sh`, which in turn would have been sourced by `perf-lib.sh`,
which in turn would have been sourced by the perf test script,
_overriding_ the environment variable specified in the way illustrated
above.

Since perf tests are not run as part of the build, this most likely
unintended behavior was not caught and certainly not fixed, as the
`GIT_PERF_*` values would have been empty at build-time.

However, in 4638e8806e3a (Makefile: use common template for
GIT-BUILD-OPTIONS, 2024-12-06), a subtle change of behavior was
introduced: Whereas before, a couple of build-time options (the
`GIT_PERF_*` ones included) were written to `GIT-BUILD-OPTIONS` only
when their values were non-empty. With this commit, they are also
written when they are empty.

The consequence is that above-mentioned way to run the perf tests will
not only fail to pick up the desired `GIT_PERF_*` settings when they
were specified differently while building Git, instead the desired
settings will be only respected when specified _while building_ Git.

Let's work around the original issue, i.e. let `GIT_PERF_*` environment
variables override what is recorded in `GIT-BUILD-OPTIONS`.

Note that this is just the tip of the iceberg, there are a couple of
`GIT_TEST_*` options that may want a similar fix in `test-lib.sh`. Due
to time constraints on my side, this here patch focuses exclusively on
the `GIT_PERF_*` settings.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
    perf: do allow GIT_PERF_* to be overridden again
    
    This issue was noticed when working on large-scale issues.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1900%2Fdscho%2Fsupport-ad-hoc-git-perf-settings-again-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1900/dscho/support-ad-hoc-git-perf-settings-again-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1900

 t/perf/perf-lib.sh | 12 ++++++++++++
 1 file changed, 12 insertions(+)


base-commit: 683c54c999c301c2cd6f715c411407c413b1d84e

Comments

Derrick Stolee April 4, 2025, 12:13 p.m. UTC | #1

On 4/4/2025 6:56 AM, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> A common way to run Git's performance benchmarks on repositories other
> than Git's own repository (which is not exactly large when compared to
> actually large repositories) is to run them like this:
> 
> 	GIT_PERF_LARGE_REPO=/path/to/my/large/repo \
> 	./p1234-*.sh -ivx
> 

This issue also extends to other necessary variables such as
GIT_PERF_REPEAT_COUNT.  
> +# GIT-BUILD-OPTIONS, sourced by test-lib.sh, overwrites the `GIT_PERF_*`
> +# values that are set by the user (if any). Let's stash them away as
> +# `eval`-able assignments.
> +git_perf_settings="$(env |
> +	sed -n "/^GIT_PERF_/{
> +		# escape all single-quotes in the value
> +		s/'/'\\\\''/g
> +		# turn this into an eval-able assignment
> +		s/^\\([^=]*=\\)\\(.*\\)/\\1'\\2'/p
> +	}")"
> +
>  . ../test-lib.sh
> +eval "$git_perf_settings"
I verified this fix in my local environment. Thanks so much for digging
in and finding the solution here!

-Stolee

Jeff King April 19, 2025, 3:54 a.m. UTC | #2

On Fri, Apr 04, 2025 at 10:56:07AM +0000, Johannes Schindelin via GitGitGadget wrote:

> However, in 4638e8806e3a (Makefile: use common template for
> GIT-BUILD-OPTIONS, 2024-12-06), a subtle change of behavior was
> introduced: Whereas before, a couple of build-time options (the
> `GIT_PERF_*` ones included) were written to `GIT-BUILD-OPTIONS` only
> when their values were non-empty. With this commit, they are also
> written when they are empty.

It doesn't look like Junio picked this up, so I wanted to chime in that
this regression bit me today, too (specifically for GIT_PERF_LARGE_REPO,
but also another variable which I'll detail in a moment).

This is also the same issue that hit the interop suite discussed in:

  https://lore.kernel.org/git/Z8IX2bMJe+V80idE@nand.local/

(there I was a bit dismissive of it, because I think GIT_*_MAKE_OPTS
would usually be set at build time, but for these other variables,
they're almost always going to come from the environment).

> Let's work around the original issue, i.e. let `GIT_PERF_*` environment
> variables override what is recorded in `GIT-BUILD-OPTIONS`.

I like this direction. It not only fixes the regression, but makes
things generally behave more as I'd expect them to.

I think it doesn't quite fix everything, though. I noticed that
GIT_PERF_REPEAT_COUNT no longer correctly defaults to "3" when using the
"run" script. And there are two problems here:

  1. The "run" script itself sources GIT-BUILD-OPTIONS, so it would need
     similar treatment.

  2. But even if we did, that, in my case I am not setting
     PERF_REPEAT_COUNT at all. So there is no local env variable that
     we'd use to take precedence. The problem is in the way the "run"
     script assigns defaults. If it sees an environment variable set
     (whether actually from the environment or from GIT-BUILD-OPTIONS)
     it accepts it as-is, rather than installing its fallback. So it
     will never use its default of "3", and instead retain the blank
     string (which, by a stroke of luck, does still cause it to run each
     trial at least once).

So I think we either need to rewrite the "run" script's fallback code,
or teach the GIT-BUILD-OPTIONS writer to avoid mentioning unset
variables (which is the real source of the problem in 4638e8806e3a).

So...

> Note that this is just the tip of the iceberg, there are a couple of
> `GIT_TEST_*` options that may want a similar fix in `test-lib.sh`. Due
> to time constraints on my side, this here patch focuses exclusively on
> the `GIT_PERF_*` settings.

...yes, this is definitely the tip of the iceberg. I don't mind doing
this patch as an incremental step forward (and because it is an
improvement in behavior even if 4638e8806e3a were reverted). But the
issue is far from solved overall.

> +# GIT-BUILD-OPTIONS, sourced by test-lib.sh, overwrites the `GIT_PERF_*`
> +# values that are set by the user (if any). Let's stash them away as
> +# `eval`-able assignments.
> +git_perf_settings="$(env |
> +	sed -n "/^GIT_PERF_/{
> +		# escape all single-quotes in the value
> +		s/'/'\\\\''/g
> +		# turn this into an eval-able assignment
> +		s/^\\([^=]*=\\)\\(.*\\)/\\1'\\2'/p
> +	}")"

The implementation here looks correct to me, including the quoting. The
number of backslashes is a bit vomit-inducing, but I think unavoidable
(we could put the sed command into single-quotes, but then you'd have to
escape out of it to show the single-quotes you do need to mention).

-Peff

diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
index 8ab6d9c4694..39c37284452 100644
--- a/t/perf/perf-lib.sh
+++ b/t/perf/perf-lib.sh
@@ -25,7 +25,19 @@  TEST_OUTPUT_DIRECTORY=$(pwd)
 TEST_NO_CREATE_REPO=t
 TEST_NO_MALLOC_CHECK=t
 
+# GIT-BUILD-OPTIONS, sourced by test-lib.sh, overwrites the `GIT_PERF_*`
+# values that are set by the user (if any). Let's stash them away as
+# `eval`-able assignments.
+git_perf_settings="$(env |
+	sed -n "/^GIT_PERF_/{
+		# escape all single-quotes in the value
+		s/'/'\\\\''/g
+		# turn this into an eval-able assignment
+		s/^\\([^=]*=\\)\\(.*\\)/\\1'\\2'/p
+	}")"
+
 . ../test-lib.sh
+eval "$git_perf_settings"
 
 unset GIT_CONFIG_NOSYSTEM
 GIT_CONFIG_SYSTEM="$TEST_DIRECTORY/perf/config"

perf: do allow `GIT_PERF_*` to be overridden again

Commit Message

Comments

Patch