diff mbox series

[v5,13/14] core.fsyncmethod: performance tests for batch mode

Message ID 26be6ecb28bc1f76fba380fdd10acf59820df997.1648616734.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects | expand

Commit Message

Neeraj Singh (WINDOWS-SFS) March 30, 2022, 5:05 a.m. UTC
From: Neeraj Singh <neerajsi@microsoft.com>

Add basic performance tests for git commands that can add data to the
object database. We cover:
* git add
* git stash
* git update-index (via git stash)
* git unpack-objects
* git commit --all

We cover all currently available fsync methods as well.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 t/perf/p0008-odb-fsync.sh | 81 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 81 insertions(+)
 create mode 100755 t/perf/p0008-odb-fsync.sh

Comments

Neeraj Singh March 31, 2022, 4:09 a.m. UTC | #1
On Tue, Mar 29, 2022 at 10:05 PM Neeraj Singh via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Neeraj Singh <neerajsi@microsoft.com>
>
> Add basic performance tests for git commands that can add data to the
> object database. We cover:
> * git add
> * git stash
> * git update-index (via git stash)
> * git unpack-objects
> * git commit --all
>
> We cover all currently available fsync methods as well.
>
> Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
> ---
>  t/perf/p0008-odb-fsync.sh | 81 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 81 insertions(+)
>  create mode 100755 t/perf/p0008-odb-fsync.sh
>
> diff --git a/t/perf/p0008-odb-fsync.sh b/t/perf/p0008-odb-fsync.sh
> new file mode 100755
> index 00000000000..87092c2627e
> --- /dev/null
> +++ b/t/perf/p0008-odb-fsync.sh
> @@ -0,0 +1,81 @@
> +#!/bin/sh
> +#
> +# This test measures the performance of adding new files to the object
> +# database. The test was originally added to measure the effect of the
> +# core.fsyncMethod=batch mode, which is why we are testing different values of
> +# that setting explicitly and creating a lot of unique objects.
> +
> +test_description="Tests performance of adding things to the object database"
> +
> +. ./perf-lib.sh
> +
> +. $TEST_DIRECTORY/lib-unique-files.sh
> +
> +test_perf_fresh_repo
> +test_checkout_worktree
> +
> +dir_count=10
> +files_per_dir=50
> +total_files=$((dir_count * files_per_dir))
> +
> +populate_files () {
> +       test_create_unique_files $dir_count $files_per_dir files
> +}
> +
> +setup_repo () {
> +       (rm -rf .git || 1) &&
> +       git init &&
> +       test_commit first &&
> +       populate_files
> +}
> +
> +test_perf_fsync_cfgs () {
> +       local method cfg &&
> +       for method in none fsync batch writeout-only
> +       do
> +               case $method in
> +               none)
> +                       cfg="-c core.fsync=none"
> +                       ;;
> +               *)
> +                       cfg="-c core.fsync=loose-object -c core.fsyncMethod=$method"
> +               esac &&
> +

In last round, I said I'd go with Ævar's scheme for iterating over
configs.  But when looking at the test output I decided that I wanted
a shorter label for each config rather than the actual command line to
make hte output more readable.

> +               # Set GIT_TEST_FSYNC=1 explicitly since fsync is normally
> +               # disabled by t/test-lib.sh.
> +               if ! test_perf "$1 (fsyncMethod=$method)" \
> +                                               --setup "$2" \
> +                                               "GIT_TEST_FSYNC=1 git $cfg $3"
> +               then
> +                       break
> +               fi
> +       done
> +}

So here I split the 'git $cfg' invocation off of the actual command
being executed, since it wasn't clear to me the best way to structure
this shell script.

The overall effect I want to achieve is to be able to iterate over
every config for each test case so that the different configs of the
same test appear next to each other in the output.

> +
> +test_perf_fsync_cfgs "add $total_files files" \
> +       "setup_repo" \
> +       "add -- files"
> +

I initially tried not substituting the $cfg variable in a test like this:
'git $cfg add -- files'

And then using eval in test_perf_fsync_cfgs to get the variable
substitution to happen later.

Is there a better way to write this?

Thanks,
Neeraj
diff mbox series

Patch

diff --git a/t/perf/p0008-odb-fsync.sh b/t/perf/p0008-odb-fsync.sh
new file mode 100755
index 00000000000..87092c2627e
--- /dev/null
+++ b/t/perf/p0008-odb-fsync.sh
@@ -0,0 +1,81 @@ 
+#!/bin/sh
+#
+# This test measures the performance of adding new files to the object
+# database. The test was originally added to measure the effect of the
+# core.fsyncMethod=batch mode, which is why we are testing different values of
+# that setting explicitly and creating a lot of unique objects.
+
+test_description="Tests performance of adding things to the object database"
+
+. ./perf-lib.sh
+
+. $TEST_DIRECTORY/lib-unique-files.sh
+
+test_perf_fresh_repo
+test_checkout_worktree
+
+dir_count=10
+files_per_dir=50
+total_files=$((dir_count * files_per_dir))
+
+populate_files () {
+	test_create_unique_files $dir_count $files_per_dir files
+}
+
+setup_repo () {
+	(rm -rf .git || 1) &&
+	git init &&
+	test_commit first &&
+	populate_files
+}
+
+test_perf_fsync_cfgs () {
+	local method cfg &&
+	for method in none fsync batch writeout-only
+	do
+		case $method in
+		none)
+			cfg="-c core.fsync=none"
+			;;
+		*)
+			cfg="-c core.fsync=loose-object -c core.fsyncMethod=$method"
+		esac &&
+
+		# Set GIT_TEST_FSYNC=1 explicitly since fsync is normally
+		# disabled by t/test-lib.sh.
+		if ! test_perf "$1 (fsyncMethod=$method)" \
+						--setup "$2" \
+						"GIT_TEST_FSYNC=1 git $cfg $3"
+		then
+			break
+		fi
+	done
+}
+
+test_perf_fsync_cfgs "add $total_files files" \
+	"setup_repo" \
+	"add -- files"
+
+test_perf_fsync_cfgs "stash $total_files files" \
+	"setup_repo" \
+	"stash push -u -- files"
+
+test_perf_fsync_cfgs "unpack $total_files files" \
+	"
+	setup_repo &&
+	git -c core.fsync=none add -- files &&
+	git -c core.fsync=none commit -q -m second &&
+	echo HEAD | git pack-objects -q --stdout --revs >test_pack.pack &&
+	setup_repo
+	" \
+	"unpack-objects -q <test_pack.pack"
+
+test_perf_fsync_cfgs "commit $total_files files" \
+	"
+	setup_repo &&
+	git -c core.fsync=none add -- files &&
+	populate_files
+	" \
+	"commit -q -a -m test"
+
+test_done