Message ID | 20241118110154.3711777-1-linux@rasmusvillemoes.dk (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v3] setlocalversion: work around "git describe" performance | expand |
On Mon, Nov 18, 2024 at 8:01 PM Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: > > Contrary to expectations, passing a single candidate tag to "git > describe" is slower than not passing any --match options. > > $ time git describe --debug > ... > traversed 10619 commits > ... > v6.12-rc5-63-g0fc810ae3ae1 > > real 0m0.169s > > $ time git describe --match=v6.12-rc5 --debug > ... > traversed 1310024 commits > v6.12-rc5-63-g0fc810ae3ae1 > > real 0m1.281s > > In fact, the --debug output shows that git traverses all or most of > history. For some repositories and/or git versions, those 1.3s are > actually 10-15 seconds. > > This has been acknowledged as a performance bug in git [1], and a fix > is on its way [2]. However, no solution is yet in git.git, and even > when one lands, it will take quite a while before it finds its way to > a release and for $random_kernel_developer to pick that up. > > So rewrite the logic to use plumbing commands. For each of the > candidate values of $tag, we ask: (1) is $tag even an annotated > tag? (2) Is it eligible to describe HEAD, i.e. an ancestor of > HEAD? (3) If so, how many commits are in $tag..HEAD? > > I have tested that this produces the same output as the current script > for ~700 random commits between v6.9..v6.10. For those 700 commits, > and in my git repo, the 'make -s kernelrelease' command is on average > ~4 times faster with this patch applied (geometric mean of ratios). > > For the commit mentioned in Josh's original report [3], the > time-consuming part of setlocalversion goes from > > $ time git describe --match=v6.12-rc5 c1e939a21eb1 > v6.12-rc5-44-gc1e939a21eb1 > > real 0m1.210s > > to > > $ time git rev-list --count --left-right v6.12-rc5..c1e939a21eb1 > 0 44 > > real 0m0.037s > > [1] https://lore.kernel.org/git/20241101113910.GA2301440@coredump.intra.peff.net/ > [2] https://lore.kernel.org/git/20241106192236.GC880133@coredump.intra.peff.net/ > [3] https://lore.kernel.org/lkml/309549cafdcfe50c4fceac3263220cc3d8b109b2.1730337435.git.jpoimboe@kernel.org/ > > Reported-by: Sean Christopherson <seanjc@google.com> > Closes: https://lore.kernel.org/lkml/ZPtlxmdIJXOe0sEy@google.com/ > Reported-by: Josh Poimboeuf <jpoimboe@kernel.org> > Tested-by: Josh Poimboeuf <jpoimboe@kernel.org> > Closes: https://lore.kernel.org/lkml/309549cafdcfe50c4fceac3263220cc3d8b109b2.1730337435.git.jpoimboe@kernel.org/ > Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> > --- > v3: > > - Update trailer tag list, per Masahiro. > - Drop redundant quoutes around the word tag > - Add a shellcheck disable directive > > Masahiro, I decided to keep the changes minimal, in particular not to > change anything around the logic or the (unused) return values, in > order not to invalidate Josh' T-b tag. I think it's more important for > this to make it to 6.13-rc1 (if that is even still possible, given > that the MW is already open). This is not urgent because it has been broken more than one year. Your "|| return 1" may not live long. https://lore.kernel.org/linux-kbuild/20241118231534.1351938-1-masahiroy@kernel.org/T/#u If you write try_tag() like you wrote, my patch can become even simpler. > + > + # $2 is the number of commits in the range $tag..HEAD, possibly 0. > + count="$2" count=$2 is enough because double-quotes are not required on the RHS of an assignment. > - # If we are at the tagged commit, we ignore it because the version is > - # well-defined. > - if [ "${tag}" != "${desc}" ]; then > + # If we are at the tagged commit, we ignore it because the > + # version is well-defined. If none of the attempted tags exist > + # or were usable, $count is still empty. > + if [ -z "${count}" ] || [ "${count}" -gt 0 ]; then Is this code equivalent to the following? if [ "${count}" != 0 ]; then
On Mon, Nov 18, 2024 at 8:01 PM Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: > > Contrary to expectations, passing a single candidate tag to "git > describe" is slower than not passing any --match options. > > $ time git describe --debug > ... > traversed 10619 commits > ... > v6.12-rc5-63-g0fc810ae3ae1 > > real 0m0.169s > > $ time git describe --match=v6.12-rc5 --debug > ... > traversed 1310024 commits > v6.12-rc5-63-g0fc810ae3ae1 > > real 0m1.281s > > In fact, the --debug output shows that git traverses all or most of > history. For some repositories and/or git versions, those 1.3s are > actually 10-15 seconds. > > This has been acknowledged as a performance bug in git [1], and a fix > is on its way [2]. However, no solution is yet in git.git, and even > when one lands, it will take quite a while before it finds its way to > a release and for $random_kernel_developer to pick that up. > > So rewrite the logic to use plumbing commands. For each of the > candidate values of $tag, we ask: (1) is $tag even an annotated > tag? (2) Is it eligible to describe HEAD, i.e. an ancestor of > HEAD? (3) If so, how many commits are in $tag..HEAD? > > I have tested that this produces the same output as the current script > for ~700 random commits between v6.9..v6.10. For those 700 commits, > and in my git repo, the 'make -s kernelrelease' command is on average > ~4 times faster with this patch applied (geometric mean of ratios). > > For the commit mentioned in Josh's original report [3], the > time-consuming part of setlocalversion goes from > > $ time git describe --match=v6.12-rc5 c1e939a21eb1 > v6.12-rc5-44-gc1e939a21eb1 > > real 0m1.210s > > to > > $ time git rev-list --count --left-right v6.12-rc5..c1e939a21eb1 > 0 44 > > real 0m0.037s > > [1] https://lore.kernel.org/git/20241101113910.GA2301440@coredump.intra.peff.net/ > [2] https://lore.kernel.org/git/20241106192236.GC880133@coredump.intra.peff.net/ > [3] https://lore.kernel.org/lkml/309549cafdcfe50c4fceac3263220cc3d8b109b2.1730337435.git.jpoimboe@kernel.org/ > > Reported-by: Sean Christopherson <seanjc@google.com> > Closes: https://lore.kernel.org/lkml/ZPtlxmdIJXOe0sEy@google.com/ > Reported-by: Josh Poimboeuf <jpoimboe@kernel.org> > Tested-by: Josh Poimboeuf <jpoimboe@kernel.org> > Closes: https://lore.kernel.org/lkml/309549cafdcfe50c4fceac3263220cc3d8b109b2.1730337435.git.jpoimboe@kernel.org/ > Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> > --- > v3: Applied to linux-kbuild because this is better than v4 at least. Thanks.
diff --git a/scripts/setlocalversion b/scripts/setlocalversion index 38b96c6797f4..5818465abba9 100755 --- a/scripts/setlocalversion +++ b/scripts/setlocalversion @@ -30,6 +30,27 @@ if test $# -gt 0 -o ! -d "$srctree"; then usage fi +try_tag() { + tag="$1" + + # Is $tag an annotated tag? + [ "$(git cat-file -t "$tag" 2> /dev/null)" = tag ] || return 1 + + # Is it an ancestor of HEAD, and if so, how many commits are in $tag..HEAD? + # shellcheck disable=SC2046 # word splitting is the point here + set -- $(git rev-list --count --left-right "$tag"...HEAD 2> /dev/null) + + # $1 is 0 if and only if $tag is an ancestor of HEAD. Use + # string comparison, because $1 is empty if the 'git rev-list' + # command somehow failed. + [ "$1" = 0 ] || return 1 + + # $2 is the number of commits in the range $tag..HEAD, possibly 0. + count="$2" + + return 0 +} + scm_version() { local short=false @@ -61,33 +82,33 @@ scm_version() # stable kernel: 6.1.7 -> v6.1.7 version_tag=v$(echo "${KERNELVERSION}" | sed -E 's/^([0-9]+\.[0-9]+)\.0(.*)$/\1\2/') + # try_tag initializes count if the tag is usable. + count= + # If a localversion* file exists, and the corresponding # annotated tag exists and is an ancestor of HEAD, use # it. This is the case in linux-next. - tag=${file_localversion#-} - desc= - if [ -n "${tag}" ]; then - desc=$(git describe --match=$tag 2>/dev/null) + if [ -n "${file_localversion#-}" ] ; then + try_tag "${file_localversion#-}" fi # Otherwise, if a localversion* file exists, and the tag # obtained by appending it to the tag derived from # KERNELVERSION exists and is an ancestor of HEAD, use # it. This is e.g. the case in linux-rt. - if [ -z "${desc}" ] && [ -n "${file_localversion}" ]; then - tag="${version_tag}${file_localversion}" - desc=$(git describe --match=$tag 2>/dev/null) + if [ -z "${count}" ] && [ -n "${file_localversion}" ]; then + try_tag "${version_tag}${file_localversion}" fi # Otherwise, default to the annotated tag derived from KERNELVERSION. - if [ -z "${desc}" ]; then - tag="${version_tag}" - desc=$(git describe --match=$tag 2>/dev/null) + if [ -z "${count}" ]; then + try_tag "${version_tag}" fi - # If we are at the tagged commit, we ignore it because the version is - # well-defined. - if [ "${tag}" != "${desc}" ]; then + # If we are at the tagged commit, we ignore it because the + # version is well-defined. If none of the attempted tags exist + # or were usable, $count is still empty. + if [ -z "${count}" ] || [ "${count}" -gt 0 ]; then # If only the short version is requested, don't bother # running further git commands @@ -95,14 +116,15 @@ scm_version() echo "+" return fi + # If we are past the tagged commit, we pretty print it. # (like 6.1.0-14595-g292a089d78d3) - if [ -n "${desc}" ]; then - echo "${desc}" | awk -F- '{printf("-%05d", $(NF-1))}' + if [ -n "${count}" ]; then + printf "%s%05d" "-" "${count}" fi # Add -g and exactly 12 hex chars. - printf '%s%s' -g "$(echo $head | cut -c1-12)" + printf '%s%.12s' -g "$head" fi if ${no_dirty}; then