Message ID | a1e1c977d0978424fb07c97be0479f43a325cbea.1603889270.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | blame: enable funcname blaming with userdiff driver | expand |
"Philippe Blain via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Philippe Blain <levraiphilippeblain@gmail.com> > > In blame.c::cmd_blame, we send the 'path' field of the 'sb' 'struct > blame_scoreboard' as the 'path' argument to > 'line-range.c::parse_range_arg', but 'sb.path' is not set yet; it's set > to the local variable 'path' a few lines later at line 1137. > > This 'path' argument is only used in 'parse_range_arg' if we are blaming > a funcname, i.e. `git blame -L :<funcname> <path>`, and in that case it > is sent to 'parse_range_funcname', where it is used to determine if a > userdiff driver should be used for said <path> to match the given > funcname. > > Since 'path' is yet unset, the userdiff driver is never used, so we fall > back to the default funcname regex, which is usually not appropriate for > paths that are set to use a specific userdiff driver, and thus either we > match some unrelated lines, or we die with > > fatal: -L parameter '<funcname>' starting at line 1: no match > > This has been the case ever since `git blame` learned to blame a > funcname in 13b8f68c1f (log -L: :pattern:file syntax to find by > funcname, 2013-03-28). Good find. > Enable funcname blaming for paths using specific userdiff drivers by > sending the local variable 'path' to 'parse_range_arg' instead of the > yet unset 'sb.path'. Hmph. The reason why sb.path is not set to path at this point but later gets set is? I am wondering if this is leaving an unneeded room for sb.path and path to diverge by mistake. IOW, I wonder if it is better to set sb.path as early as we figure out which path we are interested in, forget we have local variable "path" and use sb.path consistently throughout the code. But that is merely a potential future clean-up. The local variable path is still used one more time in the error message given when this parse_range_arg() fails, so at least this change makes the use of path more consistent. I like the simplicity of this fix. > Add a regression test in 'annotate-tests.sh', which is sourced in > t8001-annotate.sh and t8002-blame.sh, leveraging an existing file used > to test the userdiff patterns in t4018-diff-funcname. Thanks, that change also makes sense.
> Le 29 oct. 2020 à 16:40, Junio C Hamano <gitster@pobox.com> a écrit : > > "Philippe Blain via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> From: Philippe Blain <levraiphilippeblain@gmail.com> >> >> In blame.c::cmd_blame, we send the 'path' field of the 'sb' 'struct >> blame_scoreboard' as the 'path' argument to >> 'line-range.c::parse_range_arg', but 'sb.path' is not set yet; it's set >> to the local variable 'path' a few lines later at line 1137. >> >> This 'path' argument is only used in 'parse_range_arg' if we are blaming >> a funcname, i.e. `git blame -L :<funcname> <path>`, and in that case it >> is sent to 'parse_range_funcname', where it is used to determine if a >> userdiff driver should be used for said <path> to match the given >> funcname. >> >> Since 'path' is yet unset, the userdiff driver is never used, so we fall >> back to the default funcname regex, which is usually not appropriate for >> paths that are set to use a specific userdiff driver, and thus either we >> match some unrelated lines, or we die with >> >> fatal: -L parameter '<funcname>' starting at line 1: no match >> >> This has been the case ever since `git blame` learned to blame a >> funcname in 13b8f68c1f (log -L: :pattern:file syntax to find by >> funcname, 2013-03-28). > > Good find. > >> Enable funcname blaming for paths using specific userdiff drivers by >> sending the local variable 'path' to 'parse_range_arg' instead of the >> yet unset 'sb.path'. > > Hmph. The reason why sb.path is not set to path at this point but > later gets set is? I am wondering if this is leaving an unneeded > room for sb.path and path to diverge by mistake. IOW, I wonder if > it is better to set sb.path as early as we figure out which path we > are interested in, forget we have local variable "path" and use > sb.path consistently throughout the code. I asked myself the same question, and could not come up with a good answer. I don't know if you read what I wrote in the cover letter (which I would have included in the in-patch commentary for this here patch if GitGitGadget had this feature), but I'm copying it here just in case you did not: * In patch 6, I considered fixing that bug in a different way by initializing sb.path inside blame.c::setup_scoreboard. This function already receives 'path' as its second argument, so changing that would not impact the API. Probably Thomas thought that sb.path was already initialized in sb when he modified builtin/blame.c::prepare_blame_range to also send sb.path to line-range.c::parse_range_arg in 13b8f68c1f (log -L: :pattern:file syntax to find by funcname, 2013-03-28). Initializing the path in setup_scoreboard would mean we could also simplify the API of blame.c::setup_blame_bloom_data since it would not need to receive path separately and so its second argument could be removed. I went for the simpler alternative of just sending 'path' to parse_range_arg instead of sb.path since it felt simpler, but if we feel it would be better to actually initialize sb.path in setup_scoreboard, I'll gladly tweak that for v2. > But that is merely a potential future clean-up. The local variable > path is still used one more time in the error message given when > this parse_range_arg() fails, so at least this change makes the use > of path more consistent. I like the simplicity of this fix. I also like its simplicity, and that's why I chose to submit this as v1. But I completely agree with you that it is "dangerous" in the sense that some further modifications to the code could then make the same mistake and use 'sb.path' thinking it is defined when it is not. So I'm thinking of instead initializing it in setup_scoreboard for v2. Cheers, Philippe.
Philippe Blain <levraiphilippeblain@gmail.com> writes: > * In patch 6, I considered fixing that bug in a different way by > initializing sb.path inside blame.c::setup_scoreboard. This function > already receives 'path' as its second argument, so changing that would not > impact the API. Probably Thomas thought that sb.path was already > initialized in sb when he modified builtin/blame.c::prepare_blame_range > to also send sb.path to line-range.c::parse_range_arg in 13b8f68c1f (log > -L: :pattern:file syntax to find by funcname, 2013-03-28). > > Initializing the path in setup_scoreboard would mean we could also > simplify the API of blame.c::setup_blame_bloom_data since it would not > need to receive path separately and so its second argument could be > removed. I went for the simpler alternative of just sending 'path' to > parse_range_arg instead of sb.path since it felt simpler, but if we feel > it would be better to actually initialize sb.path in setup_scoreboard, > I'll gladly tweak that for v2. > >> But that is merely a potential future clean-up. The local variable >> path is still used one more time in the error message given when >> this parse_range_arg() fails, so at least this change makes the use >> of path more consistent. I like the simplicity of this fix. > > I also like its simplicity, and that's why I chose to submit this as v1. > But I completely agree with you that it is "dangerous" in the sense > that some further modifications to the code could then make the same mistake > and use 'sb.path' thinking it is defined when it is not. > > So I'm thinking of instead initializing it in setup_scoreboard for v2. That does sound like a sensible way to clean it up. Thanks.
diff --git a/builtin/blame.c b/builtin/blame.c index 05f69211dd..917fedc635 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -1104,7 +1104,7 @@ int cmd_blame(int argc, const char **argv, const char *prefix) long bottom, top; if (parse_range_arg(range_list.items[range_i].string, nth_line_cb, &sb, lno, anchor, - &bottom, &top, sb.path, + &bottom, &top, path, the_repository->index)) usage(blame_usage); if ((!lno && (top || bottom)) || lno < bottom) diff --git a/t/annotate-tests.sh b/t/annotate-tests.sh index d933af5714..3aee61d2cc 100644 --- a/t/annotate-tests.sh +++ b/t/annotate-tests.sh @@ -479,6 +479,24 @@ test_expect_success 'blame -L ^:RE (absolute: end-of-file)' ' check_count -f hello.c -L$n -L^:ma.. F 4 G 1 H 1 ' +test_expect_success 'setup -L :funcname with userdiff driver' ' + echo "fortran-* diff=fortran" >.gitattributes && + fortran_file=fortran-external-function && + orig_file="$TEST_DIRECTORY/t4018/$fortran_file" && + cp $orig_file . && + git add $fortran_file && + GIT_AUTHOR_NAME="A" GIT_AUTHOR_EMAIL="A@test.git" \ + git commit -m "add fortran file" && + sed -e "s/ChangeMe/IWasChanged/" <"$orig_file" >$fortran_file && + git add $fortran_file && + GIT_AUTHOR_NAME="B" GIT_AUTHOR_EMAIL="B@test.git" \ + git commit -m "change fortran file" +' + +test_expect_success 'blame -L :funcname with userdiff driver' ' + check_count -f fortran-external-function -L:RIGHT A 7 B 1 +' + test_expect_success 'setup incremental' ' ( GIT_AUTHOR_NAME=I &&