Message ID | 20211202233836.2024510-2-keescook@chromium.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | patchwork-bot: Ignore empty context lines | expand |
On Thu, Dec 02, 2021 at 03:38:35PM -0800, Kees Cook wrote: > It seems that there is some mismatch in diff output (or storage?) between > git and patchwork under conditions I haven't figured out (MUA or MTA > whitespace stripping?): patchwork appears to ignore empty context lines > (i.e. a line that is only a single space). As a result, the patchwork > hash will not match, and commits will not be identified in patchwork. > > As an example of a commit in upstream that patchwork-bot cannot locate > in patchwork due to a different hash, due to the differing empty context > lines: > > $ MSGID=20210911102818.3804-1-len.baker@gmx.com > $ wget -qO- https://patchwork.kernel.org/project/linux-hardening/patch/$MSGID/raw/ | \ > grep '^ $' | wc -l > 0 > > $ SHA=f11ee2ad25b22c2ee587045dd6999434375532f7 > $ git diff ${SHA}~..${SHA} | \ > grep '^ $' | wc -l > 2 > > Teaching patchwork-bot to ignore empty context lines allows pwhashes to > match in these cases, which lets those patches get located again. With regard to the revert, do you not see the above problem with your repos? I'm working against the same patchwork. I guess there is some other root cause?
On Thu, Dec 16, 2021 at 02:11:08PM -0800, Kees Cook wrote: > > $ MSGID=20210911102818.3804-1-len.baker@gmx.com > > $ wget -qO- https://patchwork.kernel.org/project/linux-hardening/patch/$MSGID/raw/ | \ > > grep '^ $' | wc -l > > 0 > > > > $ SHA=f11ee2ad25b22c2ee587045dd6999434375532f7 > > $ git diff ${SHA}~..${SHA} | \ > > grep '^ $' | wc -l > > 2 > > > > Teaching patchwork-bot to ignore empty context lines allows pwhashes to > > match in these cases, which lets those patches get located again. > > With regard to the revert, do you not see the above problem with your > repos? I'm working against the same patchwork. I guess there is some > other root cause? I'm curious if you have local git configs that affect your "git diff" output. If you "export GIT_CONFIG_GLOBAL=/dev/null", does this make any difference to your experience? -K
On Fri, Dec 17, 2021 at 11:17:44AM -0500, Konstantin Ryabitsev wrote: > On Thu, Dec 16, 2021 at 02:11:08PM -0800, Kees Cook wrote: > > > $ MSGID=20210911102818.3804-1-len.baker@gmx.com > > > $ wget -qO- https://patchwork.kernel.org/project/linux-hardening/patch/$MSGID/raw/ | \ > > > grep '^ $' | wc -l > > > 0 > > > > > > $ SHA=f11ee2ad25b22c2ee587045dd6999434375532f7 > > > $ git diff ${SHA}~..${SHA} | \ > > > grep '^ $' | wc -l > > > 2 > > > > > > Teaching patchwork-bot to ignore empty context lines allows pwhashes to > > > match in these cases, which lets those patches get located again. > > > > With regard to the revert, do you not see the above problem with your > > repos? I'm working against the same patchwork. I guess there is some > > other root cause? > > I'm curious if you have local git configs that affect your "git diff" output. > If you "export GIT_CONFIG_GLOBAL=/dev/null", does this make any difference to > your experience? $ export SHA=f11ee2ad25b22c2ee587045dd6999434375532f7 $ GIT_CONFIG_GLOBAL=/dev/null git diff $SHA~..$SHA | grep '^ $' | wc -l 2 $ git --version git version 2.30.2 Does this report "0" for you?
On Fri, Dec 17, 2021 at 08:41:41AM -0800, Kees Cook wrote: > > I'm curious if you have local git configs that affect your "git diff" output. > > If you "export GIT_CONFIG_GLOBAL=/dev/null", does this make any difference to > > your experience? > > $ export SHA=f11ee2ad25b22c2ee587045dd6999434375532f7 > $ GIT_CONFIG_GLOBAL=/dev/null git diff $SHA~..$SHA | grep '^ $' | wc -l > 2 > $ git --version > git version 2.30.2 > > Does this report "0" for you? No, I get 2 as well. However, what I'm driving at is that the hashing function has to be identical to what patchwork uses, which is why it's a verbatim lift: https://github.com/getpatchwork/patchwork/blob/master/patchwork/hasher.py#L18 vs. https://git.kernel.org/pub/scm/linux/kernel/git/mricon/korg-helpers.git/tree/git-patchwork-bot.py#n479 If it generates a different hash from an email message than it does from the "git diff" output, then something else is happening. -K
diff --git a/git-patchwork-bot.py b/git-patchwork-bot.py index 2ebe5b4e88e4..4e10385de629 100755 --- a/git-patchwork-bot.py +++ b/git-patchwork-bot.py @@ -528,7 +528,9 @@ def get_patchwork_hash(diff): line_nos = list(map(fn, hunk_match.groups())) line = '@@ -%d +%d @@' % tuple(line_nos) elif line[0] in prefixes: - # if we have a +, - or context line, leave as-is + # if we have a +, - or context line, leave as-is, except blank lines + if line == ' ': + continue pass else: # other lines are ignored
It seems that there is some mismatch in diff output (or storage?) between git and patchwork under conditions I haven't figured out (MUA or MTA whitespace stripping?): patchwork appears to ignore empty context lines (i.e. a line that is only a single space). As a result, the patchwork hash will not match, and commits will not be identified in patchwork. As an example of a commit in upstream that patchwork-bot cannot locate in patchwork due to a different hash, due to the differing empty context lines: $ MSGID=20210911102818.3804-1-len.baker@gmx.com $ wget -qO- https://patchwork.kernel.org/project/linux-hardening/patch/$MSGID/raw/ | \ grep '^ $' | wc -l 0 $ SHA=f11ee2ad25b22c2ee587045dd6999434375532f7 $ git diff ${SHA}~..${SHA} | \ grep '^ $' | wc -l 2 Teaching patchwork-bot to ignore empty context lines allows pwhashes to match in these cases, which lets those patches get located again. Signed-off-by: Kees Cook <keescook@chromium.org> --- git-patchwork-bot.py | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)