Message ID | 20181010111351.5045-3-rv@rasmusvillemoes.dk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | send-email: Also pick up cc addresses from -by trailers | expand |
On Wed, Oct 10 2018, Rasmus Villemoes wrote: > + if ($c !~ /.+@.+|<.+>/) { > + printf("(body) Ignoring %s from line '%s'\n", > + $what, $_) unless $quiet; > + next; > + } > push @cc, $c; > printf(__("(body) Adding cc: %s from line '%s'\n"), > $c, $_) unless $quiet; There's a extract_valid_address() function in git-send-email already, shouldn't this be: if (!extract_valid_address($c)) { [...] Or is there a good reason not to use that function in this case?
On 2018-10-10 14:57, Ævar Arnfjörð Bjarmason wrote: > > On Wed, Oct 10 2018, Rasmus Villemoes wrote: > >> + if ($c !~ /.+@.+|<.+>/) { >> + printf("(body) Ignoring %s from line '%s'\n", >> + $what, $_) unless $quiet; >> + next; >> + } >> push @cc, $c; >> printf(__("(body) Adding cc: %s from line '%s'\n"), >> $c, $_) unless $quiet; > > There's a extract_valid_address() function in git-send-email already, > shouldn't this be: > > if (!extract_valid_address($c)) { > [...] > > Or is there a good reason not to use that function in this case? > I considered that (and also had a version where I simply insisted on a @ being present), but that means the user no longer would get prompted about the cases where the address was just slightly obfuscated, e.g. the Cc: John Doe <john at doe.com> cases, which would be a regression, I guess. So I do want to pass such cases through, and have them be dealt with when process_address_list gets called. So this is just a rather minimal and simple heuristic, which should still be able to handle the vast majority of cases correctly, and at least almost never exclude anything that might have a chance of becoming a real address. Rasmus
Rasmus Villemoes <rv@rasmusvillemoes.dk> writes: > I considered that (and also had a version where I simply insisted on a @ > being present), but that means the user no longer would get prompted > about the cases where the address was just slightly obfuscated, e.g. the > > Cc: John Doe <john at doe.com> > > cases, which would be a regression, I guess. So I do want to pass such > cases through, and have them be dealt with when process_address_list > gets called. We are only tightening with this patch, and we were passing any random things through with the original code anyway, so without [PATCH 3/3], this step must be making it only better, but I have to wonder one thing. You keep saying "get prompted" but are we sure we always stop and ask (and preferrably---fail and abort when the end user is not available at the terminal to interact) when we have such a questionable address?
On 2018-10-11 08:06, Junio C Hamano wrote: > Rasmus Villemoes <rv@rasmusvillemoes.dk> writes: > >> I considered that (and also had a version where I simply insisted on a @ >> being present), but that means the user no longer would get prompted >> about the cases where the address was just slightly obfuscated, e.g. the >> >> Cc: John Doe <john at doe.com> >> >> cases, which would be a regression, I guess. So I do want to pass such >> cases through, and have them be dealt with when process_address_list >> gets called. > > We are only tightening with this patch, and we were passing any > random things through with the original code anyway, so without > [PATCH 3/3], this step must be making it only better, but I have to > wonder one thing. > > You keep saying "get prompted" but are we sure we always stop and > ask (and preferrably---fail and abort when the end user is not > available at the terminal to interact) when we have such a > questionable address? > I dunno. I guess I've never considered non-interactive use of send-email. But the ask() in validate_address does have default q[uit], which I suppose gets used if stdin is /dev/null? I did do an experiment adding a bunch of the random odd patterns found in kernel commit messages to see how send-email reacted before/after this, and the only things that got filtered away (i.e., no longer prompted about) were things where the user probably couldn't easily fix it anyway. In the cases where there was a "Cc: stable" that might be fixed to the proper stable@vger.kernel.org, the logic in extract_valid_address simply saw that as a local address, so we didn't use to be prompted, but simply sent to stable@localhost. Now we simply don't pass that through. So, for non-interactive use, I guess the effect of this patch is to allow more cases to complete succesfully, since we filter away (some) cases where extract_valid_address would cause us to prompt (and thus quit). So, it seems you're ok with this tightening, but some comment on the non-interactive use case should be made in the commit log? Or am I misunderstanding? Thanks, Rasmus
Rasmus Villemoes <rv@rasmusvillemoes.dk> writes: > So, it seems you're ok with this tightening, but some comment on the > non-interactive use case should be made in the commit log? Or am I > misunderstanding? I do not think we need any immediate action on this step. I was just wondering if we want two classes of "I am not running you interactively, so assume I said 'yes' when you need to ask me any confirmation on X and Y" and "I am not running you interactively, so assume I said 'no' for safety when you need to ask me any confirmation on Z" supported in the future. Lines with both @ and <> fall into the first class, while lines with only <> fall into the second camp, I would guess.
diff --git a/git-send-email.perl b/git-send-email.perl index 2be5dac337..1916159d2a 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -1694,6 +1694,11 @@ sub process_file { next if $suppress_cc{'sob'} and $what =~ /Signed-off-by/i; next if $suppress_cc{'bodycc'} and $what =~ /Cc/i; } + if ($c !~ /.+@.+|<.+>/) { + printf("(body) Ignoring %s from line '%s'\n", + $what, $_) unless $quiet; + next; + } push @cc, $c; printf(__("(body) Adding cc: %s from line '%s'\n"), $c, $_) unless $quiet;
While the address sanitizations routines do accept local addresses, that is almost never what is meant in a Cc or Signed-off-by trailer. Looking through all the signed-off-by lines in the linux kernel tree without a @, there are mostly two patterns: Either just a full name, or a full name followed by <user at domain.com> (i.e., with the word at instead of a @), and minor variations. For cc lines, the same patterns appear, along with lots of "cc stable" variations that do not actually name stable@vger.kernel.org Cc: stable # introduced pre-git times cc: stable.kernel.org In the <user at domain.com> cases, one gets a chance to interactively fix it. But when there is no <> pair, it seems we end up just using the first word as a (local) address. As the number of cases where a local address really was meant is likely (and anecdotally) quite small compared to the number of cases where we end up cc'ing a garbage address, insist on at least a @ or a <> pair being present. This is also preparation for the next patch, where we are likely to encounter even more non-addresses in -by lines, such as Reported-by: Coverity Patch-generated-by: Coccinelle Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk> --- git-send-email.perl | 5 +++++ 1 file changed, 5 insertions(+)