diff mbox series

[7/6] format-patch: fix leak of empty header string

Message ID 20240322095951.GA529578@coredump.intra.peff.net (mailing list archive)
State Accepted
Commit 1c10b8e5b0819598b42c042754c7c860a8637ce3
Headers show
Series [1/6] shortlog: stop setting pp.print_email_subject | expand

Commit Message

Jeff King March 22, 2024, 9:59 a.m. UTC
On Tue, Mar 19, 2024 at 08:25:55PM -0400, Jeff King wrote:

>   [1/6]: shortlog: stop setting pp.print_email_subject
>   [2/6]: pretty: split oneline and email subject printing
>   [3/6]: pretty: drop print_email_subject flag
>   [4/6]: log: do not set up extra_headers for non-email formats
>   [5/6]: format-patch: return an allocated string from log_write_email_headers()
>   [6/6]: format-patch: simplify after-subject MIME header handling

These patches introduce a small leak into format-patch. I didn't notice
before because the "leaks" CI jobs were broken due to sanitizer problems
in the base image (which now seem fixed?).

Here's a fix that can go on top of jk/pretty-subject-cleanup. That topic
is not in 'next' yet, so I could also re-roll. The issue was subtle
enough that a separate commit is not such a bad thing, but I'm happy to
squash it in if we'd prefer.

-- >8 --
Subject: [PATCH] format-patch: fix leak of empty header string

The log_write_email_headers() function recently learned to return the
"extra_headers_p" variable to the caller as an allocated string. We
start by copying rev_info.extra_headers into a strbuf, and then detach
the strbuf at the end of the function. If there are no extra headers, we
leave the strbuf empty. Likewise, if there are no headers to return, we
pass back NULL.

This misses a corner case which can cause a leak. The "do we have any
headers to copy" check is done by looking for a NULL opt->extra_headers.
But the "do we have a non-empty string to return" check is done by
checking the length of the strbuf. That means if opt->extra_headers is
the empty string, we'll "copy" it into the strbuf, triggering an
allocation, but then leak the buffer when we return NULL from the
function.

We can solve this in one of two ways:

  1. Rather than checking headers->len at the end, we could check
     headers->alloc to see if we allocated anything. That retains the
     original behavior before the recent change, where an empty
     extra_headers string is "passed through" to the caller. In practice
     this doesn't matter, though (the code which eventually looks at the
     result treats NULL or the empty string the same).

  2. Only bother copying a non-empty string into the strbuf. This has
     the added bonus of avoiding a pointless allocation.

     Arguably strbuf_addstr() could do this optimization itself, though
     it may be slightly dangerous to do so (some existing callers may
     not get a fresh allocation when they expect to). In theory callers
     are all supposed to use strbuf_detach() in such a case, but there's
     no guarantee that this is the case.

This patch uses option 2. Without it, building with SANITIZE=leak shows
many errors in t4021 and elsewhere.

Signed-off-by: Jeff King <peff@peff.net>
---
 log-tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Kristoffer Haugsbakk March 22, 2024, 10:03 a.m. UTC | #1
On Fri, Mar 22, 2024, at 10:59, Jeff King wrote:
> On Tue, Mar 19, 2024 at 08:25:55PM -0400, Jeff King wrote:
>
>>   [1/6]: shortlog: stop setting pp.print_email_subject
>>   [2/6]: pretty: split oneline and email subject printing
>>   [3/6]: pretty: drop print_email_subject flag
>>   [4/6]: log: do not set up extra_headers for non-email formats
>>   [5/6]: format-patch: return an allocated string from log_write_email_headers()
>>   [6/6]: format-patch: simplify after-subject MIME header handling
>
> These patches introduce a small leak into format-patch. I didn't notice
> before because the "leaks" CI jobs were broken due to sanitizer problems
> in the base image (which now seem fixed?).
>
> Here's a fix that can go on top of jk/pretty-subject-cleanup. That topic
> is not in 'next' yet, so I could also re-roll. The issue was subtle
> enough that a separate commit is not such a bad thing, but I'm happy to
> squash it in if we'd prefer.
>
> -- >8 --
> Subject: [PATCH] format-patch: fix leak of empty header string
> [snip]

Hi Peff, and thanks a lot for making this series.

I’ll have a look at it this evening.

Thanks
Junio C Hamano March 22, 2024, 4:50 p.m. UTC | #2
Jeff King <peff@peff.net> writes:

> On Tue, Mar 19, 2024 at 08:25:55PM -0400, Jeff King wrote:
>
>>   [1/6]: shortlog: stop setting pp.print_email_subject
>>   [2/6]: pretty: split oneline and email subject printing
>>   [3/6]: pretty: drop print_email_subject flag
>>   [4/6]: log: do not set up extra_headers for non-email formats
>>   [5/6]: format-patch: return an allocated string from log_write_email_headers()
>>   [6/6]: format-patch: simplify after-subject MIME header handling
>
> These patches introduce a small leak into format-patch. I didn't notice
> before because the "leaks" CI jobs were broken due to sanitizer problems
> in the base image (which now seem fixed?).
>
> Here's a fix that can go on top of jk/pretty-subject-cleanup. That topic
> is not in 'next' yet, so I could also re-roll. The issue was subtle
> enough that a separate commit is not such a bad thing, but I'm happy to
> squash it in if we'd prefer.

Indeed it is subtle and I like the corner case described separately
like this one does.  Very much appreciated.

Thanks.

> -- >8 --
> Subject: [PATCH] format-patch: fix leak of empty header string
>
> The log_write_email_headers() function recently learned to return the
> "extra_headers_p" variable to the caller as an allocated string. We
> start by copying rev_info.extra_headers into a strbuf, and then detach
> the strbuf at the end of the function. If there are no extra headers, we
> leave the strbuf empty. Likewise, if there are no headers to return, we
> pass back NULL.
>
> This misses a corner case which can cause a leak. The "do we have any
> headers to copy" check is done by looking for a NULL opt->extra_headers.
> But the "do we have a non-empty string to return" check is done by
> checking the length of the strbuf. That means if opt->extra_headers is
> the empty string, we'll "copy" it into the strbuf, triggering an
> allocation, but then leak the buffer when we return NULL from the
> function.
>
> We can solve this in one of two ways:
>
>   1. Rather than checking headers->len at the end, we could check
>      headers->alloc to see if we allocated anything. That retains the
>      original behavior before the recent change, where an empty
>      extra_headers string is "passed through" to the caller. In practice
>      this doesn't matter, though (the code which eventually looks at the
>      result treats NULL or the empty string the same).
>
>   2. Only bother copying a non-empty string into the strbuf. This has
>      the added bonus of avoiding a pointless allocation.
>
>      Arguably strbuf_addstr() could do this optimization itself, though
>      it may be slightly dangerous to do so (some existing callers may
>      not get a fresh allocation when they expect to). In theory callers
>      are all supposed to use strbuf_detach() in such a case, but there's
>      no guarantee that this is the case.
>
> This patch uses option 2. Without it, building with SANITIZE=leak shows
> many errors in t4021 and elsewhere.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  log-tree.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/log-tree.c b/log-tree.c
> index eb2e841046..59eeaef1f7 100644
> --- a/log-tree.c
> +++ b/log-tree.c
> @@ -480,7 +480,7 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
>  
>  	*need_8bit_cte_p = 0; /* unknown */
>  
> -	if (opt->extra_headers)
> +	if (opt->extra_headers && *opt->extra_headers)
>  		strbuf_addstr(&headers, opt->extra_headers);
>  
>  	fprintf(opt->diffopt.file, "From %s Mon Sep 17 00:00:00 2001\n", name);
Kristoffer Haugsbakk March 22, 2024, 10:16 p.m. UTC | #3
On Fri, Mar 22, 2024, at 10:59, Jeff King wrote:
> On Tue, Mar 19, 2024 at 08:25:55PM -0400, Jeff King wrote:
>
>>   [1/6]: shortlog: stop setting pp.print_email_subject
>>   [2/6]: pretty: split oneline and email subject printing
>>   [3/6]: pretty: drop print_email_subject flag
>>   [4/6]: log: do not set up extra_headers for non-email formats
>>   [5/6]: format-patch: return an allocated string from log_write_email_headers()
>>   [6/6]: format-patch: simplify after-subject MIME header handling
>
> These patches introduce a small leak into format-patch. I didn't notice
> before because the "leaks" CI jobs were broken due to sanitizer problems
> in the base image (which now seem fixed?).
>
> Here's a fix that can go on top of jk/pretty-subject-cleanup. That topic
> is not in 'next' yet, so I could also re-roll. The issue was subtle
> enough that a separate commit is not such a bad thing, but I'm happy to
> squash it in if we'd prefer.
>
> -- >8 --
> Subject: [PATCH] format-patch: fix leak of empty header string
>
> The log_write_email_headers() function recently learned to return the
> "extra_headers_p" variable to the caller as an allocated string. We
> start by copying rev_info.extra_headers into a strbuf, and then detach
> the strbuf at the end of the function. If there are no extra headers, we
> leave the strbuf empty. Likewise, if there are no headers to return, we
> pass back NULL.
>
> This misses a corner case which can cause a leak. The "do we have any
> headers to copy" check is done by looking for a NULL opt->extra_headers.
> But the "do we have a non-empty string to return" check is done by
> checking the length of the strbuf. That means if opt->extra_headers is
> the empty string, we'll "copy" it into the strbuf, triggering an
> allocation, but then leak the buffer when we return NULL from the
> function.
>
> We can solve this in one of two ways:
>
>   1. Rather than checking headers->len at the end, we could check
>      headers->alloc to see if we allocated anything. That retains the
>      original behavior before the recent change, where an empty
>      extra_headers string is "passed through" to the caller. In practice
>      this doesn't matter, though (the code which eventually looks at the
>      result treats NULL or the empty string the same).
>
>   2. Only bother copying a non-empty string into the strbuf. This has
>      the added bonus of avoiding a pointless allocation.
>
>      Arguably strbuf_addstr() could do this optimization itself, though
>      it may be slightly dangerous to do so (some existing callers may
>      not get a fresh allocation when they expect to). In theory callers
>      are all supposed to use strbuf_detach() in such a case, but there's
>      no guarantee that this is the case.
>
> This patch uses option 2. Without it, building with SANITIZE=leak shows
> many errors in t4021 and elsewhere.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  log-tree.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/log-tree.c b/log-tree.c
> index eb2e841046..59eeaef1f7 100644
> --- a/log-tree.c
> +++ b/log-tree.c
> @@ -480,7 +480,7 @@ void log_write_email_headers(struct rev_info *opt,
> struct commit *commit,
>
>  	*need_8bit_cte_p = 0; /* unknown */
>
> -	if (opt->extra_headers)
> +	if (opt->extra_headers && *opt->extra_headers)
>  		strbuf_addstr(&headers, opt->extra_headers);
>
>  	fprintf(opt->diffopt.file, "From %s Mon Sep 17 00:00:00 2001\n", name);
> --
> 2.44.0.682.g01e1dab148

I was wondering if the new empty-string check now makes the condition
look non-obvious. I mean given that

• You explain how headers-to-copy-check and have-non-empty-string are
  not the same
• You explain how strbuf_addstr() could do this itself (which makes
  sense) but how it could be risky

The condition looks bare without a comment. But the empty-string check
of course makes sense without this context. And it could also be read as
an optimization (and not a leak fix).

And maybe most people just `git log -S'*opt->extra_headers'` if they
have questions in their head. So no information is really missing.
diff mbox series

Patch

diff --git a/log-tree.c b/log-tree.c
index eb2e841046..59eeaef1f7 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -480,7 +480,7 @@  void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 
 	*need_8bit_cte_p = 0; /* unknown */
 
-	if (opt->extra_headers)
+	if (opt->extra_headers && *opt->extra_headers)
 		strbuf_addstr(&headers, opt->extra_headers);
 
 	fprintf(opt->diffopt.file, "From %s Mon Sep 17 00:00:00 2001\n", name);