diff mbox series

[2/6] commit: copy saved getenv() result

Message ID 20190111221539.GB10188@sigill.intra.peff.net (mailing list archive)
State New, archived
Headers show
Series getenv() timing fixes | expand

Commit Message

Jeff King Jan. 11, 2019, 10:15 p.m. UTC
We save the result of $GIT_INDEX_FILE so that we can restore it after
setting it to a new value and running add--interactive. However, the
pointer returned by getenv() is not guaranteed to be valid after calling
setenv(). This _usually_ works fine, but can fail if libc needs to
reallocate the environment block during the setenv().

Let's just duplicate the string, so we know that it remains valid.

In the long run it may be more robust to teach interactive_add() to take
a set of environment variables to pass along to run-command when it
execs add--interactive. And then we would not have to do this
save/restore dance at all. But this is an easy fix in the meantime.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/commit.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Junio C Hamano Jan. 12, 2019, 3:07 a.m. UTC | #1
Jeff King <peff@peff.net> writes:

> We save the result of $GIT_INDEX_FILE so that we can restore it after
> setting it to a new value and running add--interactive. However, the
> pointer returned by getenv() is not guaranteed to be valid after calling
> setenv(). This _usually_ works fine, but can fail if libc needs to
> reallocate the environment block during the setenv().
>
> Let's just duplicate the string, so we know that it remains valid.
>
> In the long run it may be more robust to teach interactive_add() to take
> a set of environment variables to pass along to run-command when it
> execs add--interactive. And then we would not have to do this
> save/restore dance at all. But this is an easy fix in the meantime.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  builtin/commit.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/commit.c b/builtin/commit.c
> index 004b816635..7d2e0b61e5 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -351,7 +351,7 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix
>  		if (write_locked_index(&the_index, &index_lock, 0))
>  			die(_("unable to create temporary index"));
>  
> -		old_index_env = getenv(INDEX_ENVIRONMENT);
> +		old_index_env = xstrdup_or_null(getenv(INDEX_ENVIRONMENT));
>  		setenv(INDEX_ENVIRONMENT, get_lock_file_path(&index_lock), 1);
>  
>  		if (interactive_add(argc, argv, prefix, patch_interactive) != 0)
> @@ -361,6 +361,7 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix
>  			setenv(INDEX_ENVIRONMENT, old_index_env, 1);
>  		else
>  			unsetenv(INDEX_ENVIRONMENT);
> +		FREE_AND_NULL(old_index_env);
>  
>  		discard_cache();
>  		read_cache_from(get_lock_file_path(&index_lock));

Even though it is not wrong per-se to assign a NULL to the
now-no-longer-referenced variable, I do not quite get why it is
free-and-null, not a straight free.  This may be a taste-thing,
though.

Even if a future update needs to make it possible to access
old_index_env somewhere in the block after discard_cache() gets
called, we would need to push down the free (or free-and-null) to
prolong its lifetime a bit anyway, so...
Jeff King Jan. 12, 2019, 10:26 a.m. UTC | #2
On Fri, Jan 11, 2019 at 07:07:15PM -0800, Junio C Hamano wrote:

> > diff --git a/builtin/commit.c b/builtin/commit.c
> > index 004b816635..7d2e0b61e5 100644
> > --- a/builtin/commit.c
> > +++ b/builtin/commit.c
> > @@ -351,7 +351,7 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix
> >  		if (write_locked_index(&the_index, &index_lock, 0))
> >  			die(_("unable to create temporary index"));
> >  
> > -		old_index_env = getenv(INDEX_ENVIRONMENT);
> > +		old_index_env = xstrdup_or_null(getenv(INDEX_ENVIRONMENT));
> >  		setenv(INDEX_ENVIRONMENT, get_lock_file_path(&index_lock), 1);
> >  
> >  		if (interactive_add(argc, argv, prefix, patch_interactive) != 0)
> > @@ -361,6 +361,7 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix
> >  			setenv(INDEX_ENVIRONMENT, old_index_env, 1);
> >  		else
> >  			unsetenv(INDEX_ENVIRONMENT);
> > +		FREE_AND_NULL(old_index_env);
> >  
> >  		discard_cache();
> >  		read_cache_from(get_lock_file_path(&index_lock));
> 
> Even though it is not wrong per-se to assign a NULL to the
> now-no-longer-referenced variable, I do not quite get why it is
> free-and-null, not a straight free.  This may be a taste-thing,
> though.
> 
> Even if a future update needs to make it possible to access
> old_index_env somewhere in the block after discard_cache() gets
> called, we would need to push down the free (or free-and-null) to
> prolong its lifetime a bit anyway, so...

My thinking was that if we simply call free(), then the variable is left
as a dangling pointer for the rest of the function, making it easy to
accidentally use-after-free.

But certainly it would not be the first such instance in our code base.
In theory a static analyzer should easily be able to figure out such a
problem, too, so maybe it is not worth being defensive about.

-Peff
Johannes Schindelin Jan. 15, 2019, 2:05 p.m. UTC | #3
Hi Peff,

On Sat, 12 Jan 2019, Jeff King wrote:

> On Fri, Jan 11, 2019 at 07:07:15PM -0800, Junio C Hamano wrote:
> 
> > > diff --git a/builtin/commit.c b/builtin/commit.c
> > > index 004b816635..7d2e0b61e5 100644
> > > --- a/builtin/commit.c
> > > +++ b/builtin/commit.c
> > > @@ -351,7 +351,7 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix
> > >  		if (write_locked_index(&the_index, &index_lock, 0))
> > >  			die(_("unable to create temporary index"));
> > >  
> > > -		old_index_env = getenv(INDEX_ENVIRONMENT);
> > > +		old_index_env = xstrdup_or_null(getenv(INDEX_ENVIRONMENT));
> > >  		setenv(INDEX_ENVIRONMENT, get_lock_file_path(&index_lock), 1);
> > >  
> > >  		if (interactive_add(argc, argv, prefix, patch_interactive) != 0)
> > > @@ -361,6 +361,7 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix
> > >  			setenv(INDEX_ENVIRONMENT, old_index_env, 1);
> > >  		else
> > >  			unsetenv(INDEX_ENVIRONMENT);
> > > +		FREE_AND_NULL(old_index_env);
> > >  
> > >  		discard_cache();
> > >  		read_cache_from(get_lock_file_path(&index_lock));
> > 
> > Even though it is not wrong per-se to assign a NULL to the
> > now-no-longer-referenced variable, I do not quite get why it is
> > free-and-null, not a straight free.  This may be a taste-thing,
> > though.
> > 
> > Even if a future update needs to make it possible to access
> > old_index_env somewhere in the block after discard_cache() gets
> > called, we would need to push down the free (or free-and-null) to
> > prolong its lifetime a bit anyway, so...
> 
> My thinking was that if we simply call free(), then the variable is left
> as a dangling pointer for the rest of the function, making it easy to
> accidentally use-after-free.

FWIW I thought that was your reasoning (and did not think of asking you
about it) and totally agree with it.

It is *too* easy not to realize that the `free()` call needs to be moved,
but a segmentation fault is a very strong indicator that it should be
moved.

> But certainly it would not be the first such instance in our code base.

Just because a lot of our code has grown historically does not mean that
we need to add code that shares the same shortcomings. FREE_AND_NULL() was
not available for a long time, after all, so it is understandable that we
did not use it back then. But it is available now, so we no longer have an
excuse to add less defensive code.

> In theory a static analyzer should easily be able to figure out such a
> problem, too, so maybe it is not worth being defensive about.

How often do you run a static analyzer?

My point being: if we can prevent future mistakes easily, and it does not
add too much code churn, why not just do it. No need to rely on fancy
stuff that might not even be available on your preferred platform.

Thanks,
Dscho
Jeff King Jan. 15, 2019, 7:17 p.m. UTC | #4
On Tue, Jan 15, 2019 at 03:05:50PM +0100, Johannes Schindelin wrote:

> > But certainly it would not be the first such instance in our code base.
> 
> Just because a lot of our code has grown historically does not mean that
> we need to add code that shares the same shortcomings. FREE_AND_NULL() was
> not available for a long time, after all, so it is understandable that we
> did not use it back then. But it is available now, so we no longer have an
> excuse to add less defensive code.

Fair enough. I am happy to start using FREE_AND_NULL() consistently if
nobody things it's too opaque (or that it creates confusion that we
somehow expect to look at the variable again and need it to be NULL). I
think the compiler should generally be able to optimize out the NULL
assignment in most cases anyway.

> > In theory a static analyzer should easily be able to figure out such a
> > problem, too, so maybe it is not worth being defensive about.
> 
> How often do you run a static analyzer?

Stefan was routinely running coverity, though I haven't seen results in
a while. I think we should make sure that continues, as it did turn up
some useful results (and a lot of cruft, too, but on the whole I have
found it useful).

I also count gcc as a static analyzer, since it can and does point out
many simple problems. I don't know that it can point out an obvious
user-after-free like this, though.

That said....

> My point being: if we can prevent future mistakes easily, and it does not
> add too much code churn, why not just do it. No need to rely on fancy
> stuff that might not even be available on your preferred platform.

Yeah, I do agree (which is why I included it in the patch ;) ).

-Peff
Stefan Beller Jan. 15, 2019, 7:25 p.m. UTC | #5
> Stefan was routinely running coverity, though I haven't seen results in
> a while. I think we should make sure that continues, as it did turn up
> some useful results (and a lot of cruft, too, but on the whole I have
> found it useful).

coverity had some outage (end of last year?-ish) and then changed the
way it dealt with automated uploads IIRC.
I have not looked into redoing the automation again since then.

Since 7th of Jan, they seem to have issues with hosting (for everyone
or just the open source projects?)
https://community.synopsys.com/s/article/Coverity-Scan-Update

For reference, the script that used to work is at
https://github.com/stefanbeller/git/commit/039be8078bb0379db271135e0c0d7315c34fe243
(which is on the `coverity` branch of that repo)
Jeff King Jan. 15, 2019, 7:32 p.m. UTC | #6
On Tue, Jan 15, 2019 at 11:25:45AM -0800, Stefan Beller wrote:

> > Stefan was routinely running coverity, though I haven't seen results in
> > a while. I think we should make sure that continues, as it did turn up
> > some useful results (and a lot of cruft, too, but on the whole I have
> > found it useful).
> 
> coverity had some outage (end of last year?-ish) and then changed the
> way it dealt with automated uploads IIRC.
> I have not looked into redoing the automation again since then.
> 
> Since 7th of Jan, they seem to have issues with hosting (for everyone
> or just the open source projects?)
> https://community.synopsys.com/s/article/Coverity-Scan-Update

Yuck. :(

> For reference, the script that used to work is at
> https://github.com/stefanbeller/git/commit/039be8078bb0379db271135e0c0d7315c34fe243
> (which is on the `coverity` branch of that repo)

Thanks for the update. I may take a look at trying to make this work
again at some point (but I won't be sad if somebody else gets to it
first).

-Peff
Johannes Schindelin Jan. 16, 2019, 2:06 p.m. UTC | #7
Hi Peff,

On Tue, 15 Jan 2019, Jeff King wrote:

> On Tue, Jan 15, 2019 at 11:25:45AM -0800, Stefan Beller wrote:
> 
> > > Stefan was routinely running coverity, though I haven't seen results in
> > > a while. I think we should make sure that continues, as it did turn up
> > > some useful results (and a lot of cruft, too, but on the whole I have
> > > found it useful).
> > 
> > coverity had some outage (end of last year?-ish) and then changed the
> > way it dealt with automated uploads IIRC.
> > I have not looked into redoing the automation again since then.
> > 
> > Since 7th of Jan, they seem to have issues with hosting (for everyone
> > or just the open source projects?)
> > https://community.synopsys.com/s/article/Coverity-Scan-Update
> 
> Yuck. :(

What a timely coincidence to corroborate my argument, eh?

Ciao,
Dscho
diff mbox series

Patch

diff --git a/builtin/commit.c b/builtin/commit.c
index 004b816635..7d2e0b61e5 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -351,7 +351,7 @@  static const char *prepare_index(int argc, const char **argv, const char *prefix
 		if (write_locked_index(&the_index, &index_lock, 0))
 			die(_("unable to create temporary index"));
 
-		old_index_env = getenv(INDEX_ENVIRONMENT);
+		old_index_env = xstrdup_or_null(getenv(INDEX_ENVIRONMENT));
 		setenv(INDEX_ENVIRONMENT, get_lock_file_path(&index_lock), 1);
 
 		if (interactive_add(argc, argv, prefix, patch_interactive) != 0)
@@ -361,6 +361,7 @@  static const char *prepare_index(int argc, const char **argv, const char *prefix
 			setenv(INDEX_ENVIRONMENT, old_index_env, 1);
 		else
 			unsetenv(INDEX_ENVIRONMENT);
+		FREE_AND_NULL(old_index_env);
 
 		discard_cache();
 		read_cache_from(get_lock_file_path(&index_lock));