diff mbox series

[2/2] wrapper: use a CSPRNG to generate random file names

Message ID 20211116033542.3247094-3-sandals@crustytoothpaste.net (mailing list archive)
State New, archived
Headers show
Series Generate temporary files using a CSPRNG | expand

Commit Message

brian m. carlson Nov. 16, 2021, 3:35 a.m. UTC
The current way we generate random file names is by taking the seconds
and microseconds, plus the PID, and mixing them together, then encoding
them.  If this fails, we increment the value by 7777, and try again up
to TMP_MAX times.

Unfortunately, this is not the best idea from a security perspective.
If we're writing into TMPDIR, an attacker can guess these values easily
and prevent us from creating any temporary files at all by creating them
all first.  POSIX only requires TMP_MAX to be 25, so this is achievable
in some contexts, even if unlikely to occur in practice.

Fortunately, we can simply solve this by using the system
cryptographically secure pseudorandom number generator (CSPRNG) to
generate a random 64-bit value, and use that as before.  Note that there
is still a small bias here, but because a six-character sequence chosen
out of 62 characters provides about 36 bits of entropy, the bias here is
less than 2^-28, which is acceptable, especially considering we'll retry
several times.

Note that the use of a CSPRNG in generating temporary file names is also
used in many libcs.  glibc recently changed from an approach similar to
ours to using a CSPRNG, and FreeBSD and OpenBSD also use a CSPRNG in
this case.  Even if the likelihood of an attack is low, we should still
be at least as responsible in creating temporary files as libc is.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 wrapper.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

Comments

Jeff King Nov. 16, 2021, 3:36 p.m. UTC | #1
On Tue, Nov 16, 2021 at 03:35:42AM +0000, brian m. carlson wrote:

> The current way we generate random file names is by taking the seconds
> and microseconds, plus the PID, and mixing them together, then encoding
> them.  If this fails, we increment the value by 7777, and try again up
> to TMP_MAX times.
> 
> Unfortunately, this is not the best idea from a security perspective.
> If we're writing into TMPDIR, an attacker can guess these values easily
> and prevent us from creating any temporary files at all by creating them
> all first.  POSIX only requires TMP_MAX to be 25, so this is achievable
> in some contexts, even if unlikely to occur in practice.

I think we unconditionally define TMP_MAX as 16384. I don't think that
changes the fundamental issue that somebody could race us and win,
though.

> @@ -485,12 +483,13 @@ int git_mkstemps_mode(char *pattern, int suffix_len, int mode)
>  	 * Replace pattern's XXXXXX characters with randomness.
>  	 * Try TMP_MAX different filenames.
>  	 */
> -	gettimeofday(&tv, NULL);
> -	value = ((uint64_t)tv.tv_usec << 16) ^ tv.tv_sec ^ getpid();
>  	filename_template = &pattern[len - num_x - suffix_len];
>  	for (count = 0; count < TMP_MAX; ++count) {
> -		uint64_t v = value;
>  		int i;
> +		uint64_t v;
> +		if (csprng_bytes(&v, sizeof(v)) < 0)
> +			return -1;

If csprng_bytes() fail, the resulting errno is likely to be confusing.
E.g., if /dev/urandom doesn't exist we'd get ENOENT. But the caller is
likely to say something like:

  error: unable to create temporary file: no such file or directory

which is misleading. It's probably worth doing:

  return error_errno("unable to get random bytes for temporary file");

or similar here. That's verbose on top of the error that the caller will
give, but this is something we don't expect to fail in practice.

I actually wonder if we should simply die() in such a case. That's not
very friendly from a libification stand-point, but we really can't
progress on much without being able to generate random bytes.

-Peff
Taylor Blau Nov. 16, 2021, 6:28 p.m. UTC | #2
On Tue, Nov 16, 2021 at 10:36:51AM -0500, Jeff King wrote:
> On Tue, Nov 16, 2021 at 03:35:42AM +0000, brian m. carlson wrote:
>
> > The current way we generate random file names is by taking the seconds
> > and microseconds, plus the PID, and mixing them together, then encoding
> > them.  If this fails, we increment the value by 7777, and try again up
> > to TMP_MAX times.
> >
> > Unfortunately, this is not the best idea from a security perspective.
> > If we're writing into TMPDIR, an attacker can guess these values easily
> > and prevent us from creating any temporary files at all by creating them
> > all first.  POSIX only requires TMP_MAX to be 25, so this is achievable
> > in some contexts, even if unlikely to occur in practice.
>
> I think we unconditionally define TMP_MAX as 16384. I don't think that
> changes the fundamental issue that somebody could race us and win,
> though.

Yes, we do. Right above the declaration of this function (and so hidden
from the context) we do:

    #undef TMP_MAX
    #define TMP_MAX 16384

I don't think that the value of TMP_MAX makes this substantially less
likely, so I agree that the fundamental issue is the same.

> > @@ -485,12 +483,13 @@ int git_mkstemps_mode(char *pattern, int suffix_len, int mode)
> >  	 * Replace pattern's XXXXXX characters with randomness.
> >  	 * Try TMP_MAX different filenames.
> >  	 */
> > -	gettimeofday(&tv, NULL);
> > -	value = ((uint64_t)tv.tv_usec << 16) ^ tv.tv_sec ^ getpid();
> >  	filename_template = &pattern[len - num_x - suffix_len];
> >  	for (count = 0; count < TMP_MAX; ++count) {
> > -		uint64_t v = value;
> >  		int i;
> > +		uint64_t v;
> > +		if (csprng_bytes(&v, sizeof(v)) < 0)
> > +			return -1;
>
> If csprng_bytes() fail, the resulting errno is likely to be confusing.
> E.g., if /dev/urandom doesn't exist we'd get ENOENT. But the caller is
> likely to say something like:
>
>   error: unable to create temporary file: no such file or directory
>
> which is misleading. It's probably worth doing:
>
>   return error_errno("unable to get random bytes for temporary file");
>
> or similar here. That's verbose on top of the error that the caller will
> give, but this is something we don't expect to fail in practice.
>
> I actually wonder if we should simply die() in such a case. That's not
> very friendly from a libification stand-point, but we really can't
> progress on much without being able to generate random bytes.

Alternatively, we could fall back to the existing code paths. This is
somewhat connected to my suggestion to Randall earlier in the thread.
But I would rather see that fallback done at compile-time for platforms
that don't give us an easy-to-use CSPRNG, and avoid masking legitimate
errors caused from trying to use a CSPRNG that should exist.

Thanks,
Taylor
Junio C Hamano Nov. 16, 2021, 6:57 p.m. UTC | #3
Taylor Blau <me@ttaylorr.com> writes:

>> I actually wonder if we should simply die() in such a case. That's not
>> very friendly from a libification stand-point, but we really can't
>> progress on much without being able to generate random bytes.
>
> Alternatively, we could fall back to the existing code paths. This is
> somewhat connected to my suggestion to Randall earlier in the thread.
> But I would rather see that fallback done at compile-time for platforms
> that don't give us an easy-to-use CSPRNG, and avoid masking legitimate
> errors caused from trying to use a CSPRNG that should exist.

Yeah, I do not think we are doing this because the current code is
completely broken and everybody needs to move to CSPRNG that makes
it absoletely safe---rather this is still just making it safer than
the current code, when system support is available.  So a fallback
to the current code would be a good (and easy) thing to have, I
would think.

Thanks.
Jeff King Nov. 16, 2021, 7:21 p.m. UTC | #4
On Tue, Nov 16, 2021 at 10:57:28AM -0800, Junio C Hamano wrote:

> Taylor Blau <me@ttaylorr.com> writes:
> 
> >> I actually wonder if we should simply die() in such a case. That's not
> >> very friendly from a libification stand-point, but we really can't
> >> progress on much without being able to generate random bytes.
> >
> > Alternatively, we could fall back to the existing code paths. This is
> > somewhat connected to my suggestion to Randall earlier in the thread.
> > But I would rather see that fallback done at compile-time for platforms
> > that don't give us an easy-to-use CSPRNG, and avoid masking legitimate
> > errors caused from trying to use a CSPRNG that should exist.
> 
> Yeah, I do not think we are doing this because the current code is
> completely broken and everybody needs to move to CSPRNG that makes
> it absoletely safe---rather this is still just making it safer than
> the current code, when system support is available.  So a fallback
> to the current code would be a good (and easy) thing to have, I
> would think.

One challenge for any fallback is that there are security implications.
In particular:

  - the fallback probably needs to be specific to the mktemp code; we
    don't have any callers yet of csprng_bytes(), but anybody using it
    for, say, actual cryptography would be very unhappy if it quietly
    fell back to insecure bytes.

    (I don't have any plans to use it and we don't do very much actual
    crypto ourselves, but one place that _could_ use it is the
    generation of the push-cert nonce seed).

  - I'm not sure if we should fallback for runtime errors or not. E.g.,
    if we try to open /dev/urandom and it isn't there, is it OK to fall
    back to the older, less-secure tempfile method? That's convenient in
    some sense; Git continues to work inside a chroot for which you
    haven't set up /dev/urandom. But it may also be surprising, and
    erring on the side of doing the less secure thing is probably a bad
    idea.

    So the mktemp code probably needs to be aware of the difference
    between "we have no CSPRNG source" and "we were compiled with
    support for a source, but it didn't work".

-Peff
Taylor Blau Nov. 16, 2021, 7:33 p.m. UTC | #5
On Tue, Nov 16, 2021 at 02:21:22PM -0500, Jeff King wrote:
> On Tue, Nov 16, 2021 at 10:57:28AM -0800, Junio C Hamano wrote:
>
> > Taylor Blau <me@ttaylorr.com> writes:
> >
> > >> I actually wonder if we should simply die() in such a case. That's not
> > >> very friendly from a libification stand-point, but we really can't
> > >> progress on much without being able to generate random bytes.
> > >
> > > Alternatively, we could fall back to the existing code paths. This is
> > > somewhat connected to my suggestion to Randall earlier in the thread.
> > > But I would rather see that fallback done at compile-time for platforms
> > > that don't give us an easy-to-use CSPRNG, and avoid masking legitimate
> > > errors caused from trying to use a CSPRNG that should exist.
> >
> > Yeah, I do not think we are doing this because the current code is
> > completely broken and everybody needs to move to CSPRNG that makes
> > it absoletely safe---rather this is still just making it safer than
> > the current code, when system support is available.  So a fallback
> > to the current code would be a good (and easy) thing to have, I
> > would think.
>
> One challenge for any fallback is that there are security implications.
> In particular:
>
>   - the fallback probably needs to be specific to the mktemp code; we
>     don't have any callers yet of csprng_bytes(), but anybody using it
>     for, say, actual cryptography would be very unhappy if it quietly
>     fell back to insecure bytes.
>
>     (I don't have any plans to use it and we don't do very much actual
>     crypto ourselves, but one place that _could_ use it is the
>     generation of the push-cert nonce seed).
>
>   - I'm not sure if we should fallback for runtime errors or not. E.g.,
>     if we try to open /dev/urandom and it isn't there, is it OK to fall
>     back to the older, less-secure tempfile method? That's convenient in
>     some sense; Git continues to work inside a chroot for which you
>     haven't set up /dev/urandom. But it may also be surprising, and
>     erring on the side of doing the less secure thing is probably a bad
>     idea.
>
>     So the mktemp code probably needs to be aware of the difference
>     between "we have no CSPRNG source" and "we were compiled with
>     support for a source, but it didn't work".

My opinion is that we should probably not fallback for runtime errors
where we do have a CSPRNG and any errors trying to use it are
legitimate.

I would probably have csprng_bytes() itself only be compiled where we
know we have a CSPRNG. And then I think our implementation of
git_mkstemps_mode() would depend on whether csprng_bytes() was compiled
or not. If it was, then any errors returned by it are propagated to the
caller (or we call die()). If not, then we use the existing, insecure
implementation.

And I think that basically addresses both of your points, namely that
the fallback is specific to the mktemp code, and provides one opinion on
the matter of runtime errors.

Thanks,
Taylor
diff mbox series

Patch

diff --git a/wrapper.c b/wrapper.c
index 0046f32e46..0cdb5b18ff 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -463,8 +463,6 @@  int git_mkstemps_mode(char *pattern, int suffix_len, int mode)
 	static const int num_letters = ARRAY_SIZE(letters) - 1;
 	static const char x_pattern[] = "XXXXXX";
 	static const int num_x = ARRAY_SIZE(x_pattern) - 1;
-	uint64_t value;
-	struct timeval tv;
 	char *filename_template;
 	size_t len;
 	int fd, count;
@@ -485,12 +483,13 @@  int git_mkstemps_mode(char *pattern, int suffix_len, int mode)
 	 * Replace pattern's XXXXXX characters with randomness.
 	 * Try TMP_MAX different filenames.
 	 */
-	gettimeofday(&tv, NULL);
-	value = ((uint64_t)tv.tv_usec << 16) ^ tv.tv_sec ^ getpid();
 	filename_template = &pattern[len - num_x - suffix_len];
 	for (count = 0; count < TMP_MAX; ++count) {
-		uint64_t v = value;
 		int i;
+		uint64_t v;
+		if (csprng_bytes(&v, sizeof(v)) < 0)
+			return -1;
+
 		/* Fill in the random bits. */
 		for (i = 0; i < num_x; i++) {
 			filename_template[i] = letters[v % num_letters];
@@ -506,12 +505,6 @@  int git_mkstemps_mode(char *pattern, int suffix_len, int mode)
 		 */
 		if (errno != EEXIST)
 			break;
-		/*
-		 * This is a random value.  It is only necessary that
-		 * the next TMP_MAX values generated by adding 7777 to
-		 * VALUE are different with (module 2^32).
-		 */
-		value += 7777;
 	}
 	/* We return the null string if we can't find a unique file name.  */
 	pattern[0] = '\0';