Message ID | 20220117215617.843190-3-sandals@crustytoothpaste.net (mailing list archive) |
---|---|
State | Accepted |
Commit | 47efda967cfd4ef9d39de149e1e3654b051e5d19 |
Headers | show |
Series | Generate temporary files using a CSPRNG | expand |
On Mon, Jan 17 2022, brian m. carlson wrote: > The current way we generate random file names is by taking the seconds > and microseconds, plus the PID, and mixing them together, then encoding > them. If this fails, we increment the value by 7777, and try again up > to TMP_MAX times. > > Unfortunately, this is not the best idea from a security perspective. > If we're writing into TMPDIR, an attacker can guess these values easily > and prevent us from creating any temporary files at all by creating them > all first. Even though we set TMP_MAX to 16384, this may be achievable > in some contexts, even if unlikely to occur in practice. > [...] I had a comment on v1[1] of this series which still applies here, i.e. the "if we're writing into TMPDIR[...]" here elides the fact that much of the time we're writing a tempfile into .git, so the security issue ostensibly being solved here won't be a practical issue at all. Then for out-of-repo tempfiles some OS's have a per-user $TEMPDIR you can use (e.g. systemd's /run/user/`id -u`). Finally... > Note that the use of a CSPRNG in generating temporary file names is also > used in many libcs. glibc recently changed from an approach similar to > ours to using a CSPRNG, and FreeBSD and OpenBSD also use a CSPRNG in > this case. Even if the likelihood of an attack is low, we should still > be at least as responsible in creating temporary files as libc is. ...we already have in-tree users of mkstemp(), which on glibc ostensibly tries to solve the same security issues you note here, and the reftable/* user has been added since earlier iterations of this series: o $ git grep -E '\bmkstemp\(' -- '*.[ch]' compat/mingw.c:int mkstemp(char *template) compat/mingw.h:int mkstemp(char *template); entry.c: return mkstemp(path); reftable/stack.c: tab_fd = mkstemp(temp_tab_file_name.buf); reftable/stack.c: tab_fd = mkstemp(temp_tab->buf); reftable/stack_test.c: int fd = mkstemp(fn); wrapper.c: fd = mkstemp(filename_template); This series really feels like it's adding too much complexity and potential auditing headaches (distributors worrying about us shipping a CSPRNG, having to audit it) to a low-level codepath that most of the time won't need this at all. So instead of: A. Add CSPRNG with demo test helper B. Use it in git_mkstemps_mode() I'd think we'd be much better off with: A. Split out callers of tempfile.c and mk.*temp in wrapper.c that create tempfiles in .git B. <the rest> I honestly haven't looked too much at what <the rest> is, other than what I wrote in [1], which seems to suggest that most of our codepaths won't need this. I'd also think that given the reference to CSPRNG in e.g. some glibc versions that instead of the ifdefs in csprng_bytes() we should instead directly use a secure mkstemp() (or similar) for the not-.git cases that remain after the "mktemp in a dir we chown" v.s. "mktemp in shared /tmp" are split up. Maybe these are all things you looked at and considered, but from my recollection (I didn't go back and re-read the whole discussion) you didn't chime in on this point, so *bump* :) 1. https://lore.kernel.org/git/211116.864k8bq0xm.gmgdl@evledraar.gmail.com/
Am 18.01.22 um 10:24 schrieb Ævar Arnfjörð Bjarmason: > > On Mon, Jan 17 2022, brian m. carlson wrote: > >> The current way we generate random file names is by taking the seconds >> and microseconds, plus the PID, and mixing them together, then encoding >> them. If this fails, we increment the value by 7777, and try again up >> to TMP_MAX times. >> >> Unfortunately, this is not the best idea from a security perspective. >> If we're writing into TMPDIR, an attacker can guess these values easily >> and prevent us from creating any temporary files at all by creating them >> all first. Even though we set TMP_MAX to 16384, this may be achievable >> in some contexts, even if unlikely to occur in practice. >> [...] > > I had a comment on v1[1] of this series which still applies here, > i.e. the "if we're writing into TMPDIR[...]" here elides the fact that > much of the time we're writing a tempfile into .git, so the security > issue ostensibly being solved here won't be a practical issue at all. > > Then for out-of-repo tempfiles some OS's have a per-user $TEMPDIR you > can use (e.g. systemd's /run/user/`id -u`). Finally... > >> Note that the use of a CSPRNG in generating temporary file names is also >> used in many libcs. glibc recently changed from an approach similar to >> ours to using a CSPRNG, and FreeBSD and OpenBSD also use a CSPRNG in >> this case. Even if the likelihood of an attack is low, we should still >> be at least as responsible in creating temporary files as libc is. > > ...we already have in-tree users of mkstemp(), which on glibc ostensibly > tries to solve the same security issues you note here, and the > reftable/* user has been added since earlier iterations of this series: > o > $ git grep -E '\bmkstemp\(' -- '*.[ch]' > compat/mingw.c:int mkstemp(char *template) > compat/mingw.h:int mkstemp(char *template); > entry.c: return mkstemp(path); > reftable/stack.c: tab_fd = mkstemp(temp_tab_file_name.buf); > reftable/stack.c: tab_fd = mkstemp(temp_tab->buf); > reftable/stack_test.c: int fd = mkstemp(fn); > wrapper.c: fd = mkstemp(filename_template); > > This series really feels like it's adding too much complexity and > potential auditing headaches (distributors worrying about us shipping a > CSPRNG, having to audit it) to a low-level codepath that most of the > time won't need this at all. Good point. Please let me think out loud for a moment. mkstemp() is secure (right?) and used already. mkstemps() was used as well until b2d593a779 (wrapper.c: remove unused gitmkstemps() function, 2017-02-28). What's the difference? The former requires the random characters at the end (e.g. "name-XXXXXX"), while the latter allows a suffix (e.g. "name-XXXXXX.ext"); that's what the added "s" in the name means, I guess. And apparently mkstemp is everywhere, but mkstemps is a GNU extension. git_mkstemps_mode is used by mks_tempfile_sm, mks_tempfile_tsm and git_mkstemp_mode. The latter uses no suffix, so could be implemented using mkstemp and fchmod instead. mks_tempfile_sm is called by write_locked_index, mks_tempfile_s, mks_tempfile_m and mks_tempfile. Only mks_tempfile_s uses a suffix, but is itself unused. So an implementation based on mkstemp and fchmod would suffice for mks_tempfile_sm users as well. mks_tempfile_tsm is used by mks_tempfile_ts, mks_tempfile_tm and mks_tempfile_t. Only mks_tempfile_ts uses a suffix. Its only caller is diff.c::prep_temp_blob, called only by diff.c::prepare_temp_file, called by add_external_diff_name and run_textconv in the same file. So if I didn't take a wrong turn somewhere the only temporary file name templates with suffixes are those for external diffs and textconv filters. The rest of the git_mkstemps_mode users could be switched to mkstemp+fchmod. Temporary files for external diffs and textconv filters *should* be placed in $TMPDIR. Do they need suffixes? I guess for testconv filters it doesn't matter, but graphical diff viewers might do syntax highlighting based on the extension. Can we get extensions without mktemps or git_mkstemps_mode? Yes, by using mkdtemp to create a temporary directory with a random name and creating the files in it. There already are mkdtemp users. So AFAICS all use cases of git_mkstemps_mode can be served by mkstemp+fchmod or mkdtemp. Would that help? > So instead of: > > A. Add CSPRNG with demo test helper > B. Use it in git_mkstemps_mode() > > I'd think we'd be much better off with: > > A. Split out callers of tempfile.c and mk.*temp in wrapper.c that create tempfiles in .git > B. <the rest> > > I honestly haven't looked too much at what <the rest> is, other than > what I wrote in [1], which seems to suggest that most of our codepaths > won't need this. > > I'd also think that given the reference to CSPRNG in e.g. some glibc > versions that instead of the ifdefs in csprng_bytes() we should instead > directly use a secure mkstemp() (or similar) for the not-.git cases that > remain after the "mktemp in a dir we chown" v.s. "mktemp in shared /tmp" > are split up. > > Maybe these are all things you looked at and considered, but from my > recollection (I didn't go back and re-read the whole discussion) you > didn't chime in on this point, so *bump* :) > > 1. https://lore.kernel.org/git/211116.864k8bq0xm.gmgdl@evledraar.gmail.com/
On Tue, Jan 18 2022, René Scharfe wrote: > Am 18.01.22 um 10:24 schrieb Ævar Arnfjörð Bjarmason: >> >> On Mon, Jan 17 2022, brian m. carlson wrote: >> >>> The current way we generate random file names is by taking the seconds >>> and microseconds, plus the PID, and mixing them together, then encoding >>> them. If this fails, we increment the value by 7777, and try again up >>> to TMP_MAX times. >>> >>> Unfortunately, this is not the best idea from a security perspective. >>> If we're writing into TMPDIR, an attacker can guess these values easily >>> and prevent us from creating any temporary files at all by creating them >>> all first. Even though we set TMP_MAX to 16384, this may be achievable >>> in some contexts, even if unlikely to occur in practice. >>> [...] >> >> I had a comment on v1[1] of this series which still applies here, >> i.e. the "if we're writing into TMPDIR[...]" here elides the fact that >> much of the time we're writing a tempfile into .git, so the security >> issue ostensibly being solved here won't be a practical issue at all. >> >> Then for out-of-repo tempfiles some OS's have a per-user $TEMPDIR you >> can use (e.g. systemd's /run/user/`id -u`). Finally... >> >>> Note that the use of a CSPRNG in generating temporary file names is also >>> used in many libcs. glibc recently changed from an approach similar to >>> ours to using a CSPRNG, and FreeBSD and OpenBSD also use a CSPRNG in >>> this case. Even if the likelihood of an attack is low, we should still >>> be at least as responsible in creating temporary files as libc is. >> >> ...we already have in-tree users of mkstemp(), which on glibc ostensibly >> tries to solve the same security issues you note here, and the >> reftable/* user has been added since earlier iterations of this series: >> o >> $ git grep -E '\bmkstemp\(' -- '*.[ch]' >> compat/mingw.c:int mkstemp(char *template) >> compat/mingw.h:int mkstemp(char *template); >> entry.c: return mkstemp(path); >> reftable/stack.c: tab_fd = mkstemp(temp_tab_file_name.buf); >> reftable/stack.c: tab_fd = mkstemp(temp_tab->buf); >> reftable/stack_test.c: int fd = mkstemp(fn); >> wrapper.c: fd = mkstemp(filename_template); >> >> This series really feels like it's adding too much complexity and >> potential auditing headaches (distributors worrying about us shipping a >> CSPRNG, having to audit it) to a low-level codepath that most of the >> time won't need this at all. > > Good point. Please let me think out loud for a moment. > > mkstemp() is secure (right?) and used already. mkstemps() was used as > well until b2d593a779 (wrapper.c: remove unused gitmkstemps() function, > 2017-02-28). What's the difference? The former requires the random > characters at the end (e.g. "name-XXXXXX"), while the latter allows a > suffix (e.g. "name-XXXXXX.ext"); that's what the added "s" in the name > means, I guess. And apparently mkstemp is everywhere, but mkstemps is > a GNU extension. > > git_mkstemps_mode is used by mks_tempfile_sm, mks_tempfile_tsm and > git_mkstemp_mode. The latter uses no suffix, so could be implemented > using mkstemp and fchmod instead. > > mks_tempfile_sm is called by write_locked_index, mks_tempfile_s, > mks_tempfile_m and mks_tempfile. Only mks_tempfile_s uses a suffix, > but is itself unused. So an implementation based on mkstemp and fchmod > would suffice for mks_tempfile_sm users as well. > > mks_tempfile_tsm is used by mks_tempfile_ts, mks_tempfile_tm and > mks_tempfile_t. Only mks_tempfile_ts uses a suffix. Its only caller > is diff.c::prep_temp_blob, called only by diff.c::prepare_temp_file, > called by add_external_diff_name and run_textconv in the same file. > > So if I didn't take a wrong turn somewhere the only temporary file name > templates with suffixes are those for external diffs and textconv > filters. The rest of the git_mkstemps_mode users could be switched to > mkstemp+fchmod. > > Temporary files for external diffs and textconv filters *should* be > placed in $TMPDIR. Do they need suffixes? I guess for testconv > filters it doesn't matter, but graphical diff viewers might do syntax > highlighting based on the extension. > > Can we get extensions without mktemps or git_mkstemps_mode? Yes, by > using mkdtemp to create a temporary directory with a random name and > creating the files in it. There already are mkdtemp users. > > So AFAICS all use cases of git_mkstemps_mode can be served by > mkstemp+fchmod or mkdtemp. Would that help? That seems sensible, as a further practical suggestion it seems like we'll retry around 16k times to create these files on failure, perhaps trying with a custom extension 8k times would be OK, followed by the minor UI degradation of doing the final 8k retries with the more-random OS-provided extension-less variant. More generally I'm still not sure if this is still a purely hypothetical attack mitigation, or whether there are actually users out there that have to deal with this. Wouldn't something like this also be an acceptable "solution" to this class of problem? #define TMP_MAX 1024 [...] if (count >= TMP_MAX) die(_("we tried and failed to create a tempfile using the '%s' template after %d tries!\n\n" "Is someone actively DoS you? Is your sysadmin known to be your mortal enemy?\n" "are you using Satan's shared hosting services? In any case, we give up!\n\n" "To \"retry\" set TEMPDIR to some location where other users won't race us to death"), template, count); For a bonus grade we could add a few more lines and try to stat() some of the files we failed to create, and report what UID it is that's racing you for all of these tempfile creations. >> So instead of: >> >> A. Add CSPRNG with demo test helper >> B. Use it in git_mkstemps_mode() >> >> I'd think we'd be much better off with: >> >> A. Split out callers of tempfile.c and mk.*temp in wrapper.c that create tempfiles in .git >> B. <the rest> >> >> I honestly haven't looked too much at what <the rest> is, other than >> what I wrote in [1], which seems to suggest that most of our codepaths >> won't need this. >> >> I'd also think that given the reference to CSPRNG in e.g. some glibc >> versions that instead of the ifdefs in csprng_bytes() we should instead >> directly use a secure mkstemp() (or similar) for the not-.git cases that >> remain after the "mktemp in a dir we chown" v.s. "mktemp in shared /tmp" >> are split up. >> >> Maybe these are all things you looked at and considered, but from my >> recollection (I didn't go back and re-read the whole discussion) you >> didn't chime in on this point, so *bump* :) >> >> 1. https://lore.kernel.org/git/211116.864k8bq0xm.gmgdl@evledraar.gmail.com/
Am 18.01.22 um 15:51 schrieb Ævar Arnfjörð Bjarmason: > > On Tue, Jan 18 2022, René Scharfe wrote: > >> Am 18.01.22 um 10:24 schrieb Ævar Arnfjörð Bjarmason: >>> >>> On Mon, Jan 17 2022, brian m. carlson wrote: >>> >>>> The current way we generate random file names is by taking the seconds >>>> and microseconds, plus the PID, and mixing them together, then encoding >>>> them. If this fails, we increment the value by 7777, and try again up >>>> to TMP_MAX times. >>>> >>>> Unfortunately, this is not the best idea from a security perspective. >>>> If we're writing into TMPDIR, an attacker can guess these values easily >>>> and prevent us from creating any temporary files at all by creating them >>>> all first. Even though we set TMP_MAX to 16384, this may be achievable >>>> in some contexts, even if unlikely to occur in practice. >>>> [...] >>> >>> I had a comment on v1[1] of this series which still applies here, >>> i.e. the "if we're writing into TMPDIR[...]" here elides the fact that >>> much of the time we're writing a tempfile into .git, so the security >>> issue ostensibly being solved here won't be a practical issue at all. >>> >>> Then for out-of-repo tempfiles some OS's have a per-user $TEMPDIR you >>> can use (e.g. systemd's /run/user/`id -u`). Finally... >>> >>>> Note that the use of a CSPRNG in generating temporary file names is also >>>> used in many libcs. glibc recently changed from an approach similar to >>>> ours to using a CSPRNG, and FreeBSD and OpenBSD also use a CSPRNG in >>>> this case. Even if the likelihood of an attack is low, we should still >>>> be at least as responsible in creating temporary files as libc is. >>> >>> ...we already have in-tree users of mkstemp(), which on glibc ostensibly >>> tries to solve the same security issues you note here, and the >>> reftable/* user has been added since earlier iterations of this series: >>> o >>> $ git grep -E '\bmkstemp\(' -- '*.[ch]' >>> compat/mingw.c:int mkstemp(char *template) >>> compat/mingw.h:int mkstemp(char *template); >>> entry.c: return mkstemp(path); >>> reftable/stack.c: tab_fd = mkstemp(temp_tab_file_name.buf); >>> reftable/stack.c: tab_fd = mkstemp(temp_tab->buf); >>> reftable/stack_test.c: int fd = mkstemp(fn); >>> wrapper.c: fd = mkstemp(filename_template); >>> >>> This series really feels like it's adding too much complexity and >>> potential auditing headaches (distributors worrying about us shipping a >>> CSPRNG, having to audit it) to a low-level codepath that most of the >>> time won't need this at all. >> >> Good point. Please let me think out loud for a moment. >> >> mkstemp() is secure (right?) and used already. mkstemps() was used as >> well until b2d593a779 (wrapper.c: remove unused gitmkstemps() function, >> 2017-02-28). What's the difference? The former requires the random >> characters at the end (e.g. "name-XXXXXX"), while the latter allows a >> suffix (e.g. "name-XXXXXX.ext"); that's what the added "s" in the name >> means, I guess. And apparently mkstemp is everywhere, but mkstemps is >> a GNU extension. >> >> git_mkstemps_mode is used by mks_tempfile_sm, mks_tempfile_tsm and >> git_mkstemp_mode. The latter uses no suffix, so could be implemented >> using mkstemp and fchmod instead. >> >> mks_tempfile_sm is called by write_locked_index, mks_tempfile_s, >> mks_tempfile_m and mks_tempfile. Only mks_tempfile_s uses a suffix, >> but is itself unused. So an implementation based on mkstemp and fchmod >> would suffice for mks_tempfile_sm users as well. >> >> mks_tempfile_tsm is used by mks_tempfile_ts, mks_tempfile_tm and >> mks_tempfile_t. Only mks_tempfile_ts uses a suffix. Its only caller >> is diff.c::prep_temp_blob, called only by diff.c::prepare_temp_file, >> called by add_external_diff_name and run_textconv in the same file. >> >> So if I didn't take a wrong turn somewhere the only temporary file name >> templates with suffixes are those for external diffs and textconv >> filters. The rest of the git_mkstemps_mode users could be switched to >> mkstemp+fchmod. >> >> Temporary files for external diffs and textconv filters *should* be >> placed in $TMPDIR. Do they need suffixes? I guess for testconv >> filters it doesn't matter, but graphical diff viewers might do syntax >> highlighting based on the extension. >> >> Can we get extensions without mktemps or git_mkstemps_mode? Yes, by >> using mkdtemp to create a temporary directory with a random name and >> creating the files in it. There already are mkdtemp users. >> >> So AFAICS all use cases of git_mkstemps_mode can be served by >> mkstemp+fchmod or mkdtemp. Would that help? > > That seems sensible, as a further practical suggestion it seems like > we'll retry around 16k times to create these files on failure, perhaps > trying with a custom extension 8k times would be OK, followed by the > minor UI degradation of doing the final 8k retries with the more-random > OS-provided extension-less variant. You can't use the suffix-less mkstemp if a suffix is necessary. Retries would be handled by mkstemp and mkdtemp IIUC. To an extent. E.g. https://cgit.freebsd.org/src/tree/lib/libc/stdio/mktemp.c seems to just count up, which doesn't help if an attacker guessed the first name correctly. So maybe some kind of EEXIST loop is still necessary. > More generally I'm still not sure if this is still a purely hypothetical > attack mitigation, or whether there are actually users out there that > have to deal with this. > > Wouldn't something like this also be an acceptable "solution" to this > class of problem? > > #define TMP_MAX 1024 > [...] > if (count >= TMP_MAX) > die(_("we tried and failed to create a tempfile using the '%s' template after %d tries!\n\n" > "Is someone actively DoS you? Is your sysadmin known to be your mortal enemy?\n" > "are you using Satan's shared hosting services? In any case, we give up!\n\n" > "To \"retry\" set TEMPDIR to some location where other users won't race us to death"), > template, count); You mean users should be educated to stay away from shared temporary directories on multi-user systems? Good advice. I'm also not sure how relevant it is these days -- doesn't everybody get their own VM these days? :) Anyway, there are probably some who cannot follow it, e.g. because their $HOME quota is exhausted. > For a bonus grade we could add a few more lines and try to stat() some > of the files we failed to create, and report what UID it is that's > racing you for all of these tempfile creations. Sounds fun, can enliven the office. Once the fisticuffs are over -- PLOT TWIST! Turns out the RNG handed out the same "random" numbers to everyone. ;) > >>> So instead of: >>> >>> A. Add CSPRNG with demo test helper >>> B. Use it in git_mkstemps_mode() >>> >>> I'd think we'd be much better off with: >>> >>> A. Split out callers of tempfile.c and mk.*temp in wrapper.c that create tempfiles in .git >>> B. <the rest> >>> >>> I honestly haven't looked too much at what <the rest> is, other than >>> what I wrote in [1], which seems to suggest that most of our codepaths >>> won't need this. >>> >>> I'd also think that given the reference to CSPRNG in e.g. some glibc >>> versions that instead of the ifdefs in csprng_bytes() we should instead >>> directly use a secure mkstemp() (or similar) for the not-.git cases that >>> remain after the "mktemp in a dir we chown" v.s. "mktemp in shared /tmp" >>> are split up. >>> >>> Maybe these are all things you looked at and considered, but from my >>> recollection (I didn't go back and re-read the whole discussion) you >>> didn't chime in on this point, so *bump* :) >>> >>> 1. https://lore.kernel.org/git/211116.864k8bq0xm.gmgdl@evledraar.gmail.com/ >
René Scharfe <l.s.r@web.de> writes: >> This series really feels like it's adding too much complexity and >> potential auditing headaches (distributors worrying about us shipping a >> CSPRNG, having to audit it) to a low-level codepath that most of the >> time won't need this at all. > > Good point. Please let me think out loud for a moment. Yeah, I agree you and Ævar that the topic may be over-engineering the solution for problem that we shouldn't be the ones who solve. I agree with your analysis that the "diff" tempfiles do need suffix, we SHOULD create them in $TMPDIR (not in the working tree or $GIT_DIR) to support operation in a read-only repository, but we can create a unique temporary directory and place a file (even under its original name) in it as a workaround. Thanks.
On 2022-01-18 at 09:24:58, Ævar Arnfjörð Bjarmason wrote: > I had a comment on v1[1] of this series which still applies here, > i.e. the "if we're writing into TMPDIR[...]" here elides the fact that > much of the time we're writing a tempfile into .git, so the security > issue ostensibly being solved here won't be a practical issue at all. > > Then for out-of-repo tempfiles some OS's have a per-user $TEMPDIR you > can use (e.g. systemd's /run/user/`id -u`). Finally... Some OSes do have that, but it is not universal and we can't rely on environment variables being set. They are stripped out by some programs, including Homebrew, even on OSes where they're provided. /run/user is also not a suitable temporary directory on Linux. Temporary files can be large, and /run is almost always a tmpfs, which means it sits entirely in memory. Setting anything in /run as TMPDIR is a mistake. > ...we already have in-tree users of mkstemp(), which on glibc ostensibly > tries to solve the same security issues you note here, and the > reftable/* user has been added since earlier iterations of this series: > o > $ git grep -E '\bmkstemp\(' -- '*.[ch]' > compat/mingw.c:int mkstemp(char *template) > compat/mingw.h:int mkstemp(char *template); > entry.c: return mkstemp(path); > reftable/stack.c: tab_fd = mkstemp(temp_tab_file_name.buf); > reftable/stack.c: tab_fd = mkstemp(temp_tab->buf); > reftable/stack_test.c: int fd = mkstemp(fn); > wrapper.c: fd = mkstemp(filename_template); > > This series really feels like it's adding too much complexity and > potential auditing headaches (distributors worrying about us shipping a > CSPRNG, having to audit it) to a low-level codepath that most of the > time won't need this at all. Every Rust program or Go program includes code to call a CSPRNG because it's required to avoid hash collision DoS attacks. I do plan to look into that in the future, because my guess right now is that we are in fact vulnerable to DoS attacks if someone creates crafted object IDs.[0] When I do that, I'll need this code. I also don't think adding code for a CSPRNG is an auditing problem. We use the system CSPRNG, which is the thing that literally everybody should be doing if they need good quality random numbers. If we were shipping a custom CSPRNG, then that would be an auditing problem, but I am explicitly not doing that because it's not necessary here and I'm willing to insist that the system provide one somewhere so we don't have to. > So instead of: > > A. Add CSPRNG with demo test helper > B. Use it in git_mkstemps_mode() > > I'd think we'd be much better off with: > > A. Split out callers of tempfile.c and mk.*temp in wrapper.c that create tempfiles in .git > B. <the rest> > > I honestly haven't looked too much at what <the rest> is, other than > what I wrote in [1], which seems to suggest that most of our codepaths > won't need this. > > I'd also think that given the reference to CSPRNG in e.g. some glibc > versions that instead of the ifdefs in csprng_bytes() we should instead > directly use a secure mkstemp() (or similar) for the not-.git cases that > remain after the "mktemp in a dir we chown" v.s. "mktemp in shared /tmp" > are split up. > > Maybe these are all things you looked at and considered, but from my > recollection (I didn't go back and re-read the whole discussion) you > didn't chime in on this point, so *bump* :) I did explain it in the cover letter for v2, along with the explanation in the paragraph above. The situation is that mkstemp doesn't handle all our use cases, and Randall said in the followups to v1 that mkstemp is not available on NonStop. I therefore must conclude that we'll need to implement this somewhere, even if only as a fallback. If we think mkstemp is going to work here and we don't need this, then I'll drop this series. However, I am annoyed that I'm getting conflicting information about what code is portable on different platforms, which is made especially difficult by trying to support Unix systems that don't support a 21-year-old standard and which lack suitable public documentation. If I write and polish a series based on a set of information I've been given and then it turns out that we decide to do something completely different based on conflicting information, that's not a motivator to continue sending patches. This will not be the first time I've dropped a series after several rounds of review because we totally decided to change course and do something different, and I don't want to repeat this again. [0] This type of attack is extremely trivial because the number of collisions necessary for this attack is usually on the order of 2^16, which means the work can be done in a couple seconds on a typical system. I have a proof of concept of this for PHP online.
Am 18.01.22 um 18:25 schrieb Junio C Hamano: > I agree with your analysis that the "diff" tempfiles do need suffix, > we SHOULD create them in $TMPDIR (not in the working tree or > $GIT_DIR) to support operation in a read-only repository, but we can > create a unique temporary directory and place a file (even under its > original name) in it as a workaround. Here's a proof-of-concept patch for handling just that diff use-case using mkdtemp. Indeed it's nice to see the actual filename with difftool. diff.c | 4 ++-- t/t4020-diff-external.sh | 2 +- tempfile.c | 20 ++++++++++++++++++++ tempfile.h | 1 + wrapper.c | 12 ++++++++++++ 5 files changed, 36 insertions(+), 3 deletions(-) diff --git a/diff.c b/diff.c index c862771a58..37db4743b0 100644 --- a/diff.c +++ b/diff.c @@ -4098,8 +4098,8 @@ static void prep_temp_blob(struct index_state *istate, init_checkout_metadata(&meta, NULL, NULL, oid); - /* Generate "XXXXXX_basename.ext" */ - strbuf_addstr(&tempfile, "XXXXXX_"); + /* Generate "git-blob-XXXXXX/basename.ext" */ + strbuf_addstr(&tempfile, "git-blob-XXXXXX/"); strbuf_addstr(&tempfile, base); temp->tempfile = mks_tempfile_ts(tempfile.buf, strlen(base) + 1); diff --git a/t/t4020-diff-external.sh b/t/t4020-diff-external.sh index 54bb8ef27e..e7f93f36f5 100755 --- a/t/t4020-diff-external.sh +++ b/t/t4020-diff-external.sh @@ -217,7 +217,7 @@ test_expect_success 'GIT_EXTERNAL_DIFF generates pretty paths' ' touch file.ext && git add file.ext && echo with extension > file.ext && - GIT_EXTERNAL_DIFF=echo git diff file.ext | grep ......_file\.ext && + GIT_EXTERNAL_DIFF=echo git diff file.ext | grep git-blob-....../file\.ext && git update-index --force-remove file.ext && rm file.ext ' diff --git a/tempfile.c b/tempfile.c index 94aa18f3f7..80cc9fb512 100644 --- a/tempfile.c +++ b/tempfile.c @@ -56,6 +56,21 @@ static VOLATILE_LIST_HEAD(tempfile_list); +static void remove_template_directory(struct tempfile *tempfile, + int in_signal_handler) +{ + char *end = tempfile->filename.buf + tempfile->filename.len; + char *suffix = end - tempfile->suffixlen; + if (*suffix != '/') + return; + *suffix = '\0'; + if (in_signal_handler) + rmdir(tempfile->filename.buf); + else + rmdir_or_warn(tempfile->filename.buf); + *suffix = '/'; +} + static void remove_tempfiles(int in_signal_handler) { pid_t me = getpid(); @@ -74,6 +89,7 @@ static void remove_tempfiles(int in_signal_handler) unlink(p->filename.buf); else unlink_or_warn(p->filename.buf); + remove_template_directory(p, in_signal_handler); p->active = 0; } @@ -100,6 +116,7 @@ static struct tempfile *new_tempfile(void) tempfile->owner = 0; INIT_LIST_HEAD(&tempfile->list); strbuf_init(&tempfile->filename, 0); + tempfile->suffixlen = 0; return tempfile; } @@ -170,6 +187,7 @@ struct tempfile *mks_tempfile_sm(const char *filename_template, int suffixlen, i struct tempfile *tempfile = new_tempfile(); strbuf_add_absolute_path(&tempfile->filename, filename_template); + tempfile->suffixlen = suffixlen; tempfile->fd = git_mkstemps_mode(tempfile->filename.buf, suffixlen, mode); if (tempfile->fd < 0) { deactivate_tempfile(tempfile); @@ -189,6 +207,7 @@ struct tempfile *mks_tempfile_tsm(const char *filename_template, int suffixlen, tmpdir = "/tmp"; strbuf_addf(&tempfile->filename, "%s/%s", tmpdir, filename_template); + tempfile->suffixlen = suffixlen; tempfile->fd = git_mkstemps_mode(tempfile->filename.buf, suffixlen, mode); if (tempfile->fd < 0) { deactivate_tempfile(tempfile); @@ -316,6 +335,7 @@ void delete_tempfile(struct tempfile **tempfile_p) close_tempfile_gently(tempfile); unlink_or_warn(tempfile->filename.buf); + remove_template_directory(tempfile, 0); deactivate_tempfile(tempfile); *tempfile_p = NULL; } diff --git a/tempfile.h b/tempfile.h index 4de3bc77d2..b9a60f2431 100644 --- a/tempfile.h +++ b/tempfile.h @@ -82,6 +82,7 @@ struct tempfile { FILE *volatile fp; volatile pid_t owner; struct strbuf filename; + size_t suffixlen; }; /* diff --git a/wrapper.c b/wrapper.c index 36e12119d7..358db282f9 100644 --- a/wrapper.c +++ b/wrapper.c @@ -481,6 +481,18 @@ int git_mkstemps_mode(char *pattern, int suffix_len, int mode) return -1; } + if (pattern[len - suffix_len] == '/') { + if (mode != 0600) { + errno = EINVAL; + return -1; + } + pattern[len - suffix_len] = '\0'; + if (!mkdtemp(pattern)) + return -1; + pattern[len - suffix_len] = '/'; + return open(pattern, O_CREAT | O_EXCL | O_RDWR, mode); + } + /* * Replace pattern's XXXXXX characters with randomness. * Try TMP_MAX different filenames.
Am 18.01.22 um 18:25 schrieb Junio C Hamano: > René Scharfe <l.s.r@web.de> writes: > >>> This series really feels like it's adding too much complexity and >>> potential auditing headaches (distributors worrying about us shipping a >>> CSPRNG, having to audit it) to a low-level codepath that most of the >>> time won't need this at all. >> >> Good point. Please let me think out loud for a moment. > > Yeah, I agree you and Ævar that the topic may be over-engineering > the solution for problem that we shouldn't be the ones who solve. > > I agree with your analysis that the "diff" tempfiles do need suffix, > we SHOULD create them in $TMPDIR (not in the working tree or > $GIT_DIR) to support operation in a read-only repository, but we can > create a unique temporary directory and place a file (even under its > original name) in it as a workaround. I forgot one crucial aspect, though: umask. The "m" variants of the tempfile functions set a mode, with umask applied. git_mkstemps_mode() does that by passing the mode to open(2), which applies the umask internally. Neither mkstemp(3) nor chmod(2) do that for us, so a replacement needs to call umask(2) to get it and again to restore it, which requires two system calls and is racy if multiple threads do this. Diff doesn't need a custom mode, so we can still use mkdtemp() there and thus make the suffix feature of git_mkstemps_mode() unnecessary. But a replacement for git_mkstemp_mode() with two umask(2) calls looks less attractive to me than fortifying git_mkstemps_mode() with a good source of randomness. René
René Scharfe <l.s.r@web.de> writes: > ... But > a replacement for git_mkstemp_mode() with two umask(2) calls looks less > attractive to me than fortifying git_mkstemps_mode() with a good source > of randomness. True. Also, it is not like we are supplying our own implementation of random source, but are just pluggig various system-supplied random source into our code, so I do not see the "auditatiblity" problem we heard earlier too much of an issue.
diff --git a/wrapper.c b/wrapper.c index 1052356703..3258cdb171 100644 --- a/wrapper.c +++ b/wrapper.c @@ -463,8 +463,6 @@ int git_mkstemps_mode(char *pattern, int suffix_len, int mode) static const int num_letters = ARRAY_SIZE(letters) - 1; static const char x_pattern[] = "XXXXXX"; static const int num_x = ARRAY_SIZE(x_pattern) - 1; - uint64_t value; - struct timeval tv; char *filename_template; size_t len; int fd, count; @@ -485,12 +483,13 @@ int git_mkstemps_mode(char *pattern, int suffix_len, int mode) * Replace pattern's XXXXXX characters with randomness. * Try TMP_MAX different filenames. */ - gettimeofday(&tv, NULL); - value = ((uint64_t)tv.tv_usec << 16) ^ tv.tv_sec ^ getpid(); filename_template = &pattern[len - num_x - suffix_len]; for (count = 0; count < TMP_MAX; ++count) { - uint64_t v = value; int i; + uint64_t v; + if (csprng_bytes(&v, sizeof(v)) < 0) + return error_errno("unable to get random bytes for temporary file"); + /* Fill in the random bits. */ for (i = 0; i < num_x; i++) { filename_template[i] = letters[v % num_letters]; @@ -506,12 +505,6 @@ int git_mkstemps_mode(char *pattern, int suffix_len, int mode) */ if (errno != EEXIST) break; - /* - * This is a random value. It is only necessary that - * the next TMP_MAX values generated by adding 7777 to - * VALUE are different with (module 2^32). - */ - value += 7777; } /* We return the null string if we can't find a unique file name. */ pattern[0] = '\0';
The current way we generate random file names is by taking the seconds and microseconds, plus the PID, and mixing them together, then encoding them. If this fails, we increment the value by 7777, and try again up to TMP_MAX times. Unfortunately, this is not the best idea from a security perspective. If we're writing into TMPDIR, an attacker can guess these values easily and prevent us from creating any temporary files at all by creating them all first. Even though we set TMP_MAX to 16384, this may be achievable in some contexts, even if unlikely to occur in practice. Fortunately, we can simply solve this by using the system cryptographically secure pseudorandom number generator (CSPRNG) to generate a random 64-bit value, and use that as before. Note that there is still a small bias here, but because a six-character sequence chosen out of 62 characters provides about 36 bits of entropy, the bias here is less than 2^-28, which is acceptable, especially considering we'll retry several times. Note that the use of a CSPRNG in generating temporary file names is also used in many libcs. glibc recently changed from an approach similar to ours to using a CSPRNG, and FreeBSD and OpenBSD also use a CSPRNG in this case. Even if the likelihood of an attack is low, we should still be at least as responsible in creating temporary files as libc is. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> --- wrapper.c | 15 ++++----------- 1 file changed, 4 insertions(+), 11 deletions(-)