Message ID | pull.1652.git.1707153705840.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | commit.c: ensure strchrnul() doesn't scan beyond range | expand |
Am 05.02.24 um 18:21 schrieb Chandra Pratap via GitGitGadget: > From: Chandra Pratap <chandrapratap3519@gmail.com> > > Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> > --- > commit.c: ensure strchrnul() doesn't scan beyond range > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1652%2FChand-ra%2Fstrchrnul-v1 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1652/Chand-ra/strchrnul-v1 > Pull-Request: https://github.com/gitgitgadget/git/pull/1652 > > commit.c | 8 +------- > 1 file changed, 1 insertion(+), 7 deletions(-) > > diff --git a/commit.c b/commit.c > index ef679a0b939..a65b8e92e94 100644 > --- a/commit.c > +++ b/commit.c > @@ -1743,15 +1743,9 @@ const char *find_header_mem(const char *msg, size_t len, > int key_len = strlen(key); > const char *line = msg; > > - /* > - * NEEDSWORK: It's possible for strchrnul() to scan beyond the range > - * given by len. However, current callers are safe because they compute > - * len by scanning a NUL-terminated block of memory starting at msg. > - * Nonetheless, it would be better to ensure the function does not look > - * at msg beyond the len provided by the caller. > - */ > while (line && line < msg + len) { > const char *eol = strchrnul(line, '\n'); > + assert(eol - line <= len); Something like this might work in Verse, but C is more simple-minded. You can't undo an out-of-bounds access after the fact, and assert() would be compiled out if the code is built with NDEBUG anyway. If you want to make the code work with buffers that lack a terminating NUL then you need to replace the strchrnul() call with something that respects buffer lengths. You could e.g. call memchr(). Don't forget to check for NUL to preserve the original behavior. Or you could roll your own custom replacement, perhaps like this: char *strnchrnul(const char *s, int c, size_t len) { while (len-- && *s && *s != c) s++; return (char *)s; } A test with the new unit-test framework would be nice. It should be possible to show that the current code runs over the passed len, without causing undefined behavior. E.g. find_header_mem("foo bar", 2, "foo", &len) is safe, but returns "bar" instead of NULL. > > if (line == eol) > return NULL; > > base-commit: a54a84b333adbecf7bc4483c0e36ed5878cac17b
On Mon, Feb 5, 2024 at 9:23 AM Chandra Pratap via GitGitGadget <gitgitgadget@gmail.com> wrote: > > From: Chandra Pratap <chandrapratap3519@gmail.com> > > Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> > --- > commit.c: ensure strchrnul() doesn't scan beyond range > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1652%2FChand-ra%2Fstrchrnul-v1 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1652/Chand-ra/strchrnul-v1 > Pull-Request: https://github.com/gitgitgadget/git/pull/1652 > > commit.c | 8 +------- > 1 file changed, 1 insertion(+), 7 deletions(-) > > diff --git a/commit.c b/commit.c > index ef679a0b939..a65b8e92e94 100644 > --- a/commit.c > +++ b/commit.c > @@ -1743,15 +1743,9 @@ const char *find_header_mem(const char *msg, size_t len, > int key_len = strlen(key); > const char *line = msg; > > - /* > - * NEEDSWORK: It's possible for strchrnul() to scan beyond the range > - * given by len. However, current callers are safe because they compute > - * len by scanning a NUL-terminated block of memory starting at msg. > - * Nonetheless, it would be better to ensure the function does not look > - * at msg beyond the len provided by the caller. > - */ > while (line && line < msg + len) { > const char *eol = strchrnul(line, '\n'); > + assert(eol - line <= len); I don't think this is sufficient to address the NEEDSWORK. `assert` is only active in debug builds, and strchrnul would have already potentially exceeded the bounds of its memory by the time this check is happening. We'd need a safe version of strchrnul that took the maximum length and never exceeded it. > > if (line == eol) > return NULL; > > base-commit: a54a84b333adbecf7bc4483c0e36ed5878cac17b > -- > gitgitgadget >
René Scharfe <l.s.r@web.de> writes: >> while (line && line < msg + len) { >> const char *eol = strchrnul(line, '\n'); >> + assert(eol - line <= len); > > Something like this might work in Verse, but C is more simple-minded. > You can't undo an out-of-bounds access after the fact, and assert() > would be compiled out if the code is built with NDEBUG anyway. Good comments. Thanks.
On Mon, Feb 05, 2024 at 08:57:46PM +0100, René Scharfe wrote: > If you want to make the code work with buffers that lack a terminating > NUL then you need to replace the strchrnul() call with something that > respects buffer lengths. You could e.g. call memchr(). Don't forget > to check for NUL to preserve the original behavior. Or you could roll > your own custom replacement, perhaps like this: I'm not sure it is worth retaining the check for NUL. The original function added by me in fe6eb7f2c5 (commit: provide a function to find a header in a buffer, 2014-08-27) just took a NUL-terminated string, so we certainly were not expecting embedded NULs. In cfc5cf428b (receive-pack.c: consolidate find header logic, 2022-01-06) we switched to taking the "len" parameter, but the new caller just passes strlen(msg) anyway. I guess you could argue that before that commit, receive-pack.c's find_header() which took a length was buggy to use strchrnul(). It gets fed with a push-cert buffer. I guess it's possible for there to be an embedded NUL there, but in practice there shouldn't be. If we are thinking of malformed or malicious input, it's not clear which behavior (finding or not finding a header past a NUL) is more harmful. So all things being equal, I would try to reduce the number of special cases here by not worrying about NULs. (Though if somebody really wants to dig, it's possible there's a clever dual-parser attack here where "\nfoo\0bar baz" finds the header "bar baz" in one parser but not in another). -Peff
Am 08.02.24 um 02:00 schrieb Jeff King: > On Mon, Feb 05, 2024 at 08:57:46PM +0100, René Scharfe wrote: > >> If you want to make the code work with buffers that lack a terminating >> NUL then you need to replace the strchrnul() call with something that >> respects buffer lengths. You could e.g. call memchr(). Don't forget >> to check for NUL to preserve the original behavior. Or you could roll >> your own custom replacement, perhaps like this: > > I'm not sure it is worth retaining the check for NUL. The original > function added by me in fe6eb7f2c5 (commit: provide a function to find a > header in a buffer, 2014-08-27) just took a NUL-terminated string, so > we certainly were not expecting embedded NULs. > > In cfc5cf428b (receive-pack.c: consolidate find header logic, > 2022-01-06) we switched to taking the "len" parameter, but the new > caller just passes strlen(msg) anyway. > > I guess you could argue that before that commit, receive-pack.c's > find_header() which took a length was buggy to use strchrnul(). It gets > fed with a push-cert buffer. I guess it's possible for there to be an > embedded NUL there, but in practice there shouldn't be. If we are > thinking of malformed or malicious input, it's not clear which behavior > (finding or not finding a header past a NUL) is more harmful. So all > things being equal, I would try to reduce the number of special cases > here by not worrying about NULs. > > (Though if somebody really wants to dig, it's possible there's a clever > dual-parser attack here where "\nfoo\0bar baz" finds the header "bar > baz" in one parser but not in another). Good point. A _mem function shouldn't worry about NULs. Its callers are responsible for that -- if necessary. No idea what an attacker could do with nonce and push-option headers with varying visibility. Version detection? Something worse? But anyway: If NULs are of no concern and we currently end parsing when we see one in all cases, why do we need a _mem function at all? The original version of the function, find_commit_header(), should suffice. check_nonce() could be run against the NUL-terminated sigcheck.payload and check_cert_push_options() parses an entire strbuf, so there is no risk of out-of-bounds access. René
René Scharfe <l.s.r@web.de> writes: > But anyway: If NULs are of no concern and we currently end parsing when > we see one in all cases, why do we need a _mem function at all? The > original version of the function, find_commit_header(), should suffice. > check_nonce() could be run against the NUL-terminated sigcheck.payload > and check_cert_push_options() parses an entire strbuf, so there is no > risk of out-of-bounds access. If I recall correctly, the caller that does not pass strlen() as the payload length gives a length that is shorter than the buffer, i.e. "stop the parsing here, do not get confused into thinking the garbage after this point contains useful payload" was the reason why we have a separate "len".
On Thu, Feb 8, 2024 at 11:48 AM Junio C Hamano <gitster@pobox.com> wrote: > > René Scharfe <l.s.r@web.de> writes: > > > But anyway: If NULs are of no concern and we currently end parsing when > > we see one in all cases, why do we need a _mem function at all? The > > original version of the function, find_commit_header(), should suffice. > > check_nonce() could be run against the NUL-terminated sigcheck.payload > > and check_cert_push_options() parses an entire strbuf, so there is no > > risk of out-of-bounds access. > > If I recall correctly, the caller that does not pass strlen() as the > payload length gives a length that is shorter than the buffer, i.e. > "stop the parsing here, do not get confused into thinking the > garbage after this point contains useful payload" was the reason why > we have a separate "len". > I just rediscovered that. I think this probably should be something that caller (check_nonce) implements, then. Having a _mem function implies to me (though I'm very new to this codebase) that it supports embedded NULs, but that's not what's happening here.
On Thu, Feb 08, 2024 at 11:48:05AM -0800, Junio C Hamano wrote: > René Scharfe <l.s.r@web.de> writes: > > > But anyway: If NULs are of no concern and we currently end parsing when > > we see one in all cases, why do we need a _mem function at all? The > > original version of the function, find_commit_header(), should suffice. > > check_nonce() could be run against the NUL-terminated sigcheck.payload > > and check_cert_push_options() parses an entire strbuf, so there is no > > risk of out-of-bounds access. > > If I recall correctly, the caller that does not pass strlen() as the > payload length gives a length that is shorter than the buffer, i.e. > "stop the parsing here, do not get confused into thinking the > garbage after this point contains useful payload" was the reason why > we have a separate "len". Yes, check_nonce() passes in a length limited by the start of the actual signature, as determined by parse_signed_buffer(). Though that generally comes after a blank line, which would also stop find_header() from parsing further. But more interestingly: even though we pass a buf/len pair to parse_signed_buffer(), it then calls get_format_by_sig() which takes only a NUL-terminated string. So: 1. It is not possible for the buf/len pair we pass to check_nonce() to contain a NUL. And thus there is no caller of find_header_mem() that can contain an embedded NUL. So switching from strchrnul() to just memchr() should be OK there. 2. That raises the question of whether parse_signed_buffer() has a similar walk-too-far problem. ;) The answer is no, because we feed it from a strbuf. But it's not a great pattern overall. -Peff
Jeff King <peff@peff.net> writes: > 1. It is not possible for the buf/len pair we pass to check_nonce() to > contain a NUL. And thus there is no caller of find_header_mem() > that can contain an embedded NUL. So switching from strchrnul() to > just memchr() should be OK there. Correct. > 2. That raises the question of whether parse_signed_buffer() has a > similar walk-too-far problem. ;) The answer is no, because we feed > it from a strbuf. But it's not a great pattern overall. True, too. Thanks.
diff --git a/commit.c b/commit.c index ef679a0b939..a65b8e92e94 100644 --- a/commit.c +++ b/commit.c @@ -1743,15 +1743,9 @@ const char *find_header_mem(const char *msg, size_t len, int key_len = strlen(key); const char *line = msg; - /* - * NEEDSWORK: It's possible for strchrnul() to scan beyond the range - * given by len. However, current callers are safe because they compute - * len by scanning a NUL-terminated block of memory starting at msg. - * Nonetheless, it would be better to ensure the function does not look - * at msg beyond the len provided by the caller. - */ while (line && line < msg + len) { const char *eol = strchrnul(line, '\n'); + assert(eol - line <= len); if (line == eol) return NULL;