Message ID | pull.884.git.1614177117508.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 6347d649bcddf531f82d400103e23d99ea8f2fd4 |
Headers | show |
Series | dir: fix malloc of root untracked_cache_dir | expand |
On Wed, Feb 24, 2021 at 02:31:57PM +0000, Jeff Hostetler via GitGitGadget wrote: > From: Jeff Hostetler <jeffhost@microsoft.com> > > Use FLEX_ALLOC_STR() to allocate the `struct untracked_cache_dir` > for the root directory. Get rid of unsafe code that might fail to > initialize the `name` field (if FLEX_ARRAY is not 1). This will > make it clear that we intend to have a structure with an empty > string following it. I had to read this patch message a few times over to make sure that I wasn't missing something in the patch itself, but it looks all good to me. Reviewed-by: Taylor Blau <me@ttaylorr.com> Thanks, Taylor
"Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Jeff Hostetler <jeffhost@microsoft.com> > > Use FLEX_ALLOC_STR() to allocate the `struct untracked_cache_dir` > for the root directory. Get rid of unsafe code that might fail to > initialize the `name` field (if FLEX_ARRAY is not 1). This will > make it clear that we intend to have a structure with an empty > string following it. > > A problem was observed on Windows where the length of the memset() was > too short, so the first byte of the name field was not zeroed. This > resulted in the name field having garbage from a previous use of that > area of memory. > > The record for the root directory was then written to the untracked-cache > extension in the index. This garbage would then be visible to future > commands when they reloaded the untracked-cache extension. > > Since the directory record for the root directory had garbage in the > `name` field, the `t/helper/test-tool dump-untracked-cache` tool > printed this garbage as the path prefix (rather than '/') for each > directory in the untracked cache as it recursed. > > Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> > --- > dir: fix malloc of root untracked_cache_dir Nicely spotted. The problematic code was introduced in 2015, a year before these FLEX_ALLOC_*() helpers were introduced. The result is of course correct and much easier to read, as the necessary "ask for a region of calloc'ed memory with an additional byte for terminating NUL beyond strlen()" is hidden in the helper. Will queue; thanks. > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-884%2Fjeffhostetler%2Funtracked-cache-corruption-v1 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-884/jeffhostetler/untracked-cache-corruption-v1 > Pull-Request: https://github.com/gitgitgadget/git/pull/884 > > dir.c | 7 ++----- > 1 file changed, 2 insertions(+), 5 deletions(-) > > diff --git a/dir.c b/dir.c > index d153a63bbd14..fd8aa7c40faa 100644 > --- a/dir.c > +++ b/dir.c > @@ -2730,11 +2730,8 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d > return NULL; > } > > - if (!dir->untracked->root) { > - const int len = sizeof(*dir->untracked->root); > - dir->untracked->root = xmalloc(len); > - memset(dir->untracked->root, 0, len); > - } > + if (!dir->untracked->root) > + FLEX_ALLOC_STR(dir->untracked->root, name, ""); > > /* Validate $GIT_DIR/info/exclude and core.excludesfile */ > root = dir->untracked->root; > > base-commit: 966e671106b2fd38301e7c344c754fd118d0bb07
On Wed, Feb 24, 2021 at 12:08:42PM -0800, Junio C Hamano wrote: > > Use FLEX_ALLOC_STR() to allocate the `struct untracked_cache_dir` > > for the root directory. Get rid of unsafe code that might fail to > > initialize the `name` field (if FLEX_ARRAY is not 1). This will > > make it clear that we intend to have a structure with an empty > > string following it. > [...] > The problematic code was introduced in 2015, a year before these > FLEX_ALLOC_*() helpers were introduced. The result is of course > correct and much easier to read, as the necessary "ask for a region > of calloc'ed memory with an additional byte for terminating NUL > beyond strlen()" is hidden in the helper. When I added the FLEX_ALLOC_* helpers, I audited for existing callers to convert. But I did so by looking for places where we were doing manual size computations. The bug here was that it was not doing any computation at all (when it need to be doing "+1"). So that's my guess why it got overlooked (which is not super important, but may give a hint about how to look for similar bugs). -Peff
On 2/24/21 4:05 PM, Jeff King wrote: > On Wed, Feb 24, 2021 at 12:08:42PM -0800, Junio C Hamano wrote: > >>> Use FLEX_ALLOC_STR() to allocate the `struct untracked_cache_dir` >>> for the root directory. Get rid of unsafe code that might fail to >>> initialize the `name` field (if FLEX_ARRAY is not 1). This will >>> make it clear that we intend to have a structure with an empty >>> string following it. >> [...] >> The problematic code was introduced in 2015, a year before these >> FLEX_ALLOC_*() helpers were introduced. The result is of course >> correct and much easier to read, as the necessary "ask for a region >> of calloc'ed memory with an additional byte for terminating NUL >> beyond strlen()" is hidden in the helper. > > When I added the FLEX_ALLOC_* helpers, I audited for existing callers to > convert. But I did so by looking for places where we were doing manual > size computations. The bug here was that it was not doing any > computation at all (when it need to be doing "+1"). So that's my guess > why it got overlooked (which is not super important, but may give a hint > about how to look for similar bugs). There's another call to xmalloc() at [1] that does the st_add3() thing that I didn't change. It was correctly computing the size so I didn't want to change it for no reason. That and the 2 memcpy()s made it feel like we should have a different FLEX_ macro for this case -- more of a FLEX_DUP... or something. But I didn't want to distract from my bug fix here. [1] https://github.com/git/git/blob/master/dir.c#L3290 Jeff
On Wed, Feb 24, 2021 at 04:15:11PM -0500, Jeff Hostetler wrote: > > When I added the FLEX_ALLOC_* helpers, I audited for existing callers to > > convert. But I did so by looking for places where we were doing manual > > size computations. The bug here was that it was not doing any > > computation at all (when it need to be doing "+1"). So that's my guess > > why it got overlooked (which is not super important, but may give a hint > > about how to look for similar bugs). > > There's another call to xmalloc() at [1] that does the st_add3() > thing that I didn't change. It was correctly computing the size > so I didn't want to change it for no reason. That and the 2 memcpy()s > made it feel like we should have a different FLEX_ macro for this > case -- more of a FLEX_DUP... or something. But I didn't want to > distract from my bug fix here. Thanks for pointing it out. Even if we change it, I agree it's better to keep it out of your existing bugfix patch. But I do think that spot gets weird. We are copying all of the elements of the struct _except_ the name field, which comes from somewhere else. So it's tempting to do: diff --git a/dir.c b/dir.c index d153a63bbd..ab16db3f76 100644 --- a/dir.c +++ b/dir.c @@ -3287,9 +3287,9 @@ static int read_one_dir(struct untracked_cache_dir **untracked_, if (!eos || eos == end) return -1; - *untracked_ = untracked = xmalloc(st_add3(sizeof(*untracked), eos - data, 1)); + FLEX_ALLOC_MEM(untracked, name, data, eos - data); memcpy(untracked, &ud, sizeof(ud)); - memcpy(untracked->name, data, eos - data + 1); + *untracked_ = untracked; data = eos + 1; for (i = 0; i < untracked->untracked_nr; i++) { But that's wrong! The remaining memcpy using sizeof(ud) might or might not touch the first byte of the name field, depending on struct padding and the value of FLEX_ARRAY. And if it does, then we'd be overwriting the early bytes of that field with whatever was in "ud" (which I guess is uninitialized cruft, because that is a stack variable). So it would have to be: memcpy(untracked, &ud, offsetof(struct untracked_cache_dir, name)); Quite subtle. Even with a comment, I think I prefer the existing code. If we had a macro as you suggest for "allocate a flex struct, dup most of the contents from this other struct, and then copy in the flex bytes from this other spot", that would make it more readable. But I'm not sure we'd find more than one spot to use such a macro. :) -Peff
diff --git a/dir.c b/dir.c index d153a63bbd14..fd8aa7c40faa 100644 --- a/dir.c +++ b/dir.c @@ -2730,11 +2730,8 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d return NULL; } - if (!dir->untracked->root) { - const int len = sizeof(*dir->untracked->root); - dir->untracked->root = xmalloc(len); - memset(dir->untracked->root, 0, len); - } + if (!dir->untracked->root) + FLEX_ALLOC_STR(dir->untracked->root, name, ""); /* Validate $GIT_DIR/info/exclude and core.excludesfile */ root = dir->untracked->root;