Message ID | 20210929212347.1139666-1-rkovhaev@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | xfs: use kmem_cache_free() for kmem_cache objects | expand |
On Wed, Sep 29, 2021 at 02:23:47PM -0700, Rustam Kovhaev wrote: > For kmalloc() allocations SLOB prepends the blocks with a 4-byte header, > and it puts the size of the allocated blocks in that header. > Blocks allocated with kmem_cache_alloc() allocations do not have that > header. > > SLOB explodes when you allocate memory with kmem_cache_alloc() and then > try to free it with kfree() instead of kmem_cache_free(). > SLOB will assume that there is a header when there is none, read some > garbage to size variable and corrupt the adjacent objects, which > eventually leads to hang or panic. > > Let's make XFS work with SLOB by using proper free function. > > Fixes: 9749fee83f38 ("xfs: enable the xfs_defer mechanism to process extents to free") > Signed-off-by: Rustam Kovhaev <rkovhaev@gmail.com> IOWs, XFS has been broken on SLOB for over 5 years and nobody anywhere has noticed. And we've just had a discussion where the very best solution was to use kfree() on kmem_cache_alloc() objects so we didn't ahve to spend CPU doing global type table lookups or use an extra 8 bytes of memory per object to track the slab cache just so we could call kmem_cache_free() with the correct slab cache. But, of course, SLOB doesn't allow this and I was really tempted to solve that by adding a Kconfig "depends on SLAB|SLUB" option so that we don't have to care about SLOB not working. However, as it turns out that XFS on SLOB has already been broken for so long, maybe we should just not care about SLOB code and seriously consider just adding a specific dependency on SLAB|SLUB... Thoughts? Cheers, Dave.
On 9/30/21 06:42, Dave Chinner wrote: > On Wed, Sep 29, 2021 at 02:23:47PM -0700, Rustam Kovhaev wrote: >> For kmalloc() allocations SLOB prepends the blocks with a 4-byte header, >> and it puts the size of the allocated blocks in that header. >> Blocks allocated with kmem_cache_alloc() allocations do not have that >> header. >> >> SLOB explodes when you allocate memory with kmem_cache_alloc() and then >> try to free it with kfree() instead of kmem_cache_free(). >> SLOB will assume that there is a header when there is none, read some >> garbage to size variable and corrupt the adjacent objects, which >> eventually leads to hang or panic. >> >> Let's make XFS work with SLOB by using proper free function. >> >> Fixes: 9749fee83f38 ("xfs: enable the xfs_defer mechanism to process extents to free") >> Signed-off-by: Rustam Kovhaev <rkovhaev@gmail.com> > > IOWs, XFS has been broken on SLOB for over 5 years and nobody > anywhere has noticed. > > And we've just had a discussion where the very best solution was to > use kfree() on kmem_cache_alloc() objects so we didn't ahve to spend > CPU doing global type table lookups or use an extra 8 bytes of > memory per object to track the slab cache just so we could call > kmem_cache_free() with the correct slab cache. > > But, of course, SLOB doesn't allow this and I was really tempted to > solve that by adding a Kconfig "depends on SLAB|SLUB" option so that > we don't have to care about SLOB not working. > > However, as it turns out that XFS on SLOB has already been broken > for so long, maybe we should just not care about SLOB code and > seriously consider just adding a specific dependency on SLAB|SLUB... I think it's fair if something like XFS (not meant for tiny systems AFAIK?) excludes SLOB (meant for tiny systems). Clearly nobody tried to use these two together last 5 years anyway. Maybe we could also just add the 4 bytes to all SLOB objects, declare kfree() is always fine and be done with it. Yes, it will make SLOB footprint somewhat less tiny, but even whan we added kmalloc power of two alignment guarantees, the impact on SLOB was negligible. > Thoughts? > > Cheers, > > Dave. >
On Thu, Sep 30, 2021 at 10:13:40AM +0200, Vlastimil Babka wrote: > On 9/30/21 06:42, Dave Chinner wrote: > > On Wed, Sep 29, 2021 at 02:23:47PM -0700, Rustam Kovhaev wrote: > >> For kmalloc() allocations SLOB prepends the blocks with a 4-byte header, > >> and it puts the size of the allocated blocks in that header. > >> Blocks allocated with kmem_cache_alloc() allocations do not have that > >> header. > >> > >> SLOB explodes when you allocate memory with kmem_cache_alloc() and then > >> try to free it with kfree() instead of kmem_cache_free(). > >> SLOB will assume that there is a header when there is none, read some > >> garbage to size variable and corrupt the adjacent objects, which > >> eventually leads to hang or panic. > >> > >> Let's make XFS work with SLOB by using proper free function. > >> > >> Fixes: 9749fee83f38 ("xfs: enable the xfs_defer mechanism to process extents to free") > >> Signed-off-by: Rustam Kovhaev <rkovhaev@gmail.com> > > > > IOWs, XFS has been broken on SLOB for over 5 years and nobody > > anywhere has noticed. > > > > And we've just had a discussion where the very best solution was to > > use kfree() on kmem_cache_alloc() objects so we didn't ahve to spend > > CPU doing global type table lookups or use an extra 8 bytes of > > memory per object to track the slab cache just so we could call > > kmem_cache_free() with the correct slab cache. > > > > But, of course, SLOB doesn't allow this and I was really tempted to > > solve that by adding a Kconfig "depends on SLAB|SLUB" option so that > > we don't have to care about SLOB not working. > > > > However, as it turns out that XFS on SLOB has already been broken > > for so long, maybe we should just not care about SLOB code and > > seriously consider just adding a specific dependency on SLAB|SLUB... > > I think it's fair if something like XFS (not meant for tiny systems AFAIK?) > excludes SLOB (meant for tiny systems). Clearly nobody tried to use these > two together last 5 years anyway. +1 for adding Kconfig option, it seems like some things are not meant to be together. > Maybe we could also just add the 4 bytes to all SLOB objects, declare > kfree() is always fine and be done with it. Yes, it will make SLOB footprint > somewhat less tiny, but even whan we added kmalloc power of two alignment > guarantees, the impact on SLOB was negligible. I'll send a patch to add a 4-byte header for kmem_cache_alloc() allocations. > > Thoughts? > > > > Cheers, > > > > Dave. > > >
On 9/30/21 8:48 PM, Rustam Kovhaev wrote: > On Thu, Sep 30, 2021 at 10:13:40AM +0200, Vlastimil Babka wrote: >> >> I think it's fair if something like XFS (not meant for tiny systems AFAIK?) >> excludes SLOB (meant for tiny systems). Clearly nobody tried to use these >> two together last 5 years anyway. > > +1 for adding Kconfig option, it seems like some things are not meant to > be together. But if we patch SLOB, we won't need it. >> Maybe we could also just add the 4 bytes to all SLOB objects, declare >> kfree() is always fine and be done with it. Yes, it will make SLOB footprint >> somewhat less tiny, but even whan we added kmalloc power of two alignment >> guarantees, the impact on SLOB was negligible. > > I'll send a patch to add a 4-byte header for kmem_cache_alloc() > allocations. Thanks. Please report in the changelog slab usage from /proc/meminfo before and after patch (at least a snapshot after a full boot). >>> Thoughts? >>> >>> Cheers, >>> >>> Dave. >>> >>
On Thu, Sep 30, 2021 at 11:10:10PM +0200, Vlastimil Babka wrote: > On 9/30/21 8:48 PM, Rustam Kovhaev wrote: > > On Thu, Sep 30, 2021 at 10:13:40AM +0200, Vlastimil Babka wrote: > >> > >> I think it's fair if something like XFS (not meant for tiny systems AFAIK?) > >> excludes SLOB (meant for tiny systems). Clearly nobody tried to use these > >> two together last 5 years anyway. > > > > +1 for adding Kconfig option, it seems like some things are not meant to > > be together. > > But if we patch SLOB, we won't need it. OK, so we consider XFS on SLOB a supported configuration that might be used and should be tested. I'll look into maybe adding a config with CONFIG_SLOB and CONFIG_XFS_FS to syzbot. It seems that we need to patch SLOB anyway, because any other code can hit the very same issue. > >> Maybe we could also just add the 4 bytes to all SLOB objects, declare > >> kfree() is always fine and be done with it. Yes, it will make SLOB footprint > >> somewhat less tiny, but even whan we added kmalloc power of two alignment > >> guarantees, the impact on SLOB was negligible. > > > > I'll send a patch to add a 4-byte header for kmem_cache_alloc() > > allocations. > > Thanks. Please report in the changelog slab usage from /proc/meminfo > before and after patch (at least a snapshot after a full boot). OK.
On Thu, 30 Sep 2021, Rustam Kovhaev wrote: > > >> I think it's fair if something like XFS (not meant for tiny systems AFAIK?) > > >> excludes SLOB (meant for tiny systems). Clearly nobody tried to use these > > >> two together last 5 years anyway. > > > > > > +1 for adding Kconfig option, it seems like some things are not meant to > > > be together. > > > > But if we patch SLOB, we won't need it. > > OK, so we consider XFS on SLOB a supported configuration that might be > used and should be tested. > I'll look into maybe adding a config with CONFIG_SLOB and CONFIG_XFS_FS > to syzbot. > > It seems that we need to patch SLOB anyway, because any other code can > hit the very same issue. > It's probably best to introduce both (SLOB fix and Kconfig change for XFS), at least in the interim because the combo of XFS and SLOB could be broken in other ways. If syzbot doesn't complain with a patched kernel to allow SLOB to be used with XFS, then we could potentially allow them to be used together. (I'm not sure that this freeing issue is the *only* thing that is broken, nor that we have sufficient information to make that determination right now..)
On Sun, Oct 03, 2021 at 06:07:20PM -0700, David Rientjes wrote: > On Thu, 30 Sep 2021, Rustam Kovhaev wrote: > > > > >> I think it's fair if something like XFS (not meant for tiny systems AFAIK?) > > > >> excludes SLOB (meant for tiny systems). Clearly nobody tried to use these > > > >> two together last 5 years anyway. > > > > > > > > +1 for adding Kconfig option, it seems like some things are not meant to > > > > be together. > > > > > > But if we patch SLOB, we won't need it. > > > > OK, so we consider XFS on SLOB a supported configuration that might be > > used and should be tested. > > I'll look into maybe adding a config with CONFIG_SLOB and CONFIG_XFS_FS > > to syzbot. > > > > It seems that we need to patch SLOB anyway, because any other code can > > hit the very same issue. > > > > It's probably best to introduce both (SLOB fix and Kconfig change for > XFS), at least in the interim because the combo of XFS and SLOB could be > broken in other ways. If syzbot doesn't complain with a patched kernel to > allow SLOB to be used with XFS, then we could potentially allow them to be > used together. > > (I'm not sure that this freeing issue is the *only* thing that is broken, > nor that we have sufficient information to make that determination right > now..) I audited the entire xfs (kernel) codebase and didn't find any other usage errors. Thanks for the patch; I'll apply it to for-next. --D
On Tue, Oct 12, 2021 at 01:43:20PM -0700, Darrick J. Wong wrote: > On Sun, Oct 03, 2021 at 06:07:20PM -0700, David Rientjes wrote: > > On Thu, 30 Sep 2021, Rustam Kovhaev wrote: > > > > > > >> I think it's fair if something like XFS (not meant for tiny systems AFAIK?) > > > > >> excludes SLOB (meant for tiny systems). Clearly nobody tried to use these > > > > >> two together last 5 years anyway. > > > > > > > > > > +1 for adding Kconfig option, it seems like some things are not meant to > > > > > be together. > > > > > > > > But if we patch SLOB, we won't need it. > > > > > > OK, so we consider XFS on SLOB a supported configuration that might be > > > used and should be tested. > > > I'll look into maybe adding a config with CONFIG_SLOB and CONFIG_XFS_FS > > > to syzbot. > > > > > > It seems that we need to patch SLOB anyway, because any other code can > > > hit the very same issue. > > > > > > > It's probably best to introduce both (SLOB fix and Kconfig change for > > XFS), at least in the interim because the combo of XFS and SLOB could be > > broken in other ways. If syzbot doesn't complain with a patched kernel to > > allow SLOB to be used with XFS, then we could potentially allow them to be > > used together. > > > > (I'm not sure that this freeing issue is the *only* thing that is broken, > > nor that we have sufficient information to make that determination right > > now..) > > I audited the entire xfs (kernel) codebase and didn't find any other > usage errors. Thanks for the patch; I'll apply it to for-next. Also, the obligatory Reviewed-by: Darrick J. Wong <djwong@kernel.org> --D > > --D
On 10/12/2021 10:43 PM, Darrick J. Wong wrote: > On Tue, Oct 12, 2021 at 01:43:20PM -0700, Darrick J. Wong wrote: >> On Sun, Oct 03, 2021 at 06:07:20PM -0700, David Rientjes wrote: >>> On Thu, 30 Sep 2021, Rustam Kovhaev wrote: >>> >>>>>>> I think it's fair if something like XFS (not meant for tiny systems AFAIK?) >>>>>>> excludes SLOB (meant for tiny systems). Clearly nobody tried to use these >>>>>>> two together last 5 years anyway. >>>>>> >>>>>> +1 for adding Kconfig option, it seems like some things are not meant to >>>>>> be together. >>>>> >>>>> But if we patch SLOB, we won't need it. >>>> >>>> OK, so we consider XFS on SLOB a supported configuration that might be >>>> used and should be tested. >>>> I'll look into maybe adding a config with CONFIG_SLOB and CONFIG_XFS_FS >>>> to syzbot. >>>> >>>> It seems that we need to patch SLOB anyway, because any other code can >>>> hit the very same issue. >>>> >>> >>> It's probably best to introduce both (SLOB fix and Kconfig change for >>> XFS), at least in the interim because the combo of XFS and SLOB could be >>> broken in other ways. If syzbot doesn't complain with a patched kernel to >>> allow SLOB to be used with XFS, then we could potentially allow them to be >>> used together. >>> >>> (I'm not sure that this freeing issue is the *only* thing that is broken, >>> nor that we have sufficient information to make that determination right >>> now..) >> >> I audited the entire xfs (kernel) codebase and didn't find any other >> usage errors. Thanks for the patch; I'll apply it to for-next. Which patch, the one that started this thread and uses kmem_cache_free() instead of kfree()? I thought we said it's not the best way? > Also, the obligatory > > Reviewed-by: Darrick J. Wong <djwong@kernel.org> > > --D > >> >> --D
On Tue, Oct 12, 2021 at 11:32:25PM +0200, Vlastimil Babka wrote: > On 10/12/2021 10:43 PM, Darrick J. Wong wrote: > > On Tue, Oct 12, 2021 at 01:43:20PM -0700, Darrick J. Wong wrote: > >> On Sun, Oct 03, 2021 at 06:07:20PM -0700, David Rientjes wrote: > >>> On Thu, 30 Sep 2021, Rustam Kovhaev wrote: > >>> > >>>>>>> I think it's fair if something like XFS (not meant for tiny systems AFAIK?) > >>>>>>> excludes SLOB (meant for tiny systems). Clearly nobody tried to use these > >>>>>>> two together last 5 years anyway. > >>>>>> > >>>>>> +1 for adding Kconfig option, it seems like some things are not meant to > >>>>>> be together. > >>>>> > >>>>> But if we patch SLOB, we won't need it. > >>>> > >>>> OK, so we consider XFS on SLOB a supported configuration that might be > >>>> used and should be tested. > >>>> I'll look into maybe adding a config with CONFIG_SLOB and CONFIG_XFS_FS > >>>> to syzbot. > >>>> > >>>> It seems that we need to patch SLOB anyway, because any other code can > >>>> hit the very same issue. > >>>> > >>> > >>> It's probably best to introduce both (SLOB fix and Kconfig change for > >>> XFS), at least in the interim because the combo of XFS and SLOB could be > >>> broken in other ways. If syzbot doesn't complain with a patched kernel to > >>> allow SLOB to be used with XFS, then we could potentially allow them to be > >>> used together. > >>> > >>> (I'm not sure that this freeing issue is the *only* thing that is broken, > >>> nor that we have sufficient information to make that determination right > >>> now..) > >> > >> I audited the entire xfs (kernel) codebase and didn't find any other > >> usage errors. Thanks for the patch; I'll apply it to for-next. > > Which patch, the one that started this thread and uses kmem_cache_free() instead > of kfree()? I thought we said it's not the best way? It's probably better to fix slob to be able to tell that a kmem_free'd object actually belongs to a cache and should get freed that way, just like its larger sl[ua]b cousins. However, even if that does come to pass, anybody /else/ who wants to start(?) using XFS on a SLOB system will need this patch to fix the minor papercut. Now that I've checked the rest of the codebase, I don't find it reasonable to make XFS mutually exclusive with SLOB over two instances of slab cache misuse. Hence the RVB. :) --D > > Also, the obligatory > > > > Reviewed-by: Darrick J. Wong <djwong@kernel.org> > > > > --D > > > >> > >> --D >
On 10/13/21 01:22, Darrick J. Wong wrote: > On Tue, Oct 12, 2021 at 11:32:25PM +0200, Vlastimil Babka wrote: >> On 10/12/2021 10:43 PM, Darrick J. Wong wrote: >> > On Tue, Oct 12, 2021 at 01:43:20PM -0700, Darrick J. Wong wrote: >> >> On Sun, Oct 03, 2021 at 06:07:20PM -0700, David Rientjes wrote: >> >> >> >> I audited the entire xfs (kernel) codebase and didn't find any other >> >> usage errors. Thanks for the patch; I'll apply it to for-next. >> >> Which patch, the one that started this thread and uses kmem_cache_free() instead >> of kfree()? I thought we said it's not the best way? > > It's probably better to fix slob to be able to tell that a kmem_free'd > object actually belongs to a cache and should get freed that way, just > like its larger sl[ua]b cousins. Agreed. Rustam, do you still plan to do that? > However, even if that does come to pass, anybody /else/ who wants to > start(?) using XFS on a SLOB system will need this patch to fix the > minor papercut. Now that I've checked the rest of the codebase, I don't > find it reasonable to make XFS mutually exclusive with SLOB over two > instances of slab cache misuse. Hence the RVB. :) Ok. I was just wondering because Dave's first reply was that actually you'll need to expand the use of kfree() instead of kmem_cache_free().
On Wed, Oct 13, 2021 at 09:38:31AM +0200, Vlastimil Babka wrote: > On 10/13/21 01:22, Darrick J. Wong wrote: > > On Tue, Oct 12, 2021 at 11:32:25PM +0200, Vlastimil Babka wrote: > >> On 10/12/2021 10:43 PM, Darrick J. Wong wrote: > >> > On Tue, Oct 12, 2021 at 01:43:20PM -0700, Darrick J. Wong wrote: > >> >> On Sun, Oct 03, 2021 at 06:07:20PM -0700, David Rientjes wrote: > >> >> > >> >> I audited the entire xfs (kernel) codebase and didn't find any other > >> >> usage errors. Thanks for the patch; I'll apply it to for-next. > >> > >> Which patch, the one that started this thread and uses kmem_cache_free() instead > >> of kfree()? I thought we said it's not the best way? > > > > It's probably better to fix slob to be able to tell that a kmem_free'd > > object actually belongs to a cache and should get freed that way, just > > like its larger sl[ua]b cousins. > > Agreed. Rustam, do you still plan to do that? Yes, I do, thank you. > > > However, even if that does come to pass, anybody /else/ who wants to > > start(?) using XFS on a SLOB system will need this patch to fix the > > minor papercut. Now that I've checked the rest of the codebase, I don't > > find it reasonable to make XFS mutually exclusive with SLOB over two > > instances of slab cache misuse. Hence the RVB. :) > > Ok. I was just wondering because Dave's first reply was that actually you'll > need to expand the use of kfree() instead of kmem_cache_free(). >
On Wed, Oct 13, 2021 at 09:56:41AM -0700, Rustam Kovhaev wrote: > On Wed, Oct 13, 2021 at 09:38:31AM +0200, Vlastimil Babka wrote: > > On 10/13/21 01:22, Darrick J. Wong wrote: > > > On Tue, Oct 12, 2021 at 11:32:25PM +0200, Vlastimil Babka wrote: > > >> On 10/12/2021 10:43 PM, Darrick J. Wong wrote: > > >> > On Tue, Oct 12, 2021 at 01:43:20PM -0700, Darrick J. Wong wrote: > > >> >> On Sun, Oct 03, 2021 at 06:07:20PM -0700, David Rientjes wrote: > > >> >> > > >> >> I audited the entire xfs (kernel) codebase and didn't find any other > > >> >> usage errors. Thanks for the patch; I'll apply it to for-next. > > >> > > >> Which patch, the one that started this thread and uses kmem_cache_free() instead > > >> of kfree()? I thought we said it's not the best way? > > > > > > It's probably better to fix slob to be able to tell that a kmem_free'd > > > object actually belongs to a cache and should get freed that way, just > > > like its larger sl[ua]b cousins. > > > > Agreed. Rustam, do you still plan to do that? > > Yes, I do, thank you. Note that I left out the parts of the patch that changed mm/slob.c because I didn't think that was appropriate for a patch titled 'xfs:'. > > > > > > However, even if that does come to pass, anybody /else/ who wants to > > > start(?) using XFS on a SLOB system will need this patch to fix the > > > minor papercut. Now that I've checked the rest of the codebase, I don't > > > find it reasonable to make XFS mutually exclusive with SLOB over two > > > instances of slab cache misuse. Hence the RVB. :) > > > > Ok. I was just wondering because Dave's first reply was that actually you'll > > need to expand the use of kfree() instead of kmem_cache_free(). I look forward to doing this, but since XFS is a downstream consumer of the kmem apis, we'll have to wait until the slob changes land to do that. --D
diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c index 3f8a0713573a..a4b8caa2c601 100644 --- a/fs/xfs/xfs_extfree_item.c +++ b/fs/xfs/xfs_extfree_item.c @@ -482,7 +482,7 @@ xfs_extent_free_finish_item( free->xefi_startblock, free->xefi_blockcount, &free->xefi_oinfo, free->xefi_skip_discard); - kmem_free(free); + kmem_cache_free(xfs_bmap_free_item_zone, free); return error; } @@ -502,7 +502,7 @@ xfs_extent_free_cancel_item( struct xfs_extent_free_item *free; free = container_of(item, struct xfs_extent_free_item, xefi_list); - kmem_free(free); + kmem_cache_free(xfs_bmap_free_item_zone, free); } const struct xfs_defer_op_type xfs_extent_free_defer_type = { @@ -564,7 +564,7 @@ xfs_agfl_free_finish_item( extp->ext_len = free->xefi_blockcount; efdp->efd_next_extent++; - kmem_free(free); + kmem_cache_free(xfs_bmap_free_item_zone, free); return error; } diff --git a/mm/slob.c b/mm/slob.c index 74d3f6e60666..d2d859ded5f8 100644 --- a/mm/slob.c +++ b/mm/slob.c @@ -389,7 +389,6 @@ static void slob_free(void *block, int size) if (unlikely(ZERO_OR_NULL_PTR(block))) return; - BUG_ON(!size); sp = virt_to_page(block); units = SLOB_UNITS(size); @@ -556,6 +555,7 @@ void kfree(const void *block) if (PageSlab(sp)) { int align = max_t(size_t, ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); unsigned int *m = (unsigned int *)(block - align); + BUG_ON(!*m || *m > (PAGE_SIZE - align)); slob_free(m, *m + align); } else { unsigned int order = compound_order(sp); @@ -649,8 +649,10 @@ EXPORT_SYMBOL(kmem_cache_alloc_node); static void __kmem_cache_free(void *b, int size) { - if (size < PAGE_SIZE) + if (size < PAGE_SIZE) { + BUG_ON(!size); slob_free(b, size); + } else slob_free_pages(b, get_order(size)); }
For kmalloc() allocations SLOB prepends the blocks with a 4-byte header, and it puts the size of the allocated blocks in that header. Blocks allocated with kmem_cache_alloc() allocations do not have that header. SLOB explodes when you allocate memory with kmem_cache_alloc() and then try to free it with kfree() instead of kmem_cache_free(). SLOB will assume that there is a header when there is none, read some garbage to size variable and corrupt the adjacent objects, which eventually leads to hang or panic. Let's make XFS work with SLOB by using proper free function. Fixes: 9749fee83f38 ("xfs: enable the xfs_defer mechanism to process extents to free") Signed-off-by: Rustam Kovhaev <rkovhaev@gmail.com> --- fs/xfs/xfs_extfree_item.c | 6 +++--- mm/slob.c | 6 ++++-- 2 files changed, 7 insertions(+), 5 deletions(-)