Message ID | 20220212071111.148575-1-ztong0001@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 2166a9974902d277cc03f15027d72c4d6ab2a256 |
Headers | show |
Series | dax: make sure inodes are flushed before destroy cache | expand |
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
On Fri, Feb 11, 2022 at 11:11:11PM -0800, Tong Zhang wrote: > A bug can be triggered by following command > > $ modprobe nd_pmem && modprobe -r nd_pmem > > [ 10.060014] BUG dax_cache (Not tainted): Objects remaining in dax_cache on __kmem_cache_shutdown() > [ 10.060938] Slab 0x0000000085b729ac objects=9 used=1 fp=0x000000004f5ae469 flags=0x200000000010200(slab|head|node) > [ 10.062433] Call Trace: > [ 10.062673] dump_stack_lvl+0x34/0x44 > [ 10.062865] slab_err+0x90/0xd0 > [ 10.063619] __kmem_cache_shutdown+0x13b/0x2f0 > [ 10.063848] kmem_cache_destroy+0x4a/0x110 > [ 10.064058] __x64_sys_delete_module+0x265/0x300 > > This is caused by dax_fs_exit() not flushing inodes before destroy cache. > To fix this issue, call rcu_barrier() before destroy cache. I don't doubt that this fixes the bug. However, I can't help but think this is hiding a bug, or perhaps a missing step, in the kmem_cache layer? As far as I can see dax does not call call_rcu() and only uses srcu not rcu? I was tempted to suggest srcu_barrier() but dax does not call call_srcu() either. So I'm not clear about what is really going on and why this fixes it. I know that dax is not using srcu is a standard way so perhaps this helps in a way I don't quite grok? If so perhaps a comment here would be in order? Ira > > Signed-off-by: Tong Zhang <ztong0001@gmail.com> > --- > drivers/dax/super.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/dax/super.c b/drivers/dax/super.c > index e3029389d809..6bd565fe2e63 100644 > --- a/drivers/dax/super.c > +++ b/drivers/dax/super.c > @@ -476,6 +476,7 @@ static int dax_fs_init(void) > static void dax_fs_exit(void) > { > kern_unmount(dax_mnt); > + rcu_barrier(); > kmem_cache_destroy(dax_cache); > } > > -- > 2.25.1 > >
On Mon, Feb 14, 2022 at 9:59 AM Ira Weiny <ira.weiny@intel.com> wrote: > > On Fri, Feb 11, 2022 at 11:11:11PM -0800, Tong Zhang wrote: > > A bug can be triggered by following command > > > > $ modprobe nd_pmem && modprobe -r nd_pmem > > > > [ 10.060014] BUG dax_cache (Not tainted): Objects remaining in dax_cache on __kmem_cache_shutdown() > > [ 10.060938] Slab 0x0000000085b729ac objects=9 used=1 fp=0x000000004f5ae469 flags=0x200000000010200(slab|head|node) > > [ 10.062433] Call Trace: > > [ 10.062673] dump_stack_lvl+0x34/0x44 > > [ 10.062865] slab_err+0x90/0xd0 > > [ 10.063619] __kmem_cache_shutdown+0x13b/0x2f0 > > [ 10.063848] kmem_cache_destroy+0x4a/0x110 > > [ 10.064058] __x64_sys_delete_module+0x265/0x300 > > > > This is caused by dax_fs_exit() not flushing inodes before destroy cache. > > To fix this issue, call rcu_barrier() before destroy cache. > > I don't doubt that this fixes the bug. However, I can't help but think this is > hiding a bug, or perhaps a missing step, in the kmem_cache layer? As far as I > can see dax does not call call_rcu() and only uses srcu not rcu? I was tempted > to suggest srcu_barrier() but dax does not call call_srcu() either. This rcu_barrier() is associated with the call_rcu() in destroy_inode(). While kern_unmount() does a full sycnrhonize_rcu() after clearing ->mnt_ns. Any pending destroy_inode() callbacks need to be flushed before the kmem_cache is destroyed. > So I'm not clear about what is really going on and why this fixes it. I know > that dax is not using srcu is a standard way so perhaps this helps in a way I > don't quite grok? If so perhaps a comment here would be in order? Looks like a common pattern I missed that all filesystem exit paths implement.
On Mon, Feb 14, 2022 at 12:09:54PM -0800, Dan Williams wrote: > On Mon, Feb 14, 2022 at 9:59 AM Ira Weiny <ira.weiny@intel.com> wrote: > > > > On Fri, Feb 11, 2022 at 11:11:11PM -0800, Tong Zhang wrote: > > > A bug can be triggered by following command > > > > > > $ modprobe nd_pmem && modprobe -r nd_pmem > > > > > > [ 10.060014] BUG dax_cache (Not tainted): Objects remaining in dax_cache on __kmem_cache_shutdown() > > > [ 10.060938] Slab 0x0000000085b729ac objects=9 used=1 fp=0x000000004f5ae469 flags=0x200000000010200(slab|head|node) > > > [ 10.062433] Call Trace: > > > [ 10.062673] dump_stack_lvl+0x34/0x44 > > > [ 10.062865] slab_err+0x90/0xd0 > > > [ 10.063619] __kmem_cache_shutdown+0x13b/0x2f0 > > > [ 10.063848] kmem_cache_destroy+0x4a/0x110 > > > [ 10.064058] __x64_sys_delete_module+0x265/0x300 > > > > > > This is caused by dax_fs_exit() not flushing inodes before destroy cache. > > > To fix this issue, call rcu_barrier() before destroy cache. > > > > I don't doubt that this fixes the bug. However, I can't help but think this is > > hiding a bug, or perhaps a missing step, in the kmem_cache layer? As far as I > > can see dax does not call call_rcu() and only uses srcu not rcu? I was tempted > > to suggest srcu_barrier() but dax does not call call_srcu() either. > > This rcu_barrier() is associated with the call_rcu() in destroy_inode(). Ok yea. > > While kern_unmount() does a full sycnrhonize_rcu() after clearing > ->mnt_ns. Any pending destroy_inode() callbacks need to be flushed > before the kmem_cache is destroyed. > > > So I'm not clear about what is really going on and why this fixes it. I know > > that dax is not using srcu is a standard way so perhaps this helps in a way I > > don't quite grok? If so perhaps a comment here would be in order? > > Looks like a common pattern I missed that all filesystem exit paths implement. I think a comment would be in order, especially since since it looks like every other FS has one: fs/ext4/super.c: ... /* * Make sure all delayed rcu free inodes are flushed before we * destroy cache. */ rcu_barrier(); ... Anyway ok. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Thanks for looking Dan, Ira
diff --git a/drivers/dax/super.c b/drivers/dax/super.c index e3029389d809..6bd565fe2e63 100644 --- a/drivers/dax/super.c +++ b/drivers/dax/super.c @@ -476,6 +476,7 @@ static int dax_fs_init(void) static void dax_fs_exit(void) { kern_unmount(dax_mnt); + rcu_barrier(); kmem_cache_destroy(dax_cache); }
A bug can be triggered by following command $ modprobe nd_pmem && modprobe -r nd_pmem [ 10.060014] BUG dax_cache (Not tainted): Objects remaining in dax_cache on __kmem_cache_shutdown() [ 10.060938] Slab 0x0000000085b729ac objects=9 used=1 fp=0x000000004f5ae469 flags=0x200000000010200(slab|head|node) [ 10.062433] Call Trace: [ 10.062673] dump_stack_lvl+0x34/0x44 [ 10.062865] slab_err+0x90/0xd0 [ 10.063619] __kmem_cache_shutdown+0x13b/0x2f0 [ 10.063848] kmem_cache_destroy+0x4a/0x110 [ 10.064058] __x64_sys_delete_module+0x265/0x300 This is caused by dax_fs_exit() not flushing inodes before destroy cache. To fix this issue, call rcu_barrier() before destroy cache. Signed-off-by: Tong Zhang <ztong0001@gmail.com> --- drivers/dax/super.c | 1 + 1 file changed, 1 insertion(+)