Message ID | 20240622082631.2661148-1-leo.lilong@huawei.com (mailing list archive) |
---|---|
State | Queued |
Headers | show |
Series | xfs: eliminate lockdep false positives in xfs_attr_shortform_list | expand |
On Sat, Jun 22, 2024 at 04:26:31PM +0800, Long Li wrote: > xfs_attr_shortform_list() only called from a non-transactional context, it > hold ilock before alloc memory and maybe trapped in memory reclaim. Since > commit 204fae32d5f7("xfs: clean up remaining GFP_NOFS users") removed > GFP_NOFS flag, lockdep warning will be report as [1]. Eliminate lockdep > false positives by use __GFP_NOLOCKDEP to alloc memory > in xfs_attr_shortform_list(). > > [1] https://lore.kernel.org/linux-xfs/000000000000e33add0616358204@google.com/ > Reported-by: syzbot+4248e91deb3db78358a2@syzkaller.appspotmail.com > Signed-off-by: Long Li <leo.lilong@huawei.com> > --- > fs/xfs/xfs_attr_list.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c > index 5c947e5ce8b8..8cd6088e6190 100644 > --- a/fs/xfs/xfs_attr_list.c > +++ b/fs/xfs/xfs_attr_list.c > @@ -114,7 +114,8 @@ xfs_attr_shortform_list( > * It didn't all fit, so we have to sort everything on hashval. > */ > sbsize = sf->count * sizeof(*sbuf); > - sbp = sbuf = kmalloc(sbsize, GFP_KERNEL | __GFP_NOFAIL); > + sbp = sbuf = kmalloc(sbsize, > + GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_NOFAIL); Why wouldn't we memalloc_nofs_save any time we take an ILOCK when we're not in transaction context? Surely you'd want to NOFS /any/ allocation when the ILOCK is held, right? --D > > /* > * Scan the attribute list for the rest of the entries, storing > -- > 2.39.2 > >
On Mon, Jun 24, 2024 at 09:03:42AM -0700, Darrick J. Wong wrote: > On Sat, Jun 22, 2024 at 04:26:31PM +0800, Long Li wrote: > > xfs_attr_shortform_list() only called from a non-transactional context, it > > hold ilock before alloc memory and maybe trapped in memory reclaim. Since > > commit 204fae32d5f7("xfs: clean up remaining GFP_NOFS users") removed > > GFP_NOFS flag, lockdep warning will be report as [1]. Eliminate lockdep > > false positives by use __GFP_NOLOCKDEP to alloc memory > > in xfs_attr_shortform_list(). > > > > [1] https://lore.kernel.org/linux-xfs/000000000000e33add0616358204@google.com/ > > Reported-by: syzbot+4248e91deb3db78358a2@syzkaller.appspotmail.com > > Signed-off-by: Long Li <leo.lilong@huawei.com> > > --- > > fs/xfs/xfs_attr_list.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c > > index 5c947e5ce8b8..8cd6088e6190 100644 > > --- a/fs/xfs/xfs_attr_list.c > > +++ b/fs/xfs/xfs_attr_list.c > > @@ -114,7 +114,8 @@ xfs_attr_shortform_list( > > * It didn't all fit, so we have to sort everything on hashval. > > */ > > sbsize = sf->count * sizeof(*sbuf); > > - sbp = sbuf = kmalloc(sbsize, GFP_KERNEL | __GFP_NOFAIL); > > + sbp = sbuf = kmalloc(sbsize, > > + GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_NOFAIL); > > Why wouldn't we memalloc_nofs_save any time we take an ILOCK when we're > not in transaction context? Surely you'd want to NOFS /any/ allocation > when the ILOCK is held, right? > > --D > > I believe using memalloc_nofs_save could solve the problem, sometimes it may be more effective than using the __GFP_NOLOCKDEP flag. However, looking at similar functions, for example xfs_btree_alloc_cursor, it uses __GFP_NOLOCKDEP to prevent ABBA deadlock false positive warnings. xfs_attr_list_ilocked xfs_iread_extents xfs_bmbt_init_cursor xfs_btree_alloc_cursor kmem_cache_zalloc(cache, GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_NOFAIL) After thinking a little more, I found out that just using __GFP_NOLOCKDEP may not be enough, AA deadlock false positive warnings [1] still exist in the mainline kernel if my understanding is correct. [1] https://lore.kernel.org/linux-xfs/20240622094411.GA830005@ceph-admin/T/#m6f7ab8438bf82f0dc44c6d42d183ae08c07dcd5f thanks, Long Li
On 6/24/24 11:03 AM, Darrick J. Wong wrote: > On Sat, Jun 22, 2024 at 04:26:31PM +0800, Long Li wrote: >> xfs_attr_shortform_list() only called from a non-transactional context, it >> hold ilock before alloc memory and maybe trapped in memory reclaim. Since >> commit 204fae32d5f7("xfs: clean up remaining GFP_NOFS users") removed >> GFP_NOFS flag, lockdep warning will be report as [1]. Eliminate lockdep >> false positives by use __GFP_NOLOCKDEP to alloc memory >> in xfs_attr_shortform_list(). >> >> [1] https://lore.kernel.org/linux-xfs/000000000000e33add0616358204@google.com/ >> Reported-by: syzbot+4248e91deb3db78358a2@syzkaller.appspotmail.com >> Signed-off-by: Long Li <leo.lilong@huawei.com> >> --- >> fs/xfs/xfs_attr_list.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c >> index 5c947e5ce8b8..8cd6088e6190 100644 >> --- a/fs/xfs/xfs_attr_list.c >> +++ b/fs/xfs/xfs_attr_list.c >> @@ -114,7 +114,8 @@ xfs_attr_shortform_list( >> * It didn't all fit, so we have to sort everything on hashval. >> */ >> sbsize = sf->count * sizeof(*sbuf); >> - sbp = sbuf = kmalloc(sbsize, GFP_KERNEL | __GFP_NOFAIL); >> + sbp = sbuf = kmalloc(sbsize, >> + GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_NOFAIL); > > Why wouldn't we memalloc_nofs_save any time we take an ILOCK when we're > not in transaction context? Surely you'd want to NOFS /any/ allocation > when the ILOCK is held, right? I'm not sure I understand this. AFAICT, this is indeed a false positive, and can be fixed by applying exactly the same pattern used elsewhere in 94a69db2367e ("xfs: use __GFP_NOLOCKDEP instead of GFP_NOFS") Using memalloc_nofs_save implies that this really /would/ deadlock without GFP_NOFS, right? Is that the case? I was under the impression that this was simply a missed callsite in 94a69db2367e and as Long Li points out, other allocations under xfs_attr_list_ilocked() use the exact same (GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_NOFAIL) pattern proposed in this change. Thanks, -Eric
On Mon, Jul 08, 2024 at 12:00:05PM -0700, Darrick J. Wong wrote: > On Mon, Jul 08, 2024 at 10:40:37AM -0500, Eric Sandeen wrote: > > On 6/24/24 11:03 AM, Darrick J. Wong wrote: > > > On Sat, Jun 22, 2024 at 04:26:31PM +0800, Long Li wrote: > > >> xfs_attr_shortform_list() only called from a non-transactional context, it > > >> hold ilock before alloc memory and maybe trapped in memory reclaim. Since > > >> commit 204fae32d5f7("xfs: clean up remaining GFP_NOFS users") removed > > >> GFP_NOFS flag, lockdep warning will be report as [1]. Eliminate lockdep > > >> false positives by use __GFP_NOLOCKDEP to alloc memory > > >> in xfs_attr_shortform_list(). > > >> > > >> [1] https://lore.kernel.org/linux-xfs/000000000000e33add0616358204@google.com/ > > >> Reported-by: syzbot+4248e91deb3db78358a2@syzkaller.appspotmail.com > > >> Signed-off-by: Long Li <leo.lilong@huawei.com> > > >> --- > > >> fs/xfs/xfs_attr_list.c | 3 ++- > > >> 1 file changed, 2 insertions(+), 1 deletion(-) > > >> > > >> diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c > > >> index 5c947e5ce8b8..8cd6088e6190 100644 > > >> --- a/fs/xfs/xfs_attr_list.c > > >> +++ b/fs/xfs/xfs_attr_list.c > > >> @@ -114,7 +114,8 @@ xfs_attr_shortform_list( > > >> * It didn't all fit, so we have to sort everything on hashval. > > >> */ > > >> sbsize = sf->count * sizeof(*sbuf); > > >> - sbp = sbuf = kmalloc(sbsize, GFP_KERNEL | __GFP_NOFAIL); > > >> + sbp = sbuf = kmalloc(sbsize, > > >> + GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_NOFAIL); > > > > > > Why wouldn't we memalloc_nofs_save any time we take an ILOCK when we're > > > not in transaction context? Surely you'd want to NOFS /any/ allocation > > > when the ILOCK is held, right? > > > > I'm not sure I understand this. AFAICT, this is indeed a false positive, and can > > be fixed by applying exactly the same pattern used elsewhere in > > 94a69db2367e ("xfs: use __GFP_NOLOCKDEP instead of GFP_NOFS") > > > > Using memalloc_nofs_save implies that this really /would/ deadlock without > > GFP_NOFS, right? Is that the case? > > > > I was under the impression that this was simply a missed callsite in 94a69db2367e > > and as Long Li points out, other allocations under xfs_attr_list_ilocked() > > use the exact same (GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_NOFAIL) pattern > > proposed in this change. > > Oh, now I see that the alleged deadlock is between the ILOCK of a > directory that we're accessing, and a different inode that we're trying > to reclaim. Lockdep doesn't know that these two contexts are mutually > exclusive since reclaim cannot target an inode with an active ref. NOFS > is a big hammer, which is why the proposal is to turn off lockdep for > the allocation? Why not fix lockdep's tracking? > > <sees another thread> > https://lore.kernel.org/linux-xfs/Zou8FCgPKqqWXKyS@dread.disaster.area/ > > We can't use an ILOCK subclass for the reclaim code because we've run > out of lockdep subclasses. I guess you could abuse lockdep_set_class to > change the lockdep class of an ILOCK when the inode enters reclaim (and > change it back if the inode gets recycled) but that's a bit gross. > > What if we got rid of XFS_ILOCK_RT{BITMAP,SUMMARY} to free up subclass > bits? > > https://lore.kernel.org/linux-xfs/?q=xfs%3A+remove+XFS_ILOCK_RT Yes, that would probably work - all we need is a single subclass for the ilock to say reclaim locking is a different context. There should only be one lock site that we need that annotation for (the final xfs_ilock() in xfs_reclaim_inode() after the inode has been removed from the radix tree), and we don't need nesting because we are only locking a single inode at a time. -Dave.
[cc Carlos] This still needs to be fixed. On Sat, Jun 22, 2024 at 04:26:31PM +0800, Long Li wrote: > xfs_attr_shortform_list() only called from a non-transactional context, it > hold ilock before alloc memory and maybe trapped in memory reclaim. Since > commit 204fae32d5f7("xfs: clean up remaining GFP_NOFS users") removed > GFP_NOFS flag, lockdep warning will be report as [1]. Eliminate lockdep > false positives by use __GFP_NOLOCKDEP to alloc memory > in xfs_attr_shortform_list(). > > [1] https://lore.kernel.org/linux-xfs/000000000000e33add0616358204@google.com/ > Reported-by: syzbot+4248e91deb3db78358a2@syzkaller.appspotmail.com > Signed-off-by: Long Li <leo.lilong@huawei.com> > --- > fs/xfs/xfs_attr_list.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c > index 5c947e5ce8b8..8cd6088e6190 100644 > --- a/fs/xfs/xfs_attr_list.c > +++ b/fs/xfs/xfs_attr_list.c > @@ -114,7 +114,8 @@ xfs_attr_shortform_list( > * It didn't all fit, so we have to sort everything on hashval. > */ > sbsize = sf->count * sizeof(*sbuf); > - sbp = sbuf = kmalloc(sbsize, GFP_KERNEL | __GFP_NOFAIL); > + sbp = sbuf = kmalloc(sbsize, > + GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_NOFAIL); > > /* > * Scan the attribute list for the rest of the entries, storing Reviewed-by: Dave Chinner <dchinner@redhat.com> Carlos, can you please pick this patch up? We're still getting new lockdep false positives being reported from this issue, and this is the correct fix to make right now. -Dave.
diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c index 5c947e5ce8b8..8cd6088e6190 100644 --- a/fs/xfs/xfs_attr_list.c +++ b/fs/xfs/xfs_attr_list.c @@ -114,7 +114,8 @@ xfs_attr_shortform_list( * It didn't all fit, so we have to sort everything on hashval. */ sbsize = sf->count * sizeof(*sbuf); - sbp = sbuf = kmalloc(sbsize, GFP_KERNEL | __GFP_NOFAIL); + sbp = sbuf = kmalloc(sbsize, + GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_NOFAIL); /* * Scan the attribute list for the rest of the entries, storing
xfs_attr_shortform_list() only called from a non-transactional context, it hold ilock before alloc memory and maybe trapped in memory reclaim. Since commit 204fae32d5f7("xfs: clean up remaining GFP_NOFS users") removed GFP_NOFS flag, lockdep warning will be report as [1]. Eliminate lockdep false positives by use __GFP_NOLOCKDEP to alloc memory in xfs_attr_shortform_list(). [1] https://lore.kernel.org/linux-xfs/000000000000e33add0616358204@google.com/ Reported-by: syzbot+4248e91deb3db78358a2@syzkaller.appspotmail.com Signed-off-by: Long Li <leo.lilong@huawei.com> --- fs/xfs/xfs_attr_list.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)