Message ID | 20240429054706.1543980-1-hch@lst.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm,page_owner: don't remove GFP flags in add_stack_record_to_list | expand |
On 4/29/24 7:47 AM, Christoph Hellwig wrote: > This loses flags like GFP_NOFS and GFP_NOIO that are important to avoid > deadlocks as well as GFP_NOLOCKDEP that otherwise generates lockdep false > positives. GFP_NOFS and GFP_NOIO translate to GFP_KERNEL without __GFP_FS/__GFP_IO so I don't see how this patch would have helped with those. __GFP_NOLOCKDEP is likely the actual issue and stackdepot solved it like this: https://lore.kernel.org/linux-xfs/20240418141133.22950-1-ryabinin.a.a@gmail.com/ So we could just do the same here. > Fixes: 217b2119b9e2 ("mm,page_owner: implement the tracking of the stacks count") > Reported-by: Reported-by: syzbot+b7e8d799f0ab724876f9@syzkaller.appspotmail.com > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > mm/page_owner.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/mm/page_owner.c b/mm/page_owner.c > index d17d1351ec84af..d214488846fa92 100644 > --- a/mm/page_owner.c > +++ b/mm/page_owner.c > @@ -168,9 +168,7 @@ static void add_stack_record_to_list(struct stack_record *stack_record, > unsigned long flags; > struct stack *stack; > > - /* Filter gfp_mask the same way stackdepot does, for consistency */ > gfp_mask &= ~GFP_ZONEMASK; > - gfp_mask &= (GFP_ATOMIC | GFP_KERNEL); > gfp_mask |= __GFP_NOWARN; > > set_current_in_page_owner();
On Mon, Apr 29, 2024 at 09:59:43AM +0200, Vlastimil Babka wrote: > On 4/29/24 7:47 AM, Christoph Hellwig wrote: > > This loses flags like GFP_NOFS and GFP_NOIO that are important to avoid > > deadlocks as well as GFP_NOLOCKDEP that otherwise generates lockdep false > > positives. > > GFP_NOFS and GFP_NOIO translate to GFP_KERNEL without __GFP_FS/__GFP_IO so I > don't see how this patch would have helped with those. > __GFP_NOLOCKDEP is likely the actual issue and stackdepot solved it like this: > > https://lore.kernel.org/linux-xfs/20240418141133.22950-1-ryabinin.a.a@gmail.com/ > > So we could just do the same here. Yes, it is __GFP_NOLOCKDEP that is the issue here, but cargo-cult-copying of that stackdepot fix is just whack-a-mole bug fixing without addressing the technical debt that got us here in the first place. Has anyone else bothered to look to see if kmemleak has the same problem? If anyone bothered to do an audit, they would see that gfp_kmemleak_mask() handles the reclaim context masks correctly. Further, it adds NOWARN, NOMEMALLOC and NORETRY, which means the debug code is silent when it fails, it doesn't deplete emergency reserves and doesn't bog down retrying forever when there are sustained low memory situations. This also points out that the page-owner/stackdepot code that strips GFP_ZONEMASK is completely redundant. Doing: gfp_flags &= GFP_KERNEL|GFP_ATOMIC|__GFP_NOLOCKDEP; strips everything but __GFP_RECLAIM, __GFP_FS, __GFP_IO, __GFP_HIGH and __GFP_NOLOCKDEP. This already strips the zonemask info, so there's no need to do it explicitly. IOWs, the right way to fix this set of problems is to lift gfp_kmemleak_mask() to include/linux/gfp.h and then use it across all these nested allocations that occur behind the public memory allocation API. I've got a patchset under test at the moment that does this.... -Dave.
On 4/30/24 1:49 AM, Dave Chinner wrote: > On Mon, Apr 29, 2024 at 09:59:43AM +0200, Vlastimil Babka wrote: >> On 4/29/24 7:47 AM, Christoph Hellwig wrote: >> > This loses flags like GFP_NOFS and GFP_NOIO that are important to avoid >> > deadlocks as well as GFP_NOLOCKDEP that otherwise generates lockdep false >> > positives. >> >> GFP_NOFS and GFP_NOIO translate to GFP_KERNEL without __GFP_FS/__GFP_IO so I >> don't see how this patch would have helped with those. >> __GFP_NOLOCKDEP is likely the actual issue and stackdepot solved it like this: >> >> https://lore.kernel.org/linux-xfs/20240418141133.22950-1-ryabinin.a.a@gmail.com/ >> >> So we could just do the same here. > > Yes, it is __GFP_NOLOCKDEP that is the issue here, but > cargo-cult-copying of that stackdepot fix is just whack-a-mole bug > fixing without addressing the technical debt that got us here in the > first place. Has anyone else bothered to look to see if kmemleak has > the same problem? Looks like you did :) > If anyone bothered to do an audit, they would see that > gfp_kmemleak_mask() handles the reclaim context masks correctly. > Further, it adds NOWARN, NOMEMALLOC and > NORETRY, which means the debug code is silent when it fails, it > doesn't deplete emergency reserves and doesn't bog down retrying > forever when there are sustained low memory situations. So we do have NOWARN here. __GFP_RETRY_MAYFAIL might have been slightly better than __GFP_NOWARN wrt "not retrying forever" but also not giving up too soon. If we want to be really careful about reserves, it's a question whether to keep the | GFP_ATOMIC which translates to leaving __GFP_HIGH. OTOH if we don't keep it, these allocations might fail too easily from an atomic context and we could miss the debugging data. > This also points out that the page-owner/stackdepot code that strips > GFP_ZONEMASK is completely redundant. Doing: > > gfp_flags &= GFP_KERNEL|GFP_ATOMIC|__GFP_NOLOCKDEP; > > strips everything but __GFP_RECLAIM, __GFP_FS, __GFP_IO, > __GFP_HIGH and __GFP_NOLOCKDEP. This already strips the zonemask > info, so there's no need to do it explicitly. True. > IOWs, the right way to fix this set of problems is to lift > gfp_kmemleak_mask() to include/linux/gfp.h and then use it across > all these nested allocations that occur behind the public > memory allocation API. Agree. But arguably these quick fixes adding __GFP_NOLOCKDEP were appropriate for the late rc phase we're in. > I've got a patchset under test at the moment that does this.... Great! Thanks. > -Dave.
diff --git a/mm/page_owner.c b/mm/page_owner.c index d17d1351ec84af..d214488846fa92 100644 --- a/mm/page_owner.c +++ b/mm/page_owner.c @@ -168,9 +168,7 @@ static void add_stack_record_to_list(struct stack_record *stack_record, unsigned long flags; struct stack *stack; - /* Filter gfp_mask the same way stackdepot does, for consistency */ gfp_mask &= ~GFP_ZONEMASK; - gfp_mask &= (GFP_ATOMIC | GFP_KERNEL); gfp_mask |= __GFP_NOWARN; set_current_in_page_owner();
This loses flags like GFP_NOFS and GFP_NOIO that are important to avoid deadlocks as well as GFP_NOLOCKDEP that otherwise generates lockdep false positives. Fixes: 217b2119b9e2 ("mm,page_owner: implement the tracking of the stacks count") Reported-by: Reported-by: syzbot+b7e8d799f0ab724876f9@syzkaller.appspotmail.com Signed-off-by: Christoph Hellwig <hch@lst.de> --- mm/page_owner.c | 2 -- 1 file changed, 2 deletions(-)