diff mbox series

slab_pre_alloc_hook() strips __GFP_NOLOCKDEP away.

Message ID YjoJ4CzB3yfWSV1F@linutronix.de (mailing list archive)
State New
Headers show
Series slab_pre_alloc_hook() strips __GFP_NOLOCKDEP away. | expand

Commit Message

Sebastian Andrzej Siewior March 22, 2022, 5:39 p.m. UTC
Run into
| ======================================================
| WARNING: possible circular locking dependency detected
| 5.17.0-next-20220322 #19 Not tainted
| ------------------------------------------------------
| kswapd1/513 is trying to acquire lock:
| ffff888555b7e628 (&xfs_dir_ilock_class){++++}-{3:3}, at: xfs_icwalk_ag+0x36c/0x810
| 
| but task is already holding lock:
| ffffffff82a2fb20 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x600/0x740
| 
| which lock already depends on the new lock.
| 
| 
| the existing dependency chain (in reverse order) is:
| 
| -> #1 (fs_reclaim){+.+.}-{0:0}:
|        fs_reclaim_acquire+0xaa/0xe0
|        __kmalloc_node+0x65/0x3e0
|        xfs_attr_copy_value+0x70/0xa0
…

and I think this is similar to commit
   704687deaae76 ("mm: make slab and vmalloc allocators __GFP_NOLOCKDEP aware")

and maybe something like this:

?

Sebastian

Comments

Vlastimil Babka March 23, 2022, 10:09 a.m. UTC | #1
On 3/22/22 18:39, Sebastian Andrzej Siewior wrote:
> Run into
> | ======================================================
> | WARNING: possible circular locking dependency detected
> | 5.17.0-next-20220322 #19 Not tainted
> | ------------------------------------------------------
> | kswapd1/513 is trying to acquire lock:
> | ffff888555b7e628 (&xfs_dir_ilock_class){++++}-{3:3}, at: xfs_icwalk_ag+0x36c/0x810
> | 
> | but task is already holding lock:
> | ffffffff82a2fb20 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x600/0x740
> | 
> | which lock already depends on the new lock.
> | 
> | 
> | the existing dependency chain (in reverse order) is:
> | 
> | -> #1 (fs_reclaim){+.+.}-{0:0}:
> |        fs_reclaim_acquire+0xaa/0xe0
> |        __kmalloc_node+0x65/0x3e0
> |        xfs_attr_copy_value+0x70/0xa0
> …
> 
> and I think this is similar to commit
>    704687deaae76 ("mm: make slab and vmalloc allocators __GFP_NOLOCKDEP aware")
> 
> and maybe something like this:
> 
> diff --git a/mm/slab.h b/mm/slab.h
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -717,7 +717,7 @@ static inline struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s,
>  						     struct obj_cgroup **objcgp,
>  						     size_t size, gfp_t flags)
>  {
> -	flags &= gfp_allowed_mask;
> +	flags &= gfp_allowed_mask | __GFP_NOLOCKDEP;

Hmm but gfp_allowed_mask already should contain __GFP_NOLOCKDEP after
kernel_init_freeable() is reached. I doubt we can reach fs or reclaim
code before that?

>  
>  	might_alloc(flags);
>  
> ?
> 
> Sebastian
Sebastian Andrzej Siewior March 23, 2022, 11:30 a.m. UTC | #2
On 2022-03-23 11:09:35 [+0100], Vlastimil Babka wrote:
> > --- a/mm/slab.h
> > +++ b/mm/slab.h
> > @@ -717,7 +717,7 @@ static inline struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s,
> >  						     struct obj_cgroup **objcgp,
> >  						     size_t size, gfp_t flags)
> >  {
> > -	flags &= gfp_allowed_mask;
> > +	flags &= gfp_allowed_mask | __GFP_NOLOCKDEP;
> 
> Hmm but gfp_allowed_mask already should contain __GFP_NOLOCKDEP after
> kernel_init_freeable() is reached. I doubt we can reach fs or reclaim
> code before that?

That is way past init. I have
 gfp_allowed_mask = 0x1ffffff
 __GFP_NOLOCKDEP = 0x8000000

Looking at the origin of gfp_allowed_mask

| /* Room for N __GFP_FOO bits */
| #define __GFP_BITS_SHIFT (24 +                                          \
|                           3 * IS_ENABLED(CONFIG_KASAN_HW_TAGS) +        \
|                           IS_ENABLED(CONFIG_LOCKDEP))
| #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))

which in my case 24 + 1 but it would need to be something like
	27 * IS_ENABLED(CONFIG_LOCKDEP)

or a better way to come up with __GFP_BITS_MASK.

Sebastian
Vlastimil Babka March 23, 2022, 11:45 a.m. UTC | #3
On 3/23/22 12:30, Sebastian Andrzej Siewior wrote:
> On 2022-03-23 11:09:35 [+0100], Vlastimil Babka wrote:
>>> --- a/mm/slab.h
>>> +++ b/mm/slab.h
>>> @@ -717,7 +717,7 @@ static inline struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s,
>>>  						     struct obj_cgroup **objcgp,
>>>  						     size_t size, gfp_t flags)
>>>  {
>>> -	flags &= gfp_allowed_mask;
>>> +	flags &= gfp_allowed_mask | __GFP_NOLOCKDEP;
>>
>> Hmm but gfp_allowed_mask already should contain __GFP_NOLOCKDEP after
>> kernel_init_freeable() is reached. I doubt we can reach fs or reclaim
>> code before that?
> 
> That is way past init. I have
>  gfp_allowed_mask = 0x1ffffff
>  __GFP_NOLOCKDEP = 0x8000000
> 
> Looking at the origin of gfp_allowed_mask
> 
> | /* Room for N __GFP_FOO bits */
> | #define __GFP_BITS_SHIFT (24 +                                          \
> |                           3 * IS_ENABLED(CONFIG_KASAN_HW_TAGS) +        \
> |                           IS_ENABLED(CONFIG_LOCKDEP))
> | #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))

Mainline has this:
#define __GFP_BITS_SHIFT (25 + IS_ENABLED(CONFIG_LOCKDEP))

which looks OK.

The '24' comes from "kasan, mm: only define ___GFP_SKIP_KASAN_POISON
with HW_TAGS" in mmotm.
It makes ___GFP_SKIP_KASAN_POISON optional, that's why the decrease to
24. However this is not about the number of flags, but the highest bit,
which can be that of __GFP_NOLOCKDEP and it's then not covered by
__GFP_BITS_SHIFT if lockdep is enabled and kasan not...

> which in my case 24 + 1 but it would need to be something like
> 	27 * IS_ENABLED(CONFIG_LOCKDEP)
> 
> or a better way to come up with __GFP_BITS_MASK.
> 
> Sebastian
diff mbox series

Patch

diff --git a/mm/slab.h b/mm/slab.h
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -717,7 +717,7 @@  static inline struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s,
 						     struct obj_cgroup **objcgp,
 						     size_t size, gfp_t flags)
 {
-	flags &= gfp_allowed_mask;
+	flags &= gfp_allowed_mask | __GFP_NOLOCKDEP;
 
 	might_alloc(flags);