diff mbox series

[1/3] mm/slub: directly load freelist from cpu partial slab in the likely case

Message ID 20240117-slab-misc-v1-1-fd1c49ccbe70@bytedance.com (mailing list archive)
State New
Headers show
Series mm/slub: some minor optimization and cleanup | expand

Commit Message

Chengming Zhou Jan. 17, 2024, 11:45 a.m. UTC
The likely case is that we get a usable slab from the cpu partial list,
we can directly load freelist from it and return back, instead of going
the other way that need more work, like reenable interrupt and recheck.

But we need to remove the "VM_BUG_ON(!new.frozen)" in get_freelist()
for reusing it, since cpu partial slab is not frozen. It seems
acceptable since it's only for debug purpose.

There is some small performance improvement too, which shows by:
perf bench sched messaging -g 5 -t -l 100000

            mm-stable   slub-optimize
Total time      7.473    7.209

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 mm/slub.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

Comments

Christoph Lameter (Ampere) Jan. 17, 2024, 10:41 p.m. UTC | #1
On Wed, 17 Jan 2024, Chengming Zhou wrote:

> The likely case is that we get a usable slab from the cpu partial list,
> we can directly load freelist from it and return back, instead of going
> the other way that need more work, like reenable interrupt and recheck.

Ok I see that it could be useful to avoid the unlock_irq/lock_irq sequence 
in the partial cpu handling.

> But we need to remove the "VM_BUG_ON(!new.frozen)" in get_freelist()
> for reusing it, since cpu partial slab is not frozen. It seems
> acceptable since it's only for debug purpose.

This is test for verification that the newly acquired slab is actually in 
frozen status. If that test is no longer necessary then this is a bug that 
may need to be fixed independently. Maybe this test is now required to be 
different depending on where the partial slab originated from? Check only 
necessary when taken from the per node partials?

> There is some small performance improvement too, which shows by:
> perf bench sched messaging -g 5 -t -l 100000
>
>            mm-stable   slub-optimize
> Total time      7.473    7.209

Hmm... Good avoiding the lock/relock sequence helps.
Chengming Zhou Jan. 18, 2024, 11:37 a.m. UTC | #2
On 2024/1/18 06:41, Christoph Lameter (Ampere) wrote:
> On Wed, 17 Jan 2024, Chengming Zhou wrote:
> 
>> The likely case is that we get a usable slab from the cpu partial list,
>> we can directly load freelist from it and return back, instead of going
>> the other way that need more work, like reenable interrupt and recheck.
> 
> Ok I see that it could be useful to avoid the unlock_irq/lock_irq sequence in the partial cpu handling.

Right.

> 
>> But we need to remove the "VM_BUG_ON(!new.frozen)" in get_freelist()
>> for reusing it, since cpu partial slab is not frozen. It seems
>> acceptable since it's only for debug purpose.
> 
> This is test for verification that the newly acquired slab is actually in frozen status. If that test is no longer necessary then this is a bug that may need to be fixed independently. Maybe this test is now required to be different depending on where the partial slab originated from? Check only necessary when taken from the per node partials?

Now there are two similar functions: get_freelist() and freeze_slab().

get_freelist() is used for the cpu slab, will transfer the freelist to
the cpu freelist, so there is "VM_BUG_ON(!new.frozen)" in it, since the
cpu slab must be frozen already.

freeze_slab() is used for slab got from node partial list, will be frozen
and get the freelist from it before using as the cpu slab. So it has the
"VM_BUG_ON(new.frozen)" in it since the partial slab must NOT be frozen.
And it doesn't need the cpu_slab lock.

This patch handles the third case: slab on cpu partial list, which
already held the cpu_slab lock, so change to reuse get_freelist() from
freeze_slab().

So get_freelist() has two cases to handle: cpu slab and cpu partial list slab.
The latter is NOT frozen, so need to remove "VM_BUG_ON(!new.frozen)" from it.

And "VM_BUG_ON(new.frozen)" in freeze_slab() is unchanged, so per node partials
are covered.

Thanks!

> 
>> There is some small performance improvement too, which shows by:
>> perf bench sched messaging -g 5 -t -l 100000
>>
>>            mm-stable   slub-optimize
>> Total time      7.473    7.209
> 
> Hmm... Good avoiding the lock/relock sequence helps.
Christoph Lameter (Ampere) Jan. 18, 2024, 10:14 p.m. UTC | #3
On Thu, 18 Jan 2024, Chengming Zhou wrote:

> So get_freelist() has two cases to handle: cpu slab and cpu partial list slab.
> The latter is NOT frozen, so need to remove "VM_BUG_ON(!new.frozen)" from it.

Right so keep the check if it is the former?
Chengming Zhou Jan. 19, 2024, 3:53 a.m. UTC | #4
On 2024/1/19 06:14, Christoph Lameter (Ampere) wrote:
> On Thu, 18 Jan 2024, Chengming Zhou wrote:
> 
>> So get_freelist() has two cases to handle: cpu slab and cpu partial list slab.
>> The latter is NOT frozen, so need to remove "VM_BUG_ON(!new.frozen)" from it.
> 
> Right so keep the check if it is the former?
> 

Ok, I get it. Maybe like this:

diff --git a/mm/slub.c b/mm/slub.c
index 2ef88bbf56a3..7fa9dbc2e938 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3313,7 +3313,7 @@ __update_cpu_freelist_fast(struct kmem_cache *s,
  *
  * If this function returns NULL then the slab has been unfrozen.
  */
-static inline void *get_freelist(struct kmem_cache *s, struct slab *slab)
+static inline void *get_freelist(struct kmem_cache *s, struct slab *slab, int frozen)
 {
        struct slab new;
        unsigned long counters;
@@ -3326,7 +3326,7 @@ static inline void *get_freelist(struct kmem_cache *s, struct slab *slab)
                counters = slab->counters;

                new.counters = counters;
-               VM_BUG_ON(!new.frozen);
+               VM_BUG_ON(frozen && !new.frozen);

                new.inuse = slab->objects;
                new.frozen = freelist != NULL;
@@ -3440,7 +3440,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
        if (freelist)
                goto load_freelist;

-       freelist = get_freelist(s, slab);
+       freelist = get_freelist(s, slab, 1);

        if (!freelist) {
                c->slab = NULL;
@@ -3498,18 +3498,19 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,

                slab = slub_percpu_partial(c);
                slub_set_percpu_partial(c, slab);
-               local_unlock_irqrestore(&s->cpu_slab->lock, flags);
-               stat(s, CPU_PARTIAL_ALLOC);

-               if (unlikely(!node_match(slab, node) ||
-                            !pfmemalloc_match(slab, gfpflags))) {
-                       slab->next = NULL;
-                       __put_partials(s, slab);
-                       continue;
+               if (likely(node_match(slab, node) &&
+                          pfmemalloc_match(slab, gfpflags))) {
+                       c->slab = slab;
+                       freelist = get_freelist(s, slab, 0);
+                       stat(s, CPU_PARTIAL_ALLOC);
+                       goto load_freelist;
                }

-               freelist = freeze_slab(s, slab);
-               goto retry_load_slab;
+               local_unlock_irqrestore(&s->cpu_slab->lock, flags);
+
+               slab->next = NULL;
+               __put_partials(s, slab);
        }
 #endif
Vlastimil Babka Jan. 22, 2024, 5:13 p.m. UTC | #5
On 1/19/24 04:53, Chengming Zhou wrote:
> On 2024/1/19 06:14, Christoph Lameter (Ampere) wrote:
>> On Thu, 18 Jan 2024, Chengming Zhou wrote:
>> 
>>> So get_freelist() has two cases to handle: cpu slab and cpu partial list slab.
>>> The latter is NOT frozen, so need to remove "VM_BUG_ON(!new.frozen)" from it.
>> 
>> Right so keep the check if it is the former?
>> 
> 
> Ok, I get it. Maybe like this:

I think that's just too ugly for a VM_BUG_ON(). I'd just remove the check
and be done with that.

I have a somewhat different point. You reused get_freelist() but in fact
it's more like freeze_slab(), but that one uses slab_update_freelist() and
we are under the local_lock so we want the cheaper __slab_update_freelist(),
which get_freelist() has and I guess that's why you reused that one.

However get_freelist() also assumes it can return NULL if the freelist is
empty. If that's possible to happen on the percpu partial list, we should
not "goto load_freelist;" but rather create a new label above that, above
the "if (!freelist) {" block that handles the case.

If that's not possible to happen (needs careful audit) and we have guarantee
that slabs on percpu partial list must have non-empty freelist, then we
probably instead want a new __freeze_slab() variant that is like
freeze_slab(), but uses __slab_update_freelist() and probably also has
VM_BUG_ON(!freelist) before returning it?

> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 2ef88bbf56a3..7fa9dbc2e938 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3313,7 +3313,7 @@ __update_cpu_freelist_fast(struct kmem_cache *s,
>   *
>   * If this function returns NULL then the slab has been unfrozen.
>   */
> -static inline void *get_freelist(struct kmem_cache *s, struct slab *slab)
> +static inline void *get_freelist(struct kmem_cache *s, struct slab *slab, int frozen)
>  {
>         struct slab new;
>         unsigned long counters;
> @@ -3326,7 +3326,7 @@ static inline void *get_freelist(struct kmem_cache *s, struct slab *slab)
>                 counters = slab->counters;
> 
>                 new.counters = counters;
> -               VM_BUG_ON(!new.frozen);
> +               VM_BUG_ON(frozen && !new.frozen);
> 
>                 new.inuse = slab->objects;
>                 new.frozen = freelist != NULL;
> @@ -3440,7 +3440,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
>         if (freelist)
>                 goto load_freelist;
> 
> -       freelist = get_freelist(s, slab);
> +       freelist = get_freelist(s, slab, 1);
> 
>         if (!freelist) {
>                 c->slab = NULL;
> @@ -3498,18 +3498,19 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
> 
>                 slab = slub_percpu_partial(c);
>                 slub_set_percpu_partial(c, slab);
> -               local_unlock_irqrestore(&s->cpu_slab->lock, flags);
> -               stat(s, CPU_PARTIAL_ALLOC);
> 
> -               if (unlikely(!node_match(slab, node) ||
> -                            !pfmemalloc_match(slab, gfpflags))) {
> -                       slab->next = NULL;
> -                       __put_partials(s, slab);
> -                       continue;
> +               if (likely(node_match(slab, node) &&
> +                          pfmemalloc_match(slab, gfpflags))) {
> +                       c->slab = slab;
> +                       freelist = get_freelist(s, slab, 0);
> +                       stat(s, CPU_PARTIAL_ALLOC);
> +                       goto load_freelist;
>                 }
> 
> -               freelist = freeze_slab(s, slab);
> -               goto retry_load_slab;
> +               local_unlock_irqrestore(&s->cpu_slab->lock, flags);
> +
> +               slab->next = NULL;
> +               __put_partials(s, slab);
>         }
>  #endif
Chengming Zhou Jan. 23, 2024, 2:51 a.m. UTC | #6
On 2024/1/23 01:13, Vlastimil Babka wrote:
> On 1/19/24 04:53, Chengming Zhou wrote:
>> On 2024/1/19 06:14, Christoph Lameter (Ampere) wrote:
>>> On Thu, 18 Jan 2024, Chengming Zhou wrote:
>>>
>>>> So get_freelist() has two cases to handle: cpu slab and cpu partial list slab.
>>>> The latter is NOT frozen, so need to remove "VM_BUG_ON(!new.frozen)" from it.
>>>
>>> Right so keep the check if it is the former?
>>>
>>
>> Ok, I get it. Maybe like this:
> 
> I think that's just too ugly for a VM_BUG_ON(). I'd just remove the check
> and be done with that.

Ok with me.

> 
> I have a somewhat different point. You reused get_freelist() but in fact
> it's more like freeze_slab(), but that one uses slab_update_freelist() and
> we are under the local_lock so we want the cheaper __slab_update_freelist(),
> which get_freelist() has and I guess that's why you reused that one.

Right, we already have the lock_lock, so reuse get_freelist().

> 
> However get_freelist() also assumes it can return NULL if the freelist is
> empty. If that's possible to happen on the percpu partial list, we should
> not "goto load_freelist;" but rather create a new label above that, above
> the "if (!freelist) {" block that handles the case.
> 
> If that's not possible to happen (needs careful audit) and we have guarantee

Yes, it's not possible for now.

> that slabs on percpu partial list must have non-empty freelist, then we
> probably instead want a new __freeze_slab() variant that is like
> freeze_slab(), but uses __slab_update_freelist() and probably also has
> VM_BUG_ON(!freelist) before returning it?
> 

Instead of introducing another new function, how about still reusing get_freelist()
and VM_BUG_ON(!freelist) after calling it? I feel this is simpler.

Thanks!
Chengming Zhou Jan. 23, 2024, 7:42 a.m. UTC | #7
On 2024/1/23 10:51, Chengming Zhou wrote:
> On 2024/1/23 01:13, Vlastimil Babka wrote:
>> On 1/19/24 04:53, Chengming Zhou wrote:
>>> On 2024/1/19 06:14, Christoph Lameter (Ampere) wrote:
>>>> On Thu, 18 Jan 2024, Chengming Zhou wrote:
>>>>
>>>>> So get_freelist() has two cases to handle: cpu slab and cpu partial list slab.
>>>>> The latter is NOT frozen, so need to remove "VM_BUG_ON(!new.frozen)" from it.
>>>>
>>>> Right so keep the check if it is the former?
>>>>
>>>
>>> Ok, I get it. Maybe like this:
>>
>> I think that's just too ugly for a VM_BUG_ON(). I'd just remove the check
>> and be done with that.
> 
> Ok with me.
> 
>>
>> I have a somewhat different point. You reused get_freelist() but in fact
>> it's more like freeze_slab(), but that one uses slab_update_freelist() and
>> we are under the local_lock so we want the cheaper __slab_update_freelist(),
>> which get_freelist() has and I guess that's why you reused that one.
> 
> Right, we already have the lock_lock, so reuse get_freelist().
> 
>>
>> However get_freelist() also assumes it can return NULL if the freelist is
>> empty. If that's possible to happen on the percpu partial list, we should
>> not "goto load_freelist;" but rather create a new label above that, above
>> the "if (!freelist) {" block that handles the case.
>>
>> If that's not possible to happen (needs careful audit) and we have guarantee
> 
> Yes, it's not possible for now.
> 
>> that slabs on percpu partial list must have non-empty freelist, then we
>> probably instead want a new __freeze_slab() variant that is like
>> freeze_slab(), but uses __slab_update_freelist() and probably also has
>> VM_BUG_ON(!freelist) before returning it?
>>
> 
> Instead of introducing another new function, how about still reusing get_freelist()
> and VM_BUG_ON(!freelist) after calling it? I feel this is simpler.
> 
> Thanks!

Does this look fine?

diff --git a/mm/slub.c b/mm/slub.c
index 2ef88bbf56a3..fda402b2d649 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3326,7 +3326,6 @@ static inline void *get_freelist(struct kmem_cache *s, struct slab *slab)
                counters = slab->counters;

                new.counters = counters;
-               VM_BUG_ON(!new.frozen);

                new.inuse = slab->objects;
                new.frozen = freelist != NULL;
@@ -3498,18 +3497,20 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,

                slab = slub_percpu_partial(c);
                slub_set_percpu_partial(c, slab);
-               local_unlock_irqrestore(&s->cpu_slab->lock, flags);
-               stat(s, CPU_PARTIAL_ALLOC);

-               if (unlikely(!node_match(slab, node) ||
-                            !pfmemalloc_match(slab, gfpflags))) {
-                       slab->next = NULL;
-                       __put_partials(s, slab);
-                       continue;
+               if (likely(node_match(slab, node) &&
+                          pfmemalloc_match(slab, gfpflags))) {
+                       c->slab = slab;
+                       freelist = get_freelist(s, slab);
+                       VM_BUG_ON(!freelist);
+                       stat(s, CPU_PARTIAL_ALLOC);
+                       goto load_freelist;
                }

-               freelist = freeze_slab(s, slab);
-               goto retry_load_slab;
+               local_unlock_irqrestore(&s->cpu_slab->lock, flags);
+
+               slab->next = NULL;
+               __put_partials(s, slab);
        }
 #endif
Vlastimil Babka Jan. 23, 2024, 8:24 a.m. UTC | #8
On 1/23/24 03:51, Chengming Zhou wrote:
> On 2024/1/23 01:13, Vlastimil Babka wrote:
>> On 1/19/24 04:53, Chengming Zhou wrote:
>>> On 2024/1/19 06:14, Christoph Lameter (Ampere) wrote:
>>>> On Thu, 18 Jan 2024, Chengming Zhou wrote:
>>>>
>>>>> So get_freelist() has two cases to handle: cpu slab and cpu partial list slab.
>>>>> The latter is NOT frozen, so need to remove "VM_BUG_ON(!new.frozen)" from it.
>>>>
>>>> Right so keep the check if it is the former?
>>>>
>>>
>>> Ok, I get it. Maybe like this:
>> 
>> I think that's just too ugly for a VM_BUG_ON(). I'd just remove the check
>> and be done with that.
> 
> Ok with me.
> 
>> 
>> I have a somewhat different point. You reused get_freelist() but in fact
>> it's more like freeze_slab(), but that one uses slab_update_freelist() and
>> we are under the local_lock so we want the cheaper __slab_update_freelist(),
>> which get_freelist() has and I guess that's why you reused that one.
> 
> Right, we already have the lock_lock, so reuse get_freelist().
> 
>> 
>> However get_freelist() also assumes it can return NULL if the freelist is
>> empty. If that's possible to happen on the percpu partial list, we should
>> not "goto load_freelist;" but rather create a new label above that, above
>> the "if (!freelist) {" block that handles the case.
>> 
>> If that's not possible to happen (needs careful audit) and we have guarantee
> 
> Yes, it's not possible for now.
> 
>> that slabs on percpu partial list must have non-empty freelist, then we
>> probably instead want a new __freeze_slab() variant that is like
>> freeze_slab(), but uses __slab_update_freelist() and probably also has
>> VM_BUG_ON(!freelist) before returning it?
>> 
> 
> Instead of introducing another new function, how about still reusing get_freelist()
> and VM_BUG_ON(!freelist) after calling it? I feel this is simpler.

Could you measure if introducing new function that sets new.frozen = 1; has
any performance benefit? If not, we can reuse get_freelist() as you say.
Thanks!

> Thanks!
Chengming Zhou Jan. 23, 2024, 9:17 a.m. UTC | #9
On 2024/1/23 16:24, Vlastimil Babka wrote:
> On 1/23/24 03:51, Chengming Zhou wrote:
>> On 2024/1/23 01:13, Vlastimil Babka wrote:
>>> On 1/19/24 04:53, Chengming Zhou wrote:
>>>> On 2024/1/19 06:14, Christoph Lameter (Ampere) wrote:
>>>>> On Thu, 18 Jan 2024, Chengming Zhou wrote:
>>>>>
>>>>>> So get_freelist() has two cases to handle: cpu slab and cpu partial list slab.
>>>>>> The latter is NOT frozen, so need to remove "VM_BUG_ON(!new.frozen)" from it.
>>>>>
>>>>> Right so keep the check if it is the former?
>>>>>
>>>>
>>>> Ok, I get it. Maybe like this:
>>>
>>> I think that's just too ugly for a VM_BUG_ON(). I'd just remove the check
>>> and be done with that.
>>
>> Ok with me.
>>
>>>
>>> I have a somewhat different point. You reused get_freelist() but in fact
>>> it's more like freeze_slab(), but that one uses slab_update_freelist() and
>>> we are under the local_lock so we want the cheaper __slab_update_freelist(),
>>> which get_freelist() has and I guess that's why you reused that one.
>>
>> Right, we already have the lock_lock, so reuse get_freelist().
>>
>>>
>>> However get_freelist() also assumes it can return NULL if the freelist is
>>> empty. If that's possible to happen on the percpu partial list, we should
>>> not "goto load_freelist;" but rather create a new label above that, above
>>> the "if (!freelist) {" block that handles the case.
>>>
>>> If that's not possible to happen (needs careful audit) and we have guarantee
>>
>> Yes, it's not possible for now.
>>
>>> that slabs on percpu partial list must have non-empty freelist, then we
>>> probably instead want a new __freeze_slab() variant that is like
>>> freeze_slab(), but uses __slab_update_freelist() and probably also has
>>> VM_BUG_ON(!freelist) before returning it?
>>>
>>
>> Instead of introducing another new function, how about still reusing get_freelist()
>> and VM_BUG_ON(!freelist) after calling it? I feel this is simpler.
> 
> Could you measure if introducing new function that sets new.frozen = 1; has
> any performance benefit? If not, we can reuse get_freelist() as you say.
> Thanks!
> 

I just tested using the new function: __freeze_slab() that uses __slab_update_freelist()
and sets new.frozen = 1, but found the performance is a little worse than reusing
get_freelist().

The reason I think maybe more code memory footprint? I don't look deep into that.

Anyway it looks better to reuse get_freelist(), I will update a version later.

Thanks!
diff mbox series

Patch

diff --git a/mm/slub.c b/mm/slub.c
index 2ef88bbf56a3..20c03555c97b 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3326,7 +3326,6 @@  static inline void *get_freelist(struct kmem_cache *s, struct slab *slab)
 		counters = slab->counters;
 
 		new.counters = counters;
-		VM_BUG_ON(!new.frozen);
 
 		new.inuse = slab->objects;
 		new.frozen = freelist != NULL;
@@ -3498,18 +3497,19 @@  static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
 
 		slab = slub_percpu_partial(c);
 		slub_set_percpu_partial(c, slab);
-		local_unlock_irqrestore(&s->cpu_slab->lock, flags);
-		stat(s, CPU_PARTIAL_ALLOC);
 
-		if (unlikely(!node_match(slab, node) ||
-			     !pfmemalloc_match(slab, gfpflags))) {
-			slab->next = NULL;
-			__put_partials(s, slab);
-			continue;
+		if (likely(node_match(slab, node) &&
+			   pfmemalloc_match(slab, gfpflags))) {
+			c->slab = slab;
+			freelist = get_freelist(s, slab);
+			stat(s, CPU_PARTIAL_ALLOC);
+			goto load_freelist;
 		}
 
-		freelist = freeze_slab(s, slab);
-		goto retry_load_slab;
+		local_unlock_irqrestore(&s->cpu_slab->lock, flags);
+
+		slab->next = NULL;
+		__put_partials(s, slab);
 	}
 #endif