diff mbox series

[RFC,25/26] mm, slub: use migrate_disable() in put_cpu_partial()

Message ID 20210524233946.20352-26-vbabka@suse.cz (mailing list archive)
State New, archived
Headers show
Series SLUB: use local_lock for kmem_cache_cpu protection and reduce disabling irqs | expand

Commit Message

Vlastimil Babka May 24, 2021, 11:39 p.m. UTC
In put_cpu_partial, we need a stable cpu, but being preempted is not an issue.
So, disable migration instead of preemption.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/slub.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Jann Horn May 25, 2021, 3:33 p.m. UTC | #1
On Tue, May 25, 2021 at 1:40 AM Vlastimil Babka <vbabka@suse.cz> wrote:
> In put_cpu_partial, we need a stable cpu, but being preempted is not an issue.
> So, disable migration instead of preemption.

I wouldn't say "not an issue", more like "you're not making it worse".

From what I can tell, the following race can already theoretically happen:

task A: put_cpu_partial() calls preempt_disable()
task A: oldpage = this_cpu_read(s->cpu_slab->partial)
interrupt: kfree() reaches unfreeze_partials() and discards the page
task B (on another CPU): reallocates page as page cache
task A: reads page->pages and page->pobjects, which are actually
halves of the pointer page->lru.prev
task B (on another CPU): frees page
interrupt: allocates page as SLUB page and places it on the percpu partial list
task A: this_cpu_cmpxchg() succeeds

which would cause page->pages and page->pobjects to end up containing
halves of pointers that would then influence when put_cpu_partial()
happens and show up in root-only sysfs files. Maybe that's acceptable,
I don't know. But there should probably at least be a comment for now
to point out that we're reading union fields of a page that might be
in a completely different state.

(Someone should probably fix that code sometime and get rid of
page->pobjects entirely, given how inaccurate it is...)
Vlastimil Babka June 9, 2021, 8:41 a.m. UTC | #2
On 5/25/21 5:33 PM, Jann Horn wrote:
> On Tue, May 25, 2021 at 1:40 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>> In put_cpu_partial, we need a stable cpu, but being preempted is not an issue.
>> So, disable migration instead of preemption.
> 
> I wouldn't say "not an issue", more like "you're not making it worse".
> 
> From what I can tell, the following race can already theoretically happen:
> 
> task A: put_cpu_partial() calls preempt_disable()
> task A: oldpage = this_cpu_read(s->cpu_slab->partial)
> interrupt: kfree() reaches unfreeze_partials() and discards the page
> task B (on another CPU): reallocates page as page cache
> task A: reads page->pages and page->pobjects, which are actually
> halves of the pointer page->lru.prev
> task B (on another CPU): frees page
> interrupt: allocates page as SLUB page and places it on the percpu partial list
> task A: this_cpu_cmpxchg() succeeds

Oops, nice find. Thanks.

> which would cause page->pages and page->pobjects to end up containing
> halves of pointers that would then influence when put_cpu_partial()
> happens and show up in root-only sysfs files. Maybe that's acceptable,
> I don't know. But there should probably at least be a comment for now
> to point out that we're reading union fields of a page that might be
> in a completely different state.
> 
> (Someone should probably fix that code sometime and get rid of
> page->pobjects entirely, given how inaccurate it is...)

I'll try to address it separately later. Probably just target a number of pages,
instead of objects, on the list and store the number as part of struct
kmem_cache_cpu, not struct page. The inaccuracy leading to potentially long
lists is a good reason enough, the race scenario above is another one...
diff mbox series

Patch

diff --git a/mm/slub.c b/mm/slub.c
index bfa5e7c4da1b..8818c210cb97 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2417,7 +2417,7 @@  static void put_cpu_partial(struct kmem_cache *s, struct page *page, int drain)
 	int pages;
 	int pobjects;
 
-	preempt_disable();
+	migrate_disable();
 	do {
 		pages = 0;
 		pobjects = 0;
@@ -2451,7 +2451,7 @@  static void put_cpu_partial(struct kmem_cache *s, struct page *page, int drain)
 	if (unlikely(!slub_cpu_partial(s)))
 		unfreeze_partials(s);
 
-	preempt_enable();
+	migrate_enable();
 #endif	/* CONFIG_SLUB_CPU_PARTIAL */
 }