Message ID | 20191009164934.10166-1-urezki@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/1] mm/vmalloc: remove preempt_disable/enable when do preloading | expand |
On Wed, 9 Oct 2019 18:49:34 +0200 "Uladzislau Rezki (Sony)" <urezki@gmail.com> wrote: > Get rid of preempt_disable() and preempt_enable() when the > preload is done for splitting purpose. The reason is that > calling spin_lock() with disabled preemtion is forbidden in > CONFIG_PREEMPT_RT kernel. > > Therefore, we do not guarantee that a CPU is preloaded, instead > we minimize the case when it is not with this change. > > For example i run the special test case that follows the preload > pattern and path. 20 "unbind" threads run it and each does > 1000000 allocations. Only 3.5 times among 1000000 a CPU was > not preloaded thus. So it can happen but the number is rather > negligible. Thanks for the analysis. > > Fixes: 82dd23e84be3 ("mm/vmalloc.c: preload a CPU with one object for split purpose") > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> > --- > mm/vmalloc.c | 17 ++++++++--------- > 1 file changed, 8 insertions(+), 9 deletions(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index e92ff5f7dd8b..2ed6fef86950 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -1078,9 +1078,12 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, > > retry: > /* > - * Preload this CPU with one extra vmap_area object to ensure > - * that we have it available when fit type of free area is > - * NE_FIT_TYPE. > + * Preload this CPU with one extra vmap_area object. It is used > + * when fit type of free area is NE_FIT_TYPE. Please note, it > + * does not guarantee that an allocation occurs on a CPU that > + * is preloaded, instead we minimize the case when it is not. > + * It can happen because of migration, because there is a race > + * until the below spinlock is taken. > * > * The preload is done in non-atomic context, thus it allows us > * to use more permissive allocation masks to be more stable under > @@ -1089,20 +1092,16 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, > * Even if it fails we do not really care about that. Just proceed > * as it is. "overflow" path will refill the cache we allocate from. > */ > - preempt_disable(); > - if (!__this_cpu_read(ne_fit_preload_node)) { > - preempt_enable(); As the original code enables preemption here regardless, there's no guarantee that the original patch would allocate the pva to the CPU in question. I agree with this patch, the preempt_disable() here only narrows an already narrow window, with no real help in what it was doing. > + if (!this_cpu_read(ne_fit_preload_node)) { > pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node); If the memory allocation failed here, we still may not have a pva for the current CPU's ne_fit_preload_node, rare as that may be. > - preempt_disable(); > > - if (__this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) { > + if (this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) { Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> -- Steve > if (pva) > kmem_cache_free(vmap_area_cachep, pva); > } > } > > spin_lock(&vmap_area_lock); > - preempt_enable(); > > /* > * If an allocation fails, the "vend" address is
On Wed, 9 Oct 2019 18:49:34 +0200 "Uladzislau Rezki (Sony)" <urezki@gmail.com> wrote: > Get rid of preempt_disable() and preempt_enable() when the > preload is done for splitting purpose. The reason is that > calling spin_lock() with disabled preemtion is forbidden in > CONFIG_PREEMPT_RT kernel. > > Therefore, we do not guarantee that a CPU is preloaded, instead > we minimize the case when it is not with this change. > > For example i run the special test case that follows the preload > pattern and path. 20 "unbind" threads run it and each does > 1000000 allocations. Only 3.5 times among 1000000 a CPU was > not preloaded thus. So it can happen but the number is rather > negligible. > > ... > A few questions about the resulting alloc_vmap_area(): : static struct vmap_area *alloc_vmap_area(unsigned long size, : unsigned long align, : unsigned long vstart, unsigned long vend, : int node, gfp_t gfp_mask) : { : struct vmap_area *va, *pva; : unsigned long addr; : int purged = 0; : : BUG_ON(!size); : BUG_ON(offset_in_page(size)); : BUG_ON(!is_power_of_2(align)); : : if (unlikely(!vmap_initialized)) : return ERR_PTR(-EBUSY); : : might_sleep(); : : va = kmem_cache_alloc_node(vmap_area_cachep, : gfp_mask & GFP_RECLAIM_MASK, node); Why does this use GFP_RECLAIM_MASK? Please add a comment explaining this. : if (unlikely(!va)) : return ERR_PTR(-ENOMEM); : : /* : * Only scan the relevant parts containing pointers to other objects : * to avoid false negatives. : */ : kmemleak_scan_area(&va->rb_node, SIZE_MAX, gfp_mask & GFP_RECLAIM_MASK); : : retry: : /* : * Preload this CPU with one extra vmap_area object. It is used : * when fit type of free area is NE_FIT_TYPE. Please note, it : * does not guarantee that an allocation occurs on a CPU that : * is preloaded, instead we minimize the case when it is not. : * It can happen because of migration, because there is a race : * until the below spinlock is taken. : * : * The preload is done in non-atomic context, thus it allows us : * to use more permissive allocation masks to be more stable under : * low memory condition and high memory pressure. : * : * Even if it fails we do not really care about that. Just proceed : * as it is. "overflow" path will refill the cache we allocate from. : */ : if (!this_cpu_read(ne_fit_preload_node)) { Readability nit: local `pva' should be defined here, rather than having function-wide scope. : pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node); Why doesn't this honour gfp_mask? If it's not a bug, please add comment explaining this. The kmem_cache_alloc() in adjust_va_to_fit_type() omits the caller's gfp_mask also. If not a bug, please document the unexpected behaviour. : : if (this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) { : if (pva) : kmem_cache_free(vmap_area_cachep, pva); : } : } : : spin_lock(&vmap_area_lock); : : /* : * If an allocation fails, the "vend" address is : * returned. Therefore trigger the overflow path. : */ As for the intent of this patch, why not preallocate the vmap_area outside the spinlock and use it within the spinlock? Does spin_lock() disable preemption on RT? I forget, but it doesn't matter much anyway - doing this will make the code better in the regular kernel I think? Something like this: struct vmap_area *pva = NULL; ... if (!this_cpu_read(ne_fit_preload_node)) pva = kmem_cache_alloc_node(vmap_area_cachep, ...); spin_lock(&vmap_area_lock); if (pva && __this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) kmem_cache_free(vmap_area_cachep, pva);
On Wed, 9 Oct 2019 15:19:01 -0700 Andrew Morton <akpm@linux-foundation.org> wrote: > On Wed, 9 Oct 2019 18:49:34 +0200 "Uladzislau Rezki (Sony)" <urezki@gmail.com> wrote: > > > Get rid of preempt_disable() and preempt_enable() when the > > preload is done for splitting purpose. The reason is that > > calling spin_lock() with disabled preemtion is forbidden in > > CONFIG_PREEMPT_RT kernel. > > > > Therefore, we do not guarantee that a CPU is preloaded, instead > > we minimize the case when it is not with this change. > > > > For example i run the special test case that follows the preload > > pattern and path. 20 "unbind" threads run it and each does > > 1000000 allocations. Only 3.5 times among 1000000 a CPU was > > not preloaded thus. So it can happen but the number is rather > > negligible. > > > > ... > > > > A few questions about the resulting alloc_vmap_area(): > > : static struct vmap_area *alloc_vmap_area(unsigned long size, > : unsigned long align, > : unsigned long vstart, unsigned long vend, > : int node, gfp_t gfp_mask) > : { > : struct vmap_area *va, *pva; > : unsigned long addr; > : int purged = 0; > : > : BUG_ON(!size); > : BUG_ON(offset_in_page(size)); > : BUG_ON(!is_power_of_2(align)); > : > : if (unlikely(!vmap_initialized)) > : return ERR_PTR(-EBUSY); > : > : might_sleep(); > : > : va = kmem_cache_alloc_node(vmap_area_cachep, > : gfp_mask & GFP_RECLAIM_MASK, node); > > Why does this use GFP_RECLAIM_MASK? Please add a comment explaining > this. > > : if (unlikely(!va)) > : return ERR_PTR(-ENOMEM); > : > : /* > : * Only scan the relevant parts containing pointers to other objects > : * to avoid false negatives. > : */ > : kmemleak_scan_area(&va->rb_node, SIZE_MAX, gfp_mask & GFP_RECLAIM_MASK); > : > : retry: > : /* > : * Preload this CPU with one extra vmap_area object. It is used > : * when fit type of free area is NE_FIT_TYPE. Please note, it > : * does not guarantee that an allocation occurs on a CPU that > : * is preloaded, instead we minimize the case when it is not. > : * It can happen because of migration, because there is a race > : * until the below spinlock is taken. > : * > : * The preload is done in non-atomic context, thus it allows us > : * to use more permissive allocation masks to be more stable under > : * low memory condition and high memory pressure. > : * > : * Even if it fails we do not really care about that. Just proceed > : * as it is. "overflow" path will refill the cache we allocate from. > : */ > : if (!this_cpu_read(ne_fit_preload_node)) { > > Readability nit: local `pva' should be defined here, rather than having > function-wide scope. > > : pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node); > > Why doesn't this honour gfp_mask? If it's not a bug, please add > comment explaining this. > > The kmem_cache_alloc() in adjust_va_to_fit_type() omits the caller's > gfp_mask also. If not a bug, please document the unexpected behaviour. > These questions appear to be for the code that this patch touches, not for the patch itself. > : > : if (this_cpu_cmpxchg(ne_fit_preload_node, NULL, > pva)) { : if (pva) > : kmem_cache_free(vmap_area_cachep, > pva); : } > : } > : > : spin_lock(&vmap_area_lock); > : > : /* > : * If an allocation fails, the "vend" address is > : * returned. Therefore trigger the overflow path. > : */ > > As for the intent of this patch, why not preallocate the vmap_area > outside the spinlock and use it within the spinlock? Does spin_lock() > disable preemption on RT? I forget, but it doesn't matter much anyway spin_lock() does not disable preemption on RT. But it does disable migration (thus the task should remain on the current CPU). > - doing this will make the code better in the regular kernel I think? > Something like this: > > struct vmap_area *pva = NULL; > > ... > > if (!this_cpu_read(ne_fit_preload_node)) > pva = kmem_cache_alloc_node(vmap_area_cachep, ...); > > spin_lock(&vmap_area_lock); > > if (pva && __this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) > kmem_cache_free(vmap_area_cachep, pva); > This looks fine to me. -- Steve
> > > > A few questions about the resulting alloc_vmap_area(): > > > > : static struct vmap_area *alloc_vmap_area(unsigned long size, > > : unsigned long align, > > : unsigned long vstart, unsigned long vend, > > : int node, gfp_t gfp_mask) > > : { > > : struct vmap_area *va, *pva; > > : unsigned long addr; > > : int purged = 0; > > : > > : BUG_ON(!size); > > : BUG_ON(offset_in_page(size)); > > : BUG_ON(!is_power_of_2(align)); > > : > > : if (unlikely(!vmap_initialized)) > > : return ERR_PTR(-EBUSY); > > : > > : might_sleep(); > > : > > : va = kmem_cache_alloc_node(vmap_area_cachep, > > : gfp_mask & GFP_RECLAIM_MASK, node); > > > > Why does this use GFP_RECLAIM_MASK? Please add a comment explaining > > this. > > I need to think about it. Initially it was like that. > > : if (unlikely(!va)) > > : return ERR_PTR(-ENOMEM); > > : > > : /* > > : * Only scan the relevant parts containing pointers to other objects > > : * to avoid false negatives. > > : */ > > : kmemleak_scan_area(&va->rb_node, SIZE_MAX, gfp_mask & GFP_RECLAIM_MASK); > > : > > : retry: > > : /* > > : * Preload this CPU with one extra vmap_area object. It is used > > : * when fit type of free area is NE_FIT_TYPE. Please note, it > > : * does not guarantee that an allocation occurs on a CPU that > > : * is preloaded, instead we minimize the case when it is not. > > : * It can happen because of migration, because there is a race > > : * until the below spinlock is taken. > > : * > > : * The preload is done in non-atomic context, thus it allows us > > : * to use more permissive allocation masks to be more stable under > > : * low memory condition and high memory pressure. > > : * > > : * Even if it fails we do not really care about that. Just proceed > > : * as it is. "overflow" path will refill the cache we allocate from. > > : */ > > : if (!this_cpu_read(ne_fit_preload_node)) { > > > > Readability nit: local `pva' should be defined here, rather than having > > function-wide scope. > > > > : pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node); > > > > Why doesn't this honour gfp_mask? If it's not a bug, please add > > comment explaining this. > > But there is a comment, if understand you correctly: <snip> * Even if it fails we do not really care about that. Just proceed * as it is. "overflow" path will refill the cache we allocate from. <snip> > > The kmem_cache_alloc() in adjust_va_to_fit_type() omits the caller's > > gfp_mask also. If not a bug, please document the unexpected behaviour. > > I will add a comment there. > > These questions appear to be for the code that this patch touches, not > for the patch itself. > > > : > > : if (this_cpu_cmpxchg(ne_fit_preload_node, NULL, > > pva)) { : if (pva) > > : kmem_cache_free(vmap_area_cachep, > > pva); : } > > : } > > : > > : spin_lock(&vmap_area_lock); > > : > > : /* > > : * If an allocation fails, the "vend" address is > > : * returned. Therefore trigger the overflow path. > > : */ > > > > As for the intent of this patch, why not preallocate the vmap_area > > outside the spinlock and use it within the spinlock? Does spin_lock() > > disable preemption on RT? I forget, but it doesn't matter much anyway > > spin_lock() does not disable preemption on RT. But it does disable > migration (thus the task should remain on the current CPU). > > > - doing this will make the code better in the regular kernel I think? > > Something like this: > > > > struct vmap_area *pva = NULL; > > > > ... > > > > if (!this_cpu_read(ne_fit_preload_node)) > > pva = kmem_cache_alloc_node(vmap_area_cachep, ...); > > > > spin_lock(&vmap_area_lock); > > > > if (pva && __this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) > > kmem_cache_free(vmap_area_cachep, pva); > > > > > This looks fine to me. > Yes, i agree that is better. I was thinking about doing so, but decided to keep as it is, because of low number of "corner cases" anyway. I will upload the v2. Thanks for the comments! -- Vlad Rezki
On Thu, 10 Oct 2019 17:17:49 +0200 Uladzislau Rezki <urezki@gmail.com> wrote: > > > : * The preload is done in non-atomic context, thus it allows us > > > : * to use more permissive allocation masks to be more stable under > > > : * low memory condition and high memory pressure. > > > : * > > > : * Even if it fails we do not really care about that. Just proceed > > > : * as it is. "overflow" path will refill the cache we allocate from. > > > : */ > > > : if (!this_cpu_read(ne_fit_preload_node)) { > > > > > > Readability nit: local `pva' should be defined here, rather than having > > > function-wide scope. > > > > > > : pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node); > > > > > > Why doesn't this honour gfp_mask? If it's not a bug, please add > > > comment explaining this. > > > > But there is a comment, if understand you correctly: > > <snip> > * Even if it fails we do not really care about that. Just proceed > * as it is. "overflow" path will refill the cache we allocate from. > <snip> My point is that the alloc_vmap_area() caller passed us a gfp_t but this code ignores it, as does adjust_va_to_fit_type(). These *look* like potential bugs. If not, they should be commented so they don't look like bugs any more ;)
On Fri, Oct 11, 2019 at 04:55:15PM -0700, Andrew Morton wrote: > On Thu, 10 Oct 2019 17:17:49 +0200 Uladzislau Rezki <urezki@gmail.com> wrote: > > > > > : * The preload is done in non-atomic context, thus it allows us > > > > : * to use more permissive allocation masks to be more stable under > > > > : * low memory condition and high memory pressure. > > > > : * > > > > : * Even if it fails we do not really care about that. Just proceed > > > > : * as it is. "overflow" path will refill the cache we allocate from. > > > > : */ > > > > : if (!this_cpu_read(ne_fit_preload_node)) { > > > > > > > > Readability nit: local `pva' should be defined here, rather than having > > > > function-wide scope. > > > > > > > > : pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node); > > > > > > > > Why doesn't this honour gfp_mask? If it's not a bug, please add > > > > comment explaining this. > > > > > > But there is a comment, if understand you correctly: > > > > <snip> > > * Even if it fails we do not really care about that. Just proceed > > * as it is. "overflow" path will refill the cache we allocate from. > > <snip> > > My point is that the alloc_vmap_area() caller passed us a gfp_t but > this code ignores it, as does adjust_va_to_fit_type(). These *look* > like potential bugs. If not, they should be commented so they don't > look like bugs any more ;) > I got it, there was misunderstanding from my side :) I agree. In the first case i should have used and respect the passed "gfp_mask", like below: diff --git a/mm/vmalloc.c b/mm/vmalloc.c index f48cd0711478..880b6e8cdeae 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1113,7 +1113,8 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, * Just proceed as it is. If needed "overflow" path * will refill the cache we allocate from. */ - pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node); + pva = kmem_cache_alloc_node(vmap_area_cachep, + gfp_mask & GFP_RECLAIM_MASK, node); spin_lock(&vmap_area_lock); It should be sent as a separate patch, i think. As for adjust_va_to_fit_type(), i can add a comment, since we can not sleep there and the case is one per 1000000 or even lower with your proposal. Does it sound good? Thank you! -- Vlad Rezki
On Mon 14-10-19 16:27:19, Uladzislau Rezki wrote: > On Fri, Oct 11, 2019 at 04:55:15PM -0700, Andrew Morton wrote: > > On Thu, 10 Oct 2019 17:17:49 +0200 Uladzislau Rezki <urezki@gmail.com> wrote: > > > > > > > : * The preload is done in non-atomic context, thus it allows us > > > > > : * to use more permissive allocation masks to be more stable under > > > > > : * low memory condition and high memory pressure. > > > > > : * > > > > > : * Even if it fails we do not really care about that. Just proceed > > > > > : * as it is. "overflow" path will refill the cache we allocate from. > > > > > : */ > > > > > : if (!this_cpu_read(ne_fit_preload_node)) { > > > > > > > > > > Readability nit: local `pva' should be defined here, rather than having > > > > > function-wide scope. > > > > > > > > > > : pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node); > > > > > > > > > > Why doesn't this honour gfp_mask? If it's not a bug, please add > > > > > comment explaining this. > > > > > > > > But there is a comment, if understand you correctly: > > > > > > <snip> > > > * Even if it fails we do not really care about that. Just proceed > > > * as it is. "overflow" path will refill the cache we allocate from. > > > <snip> > > > > My point is that the alloc_vmap_area() caller passed us a gfp_t but > > this code ignores it, as does adjust_va_to_fit_type(). These *look* > > like potential bugs. If not, they should be commented so they don't > > look like bugs any more ;) > > > I got it, there was misunderstanding from my side :) I agree. > > In the first case i should have used and respect the passed "gfp_mask", > like below: > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index f48cd0711478..880b6e8cdeae 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -1113,7 +1113,8 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, > * Just proceed as it is. If needed "overflow" path > * will refill the cache we allocate from. > */ > - pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node); > + pva = kmem_cache_alloc_node(vmap_area_cachep, > + gfp_mask & GFP_RECLAIM_MASK, node); > > spin_lock(&vmap_area_lock); > > It should be sent as a separate patch, i think. Yes. I do not think this would make any real difference because that battle is lost long ago. vmalloc is simply not gfp mask friendly. There are places like page table allocation which are hardcoded GFP_KERNEL so GFP_NOWAIT semantic is not going to work, really. The above makes sense from a pure aesthetic POV, though, I would say.
> > > > > > : * The preload is done in non-atomic context, thus it allows us > > > > > > : * to use more permissive allocation masks to be more stable under > > > > > > : * low memory condition and high memory pressure. > > > > > > : * > > > > > > : * Even if it fails we do not really care about that. Just proceed > > > > > > : * as it is. "overflow" path will refill the cache we allocate from. > > > > > > : */ > > > > > > : if (!this_cpu_read(ne_fit_preload_node)) { > > > > > > > > > > > > Readability nit: local `pva' should be defined here, rather than having > > > > > > function-wide scope. > > > > > > > > > > > > : pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node); > > > > > > > > > > > > Why doesn't this honour gfp_mask? If it's not a bug, please add > > > > > > comment explaining this. > > > > > > > > > > But there is a comment, if understand you correctly: > > > > > > > > <snip> > > > > * Even if it fails we do not really care about that. Just proceed > > > > * as it is. "overflow" path will refill the cache we allocate from. > > > > <snip> > > > > > > My point is that the alloc_vmap_area() caller passed us a gfp_t but > > > this code ignores it, as does adjust_va_to_fit_type(). These *look* > > > like potential bugs. If not, they should be commented so they don't > > > look like bugs any more ;) > > > > > I got it, there was misunderstanding from my side :) I agree. > > > > In the first case i should have used and respect the passed "gfp_mask", > > like below: > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > index f48cd0711478..880b6e8cdeae 100644 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -1113,7 +1113,8 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, > > * Just proceed as it is. If needed "overflow" path > > * will refill the cache we allocate from. > > */ > > - pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node); > > + pva = kmem_cache_alloc_node(vmap_area_cachep, > > + gfp_mask & GFP_RECLAIM_MASK, node); > > > > spin_lock(&vmap_area_lock); > > > > It should be sent as a separate patch, i think. > > Yes. I do not think this would make any real difference because that > battle is lost long ago. vmalloc is simply not gfp mask friendly. There > are places like page table allocation which are hardcoded GFP_KERNEL so > GFP_NOWAIT semantic is not going to work, really. The above makes sense > from a pure aesthetic POV, though, I would say. I agree. Then i will create a patch. Thank you! -- Vlad Rezki
diff --git a/mm/vmalloc.c b/mm/vmalloc.c index e92ff5f7dd8b..2ed6fef86950 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1078,9 +1078,12 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, retry: /* - * Preload this CPU with one extra vmap_area object to ensure - * that we have it available when fit type of free area is - * NE_FIT_TYPE. + * Preload this CPU with one extra vmap_area object. It is used + * when fit type of free area is NE_FIT_TYPE. Please note, it + * does not guarantee that an allocation occurs on a CPU that + * is preloaded, instead we minimize the case when it is not. + * It can happen because of migration, because there is a race + * until the below spinlock is taken. * * The preload is done in non-atomic context, thus it allows us * to use more permissive allocation masks to be more stable under @@ -1089,20 +1092,16 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, * Even if it fails we do not really care about that. Just proceed * as it is. "overflow" path will refill the cache we allocate from. */ - preempt_disable(); - if (!__this_cpu_read(ne_fit_preload_node)) { - preempt_enable(); + if (!this_cpu_read(ne_fit_preload_node)) { pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node); - preempt_disable(); - if (__this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) { + if (this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) { if (pva) kmem_cache_free(vmap_area_cachep, pva); } } spin_lock(&vmap_area_lock); - preempt_enable(); /* * If an allocation fails, the "vend" address is
Get rid of preempt_disable() and preempt_enable() when the preload is done for splitting purpose. The reason is that calling spin_lock() with disabled preemtion is forbidden in CONFIG_PREEMPT_RT kernel. Therefore, we do not guarantee that a CPU is preloaded, instead we minimize the case when it is not with this change. For example i run the special test case that follows the preload pattern and path. 20 "unbind" threads run it and each does 1000000 allocations. Only 3.5 times among 1000000 a CPU was not preloaded thus. So it can happen but the number is rather negligible. Fixes: 82dd23e84be3 ("mm/vmalloc.c: preload a CPU with one object for split purpose") Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> --- mm/vmalloc.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-)