diff mbox series

[RESEND] mm/page_alloc: skip setting nodemask when we are in interrupt

Message ID 20200703061350.94474-1-songmuchun@bytedance.com (mailing list archive)
State New, archived
Headers show
Series [RESEND] mm/page_alloc: skip setting nodemask when we are in interrupt | expand

Commit Message

Muchun Song July 3, 2020, 6:13 a.m. UTC
When we are in the interrupt context, it is irrelevant to the
current task context. If we use current task's mems_allowed, we
can fair to alloc pages in the fast path and fall back to slow
path memory allocation when the current node(which is the current
task mems_allowed) does not have enough memory to allocate. In
this case, it slows down the memory allocation speed of interrupt
context. So we can skip setting the nodemask to allow any node
to allocate memory, so that fast path allocation can success.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 mm/page_alloc.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Pekka Enberg July 3, 2020, 6:34 a.m. UTC | #1
On Fri, Jul 3, 2020 at 9:14 AM Muchun Song <songmuchun@bytedance.com> wrote:
>
> When we are in the interrupt context, it is irrelevant to the
> current task context. If we use current task's mems_allowed, we
> can fair to alloc pages in the fast path and fall back to slow
> path memory allocation when the current node(which is the current
> task mems_allowed) does not have enough memory to allocate. In
> this case, it slows down the memory allocation speed of interrupt
> context. So we can skip setting the nodemask to allow any node
> to allocate memory, so that fast path allocation can success.
>
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> ---
>  mm/page_alloc.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index b48336e20bdcd..a6c36cd557d1d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4726,10 +4726,12 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order,
>
>         if (cpusets_enabled()) {
>                 *alloc_mask |= __GFP_HARDWALL;
> -               if (!ac->nodemask)
> -                       ac->nodemask = &cpuset_current_mems_allowed;
> -               else
> +               if (!ac->nodemask) {
> +                       if (!in_interrupt())
> +                               ac->nodemask = &cpuset_current_mems_allowed;

If !ac->nodemask and in_interrupt() the ALLOC_CPUSET flag is not set,
which by-passes the __cpuset_zone_allowed() check for allocations.
This works fine because in the case if in_interrupt() the function
allows allocation on any zone/node.

> +               } else {
>                         *alloc_flags |= ALLOC_CPUSET;
> +               }
>         }

However, if you write the condition as follows:

        if (cpusets_enabled()) {
                *alloc_mask |= __GFP_HARDWALL;
                if (!in_interrupt() && !ac->nodemask)
                        ac->nodemask = &cpuset_current_mems_allowed;
                else
                        *alloc_flags |= ALLOC_CPUSET;
        }

then the code is future-proof in case of __cpuset_zone_allowed() is
one day extended to support IRQ context too (it probably should
eventually respect IRQ SMP affinity).

>
>         fs_reclaim_acquire(gfp_mask);
> --
> 2.11.0
>
>
David Hildenbrand July 3, 2020, 7:20 a.m. UTC | #2
On 03.07.20 08:34, Pekka Enberg wrote:
> On Fri, Jul 3, 2020 at 9:14 AM Muchun Song <songmuchun@bytedance.com> wrote:
>>
>> When we are in the interrupt context, it is irrelevant to the
>> current task context. If we use current task's mems_allowed, we
>> can fair to alloc pages in the fast path and fall back to slow
>> path memory allocation when the current node(which is the current
>> task mems_allowed) does not have enough memory to allocate. In
>> this case, it slows down the memory allocation speed of interrupt
>> context. So we can skip setting the nodemask to allow any node
>> to allocate memory, so that fast path allocation can success.
>>
>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>> ---
>>  mm/page_alloc.c | 8 +++++---
>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index b48336e20bdcd..a6c36cd557d1d 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -4726,10 +4726,12 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order,
>>
>>         if (cpusets_enabled()) {
>>                 *alloc_mask |= __GFP_HARDWALL;
>> -               if (!ac->nodemask)
>> -                       ac->nodemask = &cpuset_current_mems_allowed;
>> -               else
>> +               if (!ac->nodemask) {
>> +                       if (!in_interrupt())
>> +                               ac->nodemask = &cpuset_current_mems_allowed;
> 
> If !ac->nodemask and in_interrupt() the ALLOC_CPUSET flag is not set,
> which by-passes the __cpuset_zone_allowed() check for allocations.
> This works fine because in the case if in_interrupt() the function
> allows allocation on any zone/node.
> 
>> +               } else {
>>                         *alloc_flags |= ALLOC_CPUSET;
>> +               }
>>         }
> 
> However, if you write the condition as follows:
> 
>         if (cpusets_enabled()) {
>                 *alloc_mask |= __GFP_HARDWALL;
>                 if (!in_interrupt() && !ac->nodemask)
>                         ac->nodemask = &cpuset_current_mems_allowed;
>                 else
>                         *alloc_flags |= ALLOC_CPUSET;
>         }

^ looks much cleaner as well. Do we want to add a summarizing comment?

> 
> then the code is future-proof in case of __cpuset_zone_allowed() is
> one day extended to support IRQ context too (it probably should
> eventually respect IRQ SMP affinity).
Pekka Enberg July 3, 2020, 7:47 a.m. UTC | #3
On 03.07.20 08:34, Pekka Enberg wrote:
> >         if (cpusets_enabled()) {
> >                 *alloc_mask |= __GFP_HARDWALL;
> >                 if (!in_interrupt() && !ac->nodemask)
> >                         ac->nodemask = &cpuset_current_mems_allowed;
> >                 else
> >                         *alloc_flags |= ALLOC_CPUSET;
> >         }

On Fri, Jul 3, 2020 at 10:20 AM David Hildenbrand <david@redhat.com> wrote:
> ^ looks much cleaner as well. Do we want to add a summarizing comment?

I see no harm in adding one. I'm sure the next person starting a
journey in the maze some call the page allocator will appreciate it.

- Pekka
diff mbox series

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b48336e20bdcd..a6c36cd557d1d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4726,10 +4726,12 @@  static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order,
 
 	if (cpusets_enabled()) {
 		*alloc_mask |= __GFP_HARDWALL;
-		if (!ac->nodemask)
-			ac->nodemask = &cpuset_current_mems_allowed;
-		else
+		if (!ac->nodemask) {
+			if (!in_interrupt())
+				ac->nodemask = &cpuset_current_mems_allowed;
+		} else {
 			*alloc_flags |= ALLOC_CPUSET;
+		}
 	}
 
 	fs_reclaim_acquire(gfp_mask);