Message ID | 9c3f7b743477560d1c5b12b8c111a584a2cc92ee.1708097962.git.donettom@linux.ibm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [1/3] mm/mempolicy: Use the already fetched local variable | expand |
On Sat, 17 Feb 2024 01:31:33 -0600 Donet Tom <donettom@linux.ibm.com> wrote: > Avoid doing a per cpu read and use the local variable thisnid. IMHO > this also makes the code more readable. > > ... > > --- a/mm/mempolicy.c > +++ b/mm/mempolicy.c > @@ -2526,7 +2526,7 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, > if (node_isset(curnid, pol->nodes)) > goto out; > z = first_zones_zonelist( > - node_zonelist(numa_node_id(), GFP_HIGHUSER), > + node_zonelist(thisnid, GFP_HIGHUSER), > gfp_zone(GFP_HIGHUSER), > &pol->nodes); > polnid = zone_to_nid(z->zone); int thisnid = cpu_to_node(thiscpu); Is there any dofference between numa_node_id() and cpu_to_node(raw_smp_processor_id())? And it it explicable that we're using one here and not the other?
On 2/19/24 03:08, Andrew Morton wrote: > On Sat, 17 Feb 2024 01:31:33 -0600 Donet Tom <donettom@linux.ibm.com> wrote: > >> Avoid doing a per cpu read and use the local variable thisnid. IMHO >> this also makes the code more readable. >> >> ... >> >> --- a/mm/mempolicy.c >> +++ b/mm/mempolicy.c >> @@ -2526,7 +2526,7 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, >> if (node_isset(curnid, pol->nodes)) >> goto out; >> z = first_zones_zonelist( >> - node_zonelist(numa_node_id(), GFP_HIGHUSER), >> + node_zonelist(thisnid, GFP_HIGHUSER), >> gfp_zone(GFP_HIGHUSER), >> &pol->nodes); >> polnid = zone_to_nid(z->zone); > int thisnid = cpu_to_node(thiscpu); > > Is there any dofference between numa_node_id() and > cpu_to_node(raw_smp_processor_id())? And it it explicable that we're > using one here and not the other? Hi Andrew Both numa_node_id() and cpu_to_node(raw_smp_processor_id()) return the current execution node id, Since the current execution node is already fetched at the beginning (thisnid) we can reuse it instead of getting it again. Thanks Donet Tom >
On Sat 17-02-24 01:31:34, Donet Tom wrote: > We will update MPOL_PREFERRED_MANY in the follow up patch. This change > is required for that. Why is it a separate patch then? Does it make review of the next patch easier? If so make it explicit in the changelog. > > Signed-off-by: Aneesh Kumar K.V (IBM) <aneesh.kumar@kernel.org> > Signed-off-by: Donet Tom <donettom@linux.ibm.com> > --- > mm/mempolicy.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > index 8478574c000c..73d698e21dae 100644 > --- a/mm/mempolicy.c > +++ b/mm/mempolicy.c > @@ -2515,7 +2515,15 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, > break; > goto out; > } > - fallthrough; > + > + if (node_isset(curnid, pol->nodes)) > + goto out; > + z = first_zones_zonelist( > + node_zonelist(thisnid, GFP_HIGHUSER), > + gfp_zone(GFP_HIGHUSER), > + &pol->nodes); > + polnid = zone_to_nid(z->zone); > + break; > > case MPOL_PREFERRED_MANY: > /* > -- > 2.39.3
On 2/19/24 17:32, Michal Hocko wrote: > On Sat 17-02-24 01:31:34, Donet Tom wrote: >> We will update MPOL_PREFERRED_MANY in the follow up patch. This change >> is required for that. > Why is it a separate patch then? Does it make review of the next patch > easier? If so make it explicit in the changelog. Hi Michal In this patch there is no functional changes. This is just re-arrangement of code. Patch 3 is the actual fix .It will not look nice if we mix these patches. As you said it is easy for reviewing also. That's why we kept it as a separate patch. Thanks Donet Tom > >> Signed-off-by: Aneesh Kumar K.V (IBM) <aneesh.kumar@kernel.org> >> Signed-off-by: Donet Tom <donettom@linux.ibm.com> >> --- >> mm/mempolicy.c | 10 +++++++++- >> 1 file changed, 9 insertions(+), 1 deletion(-) >> >> diff --git a/mm/mempolicy.c b/mm/mempolicy.c >> index 8478574c000c..73d698e21dae 100644 >> --- a/mm/mempolicy.c >> +++ b/mm/mempolicy.c >> @@ -2515,7 +2515,15 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, >> break; >> goto out; >> } >> - fallthrough; >> + >> + if (node_isset(curnid, pol->nodes)) >> + goto out; >> + z = first_zones_zonelist( >> + node_zonelist(thisnid, GFP_HIGHUSER), >> + gfp_zone(GFP_HIGHUSER), >> + &pol->nodes); >> + polnid = zone_to_nid(z->zone); >> + break; >> >> case MPOL_PREFERRED_MANY: >> /* >> -- >> 2.39.3
On Mon, 19 Feb 2024 14:04:23 +0530 Donet Tom <donettom@linux.ibm.com> wrote: > >> --- a/mm/mempolicy.c > >> +++ b/mm/mempolicy.c > >> @@ -2526,7 +2526,7 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, > >> if (node_isset(curnid, pol->nodes)) > >> goto out; > >> z = first_zones_zonelist( > >> - node_zonelist(numa_node_id(), GFP_HIGHUSER), > >> + node_zonelist(thisnid, GFP_HIGHUSER), > >> gfp_zone(GFP_HIGHUSER), > >> &pol->nodes); > >> polnid = zone_to_nid(z->zone); > > int thisnid = cpu_to_node(thiscpu); > > > > Is there any dofference between numa_node_id() and > > cpu_to_node(raw_smp_processor_id())? And it it explicable that we're > > using one here and not the other? > > Hi Andrew > > Both numa_node_id() and cpu_to_node(raw_smp_processor_id()) return the current execution node id, > Since the current execution node is already fetched at the beginning (thisnid) we can reuse it instead of getting it again. Sure, but mine was a broader thought: why do we have both? Is one preferable and if so why?
On 2/20/24 6:51 AM, Andrew Morton wrote: > On Mon, 19 Feb 2024 14:04:23 +0530 Donet Tom <donettom@linux.ibm.com> wrote: > >>>> --- a/mm/mempolicy.c >>>> +++ b/mm/mempolicy.c >>>> @@ -2526,7 +2526,7 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, >>>> if (node_isset(curnid, pol->nodes)) >>>> goto out; >>>> z = first_zones_zonelist( >>>> - node_zonelist(numa_node_id(), GFP_HIGHUSER), >>>> + node_zonelist(thisnid, GFP_HIGHUSER), >>>> gfp_zone(GFP_HIGHUSER), >>>> &pol->nodes); >>>> polnid = zone_to_nid(z->zone); >>> int thisnid = cpu_to_node(thiscpu); >>> >>> Is there any dofference between numa_node_id() and >>> cpu_to_node(raw_smp_processor_id())? And it it explicable that we're >>> using one here and not the other? >> >> Hi Andrew >> >> Both numa_node_id() and cpu_to_node(raw_smp_processor_id()) return the current execution node id, >> Since the current execution node is already fetched at the beginning (thisnid) we can reuse it instead of getting it again. > > Sure, but mine was a broader thought: why do we have both? Is one > preferable and if so why? IIUC these are two helpers to fetch current numa node id. and either of them can be used based on need. The default implementation shows the details. (One small difference is numa_node_id() can use optimized per cpu reader because it is fetching the per cpu variable of the currently running cpu.) #ifndef numa_node_id /* Returns the number of the current Node. */ static inline int numa_node_id(void) { return raw_cpu_read(numa_node); } #endif #ifndef cpu_to_node static inline int cpu_to_node(int cpu) { return per_cpu(numa_node, cpu); } #endif In mpol_misplaced function, we need the cpu details because we are using that in other place (should_numa_migreate_memory()). So it makes it easy to use cpu_to_node(thiscpu) instead of numa_node_id(). -aneesh
"Aneesh Kumar K.V" <aneesh.kumar@kernel.org> writes: > On 2/20/24 6:51 AM, Andrew Morton wrote: >> On Mon, 19 Feb 2024 14:04:23 +0530 Donet Tom <donettom@linux.ibm.com> wrote: >> >>>>> --- a/mm/mempolicy.c >>>>> +++ b/mm/mempolicy.c >>>>> @@ -2526,7 +2526,7 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, >>>>> if (node_isset(curnid, pol->nodes)) >>>>> goto out; >>>>> z = first_zones_zonelist( >>>>> - node_zonelist(numa_node_id(), GFP_HIGHUSER), >>>>> + node_zonelist(thisnid, GFP_HIGHUSER), >>>>> gfp_zone(GFP_HIGHUSER), >>>>> &pol->nodes); >>>>> polnid = zone_to_nid(z->zone); >>>> int thisnid = cpu_to_node(thiscpu); >>>> >>>> Is there any dofference between numa_node_id() and >>>> cpu_to_node(raw_smp_processor_id())? And it it explicable that we're >>>> using one here and not the other? >>> >>> Hi Andrew >>> >>> Both numa_node_id() and cpu_to_node(raw_smp_processor_id()) return the current execution node id, >>> Since the current execution node is already fetched at the beginning (thisnid) we can reuse it instead of getting it again. >> >> Sure, but mine was a broader thought: why do we have both? Is one >> preferable and if so why? > > IIUC these are two helpers to fetch current numa node id. and either of them can be used based on need. The default implementation shows the details. > (One small difference is numa_node_id() can use optimized per cpu reader because it is fetching the per cpu variable of the currently running cpu.) > > #ifndef numa_node_id > /* Returns the number of the current Node. */ > static inline int numa_node_id(void) > { > return raw_cpu_read(numa_node); > } > #endif > > #ifndef cpu_to_node > static inline int cpu_to_node(int cpu) > { > return per_cpu(numa_node, cpu); > } > #endif > > In mpol_misplaced function, we need the cpu details because we are using that in other place (should_numa_migreate_memory()). So it makes it easy > to use cpu_to_node(thiscpu) instead of numa_node_id(). IIUC, numa_node_id() is faster than cpu_to_node(thiscpu), even if we have thiscpu already. cpu_to_node() is mainly used to get the node of NOT current CPU. So, IMHO, we should only use numa_node_id() in this function. -- Best Regards, Huang, Ying
On 2/20/24 11:55 AM, Huang, Ying wrote: > "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> writes: > >> On 2/20/24 6:51 AM, Andrew Morton wrote: >>> On Mon, 19 Feb 2024 14:04:23 +0530 Donet Tom <donettom@linux.ibm.com> wrote: >>> >>>>>> --- a/mm/mempolicy.c >>>>>> +++ b/mm/mempolicy.c >>>>>> @@ -2526,7 +2526,7 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, >>>>>> if (node_isset(curnid, pol->nodes)) >>>>>> goto out; >>>>>> z = first_zones_zonelist( >>>>>> - node_zonelist(numa_node_id(), GFP_HIGHUSER), >>>>>> + node_zonelist(thisnid, GFP_HIGHUSER), >>>>>> gfp_zone(GFP_HIGHUSER), >>>>>> &pol->nodes); >>>>>> polnid = zone_to_nid(z->zone); >>>>> int thisnid = cpu_to_node(thiscpu); >>>>> >>>>> Is there any dofference between numa_node_id() and >>>>> cpu_to_node(raw_smp_processor_id())? And it it explicable that we're >>>>> using one here and not the other? >>>> >>>> Hi Andrew >>>> >>>> Both numa_node_id() and cpu_to_node(raw_smp_processor_id()) return the current execution node id, >>>> Since the current execution node is already fetched at the beginning (thisnid) we can reuse it instead of getting it again. >>> >>> Sure, but mine was a broader thought: why do we have both? Is one >>> preferable and if so why? >> >> IIUC these are two helpers to fetch current numa node id. and either of them can be used based on need. The default implementation shows the details. >> (One small difference is numa_node_id() can use optimized per cpu reader because it is fetching the per cpu variable of the currently running cpu.) >> >> #ifndef numa_node_id >> /* Returns the number of the current Node. */ >> static inline int numa_node_id(void) >> { >> return raw_cpu_read(numa_node); >> } >> #endif >> >> #ifndef cpu_to_node >> static inline int cpu_to_node(int cpu) >> { >> return per_cpu(numa_node, cpu); >> } >> #endif >> >> In mpol_misplaced function, we need the cpu details because we are using that in other place (should_numa_migreate_memory()). So it makes it easy >> to use cpu_to_node(thiscpu) instead of numa_node_id(). > > IIUC, numa_node_id() is faster than cpu_to_node(thiscpu), even if we > have thiscpu already. cpu_to_node() is mainly used to get the node of > NOT current CPU. So, IMHO, we should only use numa_node_id() in this > function. > This change? modified mm/mempolicy.c @@ -2502,8 +2502,7 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, pgoff_t ilx; struct zoneref *z; int curnid = folio_nid(folio); - int thiscpu = raw_smp_processor_id(); - int thisnid = cpu_to_node(thiscpu); + int thisnid = numa_node_id(); int polnid = NUMA_NO_NODE; int ret = NUMA_NO_NODE; @@ -2573,7 +2572,7 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, polnid = thisnid; if (!should_numa_migrate_memory(current, folio, curnid, - thiscpu)) + raw_smp_processor_id())) goto out; } -aneesh
On 2/20/24 12:02 PM, Aneesh Kumar K.V wrote: > On 2/20/24 11:55 AM, Huang, Ying wrote: >> "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> writes: >> >>> On 2/20/24 6:51 AM, Andrew Morton wrote: >>>> On Mon, 19 Feb 2024 14:04:23 +0530 Donet Tom <donettom@linux.ibm.com> wrote: >>>> >>>>>>> --- a/mm/mempolicy.c >>>>>>> +++ b/mm/mempolicy.c >>>>>>> @@ -2526,7 +2526,7 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, >>>>>>> if (node_isset(curnid, pol->nodes)) >>>>>>> goto out; >>>>>>> z = first_zones_zonelist( >>>>>>> - node_zonelist(numa_node_id(), GFP_HIGHUSER), >>>>>>> + node_zonelist(thisnid, GFP_HIGHUSER), >>>>>>> gfp_zone(GFP_HIGHUSER), >>>>>>> &pol->nodes); >>>>>>> polnid = zone_to_nid(z->zone); >>>>>> int thisnid = cpu_to_node(thiscpu); >>>>>> >>>>>> Is there any dofference between numa_node_id() and >>>>>> cpu_to_node(raw_smp_processor_id())? And it it explicable that we're >>>>>> using one here and not the other? >>>>> >>>>> Hi Andrew >>>>> >>>>> Both numa_node_id() and cpu_to_node(raw_smp_processor_id()) return the current execution node id, >>>>> Since the current execution node is already fetched at the beginning (thisnid) we can reuse it instead of getting it again. >>>> >>>> Sure, but mine was a broader thought: why do we have both? Is one >>>> preferable and if so why? >>> >>> IIUC these are two helpers to fetch current numa node id. and either of them can be used based on need. The default implementation shows the details. >>> (One small difference is numa_node_id() can use optimized per cpu reader because it is fetching the per cpu variable of the currently running cpu.) >>> >>> #ifndef numa_node_id >>> /* Returns the number of the current Node. */ >>> static inline int numa_node_id(void) >>> { >>> return raw_cpu_read(numa_node); >>> } >>> #endif >>> >>> #ifndef cpu_to_node >>> static inline int cpu_to_node(int cpu) >>> { >>> return per_cpu(numa_node, cpu); >>> } >>> #endif >>> >>> In mpol_misplaced function, we need the cpu details because we are using that in other place (should_numa_migreate_memory()). So it makes it easy >>> to use cpu_to_node(thiscpu) instead of numa_node_id(). >> >> IIUC, numa_node_id() is faster than cpu_to_node(thiscpu), even if we >> have thiscpu already. cpu_to_node() is mainly used to get the node of >> NOT current CPU. So, IMHO, we should only use numa_node_id() in this >> function. >> > > This change? > > modified mm/mempolicy.c > @@ -2502,8 +2502,7 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, > pgoff_t ilx; > struct zoneref *z; > int curnid = folio_nid(folio); > - int thiscpu = raw_smp_processor_id(); > - int thisnid = cpu_to_node(thiscpu); > + int thisnid = numa_node_id(); > int polnid = NUMA_NO_NODE; > int ret = NUMA_NO_NODE; > > @@ -2573,7 +2572,7 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, > polnid = thisnid; > > if (!should_numa_migrate_memory(current, folio, curnid, > - thiscpu)) > + raw_smp_processor_id())) > goto out; > } > > One of the problem with the above change will be the need to make sure smp processor id remaining stable, which I am not sure we want in this function. With that we can end up with processor id not related to the numa node id we are using. -aneesh
"Aneesh Kumar K.V" <aneesh.kumar@kernel.org> writes: > On 2/20/24 12:02 PM, Aneesh Kumar K.V wrote: >> On 2/20/24 11:55 AM, Huang, Ying wrote: >>> "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> writes: >>> >>>> On 2/20/24 6:51 AM, Andrew Morton wrote: >>>>> On Mon, 19 Feb 2024 14:04:23 +0530 Donet Tom <donettom@linux.ibm.com> wrote: >>>>> >>>>>>>> --- a/mm/mempolicy.c >>>>>>>> +++ b/mm/mempolicy.c >>>>>>>> @@ -2526,7 +2526,7 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, >>>>>>>> if (node_isset(curnid, pol->nodes)) >>>>>>>> goto out; >>>>>>>> z = first_zones_zonelist( >>>>>>>> - node_zonelist(numa_node_id(), GFP_HIGHUSER), >>>>>>>> + node_zonelist(thisnid, GFP_HIGHUSER), >>>>>>>> gfp_zone(GFP_HIGHUSER), >>>>>>>> &pol->nodes); >>>>>>>> polnid = zone_to_nid(z->zone); >>>>>>> int thisnid = cpu_to_node(thiscpu); >>>>>>> >>>>>>> Is there any dofference between numa_node_id() and >>>>>>> cpu_to_node(raw_smp_processor_id())? And it it explicable that we're >>>>>>> using one here and not the other? >>>>>> >>>>>> Hi Andrew >>>>>> >>>>>> Both numa_node_id() and cpu_to_node(raw_smp_processor_id()) return the current execution node id, >>>>>> Since the current execution node is already fetched at the beginning (thisnid) we can reuse it instead of getting it again. >>>>> >>>>> Sure, but mine was a broader thought: why do we have both? Is one >>>>> preferable and if so why? >>>> >>>> IIUC these are two helpers to fetch current numa node id. and either of them can be used based on need. The default implementation shows the details. >>>> (One small difference is numa_node_id() can use optimized per cpu reader because it is fetching the per cpu variable of the currently running cpu.) >>>> >>>> #ifndef numa_node_id >>>> /* Returns the number of the current Node. */ >>>> static inline int numa_node_id(void) >>>> { >>>> return raw_cpu_read(numa_node); >>>> } >>>> #endif >>>> >>>> #ifndef cpu_to_node >>>> static inline int cpu_to_node(int cpu) >>>> { >>>> return per_cpu(numa_node, cpu); >>>> } >>>> #endif >>>> >>>> In mpol_misplaced function, we need the cpu details because we are using that in other place (should_numa_migreate_memory()). So it makes it easy >>>> to use cpu_to_node(thiscpu) instead of numa_node_id(). >>> >>> IIUC, numa_node_id() is faster than cpu_to_node(thiscpu), even if we >>> have thiscpu already. cpu_to_node() is mainly used to get the node of >>> NOT current CPU. So, IMHO, we should only use numa_node_id() in this >>> function. >>> >> >> This change? >> >> modified mm/mempolicy.c >> @@ -2502,8 +2502,7 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, >> pgoff_t ilx; >> struct zoneref *z; >> int curnid = folio_nid(folio); >> - int thiscpu = raw_smp_processor_id(); >> - int thisnid = cpu_to_node(thiscpu); >> + int thisnid = numa_node_id(); >> int polnid = NUMA_NO_NODE; >> int ret = NUMA_NO_NODE; >> >> @@ -2573,7 +2572,7 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, >> polnid = thisnid; >> >> if (!should_numa_migrate_memory(current, folio, curnid, >> - thiscpu)) >> + raw_smp_processor_id())) >> goto out; >> } >> >> > > One of the problem with the above change will be the need to make sure smp processor id remaining stable, which > I am not sure we want in this function. With that we can end up with processor id not related to the numa node id > we are using. This isn't an issue now, because mpol_misplaced() are always called with PTL held. And, we can still keep thiscpu local variable. -- Best Regards, Huang, Ying
On Tue 20-02-24 15:22:07, Huang, Ying wrote: [...] > This isn't an issue now, because mpol_misplaced() are always called with > PTL held. And, we can still keep thiscpu local variable. yes, this is the case but it would be better if we made that assumption official by lockdep_assert_held
Michal Hocko <mhocko@suse.com> writes: > On Tue 20-02-24 15:22:07, Huang, Ying wrote: > [...] >> This isn't an issue now, because mpol_misplaced() are always called with >> PTL held. And, we can still keep thiscpu local variable. > > yes, this is the case but it would be better if we made that assumption > official by lockdep_assert_held > How about this folded into this patch? 2 files changed, 12 insertions(+), 4 deletions(-) mm/memory.c | 6 ++++-- mm/mempolicy.c | 10 ++++++++-- modified mm/memory.c @@ -4879,9 +4879,11 @@ static vm_fault_t do_fault(struct vm_fault *vmf) return ret; } -int numa_migrate_prep(struct folio *folio, struct vm_area_struct *vma, +int numa_migrate_prep(struct folio *folio, struct vm_fault *vmf, unsigned long addr, int page_nid, int *flags) { + struct vm_area_struct *vma = vmf->vma; + folio_get(folio); /* Record the current PID acceesing VMA */ @@ -4893,7 +4895,7 @@ int numa_migrate_prep(struct folio *folio, struct vm_area_struct *vma, *flags |= TNF_FAULT_LOCAL; } - return mpol_misplaced(folio, vma, addr); + return mpol_misplaced(folio, vmf, addr); } static vm_fault_t do_numa_page(struct vm_fault *vmf) modified mm/mempolicy.c @@ -2495,18 +2495,24 @@ static inline bool mpol_preferred_should_numa_migrate(int exec_node, int folio_n * Return: NUMA_NO_NODE if the page is in a node that is valid for this * policy, or a suitable node ID to allocate a replacement folio from. */ -int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, +int mpol_misplaced(struct folio *folio, struct vm_fault *vmf, unsigned long addr) { struct mempolicy *pol; pgoff_t ilx; struct zoneref *z; int curnid = folio_nid(folio); + struct vm_area_struct *vma = vmf->vma; int thiscpu = raw_smp_processor_id(); - int thisnid = cpu_to_node(thiscpu); + int thisnid = numa_node_id(); int polnid = NUMA_NO_NODE; int ret = NUMA_NO_NODE; + /* + * Make sure ptl is held so that we don't preempt and we + * have a stable smp processor id + */ + lockdep_assert_held(vmf->ptl); pol = get_vma_policy(vma, addr, folio_order(folio), &ilx); if (!(pol->flags & MPOL_F_MOF)) goto out; [back]
Aneesh Kumar K.V <aneesh.kumar@kernel.org> writes: > Michal Hocko <mhocko@suse.com> writes: > >> On Tue 20-02-24 15:22:07, Huang, Ying wrote: >> [...] >>> This isn't an issue now, because mpol_misplaced() are always called with >>> PTL held. And, we can still keep thiscpu local variable. >> >> yes, this is the case but it would be better if we made that assumption >> official by lockdep_assert_held >> > > How about this folded into this patch? > > 2 files changed, 12 insertions(+), 4 deletions(-) > mm/memory.c | 6 ++++-- > mm/mempolicy.c | 10 ++++++++-- > > modified mm/memory.c > @@ -4879,9 +4879,11 @@ static vm_fault_t do_fault(struct vm_fault *vmf) > return ret; > } > > -int numa_migrate_prep(struct folio *folio, struct vm_area_struct *vma, > +int numa_migrate_prep(struct folio *folio, struct vm_fault *vmf, > unsigned long addr, int page_nid, int *flags) > { > + struct vm_area_struct *vma = vmf->vma; > + > folio_get(folio); > > /* Record the current PID acceesing VMA */ > @@ -4893,7 +4895,7 @@ int numa_migrate_prep(struct folio *folio, struct vm_area_struct *vma, > *flags |= TNF_FAULT_LOCAL; > } > > - return mpol_misplaced(folio, vma, addr); > + return mpol_misplaced(folio, vmf, addr); > } > > static vm_fault_t do_numa_page(struct vm_fault *vmf) > modified mm/mempolicy.c > @@ -2495,18 +2495,24 @@ static inline bool mpol_preferred_should_numa_migrate(int exec_node, int folio_n > * Return: NUMA_NO_NODE if the page is in a node that is valid for this > * policy, or a suitable node ID to allocate a replacement folio from. > */ > -int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, > +int mpol_misplaced(struct folio *folio, struct vm_fault *vmf, > unsigned long addr) > { > struct mempolicy *pol; > pgoff_t ilx; > struct zoneref *z; > int curnid = folio_nid(folio); > + struct vm_area_struct *vma = vmf->vma; > int thiscpu = raw_smp_processor_id(); > - int thisnid = cpu_to_node(thiscpu); > + int thisnid = numa_node_id(); > int polnid = NUMA_NO_NODE; > int ret = NUMA_NO_NODE; > > + /* > + * Make sure ptl is held so that we don't preempt and we > + * have a stable smp processor id > + */ > + lockdep_assert_held(vmf->ptl); > pol = get_vma_policy(vma, addr, folio_order(folio), &ilx); > if (!(pol->flags & MPOL_F_MOF)) > goto out; > > [back] > LGTM, Thanks! -- Best Regards, Huang, Ying
diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 10a590ee1c89..8478574c000c 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -2526,7 +2526,7 @@ int mpol_misplaced(struct folio *folio, struct vm_area_struct *vma, if (node_isset(curnid, pol->nodes)) goto out; z = first_zones_zonelist( - node_zonelist(numa_node_id(), GFP_HIGHUSER), + node_zonelist(thisnid, GFP_HIGHUSER), gfp_zone(GFP_HIGHUSER), &pol->nodes); polnid = zone_to_nid(z->zone);