Message ID | 20240506155120.83105-6-libang.li@antgroup.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Add update_mmu_tlb_range() to simplify code | expand |
On Mon, May 6, 2024 at 11:52 PM Bang Li <libang.li@antgroup.com> wrote: > > After the commit 19eaf44954df ("mm: thp: support allocation of anonymous > multi-size THP"), it may need to batch update tlb of an address range > through the update_mmu_tlb function. We can simplify this operation by > adding the update_mmu_tlb_range function, which may also reduce the > execution of some unnecessary code in some architectures. > > Signed-off-by: Bang Li <libang.li@antgroup.com> > --- > include/linux/pgtable.h | 8 ++++++++ > mm/memory.c | 4 +--- > 2 files changed, 9 insertions(+), 3 deletions(-) > > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > index 18019f037bae..869bfe6054f1 100644 > --- a/include/linux/pgtable.h > +++ b/include/linux/pgtable.h > @@ -737,6 +737,14 @@ static inline void update_mmu_tlb(struct vm_area_struct *vma, > #define __HAVE_ARCH_UPDATE_MMU_TLB > #endif > > +#ifndef __HAVE_ARCH_UPDATE_MMU_TLB_RANGE IIRC, the contemporary practice is to define a macro with the same name as the function if it is being overridden. Thanks, Lance > +static inline void update_mmu_tlb_range(struct vm_area_struct *vma, > + unsigned long address, pte_t *ptep, unsigned int nr) > +{ > +} > +#define __HAVE_ARCH_UPDATE_MMU_TLB_RANGE > +#endif > + > /* > * Some architectures may be able to avoid expensive synchronization > * primitives when modifications are made to PTE's which are already > diff --git a/mm/memory.c b/mm/memory.c > index eea6e4984eae..2d53e29cf76e 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4421,7 +4421,6 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) > vm_fault_t ret = 0; > int nr_pages = 1; > pte_t entry; > - int i; > > /* File mapping without ->vm_ops ? */ > if (vma->vm_flags & VM_SHARED) > @@ -4491,8 +4490,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) > update_mmu_tlb(vma, addr, vmf->pte); > goto release; > } else if (nr_pages > 1 && !pte_range_none(vmf->pte, nr_pages)) { > - for (i = 0; i < nr_pages; i++) > - update_mmu_tlb(vma, addr + PAGE_SIZE * i, vmf->pte + i); > + update_mmu_tlb_range(vma, addr, vmf->pte, nr_pages); > goto release; > } > > -- > 2.19.1.6.gb485710b >
Hey Lance, Thanks for taking time to review! On 2024/5/7 0:07, Lance Yang wrote: > On Mon, May 6, 2024 at 11:52 PM Bang Li <libang.li@antgroup.com> wrote: >> >> After the commit 19eaf44954df ("mm: thp: support allocation of anonymous >> multi-size THP"), it may need to batch update tlb of an address range >> through the update_mmu_tlb function. We can simplify this operation by >> adding the update_mmu_tlb_range function, which may also reduce the >> execution of some unnecessary code in some architectures. >> >> Signed-off-by: Bang Li <libang.li@antgroup.com> >> --- >> include/linux/pgtable.h | 8 ++++++++ >> mm/memory.c | 4 +--- >> 2 files changed, 9 insertions(+), 3 deletions(-) >> >> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h >> index 18019f037bae..869bfe6054f1 100644 >> --- a/include/linux/pgtable.h >> +++ b/include/linux/pgtable.h >> @@ -737,6 +737,14 @@ static inline void update_mmu_tlb(struct vm_area_struct *vma, >> #define __HAVE_ARCH_UPDATE_MMU_TLB >> #endif >> >> +#ifndef __HAVE_ARCH_UPDATE_MMU_TLB_RANGE > > IIRC, the contemporary practice is to define a macro with the same name > as the function if it is being overridden. The macro __HAVE_ARCH_UPDATE_MMU_TLB_RANGE defined here is aligned with the macro __HAVE_ARCH_UPDATE_MMU_TLB corresponding to the update_mmu_tlb function. IMO, it should be better to use my method in this case. Thanks, Bang > > Thanks, > Lance > >> +static inline void update_mmu_tlb_range(struct vm_area_struct *vma, >> + unsigned long address, pte_t *ptep, unsigned int nr) >> +{ >> +} >> +#define __HAVE_ARCH_UPDATE_MMU_TLB_RANGE >> +#endif >> + >> /* >> * Some architectures may be able to avoid expensive synchronization >> * primitives when modifications are made to PTE's which are already >> diff --git a/mm/memory.c b/mm/memory.c >> index eea6e4984eae..2d53e29cf76e 100644 >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -4421,7 +4421,6 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) >> vm_fault_t ret = 0; >> int nr_pages = 1; >> pte_t entry; >> - int i; >> >> /* File mapping without ->vm_ops ? */ >> if (vma->vm_flags & VM_SHARED) >> @@ -4491,8 +4490,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) >> update_mmu_tlb(vma, addr, vmf->pte); >> goto release; >> } else if (nr_pages > 1 && !pte_range_none(vmf->pte, nr_pages)) { >> - for (i = 0; i < nr_pages; i++) >> - update_mmu_tlb(vma, addr + PAGE_SIZE * i, vmf->pte + i); >> + update_mmu_tlb_range(vma, addr, vmf->pte, nr_pages); >> goto release; >> } >> >> -- >> 2.19.1.6.gb485710b >>
On 06/05/2024 16:51, Bang Li wrote: > After the commit 19eaf44954df ("mm: thp: support allocation of anonymous > multi-size THP"), it may need to batch update tlb of an address range > through the update_mmu_tlb function. We can simplify this operation by > adding the update_mmu_tlb_range function, which may also reduce the > execution of some unnecessary code in some architectures. > > Signed-off-by: Bang Li <libang.li@antgroup.com> > --- > include/linux/pgtable.h | 8 ++++++++ > mm/memory.c | 4 +--- > 2 files changed, 9 insertions(+), 3 deletions(-) > > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > index 18019f037bae..869bfe6054f1 100644 > --- a/include/linux/pgtable.h > +++ b/include/linux/pgtable.h > @@ -737,6 +737,14 @@ static inline void update_mmu_tlb(struct vm_area_struct *vma, > #define __HAVE_ARCH_UPDATE_MMU_TLB > #endif Given you are implementing update_mmu_tlb_range() in all the arches that currently override update_mmu_tlb() I wonder if it would be cleaner to remove update_mmu_tlb() from all those arches, and define generically, removing the ability for arches to override it: static inline void update_mmu_tlb(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) { update_mmu_tlb_range(vma, address, ptep, 1); } > > +#ifndef __HAVE_ARCH_UPDATE_MMU_TLB_RANGE > +static inline void update_mmu_tlb_range(struct vm_area_struct *vma, > + unsigned long address, pte_t *ptep, unsigned int nr) > +{ > +} > +#define __HAVE_ARCH_UPDATE_MMU_TLB_RANGE > +#endif Then you could use the modern override scheme as Lance suggested and you won't have any confusion with __HAVE_ARCH_UPDATE_MMU_TLB because it won't exist anymore. > + > /* > * Some architectures may be able to avoid expensive synchronization > * primitives when modifications are made to PTE's which are already > diff --git a/mm/memory.c b/mm/memory.c > index eea6e4984eae..2d53e29cf76e 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4421,7 +4421,6 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) > vm_fault_t ret = 0; > int nr_pages = 1; > pte_t entry; > - int i; > > /* File mapping without ->vm_ops ? */ > if (vma->vm_flags & VM_SHARED) > @@ -4491,8 +4490,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) > update_mmu_tlb(vma, addr, vmf->pte); > goto release; > } else if (nr_pages > 1 && !pte_range_none(vmf->pte, nr_pages)) { > - for (i = 0; i < nr_pages; i++) > - update_mmu_tlb(vma, addr + PAGE_SIZE * i, vmf->pte + i); > + update_mmu_tlb_range(vma, addr, vmf->pte, nr_pages); I certainly agree that this will be a useful helper to have. I expect there will be more users in future. > goto release; > } >
On Fri, May 10, 2024 at 5:05 PM Ryan Roberts <ryan.roberts@arm.com> wrote: > > On 06/05/2024 16:51, Bang Li wrote: > > After the commit 19eaf44954df ("mm: thp: support allocation of anonymous > > multi-size THP"), it may need to batch update tlb of an address range > > through the update_mmu_tlb function. We can simplify this operation by > > adding the update_mmu_tlb_range function, which may also reduce the > > execution of some unnecessary code in some architectures. > > > > Signed-off-by: Bang Li <libang.li@antgroup.com> > > --- > > include/linux/pgtable.h | 8 ++++++++ > > mm/memory.c | 4 +--- > > 2 files changed, 9 insertions(+), 3 deletions(-) > > > > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > > index 18019f037bae..869bfe6054f1 100644 > > --- a/include/linux/pgtable.h > > +++ b/include/linux/pgtable.h > > @@ -737,6 +737,14 @@ static inline void update_mmu_tlb(struct vm_area_struct *vma, > > #define __HAVE_ARCH_UPDATE_MMU_TLB > > #endif > > Given you are implementing update_mmu_tlb_range() in all the arches that > currently override update_mmu_tlb() I wonder if it would be cleaner to remove > update_mmu_tlb() from all those arches, and define generically, removing the > ability for arches to override it: Sounds great! Let's get it done. > > static inline void update_mmu_tlb(struct vm_area_struct *vma, > unsigned long address, pte_t *ptep) > { > update_mmu_tlb_range(vma, address, ptep, 1); > } > > > > > +#ifndef __HAVE_ARCH_UPDATE_MMU_TLB_RANGE > > +static inline void update_mmu_tlb_range(struct vm_area_struct *vma, > > + unsigned long address, pte_t *ptep, unsigned int nr) > > +{ > > +} > > +#define __HAVE_ARCH_UPDATE_MMU_TLB_RANGE > > +#endif > > Then you could use the modern override scheme as Lance suggested and you won't > have any confusion with __HAVE_ARCH_UPDATE_MMU_TLB because it won't exist anymore. +1. It might be better to use the modern override scheme :) Thanks, Lance > > > + > > /* > > * Some architectures may be able to avoid expensive synchronization > > * primitives when modifications are made to PTE's which are already > > diff --git a/mm/memory.c b/mm/memory.c > > index eea6e4984eae..2d53e29cf76e 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -4421,7 +4421,6 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) > > vm_fault_t ret = 0; > > int nr_pages = 1; > > pte_t entry; > > - int i; > > > > /* File mapping without ->vm_ops ? */ > > if (vma->vm_flags & VM_SHARED) > > @@ -4491,8 +4490,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) > > update_mmu_tlb(vma, addr, vmf->pte); > > goto release; > > } else if (nr_pages > 1 && !pte_range_none(vmf->pte, nr_pages)) { > > - for (i = 0; i < nr_pages; i++) > > - update_mmu_tlb(vma, addr + PAGE_SIZE * i, vmf->pte + i); > > + update_mmu_tlb_range(vma, addr, vmf->pte, nr_pages); > > I certainly agree that this will be a useful helper to have. I expect there will > be more users in future. > > > goto release; > > } > > >
Hi Ryan, Thanks for you review! On 2024/5/10 17:05, Ryan Roberts wrote: > On 06/05/2024 16:51, Bang Li wrote: >> After the commit 19eaf44954df ("mm: thp: support allocation of anonymous >> multi-size THP"), it may need to batch update tlb of an address range >> through the update_mmu_tlb function. We can simplify this operation by >> adding the update_mmu_tlb_range function, which may also reduce the >> execution of some unnecessary code in some architectures. >> >> Signed-off-by: Bang Li <libang.li@antgroup.com> >> --- >> include/linux/pgtable.h | 8 ++++++++ >> mm/memory.c | 4 +--- >> 2 files changed, 9 insertions(+), 3 deletions(-) >> >> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h >> index 18019f037bae..869bfe6054f1 100644 >> --- a/include/linux/pgtable.h >> +++ b/include/linux/pgtable.h >> @@ -737,6 +737,14 @@ static inline void update_mmu_tlb(struct vm_area_struct *vma, >> #define __HAVE_ARCH_UPDATE_MMU_TLB >> #endif > > Given you are implementing update_mmu_tlb_range() in all the arches that > currently override update_mmu_tlb() I wonder if it would be cleaner to remove > update_mmu_tlb() from all those arches, and define generically, removing the > ability for arches to override it: > > static inline void update_mmu_tlb(struct vm_area_struct *vma, > unsigned long address, pte_t *ptep) > { > update_mmu_tlb_range(vma, address, ptep, 1); > } Agreed! Thank you for your suggestion, I will modify it in the next version. > >> >> +#ifndef __HAVE_ARCH_UPDATE_MMU_TLB_RANGE >> +static inline void update_mmu_tlb_range(struct vm_area_struct *vma, >> + unsigned long address, pte_t *ptep, unsigned int nr) >> +{ >> +} >> +#define __HAVE_ARCH_UPDATE_MMU_TLB_RANGE >> +#endif > > Then you could use the modern override scheme as Lance suggested and you won't > have any confusion with __HAVE_ARCH_UPDATE_MMU_TLB because it won't exist anymore. Yes, use update_mmu_tlb_range to implement update_mmu_tlb, we only need to define the update_mmu_tlb_range macro. > >> + >> /* >> * Some architectures may be able to avoid expensive synchronization >> * primitives when modifications are made to PTE's which are already >> diff --git a/mm/memory.c b/mm/memory.c >> index eea6e4984eae..2d53e29cf76e 100644 >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -4421,7 +4421,6 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) >> vm_fault_t ret = 0; >> int nr_pages = 1; >> pte_t entry; >> - int i; >> >> /* File mapping without ->vm_ops ? */ >> if (vma->vm_flags & VM_SHARED) >> @@ -4491,8 +4490,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) >> update_mmu_tlb(vma, addr, vmf->pte); >> goto release; >> } else if (nr_pages > 1 && !pte_range_none(vmf->pte, nr_pages)) { >> - for (i = 0; i < nr_pages; i++) >> - update_mmu_tlb(vma, addr + PAGE_SIZE * i, vmf->pte + i); >> + update_mmu_tlb_range(vma, addr, vmf->pte, nr_pages); > > I certainly agree that this will be a useful helper to have. I expect there will > be more users in future. Thank you for your affirmation. Baolin’s "add mTHP support for anonymous shmem" series[1] can also use this function to simplify the code. [1] https://lore.kernel.org/linux-mm/cover.1714978902.git.baolin.wang@linux.alibaba.com/ Thanks, Bang > >> goto release; >> } >>
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 18019f037bae..869bfe6054f1 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -737,6 +737,14 @@ static inline void update_mmu_tlb(struct vm_area_struct *vma, #define __HAVE_ARCH_UPDATE_MMU_TLB #endif +#ifndef __HAVE_ARCH_UPDATE_MMU_TLB_RANGE +static inline void update_mmu_tlb_range(struct vm_area_struct *vma, + unsigned long address, pte_t *ptep, unsigned int nr) +{ +} +#define __HAVE_ARCH_UPDATE_MMU_TLB_RANGE +#endif + /* * Some architectures may be able to avoid expensive synchronization * primitives when modifications are made to PTE's which are already diff --git a/mm/memory.c b/mm/memory.c index eea6e4984eae..2d53e29cf76e 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4421,7 +4421,6 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) vm_fault_t ret = 0; int nr_pages = 1; pte_t entry; - int i; /* File mapping without ->vm_ops ? */ if (vma->vm_flags & VM_SHARED) @@ -4491,8 +4490,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) update_mmu_tlb(vma, addr, vmf->pte); goto release; } else if (nr_pages > 1 && !pte_range_none(vmf->pte, nr_pages)) { - for (i = 0; i < nr_pages; i++) - update_mmu_tlb(vma, addr + PAGE_SIZE * i, vmf->pte + i); + update_mmu_tlb_range(vma, addr, vmf->pte, nr_pages); goto release; }
After the commit 19eaf44954df ("mm: thp: support allocation of anonymous multi-size THP"), it may need to batch update tlb of an address range through the update_mmu_tlb function. We can simplify this operation by adding the update_mmu_tlb_range function, which may also reduce the execution of some unnecessary code in some architectures. Signed-off-by: Bang Li <libang.li@antgroup.com> --- include/linux/pgtable.h | 8 ++++++++ mm/memory.c | 4 +--- 2 files changed, 9 insertions(+), 3 deletions(-)