Message ID | 20230714161733.4144503-2-ryan.roberts@arm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | variable-order, large folios for anonymous memory | expand |
On Fri, Jul 14, 2023 at 10:17 AM Ryan Roberts <ryan.roberts@arm.com> wrote: > > arch_wants_pte_order() can be overridden by the arch to return the > preferred folio order for pte-mapped memory. This is useful as some > architectures (e.g. arm64) can coalesce TLB entries when the physical > memory is suitably contiguous. > > The first user for this hint will be FLEXIBLE_THP, which aims to > allocate large folios for anonymous memory to reduce page faults and > other per-page operation costs. > > Here we add the default implementation of the function, used when the > architecture does not define it, which returns -1, implying that the HW > has no preference. In this case, mm will choose it's own default order. > > Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Yu Zhao <yuzhao@google.com> Thanks: -1 actually is better than 0 (what I suggested) for the obvious reason.
On 7/15/23 00:17, Ryan Roberts wrote: > arch_wants_pte_order() can be overridden by the arch to return the > preferred folio order for pte-mapped memory. This is useful as some > architectures (e.g. arm64) can coalesce TLB entries when the physical > memory is suitably contiguous. > > The first user for this hint will be FLEXIBLE_THP, which aims to > allocate large folios for anonymous memory to reduce page faults and > other per-page operation costs. > > Here we add the default implementation of the function, used when the > architecture does not define it, which returns -1, implying that the HW > has no preference. In this case, mm will choose it's own default order. > > Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Regards Yin, Fengwei > --- > include/linux/pgtable.h | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > index 5063b482e34f..2a1d83775837 100644 > --- a/include/linux/pgtable.h > +++ b/include/linux/pgtable.h > @@ -313,6 +313,19 @@ static inline bool arch_has_hw_pte_young(void) > } > #endif > > +#ifndef arch_wants_pte_order > +/* > + * Returns preferred folio order for pte-mapped memory. Must be in range [0, > + * PMD_SHIFT-PAGE_SHIFT) and must not be order-1 since THP requires large folios > + * to be at least order-2. Negative value implies that the HW has no preference > + * and mm will choose it's own default order. > + */ > +static inline int arch_wants_pte_order(void) > +{ > + return -1; > +} > +#endif > + > #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR > static inline pte_t ptep_get_and_clear(struct mm_struct *mm, > unsigned long address,
On 14.07.23 18:17, Ryan Roberts wrote: > arch_wants_pte_order() can be overridden by the arch to return the > preferred folio order for pte-mapped memory. This is useful as some > architectures (e.g. arm64) can coalesce TLB entries when the physical > memory is suitably contiguous. > > The first user for this hint will be FLEXIBLE_THP, which aims to > allocate large folios for anonymous memory to reduce page faults and > other per-page operation costs. > > Here we add the default implementation of the function, used when the > architecture does not define it, which returns -1, implying that the HW > has no preference. In this case, mm will choose it's own default order. > > Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> > --- > include/linux/pgtable.h | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > index 5063b482e34f..2a1d83775837 100644 > --- a/include/linux/pgtable.h > +++ b/include/linux/pgtable.h > @@ -313,6 +313,19 @@ static inline bool arch_has_hw_pte_young(void) > } > #endif > > +#ifndef arch_wants_pte_order > +/* > + * Returns preferred folio order for pte-mapped memory. Must be in range [0, > + * PMD_SHIFT-PAGE_SHIFT) and must not be order-1 since THP requires large folios > + * to be at least order-2. Negative value implies that the HW has no preference > + * and mm will choose it's own default order. > + */ > +static inline int arch_wants_pte_order(void) > +{ > + return -1; > +} > +#endif > + > #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR > static inline pte_t ptep_get_and_clear(struct mm_struct *mm, > unsigned long address, What is the reason to have this into a separate patch? That should simply be squashed into the actual user -- patch #3.
On 17/07/2023 14:01, David Hildenbrand wrote: > On 14.07.23 18:17, Ryan Roberts wrote: >> arch_wants_pte_order() can be overridden by the arch to return the >> preferred folio order for pte-mapped memory. This is useful as some >> architectures (e.g. arm64) can coalesce TLB entries when the physical >> memory is suitably contiguous. >> >> The first user for this hint will be FLEXIBLE_THP, which aims to >> allocate large folios for anonymous memory to reduce page faults and >> other per-page operation costs. >> >> Here we add the default implementation of the function, used when the >> architecture does not define it, which returns -1, implying that the HW >> has no preference. In this case, mm will choose it's own default order. >> >> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> >> --- >> include/linux/pgtable.h | 13 +++++++++++++ >> 1 file changed, 13 insertions(+) >> >> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h >> index 5063b482e34f..2a1d83775837 100644 >> --- a/include/linux/pgtable.h >> +++ b/include/linux/pgtable.h >> @@ -313,6 +313,19 @@ static inline bool arch_has_hw_pte_young(void) >> } >> #endif >> +#ifndef arch_wants_pte_order >> +/* >> + * Returns preferred folio order for pte-mapped memory. Must be in range [0, >> + * PMD_SHIFT-PAGE_SHIFT) and must not be order-1 since THP requires large folios >> + * to be at least order-2. Negative value implies that the HW has no preference >> + * and mm will choose it's own default order. >> + */ >> +static inline int arch_wants_pte_order(void) >> +{ >> + return -1; >> +} >> +#endif >> + >> #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR >> static inline pte_t ptep_get_and_clear(struct mm_struct *mm, >> unsigned long address, > > What is the reason to have this into a separate patch? That should simply be > squashed into the actual user -- patch #3. There was a lot more in this at v1 IIRC, so made more sense as standalone. I agree it can be squashed into the next patch now. Will do for next version. >
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 5063b482e34f..2a1d83775837 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -313,6 +313,19 @@ static inline bool arch_has_hw_pte_young(void) } #endif +#ifndef arch_wants_pte_order +/* + * Returns preferred folio order for pte-mapped memory. Must be in range [0, + * PMD_SHIFT-PAGE_SHIFT) and must not be order-1 since THP requires large folios + * to be at least order-2. Negative value implies that the HW has no preference + * and mm will choose it's own default order. + */ +static inline int arch_wants_pte_order(void) +{ + return -1; +} +#endif + #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long address,
arch_wants_pte_order() can be overridden by the arch to return the preferred folio order for pte-mapped memory. This is useful as some architectures (e.g. arm64) can coalesce TLB entries when the physical memory is suitably contiguous. The first user for this hint will be FLEXIBLE_THP, which aims to allocate large folios for anonymous memory to reduce page faults and other per-page operation costs. Here we add the default implementation of the function, used when the architecture does not define it, which returns -1, implying that the HW has no preference. In this case, mm will choose it's own default order. Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> --- include/linux/pgtable.h | 13 +++++++++++++ 1 file changed, 13 insertions(+)