Message ID | 20230414130303.2345383-4-ryan.roberts@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | variable-order, large folios for anonymous memory | expand |
On 4/14/2023 9:02 PM, Ryan Roberts wrote: > Opportunistically attempt to allocate high-order folios in highmem, > optionally zeroed. Retry with lower orders all the way to order-0, until > success. Although, of note, order-1 allocations are skipped since a > large folio must be at least order-2 to work with the THP machinery. The > user must check what they got with folio_order(). > > This will be used to oportunistically allocate large folios for > anonymous memory with a sensible fallback under memory pressure. > > For attempts to allocate non-0 orders, we set __GFP_NORETRY to prevent > high latency due to reclaim, instead preferring to just try for a lower > order. The same approach is used by the readahead code when allocating > large folios. I am not sure whether anonymous page can share the same approach as page cache. The latency of new page cache is dominated by IO. So it may be not big deal to retry with different order some times. Retry too many times could bring latency for anonymous page allocation. Regards Yin, Fengwei > > Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> > --- > mm/memory.c | 33 +++++++++++++++++++++++++++++++++ > 1 file changed, 33 insertions(+) > > diff --git a/mm/memory.c b/mm/memory.c > index 9d5e8be49f3b..ca32f59acef2 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -2989,6 +2989,39 @@ static vm_fault_t fault_dirty_shared_page(struct vm_fault *vmf) > return 0; > } > > +static inline struct folio *vma_alloc_movable_folio(struct vm_area_struct *vma, > + unsigned long vaddr, int order, bool zeroed) > +{ > + gfp_t gfp = order > 0 ? __GFP_NORETRY | __GFP_NOWARN : 0; > + > + if (zeroed) > + return vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order); > + else > + return vma_alloc_folio(GFP_HIGHUSER_MOVABLE | gfp, order, vma, > + vaddr, false); > +} > + > +/* > + * Opportunistically attempt to allocate high-order folios, retrying with lower > + * orders all the way to order-0, until success. order-1 allocations are skipped > + * since a folio must be at least order-2 to work with the THP machinery. The > + * user must check what they got with folio_order(). vaddr can be any virtual > + * address that will be mapped by the allocated folio. > + */ > +static struct folio *try_vma_alloc_movable_folio(struct vm_area_struct *vma, > + unsigned long vaddr, int order, bool zeroed) > +{ > + struct folio *folio; > + > + for (; order > 1; order--) { > + folio = vma_alloc_movable_folio(vma, vaddr, order, zeroed); > + if (folio) > + return folio; > + } > + > + return vma_alloc_movable_folio(vma, vaddr, 0, zeroed); > +} > + > /* > * Handle write page faults for pages that can be reused in the current vma > * > -- > 2.25.1 >
On 17/04/2023 09:49, Yin, Fengwei wrote: > > > On 4/14/2023 9:02 PM, Ryan Roberts wrote: >> Opportunistically attempt to allocate high-order folios in highmem, >> optionally zeroed. Retry with lower orders all the way to order-0, until >> success. Although, of note, order-1 allocations are skipped since a >> large folio must be at least order-2 to work with the THP machinery. The >> user must check what they got with folio_order(). >> >> This will be used to oportunistically allocate large folios for >> anonymous memory with a sensible fallback under memory pressure. >> >> For attempts to allocate non-0 orders, we set __GFP_NORETRY to prevent >> high latency due to reclaim, instead preferring to just try for a lower >> order. The same approach is used by the readahead code when allocating >> large folios. > I am not sure whether anonymous page can share the same approach as page > cache. The latency of new page cache is dominated by IO. So it may be not > big deal to retry with different order some times. > > Retry too many times could bring latency for anonymous page allocation. Perhaps I'm better off just using vma_thp_gfp_mask(), or at least taking inspiration from it? > > Regards > Yin, Fengwei > >> >> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> >> --- >> mm/memory.c | 33 +++++++++++++++++++++++++++++++++ >> 1 file changed, 33 insertions(+) >> >> diff --git a/mm/memory.c b/mm/memory.c >> index 9d5e8be49f3b..ca32f59acef2 100644 >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -2989,6 +2989,39 @@ static vm_fault_t fault_dirty_shared_page(struct vm_fault *vmf) >> return 0; >> } >> >> +static inline struct folio *vma_alloc_movable_folio(struct vm_area_struct *vma, >> + unsigned long vaddr, int order, bool zeroed) >> +{ >> + gfp_t gfp = order > 0 ? __GFP_NORETRY | __GFP_NOWARN : 0; >> + >> + if (zeroed) >> + return vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order); >> + else >> + return vma_alloc_folio(GFP_HIGHUSER_MOVABLE | gfp, order, vma, >> + vaddr, false); >> +} >> + >> +/* >> + * Opportunistically attempt to allocate high-order folios, retrying with lower >> + * orders all the way to order-0, until success. order-1 allocations are skipped >> + * since a folio must be at least order-2 to work with the THP machinery. The >> + * user must check what they got with folio_order(). vaddr can be any virtual >> + * address that will be mapped by the allocated folio. >> + */ >> +static struct folio *try_vma_alloc_movable_folio(struct vm_area_struct *vma, >> + unsigned long vaddr, int order, bool zeroed) >> +{ >> + struct folio *folio; >> + >> + for (; order > 1; order--) { >> + folio = vma_alloc_movable_folio(vma, vaddr, order, zeroed); >> + if (folio) >> + return folio; >> + } >> + >> + return vma_alloc_movable_folio(vma, vaddr, 0, zeroed); >> +} >> + >> /* >> * Handle write page faults for pages that can be reused in the current vma >> * >> -- >> 2.25.1 >>
diff --git a/mm/memory.c b/mm/memory.c index 9d5e8be49f3b..ca32f59acef2 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2989,6 +2989,39 @@ static vm_fault_t fault_dirty_shared_page(struct vm_fault *vmf) return 0; } +static inline struct folio *vma_alloc_movable_folio(struct vm_area_struct *vma, + unsigned long vaddr, int order, bool zeroed) +{ + gfp_t gfp = order > 0 ? __GFP_NORETRY | __GFP_NOWARN : 0; + + if (zeroed) + return vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order); + else + return vma_alloc_folio(GFP_HIGHUSER_MOVABLE | gfp, order, vma, + vaddr, false); +} + +/* + * Opportunistically attempt to allocate high-order folios, retrying with lower + * orders all the way to order-0, until success. order-1 allocations are skipped + * since a folio must be at least order-2 to work with the THP machinery. The + * user must check what they got with folio_order(). vaddr can be any virtual + * address that will be mapped by the allocated folio. + */ +static struct folio *try_vma_alloc_movable_folio(struct vm_area_struct *vma, + unsigned long vaddr, int order, bool zeroed) +{ + struct folio *folio; + + for (; order > 1; order--) { + folio = vma_alloc_movable_folio(vma, vaddr, order, zeroed); + if (folio) + return folio; + } + + return vma_alloc_movable_folio(vma, vaddr, 0, zeroed); +} + /* * Handle write page faults for pages that can be reused in the current vma *
Opportunistically attempt to allocate high-order folios in highmem, optionally zeroed. Retry with lower orders all the way to order-0, until success. Although, of note, order-1 allocations are skipped since a large folio must be at least order-2 to work with the THP machinery. The user must check what they got with folio_order(). This will be used to oportunistically allocate large folios for anonymous memory with a sensible fallback under memory pressure. For attempts to allocate non-0 orders, we set __GFP_NORETRY to prevent high latency due to reclaim, instead preferring to just try for a lower order. The same approach is used by the readahead code when allocating large folios. Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> --- mm/memory.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) -- 2.25.1