Message ID | 20240327144537.4165578-6-ryan.roberts@arm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Swap-out mTHP without splitting | expand |
On Thu, Mar 28, 2024 at 3:45 AM Ryan Roberts <ryan.roberts@arm.com> wrote: > > Now that swap supports storing all mTHP sizes, avoid splitting large > folios before swap-out. This benefits performance of the swap-out path > by eliding split_folio_to_list(), which is expensive, and also sets us > up for swapping in large folios in a future series. > > If the folio is partially mapped, we continue to split it since we want > to avoid the extra IO overhead and storage of writing out pages > uneccessarily. > > Reviewed-by: David Hildenbrand <david@redhat.com> > Reviewed-by: Barry Song <v-songbaohua@oppo.com> > Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> > --- > mm/vmscan.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 00adaf1cb2c3..293120fe54f3 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1223,11 +1223,12 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, > if (!can_split_folio(folio, NULL)) > goto activate_locked; > /* > - * Split folios without a PMD map right > - * away. Chances are some or all of the > - * tail pages can be freed without IO. > + * Split partially mapped folios right > + * away. We can free the unmapped pages > + * without IO. > */ > - if (!folio_entire_mapcount(folio) && > + if (data_race(!list_empty( > + &folio->_deferred_list)) && > split_folio_to_list(folio, > folio_list)) > goto activate_locked; Hi Ryan, Sorry for bringing up another minor issue at this late stage. During the debugging of thp counter patch v2, I noticed the discrepancy between THP_SWPOUT_FALLBACK and THP_SWPOUT. Should we make adjustments to the counter? diff --git a/mm/vmscan.c b/mm/vmscan.c index 293120fe54f3..d7856603f689 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1241,8 +1241,10 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, folio_list)) goto activate_locked; #ifdef CONFIG_TRANSPARENT_HUGEPAGE - count_memcg_folio_events(folio, THP_SWPOUT_FALLBACK, 1); - count_vm_event(THP_SWPOUT_FALLBACK); + if (folio_test_pmd_mappable(folio)) { + count_memcg_folio_events(folio, THP_SWPOUT_FALLBACK, 1); + count_vm_event(THP_SWPOUT_FALLBACK); + } #endif if (!add_to_swap(folio)) goto activate_locked_split; Because THP_SWPOUT is only for pmd: static inline void count_swpout_vm_event(struct folio *folio) { #ifdef CONFIG_TRANSPARENT_HUGEPAGE if (unlikely(folio_test_pmd_mappable(folio))) { count_memcg_folio_events(folio, THP_SWPOUT, 1); count_vm_event(THP_SWPOUT); } #endif count_vm_events(PSWPOUT, folio_nr_pages(folio)); } I can provide per-order counters for this in my THP counter patch. > -- > 2.25.1 > Thanks Barry
On 28/03/2024 08:18, Barry Song wrote: > On Thu, Mar 28, 2024 at 3:45 AM Ryan Roberts <ryan.roberts@arm.com> wrote: >> >> Now that swap supports storing all mTHP sizes, avoid splitting large >> folios before swap-out. This benefits performance of the swap-out path >> by eliding split_folio_to_list(), which is expensive, and also sets us >> up for swapping in large folios in a future series. >> >> If the folio is partially mapped, we continue to split it since we want >> to avoid the extra IO overhead and storage of writing out pages >> uneccessarily. >> >> Reviewed-by: David Hildenbrand <david@redhat.com> >> Reviewed-by: Barry Song <v-songbaohua@oppo.com> >> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> >> --- >> mm/vmscan.c | 9 +++++---- >> 1 file changed, 5 insertions(+), 4 deletions(-) >> >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> index 00adaf1cb2c3..293120fe54f3 100644 >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -1223,11 +1223,12 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, >> if (!can_split_folio(folio, NULL)) >> goto activate_locked; >> /* >> - * Split folios without a PMD map right >> - * away. Chances are some or all of the >> - * tail pages can be freed without IO. >> + * Split partially mapped folios right >> + * away. We can free the unmapped pages >> + * without IO. >> */ >> - if (!folio_entire_mapcount(folio) && >> + if (data_race(!list_empty( >> + &folio->_deferred_list)) && >> split_folio_to_list(folio, >> folio_list)) >> goto activate_locked; > > Hi Ryan, > > Sorry for bringing up another minor issue at this late stage. > > During the debugging of thp counter patch v2, I noticed the discrepancy between > THP_SWPOUT_FALLBACK and THP_SWPOUT. Ahh good spot! I had noticed this previously and clearly forgot all about it. I'm on holiday today and over the long weekend in the UK. I'll take a proper look next week and send a fix. > > Should we make adjustments to the counter? > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 293120fe54f3..d7856603f689 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1241,8 +1241,10 @@ static unsigned int shrink_folio_list(struct > list_head *folio_list, > folio_list)) > goto activate_locked; > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > - > count_memcg_folio_events(folio, THP_SWPOUT_FALLBACK, 1); > - count_vm_event(THP_SWPOUT_FALLBACK); > + if (folio_test_pmd_mappable(folio)) { > + > count_memcg_folio_events(folio, THP_SWPOUT_FALLBACK, 1); > + > count_vm_event(THP_SWPOUT_FALLBACK); > + } > #endif > if (!add_to_swap(folio)) > goto activate_locked_split; > > > Because THP_SWPOUT is only for pmd: > > static inline void count_swpout_vm_event(struct folio *folio) > { > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > if (unlikely(folio_test_pmd_mappable(folio))) { > count_memcg_folio_events(folio, THP_SWPOUT, 1); > count_vm_event(THP_SWPOUT); > } > #endif > count_vm_events(PSWPOUT, folio_nr_pages(folio)); > } > > I can provide per-order counters for this in my THP counter patch. > >> -- >> 2.25.1 >> > > Thanks > Barry
On 28/03/2024 08:18, Barry Song wrote: > On Thu, Mar 28, 2024 at 3:45 AM Ryan Roberts <ryan.roberts@arm.com> wrote: >> >> Now that swap supports storing all mTHP sizes, avoid splitting large >> folios before swap-out. This benefits performance of the swap-out path >> by eliding split_folio_to_list(), which is expensive, and also sets us >> up for swapping in large folios in a future series. >> >> If the folio is partially mapped, we continue to split it since we want >> to avoid the extra IO overhead and storage of writing out pages >> uneccessarily. >> >> Reviewed-by: David Hildenbrand <david@redhat.com> >> Reviewed-by: Barry Song <v-songbaohua@oppo.com> >> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> >> --- >> mm/vmscan.c | 9 +++++---- >> 1 file changed, 5 insertions(+), 4 deletions(-) >> >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> index 00adaf1cb2c3..293120fe54f3 100644 >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -1223,11 +1223,12 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, >> if (!can_split_folio(folio, NULL)) >> goto activate_locked; >> /* >> - * Split folios without a PMD map right >> - * away. Chances are some or all of the >> - * tail pages can be freed without IO. >> + * Split partially mapped folios right >> + * away. We can free the unmapped pages >> + * without IO. >> */ >> - if (!folio_entire_mapcount(folio) && >> + if (data_race(!list_empty( >> + &folio->_deferred_list)) && >> split_folio_to_list(folio, >> folio_list)) >> goto activate_locked; > > Hi Ryan, > > Sorry for bringing up another minor issue at this late stage. No problem - I'd rather take a bit longer and get it right, rather than rush it and get it wrong! > > During the debugging of thp counter patch v2, I noticed the discrepancy between > THP_SWPOUT_FALLBACK and THP_SWPOUT. > > Should we make adjustments to the counter? Yes, agreed; we want to be consistent here with all the other existing THP counters; they only refer to PMD-sized THP. I'll make the change for the next version. I guess we will eventually want equivalent counters for per-size mTHP using the framework you are adding. > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 293120fe54f3..d7856603f689 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1241,8 +1241,10 @@ static unsigned int shrink_folio_list(struct > list_head *folio_list, > folio_list)) > goto activate_locked; > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > - > count_memcg_folio_events(folio, THP_SWPOUT_FALLBACK, 1); > - count_vm_event(THP_SWPOUT_FALLBACK); > + if (folio_test_pmd_mappable(folio)) { > + > count_memcg_folio_events(folio, THP_SWPOUT_FALLBACK, 1); > + > count_vm_event(THP_SWPOUT_FALLBACK); > + } > #endif > if (!add_to_swap(folio)) > goto activate_locked_split; > > > Because THP_SWPOUT is only for pmd: > > static inline void count_swpout_vm_event(struct folio *folio) > { > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > if (unlikely(folio_test_pmd_mappable(folio))) { > count_memcg_folio_events(folio, THP_SWPOUT, 1); > count_vm_event(THP_SWPOUT); > } > #endif > count_vm_events(PSWPOUT, folio_nr_pages(folio)); > } > > I can provide per-order counters for this in my THP counter patch. > >> -- >> 2.25.1 >> > > Thanks > Barry
Baolin Wang's patch[1] has avoided confusion with PMD mapped THP related statistics. So, these three counters (THP_SPLIT_PAGE, THP_SPLIT_PAGE_FAILED, and THP_DEFERRED_SPLIT_PAGE) no longer include mTHP. [1] https://lore.kernel.org/linux-mm/a5341defeef27c9ac7b85c97f030f93e4368bbc1.1711694852.git.baolin.wang@linux.alibaba.com/ On Tue, Apr 2, 2024 at 9:10 PM Ryan Roberts <ryan.roberts@arm.com> wrote: > > On 28/03/2024 08:18, Barry Song wrote: > > On Thu, Mar 28, 2024 at 3:45 AM Ryan Roberts <ryan.roberts@arm.com> wrote: > >> > >> Now that swap supports storing all mTHP sizes, avoid splitting large > >> folios before swap-out. This benefits performance of the swap-out path > >> by eliding split_folio_to_list(), which is expensive, and also sets us > >> up for swapping in large folios in a future series. > >> > >> If the folio is partially mapped, we continue to split it since we want > >> to avoid the extra IO overhead and storage of writing out pages > >> uneccessarily. > >> > >> Reviewed-by: David Hildenbrand <david@redhat.com> > >> Reviewed-by: Barry Song <v-songbaohua@oppo.com> > >> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> > >> --- > >> mm/vmscan.c | 9 +++++---- > >> 1 file changed, 5 insertions(+), 4 deletions(-) > >> > >> diff --git a/mm/vmscan.c b/mm/vmscan.c > >> index 00adaf1cb2c3..293120fe54f3 100644 > >> --- a/mm/vmscan.c > >> +++ b/mm/vmscan.c > >> @@ -1223,11 +1223,12 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, > >> if (!can_split_folio(folio, NULL)) > >> goto activate_locked; > >> /* > >> - * Split folios without a PMD map right > >> - * away. Chances are some or all of the > >> - * tail pages can be freed without IO. > >> + * Split partially mapped folios right > >> + * away. We can free the unmapped pages > >> + * without IO. > >> */ > >> - if (!folio_entire_mapcount(folio) && > >> + if (data_race(!list_empty( > >> + &folio->_deferred_list)) && > >> split_folio_to_list(folio, > >> folio_list)) > >> goto activate_locked; > > > > Hi Ryan, > > > > Sorry for bringing up another minor issue at this late stage. > > No problem - I'd rather take a bit longer and get it right, rather than rush it > and get it wrong! > > > > > During the debugging of thp counter patch v2, I noticed the discrepancy between > > THP_SWPOUT_FALLBACK and THP_SWPOUT. > > > > Should we make adjustments to the counter? > > Yes, agreed; we want to be consistent here with all the other existing THP > counters; they only refer to PMD-sized THP. I'll make the change for the next > version. > > I guess we will eventually want equivalent counters for per-size mTHP using the > framework you are adding. > > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index 293120fe54f3..d7856603f689 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -1241,8 +1241,10 @@ static unsigned int shrink_folio_list(struct > > list_head *folio_list, > > folio_list)) > > goto activate_locked; > > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > > - > > count_memcg_folio_events(folio, THP_SWPOUT_FALLBACK, 1); > > - count_vm_event(THP_SWPOUT_FALLBACK); > > + if (folio_test_pmd_mappable(folio)) { > > + > > count_memcg_folio_events(folio, THP_SWPOUT_FALLBACK, 1); > > + > > count_vm_event(THP_SWPOUT_FALLBACK); > > + } > > #endif > > if (!add_to_swap(folio)) > > goto activate_locked_split; > > > > > > Because THP_SWPOUT is only for pmd: > > > > static inline void count_swpout_vm_event(struct folio *folio) > > { > > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > > if (unlikely(folio_test_pmd_mappable(folio))) { > > count_memcg_folio_events(folio, THP_SWPOUT, 1); > > count_vm_event(THP_SWPOUT); > > } > > #endif > > count_vm_events(PSWPOUT, folio_nr_pages(folio)); > > } > > > > I can provide per-order counters for this in my THP counter patch. > > > >> -- > >> 2.25.1 > >> > > > > Thanks > > Barry >
On 02/04/2024 14:10, Ryan Roberts wrote: > On 28/03/2024 08:18, Barry Song wrote: >> On Thu, Mar 28, 2024 at 3:45 AM Ryan Roberts <ryan.roberts@arm.com> wrote: >>> >>> Now that swap supports storing all mTHP sizes, avoid splitting large >>> folios before swap-out. This benefits performance of the swap-out path >>> by eliding split_folio_to_list(), which is expensive, and also sets us >>> up for swapping in large folios in a future series. >>> >>> If the folio is partially mapped, we continue to split it since we want >>> to avoid the extra IO overhead and storage of writing out pages >>> uneccessarily. >>> >>> Reviewed-by: David Hildenbrand <david@redhat.com> >>> Reviewed-by: Barry Song <v-songbaohua@oppo.com> >>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> >>> --- >>> mm/vmscan.c | 9 +++++---- >>> 1 file changed, 5 insertions(+), 4 deletions(-) >>> >>> diff --git a/mm/vmscan.c b/mm/vmscan.c >>> index 00adaf1cb2c3..293120fe54f3 100644 >>> --- a/mm/vmscan.c >>> +++ b/mm/vmscan.c >>> @@ -1223,11 +1223,12 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, >>> if (!can_split_folio(folio, NULL)) >>> goto activate_locked; >>> /* >>> - * Split folios without a PMD map right >>> - * away. Chances are some or all of the >>> - * tail pages can be freed without IO. >>> + * Split partially mapped folios right >>> + * away. We can free the unmapped pages >>> + * without IO. >>> */ >>> - if (!folio_entire_mapcount(folio) && >>> + if (data_race(!list_empty( >>> + &folio->_deferred_list)) && >>> split_folio_to_list(folio, >>> folio_list)) >>> goto activate_locked; >> >> Hi Ryan, >> >> Sorry for bringing up another minor issue at this late stage. > > No problem - I'd rather take a bit longer and get it right, rather than rush it > and get it wrong! > >> >> During the debugging of thp counter patch v2, I noticed the discrepancy between >> THP_SWPOUT_FALLBACK and THP_SWPOUT. >> >> Should we make adjustments to the counter? > > Yes, agreed; we want to be consistent here with all the other existing THP > counters; they only refer to PMD-sized THP. I'll make the change for the next > version. > > I guess we will eventually want equivalent counters for per-size mTHP using the > framework you are adding. > >> >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> index 293120fe54f3..d7856603f689 100644 >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -1241,8 +1241,10 @@ static unsigned int shrink_folio_list(struct >> list_head *folio_list, >> folio_list)) >> goto activate_locked; >> #ifdef CONFIG_TRANSPARENT_HUGEPAGE >> - >> count_memcg_folio_events(folio, THP_SWPOUT_FALLBACK, 1); >> - count_vm_event(THP_SWPOUT_FALLBACK); >> + if (folio_test_pmd_mappable(folio)) { This doesn't quite work because we have already split the folio here, so this will always return false. I've changed it to: if (nr_pages >= HPAGE_PMD_NR) { >> + >> count_memcg_folio_events(folio, THP_SWPOUT_FALLBACK, 1); >> + >> count_vm_event(THP_SWPOUT_FALLBACK); >> + } >> #endif >> if (!add_to_swap(folio)) >> goto activate_locked_split; >> >> >> Because THP_SWPOUT is only for pmd: >> >> static inline void count_swpout_vm_event(struct folio *folio) >> { >> #ifdef CONFIG_TRANSPARENT_HUGEPAGE >> if (unlikely(folio_test_pmd_mappable(folio))) { >> count_memcg_folio_events(folio, THP_SWPOUT, 1); >> count_vm_event(THP_SWPOUT); >> } >> #endif >> count_vm_events(PSWPOUT, folio_nr_pages(folio)); >> } >> >> I can provide per-order counters for this in my THP counter patch. >> >>> -- >>> 2.25.1 >>> >> >> Thanks >> Barry >
On Wed, Apr 3, 2024 at 2:22 AM Ryan Roberts <ryan.roberts@arm.com> wrote: > > On 02/04/2024 14:10, Ryan Roberts wrote: > > On 28/03/2024 08:18, Barry Song wrote: > >> On Thu, Mar 28, 2024 at 3:45 AM Ryan Roberts <ryan.roberts@arm.com> wrote: > >>> > >>> Now that swap supports storing all mTHP sizes, avoid splitting large > >>> folios before swap-out. This benefits performance of the swap-out path > >>> by eliding split_folio_to_list(), which is expensive, and also sets us > >>> up for swapping in large folios in a future series. > >>> > >>> If the folio is partially mapped, we continue to split it since we want > >>> to avoid the extra IO overhead and storage of writing out pages > >>> uneccessarily. > >>> > >>> Reviewed-by: David Hildenbrand <david@redhat.com> > >>> Reviewed-by: Barry Song <v-songbaohua@oppo.com> > >>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> > >>> --- > >>> mm/vmscan.c | 9 +++++---- > >>> 1 file changed, 5 insertions(+), 4 deletions(-) > >>> > >>> diff --git a/mm/vmscan.c b/mm/vmscan.c > >>> index 00adaf1cb2c3..293120fe54f3 100644 > >>> --- a/mm/vmscan.c > >>> +++ b/mm/vmscan.c > >>> @@ -1223,11 +1223,12 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, > >>> if (!can_split_folio(folio, NULL)) > >>> goto activate_locked; > >>> /* > >>> - * Split folios without a PMD map right > >>> - * away. Chances are some or all of the > >>> - * tail pages can be freed without IO. > >>> + * Split partially mapped folios right > >>> + * away. We can free the unmapped pages > >>> + * without IO. > >>> */ > >>> - if (!folio_entire_mapcount(folio) && > >>> + if (data_race(!list_empty( > >>> + &folio->_deferred_list)) && > >>> split_folio_to_list(folio, > >>> folio_list)) > >>> goto activate_locked; > >> > >> Hi Ryan, > >> > >> Sorry for bringing up another minor issue at this late stage. > > > > No problem - I'd rather take a bit longer and get it right, rather than rush it > > and get it wrong! > > > >> > >> During the debugging of thp counter patch v2, I noticed the discrepancy between > >> THP_SWPOUT_FALLBACK and THP_SWPOUT. > >> > >> Should we make adjustments to the counter? > > > > Yes, agreed; we want to be consistent here with all the other existing THP > > counters; they only refer to PMD-sized THP. I'll make the change for the next > > version. > > > > I guess we will eventually want equivalent counters for per-size mTHP using the > > framework you are adding. > > > >> > >> diff --git a/mm/vmscan.c b/mm/vmscan.c > >> index 293120fe54f3..d7856603f689 100644 > >> --- a/mm/vmscan.c > >> +++ b/mm/vmscan.c > >> @@ -1241,8 +1241,10 @@ static unsigned int shrink_folio_list(struct > >> list_head *folio_list, > >> folio_list)) > >> goto activate_locked; > >> #ifdef CONFIG_TRANSPARENT_HUGEPAGE > >> - > >> count_memcg_folio_events(folio, THP_SWPOUT_FALLBACK, 1); > >> - count_vm_event(THP_SWPOUT_FALLBACK); > >> + if (folio_test_pmd_mappable(folio)) { > > This doesn't quite work because we have already split the folio here, so this > will always return false. I've changed it to: > > if (nr_pages >= HPAGE_PMD_NR) { make sense to me. > > > >> + > >> count_memcg_folio_events(folio, THP_SWPOUT_FALLBACK, 1); > >> + > >> count_vm_event(THP_SWPOUT_FALLBACK); > >> + } > >> #endif > >> if (!add_to_swap(folio)) > >> goto activate_locked_split; > >> > >> > >> Because THP_SWPOUT is only for pmd: > >> > >> static inline void count_swpout_vm_event(struct folio *folio) > >> { > >> #ifdef CONFIG_TRANSPARENT_HUGEPAGE > >> if (unlikely(folio_test_pmd_mappable(folio))) { > >> count_memcg_folio_events(folio, THP_SWPOUT, 1); > >> count_vm_event(THP_SWPOUT); > >> } > >> #endif > >> count_vm_events(PSWPOUT, folio_nr_pages(folio)); > >> } > >> > >> I can provide per-order counters for this in my THP counter patch. > >> > >>> -- > >>> 2.25.1 > >>> > >> Thanks Barry
On Wed, Apr 3, 2024 at 2:10 AM Ryan Roberts <ryan.roberts@arm.com> wrote: > > On 28/03/2024 08:18, Barry Song wrote: > > On Thu, Mar 28, 2024 at 3:45 AM Ryan Roberts <ryan.roberts@arm.com> wrote: > >> > >> Now that swap supports storing all mTHP sizes, avoid splitting large > >> folios before swap-out. This benefits performance of the swap-out path > >> by eliding split_folio_to_list(), which is expensive, and also sets us > >> up for swapping in large folios in a future series. > >> > >> If the folio is partially mapped, we continue to split it since we want > >> to avoid the extra IO overhead and storage of writing out pages > >> uneccessarily. > >> > >> Reviewed-by: David Hildenbrand <david@redhat.com> > >> Reviewed-by: Barry Song <v-songbaohua@oppo.com> > >> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> > >> --- > >> mm/vmscan.c | 9 +++++---- > >> 1 file changed, 5 insertions(+), 4 deletions(-) > >> > >> diff --git a/mm/vmscan.c b/mm/vmscan.c > >> index 00adaf1cb2c3..293120fe54f3 100644 > >> --- a/mm/vmscan.c > >> +++ b/mm/vmscan.c > >> @@ -1223,11 +1223,12 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, > >> if (!can_split_folio(folio, NULL)) > >> goto activate_locked; > >> /* > >> - * Split folios without a PMD map right > >> - * away. Chances are some or all of the > >> - * tail pages can be freed without IO. > >> + * Split partially mapped folios right > >> + * away. We can free the unmapped pages > >> + * without IO. > >> */ > >> - if (!folio_entire_mapcount(folio) && > >> + if (data_race(!list_empty( > >> + &folio->_deferred_list)) && > >> split_folio_to_list(folio, > >> folio_list)) > >> goto activate_locked; > > > > Hi Ryan, > > > > Sorry for bringing up another minor issue at this late stage. > > No problem - I'd rather take a bit longer and get it right, rather than rush it > and get it wrong! > > > > > During the debugging of thp counter patch v2, I noticed the discrepancy between > > THP_SWPOUT_FALLBACK and THP_SWPOUT. > > > > Should we make adjustments to the counter? > > Yes, agreed; we want to be consistent here with all the other existing THP > counters; they only refer to PMD-sized THP. I'll make the change for the next > version. > > I guess we will eventually want equivalent counters for per-size mTHP using the > framework you are adding. Hi Ryan, Today, I created counters for per-order SWPOUT and SWPOUT_FALLBACK. I'd appreciate any suggestions you might have before I submit this as patch 2/2 of my mTHP counters series. diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index cc13fa14aa32..762a6d8759b9 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -267,6 +267,8 @@ unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma, enum thp_stat_item { THP_STAT_ANON_ALLOC, THP_STAT_ANON_ALLOC_FALLBACK, + THP_STAT_ANON_SWPOUT, + THP_STAT_ANON_SWPOUT_FALLBACK, __THP_STAT_COUNT }; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index e704b4408181..7f2b5d2852cc 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -554,10 +554,14 @@ static struct kobj_attribute _name##_attr = __ATTR_RO(_name) THP_STATE_ATTR(anon_alloc, THP_STAT_ANON_ALLOC); THP_STATE_ATTR(anon_alloc_fallback, THP_STAT_ANON_ALLOC_FALLBACK); +THP_STATE_ATTR(anon_swpout, THP_STAT_ANON_SWPOUT); +THP_STATE_ATTR(anon_swpout_fallback, THP_STAT_ANON_SWPOUT_FALLBACK); static struct attribute *stats_attrs[] = { &anon_alloc_attr.attr, &anon_alloc_fallback_attr.attr, + &anon_swpout_attr.attr, + &anon_swpout_fallback_attr.attr, NULL, }; diff --git a/mm/page_io.c b/mm/page_io.c index a9a7c236aecc..be4f822b39f8 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -212,13 +212,16 @@ int swap_writepage(struct page *page, struct writeback_control *wbc) static inline void count_swpout_vm_event(struct folio *folio) { + long nr_pages = folio_nr_pages(folio); #ifdef CONFIG_TRANSPARENT_HUGEPAGE if (unlikely(folio_test_pmd_mappable(folio))) { count_memcg_folio_events(folio, THP_SWPOUT, 1); count_vm_event(THP_SWPOUT); } + if (nr_pages > 0 && nr_pages <= HPAGE_PMD_NR) + count_thp_state(folio_order(folio), THP_STAT_ANON_SWPOUT); #endif - count_vm_events(PSWPOUT, folio_nr_pages(folio)); + count_vm_events(PSWPOUT, nr_pages); } #if defined(CONFIG_MEMCG) && defined(CONFIG_BLK_CGROUP) diff --git a/mm/vmscan.c b/mm/vmscan.c index ffc4553c8615..b7c5fbd830b6 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1247,6 +1247,10 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, count_vm_event( THP_SWPOUT_FALLBACK); } + if (nr_pages > 0 && nr_pages <= HPAGE_PMD_NR) + count_thp_state(folio_order(folio), + THP_STAT_ANON_SWPOUT_FALLBACK); + #endif if (!add_to_swap(folio)) goto activate_locked_split; Thanks Barry
On 05/04/2024 05:06, Barry Song wrote: > On Wed, Apr 3, 2024 at 2:10 AM Ryan Roberts <ryan.roberts@arm.com> wrote: >> >> On 28/03/2024 08:18, Barry Song wrote: >>> On Thu, Mar 28, 2024 at 3:45 AM Ryan Roberts <ryan.roberts@arm.com> wrote: >>>> >>>> Now that swap supports storing all mTHP sizes, avoid splitting large >>>> folios before swap-out. This benefits performance of the swap-out path >>>> by eliding split_folio_to_list(), which is expensive, and also sets us >>>> up for swapping in large folios in a future series. >>>> >>>> If the folio is partially mapped, we continue to split it since we want >>>> to avoid the extra IO overhead and storage of writing out pages >>>> uneccessarily. >>>> >>>> Reviewed-by: David Hildenbrand <david@redhat.com> >>>> Reviewed-by: Barry Song <v-songbaohua@oppo.com> >>>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> >>>> --- >>>> mm/vmscan.c | 9 +++++---- >>>> 1 file changed, 5 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/mm/vmscan.c b/mm/vmscan.c >>>> index 00adaf1cb2c3..293120fe54f3 100644 >>>> --- a/mm/vmscan.c >>>> +++ b/mm/vmscan.c >>>> @@ -1223,11 +1223,12 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, >>>> if (!can_split_folio(folio, NULL)) >>>> goto activate_locked; >>>> /* >>>> - * Split folios without a PMD map right >>>> - * away. Chances are some or all of the >>>> - * tail pages can be freed without IO. >>>> + * Split partially mapped folios right >>>> + * away. We can free the unmapped pages >>>> + * without IO. >>>> */ >>>> - if (!folio_entire_mapcount(folio) && >>>> + if (data_race(!list_empty( >>>> + &folio->_deferred_list)) && >>>> split_folio_to_list(folio, >>>> folio_list)) >>>> goto activate_locked; >>> >>> Hi Ryan, >>> >>> Sorry for bringing up another minor issue at this late stage. >> >> No problem - I'd rather take a bit longer and get it right, rather than rush it >> and get it wrong! >> >>> >>> During the debugging of thp counter patch v2, I noticed the discrepancy between >>> THP_SWPOUT_FALLBACK and THP_SWPOUT. >>> >>> Should we make adjustments to the counter? >> >> Yes, agreed; we want to be consistent here with all the other existing THP >> counters; they only refer to PMD-sized THP. I'll make the change for the next >> version. >> >> I guess we will eventually want equivalent counters for per-size mTHP using the >> framework you are adding. > > Hi Ryan, > > Today, I created counters for per-order SWPOUT and SWPOUT_FALLBACK. > I'd appreciate any > suggestions you might have before I submit this as patch 2/2 of my > mTHP counters series. Amazing - this is going to be very useful! > > diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h > index cc13fa14aa32..762a6d8759b9 100644 > --- a/include/linux/huge_mm.h > +++ b/include/linux/huge_mm.h > @@ -267,6 +267,8 @@ unsigned long thp_vma_allowable_orders(struct > vm_area_struct *vma, > enum thp_stat_item { > THP_STAT_ANON_ALLOC, > THP_STAT_ANON_ALLOC_FALLBACK, > + THP_STAT_ANON_SWPOUT, > + THP_STAT_ANON_SWPOUT_FALLBACK, > __THP_STAT_COUNT > }; > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index e704b4408181..7f2b5d2852cc 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -554,10 +554,14 @@ static struct kobj_attribute _name##_attr = > __ATTR_RO(_name) > > THP_STATE_ATTR(anon_alloc, THP_STAT_ANON_ALLOC); > THP_STATE_ATTR(anon_alloc_fallback, THP_STAT_ANON_ALLOC_FALLBACK); > +THP_STATE_ATTR(anon_swpout, THP_STAT_ANON_SWPOUT); > +THP_STATE_ATTR(anon_swpout_fallback, THP_STAT_ANON_SWPOUT_FALLBACK); > > static struct attribute *stats_attrs[] = { > &anon_alloc_attr.attr, > &anon_alloc_fallback_attr.attr, > + &anon_swpout_attr.attr, > + &anon_swpout_fallback_attr.attr, > NULL, > }; > > diff --git a/mm/page_io.c b/mm/page_io.c > index a9a7c236aecc..be4f822b39f8 100644 > --- a/mm/page_io.c > +++ b/mm/page_io.c > @@ -212,13 +212,16 @@ int swap_writepage(struct page *page, struct > writeback_control *wbc) > > static inline void count_swpout_vm_event(struct folio *folio) > { > + long nr_pages = folio_nr_pages(folio); > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > if (unlikely(folio_test_pmd_mappable(folio))) { > count_memcg_folio_events(folio, THP_SWPOUT, 1); > count_vm_event(THP_SWPOUT); > } > + if (nr_pages > 0 && nr_pages <= HPAGE_PMD_NR) The guard is a bit ugly; I wonder if we should at least check that order is in bounds in count_thp_state(), since all callers could benefit? Then we only have to care about the nr_pages > 0 condition here. Just a thought... > + count_thp_state(folio_order(folio), THP_STAT_ANON_SWPOUT); So you're counting THPs, not pages; I agree with that approach. > #endif > - count_vm_events(PSWPOUT, folio_nr_pages(folio)); > + count_vm_events(PSWPOUT, nr_pages); > } > > #if defined(CONFIG_MEMCG) && defined(CONFIG_BLK_CGROUP) > diff --git a/mm/vmscan.c b/mm/vmscan.c > index ffc4553c8615..b7c5fbd830b6 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1247,6 +1247,10 @@ static unsigned int shrink_folio_list(struct > list_head *folio_list, > count_vm_event( > THP_SWPOUT_FALLBACK); > } > + if (nr_pages > 0 && nr_pages > <= HPAGE_PMD_NR) > + > count_thp_state(folio_order(folio), > + > THP_STAT_ANON_SWPOUT_FALLBACK); > + > #endif > if (!add_to_swap(folio)) > goto activate_locked_split; > > > Thanks > Barry
diff --git a/mm/vmscan.c b/mm/vmscan.c index 00adaf1cb2c3..293120fe54f3 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1223,11 +1223,12 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, if (!can_split_folio(folio, NULL)) goto activate_locked; /* - * Split folios without a PMD map right - * away. Chances are some or all of the - * tail pages can be freed without IO. + * Split partially mapped folios right + * away. We can free the unmapped pages + * without IO. */ - if (!folio_entire_mapcount(folio) && + if (data_race(!list_empty( + &folio->_deferred_list)) && split_folio_to_list(folio, folio_list)) goto activate_locked;