Message ID | 20240617231137.80726-3-21cnbao@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: clarify folio_add_new_anon_rmap() and __folio_add_anon_rmap() | expand |
On 18.06.24 01:11, Barry Song wrote: > From: Barry Song <v-songbaohua@oppo.com> > > For the !folio_test_anon(folio) case, we can now invoke folio_add_new_anon_rmap() > with the rmap flags set to either EXCLUSIVE or non-EXCLUSIVE. This action will > suppress the VM_WARN_ON_FOLIO check within __folio_add_anon_rmap() while initiating > the process of bringing up mTHP swapin. > > static __always_inline void __folio_add_anon_rmap(struct folio *folio, > struct page *page, int nr_pages, struct vm_area_struct *vma, > unsigned long address, rmap_t flags, enum rmap_level level) > { > ... > if (unlikely(!folio_test_anon(folio))) { > VM_WARN_ON_FOLIO(folio_test_large(folio) && > level != RMAP_LEVEL_PMD, folio); > } > ... > } > > It also improves the code’s readability. Currently, all new anonymous > folios calling folio_add_anon_rmap_ptes() are order-0. This ensures > that new folios cannot be partially exclusive; they are either entirely > exclusive or entirely shared. > > Suggested-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Barry Song <v-songbaohua@oppo.com> > Tested-by: Shuai Yuan <yuanshuai@oppo.com> > --- > mm/memory.c | 8 ++++++++ > mm/swapfile.c | 13 +++++++++++-- > 2 files changed, 19 insertions(+), 2 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index 1f24ecdafe05..620654c13b2f 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4339,6 +4339,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > if (unlikely(folio != swapcache && swapcache)) { > folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); > folio_add_lru_vma(folio, vma); > + } else if (!folio_test_anon(folio)) { > + /* > + * We currently only expect small !anon folios, for which we now s/now/know/ > + * that they are either fully exclusive or fully shared. If we > + * ever get large folios here, we have to be careful. > + */ > + VM_WARN_ON_ONCE(folio_test_large(folio)); > + folio_add_new_anon_rmap(folio, vma, address, rmap_flags); > } else { > folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, address, > rmap_flags); > diff --git a/mm/swapfile.c b/mm/swapfile.c > index ae1d2700f6a3..69efa1a57087 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -1908,8 +1908,17 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, > VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); > if (pte_swp_exclusive(old_pte)) > rmap_flags |= RMAP_EXCLUSIVE; > - > - folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); > + /* > + * We currently only expect small !anon folios, for which we now that s/now/know/ > + * they are either fully exclusive or fully shared. If we ever get > + * large folios here, we have to be careful. > + */ > + if (!folio_test_anon(folio)) { > + VM_WARN_ON_ONCE(folio_test_large(folio)); > + folio_add_new_anon_rmap(folio, vma, addr, rmap_flags); > + } else { > + folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); > + } > } else { /* ksm created a completely new copy */ > folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE); > folio_add_lru_vma(folio, vma); Thanks! Acked-by: David Hildenbrand <david@redhat.com>
On 18.06.24 01:11, Barry Song wrote: > From: Barry Song <v-songbaohua@oppo.com> > > For the !folio_test_anon(folio) case, we can now invoke folio_add_new_anon_rmap() > with the rmap flags set to either EXCLUSIVE or non-EXCLUSIVE. This action will > suppress the VM_WARN_ON_FOLIO check within __folio_add_anon_rmap() while initiating > the process of bringing up mTHP swapin. > > static __always_inline void __folio_add_anon_rmap(struct folio *folio, > struct page *page, int nr_pages, struct vm_area_struct *vma, > unsigned long address, rmap_t flags, enum rmap_level level) > { > ... > if (unlikely(!folio_test_anon(folio))) { > VM_WARN_ON_FOLIO(folio_test_large(folio) && > level != RMAP_LEVEL_PMD, folio); > } > ... > } > > It also improves the code’s readability. Currently, all new anonymous > folios calling folio_add_anon_rmap_ptes() are order-0. This ensures > that new folios cannot be partially exclusive; they are either entirely > exclusive or entirely shared. > > Suggested-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Barry Song <v-songbaohua@oppo.com> > Tested-by: Shuai Yuan <yuanshuai@oppo.com> > --- > mm/memory.c | 8 ++++++++ > mm/swapfile.c | 13 +++++++++++-- > 2 files changed, 19 insertions(+), 2 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index 1f24ecdafe05..620654c13b2f 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4339,6 +4339,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > if (unlikely(folio != swapcache && swapcache)) { > folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); > folio_add_lru_vma(folio, vma); > + } else if (!folio_test_anon(folio)) { > + /* > + * We currently only expect small !anon folios, for which we now > + * that they are either fully exclusive or fully shared. If we > + * ever get large folios here, we have to be careful. > + */ > + VM_WARN_ON_ONCE(folio_test_large(folio)); > + folio_add_new_anon_rmap(folio, vma, address, rmap_flags); > } else { > folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, address, > rmap_flags); > diff --git a/mm/swapfile.c b/mm/swapfile.c > index ae1d2700f6a3..69efa1a57087 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -1908,8 +1908,17 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, > VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); > if (pte_swp_exclusive(old_pte)) > rmap_flags |= RMAP_EXCLUSIVE; > - > - folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); > + /* > + * We currently only expect small !anon folios, for which we now that > + * they are either fully exclusive or fully shared. If we ever get > + * large folios here, we have to be careful. > + */ > + if (!folio_test_anon(folio)) { > + VM_WARN_ON_ONCE(folio_test_large(folio)); (comment applies to both cases) Thinking about Hugh's comment, we should likely add here: VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); [the check we are removing from __folio_add_anon_rmap()] and document for folio_add_new_anon_rmap() in patch #1, that when dealing with folios that might be mapped concurrently by others, the folio lock must be held. > + folio_add_new_anon_rmap(folio, vma, addr, rmap_flags); > + } else { > + folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); > + } > } else { /* ksm created a completely new copy */ > folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE); > folio_add_lru_vma(folio, vma);
On Thu, Jun 20, 2024 at 7:46 PM David Hildenbrand <david@redhat.com> wrote: > > On 18.06.24 01:11, Barry Song wrote: > > From: Barry Song <v-songbaohua@oppo.com> > > > > For the !folio_test_anon(folio) case, we can now invoke folio_add_new_anon_rmap() > > with the rmap flags set to either EXCLUSIVE or non-EXCLUSIVE. This action will > > suppress the VM_WARN_ON_FOLIO check within __folio_add_anon_rmap() while initiating > > the process of bringing up mTHP swapin. > > > > static __always_inline void __folio_add_anon_rmap(struct folio *folio, > > struct page *page, int nr_pages, struct vm_area_struct *vma, > > unsigned long address, rmap_t flags, enum rmap_level level) > > { > > ... > > if (unlikely(!folio_test_anon(folio))) { > > VM_WARN_ON_FOLIO(folio_test_large(folio) && > > level != RMAP_LEVEL_PMD, folio); > > } > > ... > > } > > > > It also improves the code’s readability. Currently, all new anonymous > > folios calling folio_add_anon_rmap_ptes() are order-0. This ensures > > that new folios cannot be partially exclusive; they are either entirely > > exclusive or entirely shared. > > > > Suggested-by: David Hildenbrand <david@redhat.com> > > Signed-off-by: Barry Song <v-songbaohua@oppo.com> > > Tested-by: Shuai Yuan <yuanshuai@oppo.com> > > --- > > mm/memory.c | 8 ++++++++ > > mm/swapfile.c | 13 +++++++++++-- > > 2 files changed, 19 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memory.c b/mm/memory.c > > index 1f24ecdafe05..620654c13b2f 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -4339,6 +4339,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > > if (unlikely(folio != swapcache && swapcache)) { > > folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); > > folio_add_lru_vma(folio, vma); > > + } else if (!folio_test_anon(folio)) { > > + /* > > + * We currently only expect small !anon folios, for which we now > > + * that they are either fully exclusive or fully shared. If we > > + * ever get large folios here, we have to be careful. > > + */ > > + VM_WARN_ON_ONCE(folio_test_large(folio)); > > + folio_add_new_anon_rmap(folio, vma, address, rmap_flags); > > } else { > > folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, address, > > rmap_flags); > > diff --git a/mm/swapfile.c b/mm/swapfile.c > > index ae1d2700f6a3..69efa1a57087 100644 > > --- a/mm/swapfile.c > > +++ b/mm/swapfile.c > > @@ -1908,8 +1908,17 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, > > VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); > > if (pte_swp_exclusive(old_pte)) > > rmap_flags |= RMAP_EXCLUSIVE; > > - > > - folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); > > + /* > > + * We currently only expect small !anon folios, for which we now that > > + * they are either fully exclusive or fully shared. If we ever get > > + * large folios here, we have to be careful. > > + */ > > + if (!folio_test_anon(folio)) { > > + VM_WARN_ON_ONCE(folio_test_large(folio)); > > (comment applies to both cases) > > Thinking about Hugh's comment, we should likely add here: > > VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); > > [the check we are removing from __folio_add_anon_rmap()] > > and document for folio_add_new_anon_rmap() in patch #1, that when > dealing with folios that might be mapped concurrently by others, the > folio lock must be held. I assume you mean something like the following for patch#1? diff --git a/mm/rmap.c b/mm/rmap.c index df1a43295c85..20986b25f1b2 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1394,7 +1394,8 @@ void folio_add_anon_rmap_pmd(struct folio *folio, struct page *page, * * Like folio_add_anon_rmap_*() but must only be called on *new* folios. * This means the inc-and-test can be bypassed. - * The folio does not have to be locked. + * The folio doesn't necessarily need to be locked while it's exclusive unless two threads + * map it concurrently. However, the folio must be locked if it's shared. * * If the folio is pmd-mappable, it is accounted as a THP. */ @@ -1406,6 +1407,7 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma, int nr_pmdmapped = 0; VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); + VM_WARN_ON_FOLIO(!exclusive && !folio_test_locked(folio), folio); VM_BUG_ON_VMA(address < vma->vm_start || address + (nr << PAGE_SHIFT) > vma->vm_end, vma); __folio_set_swapbacked(folio); > > > + folio_add_new_anon_rmap(folio, vma, addr, rmap_flags); > > + } else { > > + folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); > > + } > > } else { /* ksm created a completely new copy */ > > folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE); > > folio_add_lru_vma(folio, vma); > > -- > Cheers, > > David / dhildenb >
On 20.06.24 10:33, Barry Song wrote: > On Thu, Jun 20, 2024 at 7:46 PM David Hildenbrand <david@redhat.com> wrote: >> >> On 18.06.24 01:11, Barry Song wrote: >>> From: Barry Song <v-songbaohua@oppo.com> >>> >>> For the !folio_test_anon(folio) case, we can now invoke folio_add_new_anon_rmap() >>> with the rmap flags set to either EXCLUSIVE or non-EXCLUSIVE. This action will >>> suppress the VM_WARN_ON_FOLIO check within __folio_add_anon_rmap() while initiating >>> the process of bringing up mTHP swapin. >>> >>> static __always_inline void __folio_add_anon_rmap(struct folio *folio, >>> struct page *page, int nr_pages, struct vm_area_struct *vma, >>> unsigned long address, rmap_t flags, enum rmap_level level) >>> { >>> ... >>> if (unlikely(!folio_test_anon(folio))) { >>> VM_WARN_ON_FOLIO(folio_test_large(folio) && >>> level != RMAP_LEVEL_PMD, folio); >>> } >>> ... >>> } >>> >>> It also improves the code’s readability. Currently, all new anonymous >>> folios calling folio_add_anon_rmap_ptes() are order-0. This ensures >>> that new folios cannot be partially exclusive; they are either entirely >>> exclusive or entirely shared. >>> >>> Suggested-by: David Hildenbrand <david@redhat.com> >>> Signed-off-by: Barry Song <v-songbaohua@oppo.com> >>> Tested-by: Shuai Yuan <yuanshuai@oppo.com> >>> --- >>> mm/memory.c | 8 ++++++++ >>> mm/swapfile.c | 13 +++++++++++-- >>> 2 files changed, 19 insertions(+), 2 deletions(-) >>> >>> diff --git a/mm/memory.c b/mm/memory.c >>> index 1f24ecdafe05..620654c13b2f 100644 >>> --- a/mm/memory.c >>> +++ b/mm/memory.c >>> @@ -4339,6 +4339,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) >>> if (unlikely(folio != swapcache && swapcache)) { >>> folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); >>> folio_add_lru_vma(folio, vma); >>> + } else if (!folio_test_anon(folio)) { >>> + /* >>> + * We currently only expect small !anon folios, for which we now >>> + * that they are either fully exclusive or fully shared. If we >>> + * ever get large folios here, we have to be careful. >>> + */ >>> + VM_WARN_ON_ONCE(folio_test_large(folio)); >>> + folio_add_new_anon_rmap(folio, vma, address, rmap_flags); >>> } else { >>> folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, address, >>> rmap_flags); >>> diff --git a/mm/swapfile.c b/mm/swapfile.c >>> index ae1d2700f6a3..69efa1a57087 100644 >>> --- a/mm/swapfile.c >>> +++ b/mm/swapfile.c >>> @@ -1908,8 +1908,17 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, >>> VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); >>> if (pte_swp_exclusive(old_pte)) >>> rmap_flags |= RMAP_EXCLUSIVE; >>> - >>> - folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); >>> + /* >>> + * We currently only expect small !anon folios, for which we now that >>> + * they are either fully exclusive or fully shared. If we ever get >>> + * large folios here, we have to be careful. >>> + */ >>> + if (!folio_test_anon(folio)) { >>> + VM_WARN_ON_ONCE(folio_test_large(folio)); >> >> (comment applies to both cases) >> >> Thinking about Hugh's comment, we should likely add here: >> >> VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); >> >> [the check we are removing from __folio_add_anon_rmap()] >> >> and document for folio_add_new_anon_rmap() in patch #1, that when >> dealing with folios that might be mapped concurrently by others, the >> folio lock must be held. > > I assume you mean something like the following for patch#1? > > diff --git a/mm/rmap.c b/mm/rmap.c > index df1a43295c85..20986b25f1b2 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1394,7 +1394,8 @@ void folio_add_anon_rmap_pmd(struct folio > *folio, struct page *page, > * > * Like folio_add_anon_rmap_*() but must only be called on *new* folios. > * This means the inc-and-test can be bypassed. > - * The folio does not have to be locked. > + * The folio doesn't necessarily need to be locked while it's > exclusive unless two threads > + * map it concurrently. However, the folio must be locked if it's shared. > * > * If the folio is pmd-mappable, it is accounted as a THP. > */ > @@ -1406,6 +1407,7 @@ void folio_add_new_anon_rmap(struct folio > *folio, struct vm_area_struct *vma, > int nr_pmdmapped = 0; > > VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); > + VM_WARN_ON_FOLIO(!exclusive && !folio_test_locked(folio), folio); For now this would likely do. I was concerned about a concurrent scenario in the exclusive case, but that shouldn't really happen I guess.
On Thu, Jun 20, 2024 at 8:49 PM David Hildenbrand <david@redhat.com> wrote: > > On 20.06.24 10:33, Barry Song wrote: > > On Thu, Jun 20, 2024 at 7:46 PM David Hildenbrand <david@redhat.com> wrote: > >> > >> On 18.06.24 01:11, Barry Song wrote: > >>> From: Barry Song <v-songbaohua@oppo.com> > >>> > >>> For the !folio_test_anon(folio) case, we can now invoke folio_add_new_anon_rmap() > >>> with the rmap flags set to either EXCLUSIVE or non-EXCLUSIVE. This action will > >>> suppress the VM_WARN_ON_FOLIO check within __folio_add_anon_rmap() while initiating > >>> the process of bringing up mTHP swapin. > >>> > >>> static __always_inline void __folio_add_anon_rmap(struct folio *folio, > >>> struct page *page, int nr_pages, struct vm_area_struct *vma, > >>> unsigned long address, rmap_t flags, enum rmap_level level) > >>> { > >>> ... > >>> if (unlikely(!folio_test_anon(folio))) { > >>> VM_WARN_ON_FOLIO(folio_test_large(folio) && > >>> level != RMAP_LEVEL_PMD, folio); > >>> } > >>> ... > >>> } > >>> > >>> It also improves the code’s readability. Currently, all new anonymous > >>> folios calling folio_add_anon_rmap_ptes() are order-0. This ensures > >>> that new folios cannot be partially exclusive; they are either entirely > >>> exclusive or entirely shared. > >>> > >>> Suggested-by: David Hildenbrand <david@redhat.com> > >>> Signed-off-by: Barry Song <v-songbaohua@oppo.com> > >>> Tested-by: Shuai Yuan <yuanshuai@oppo.com> > >>> --- > >>> mm/memory.c | 8 ++++++++ > >>> mm/swapfile.c | 13 +++++++++++-- > >>> 2 files changed, 19 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/mm/memory.c b/mm/memory.c > >>> index 1f24ecdafe05..620654c13b2f 100644 > >>> --- a/mm/memory.c > >>> +++ b/mm/memory.c > >>> @@ -4339,6 +4339,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > >>> if (unlikely(folio != swapcache && swapcache)) { > >>> folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); > >>> folio_add_lru_vma(folio, vma); > >>> + } else if (!folio_test_anon(folio)) { > >>> + /* > >>> + * We currently only expect small !anon folios, for which we now > >>> + * that they are either fully exclusive or fully shared. If we > >>> + * ever get large folios here, we have to be careful. > >>> + */ > >>> + VM_WARN_ON_ONCE(folio_test_large(folio)); > >>> + folio_add_new_anon_rmap(folio, vma, address, rmap_flags); > >>> } else { > >>> folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, address, > >>> rmap_flags); > >>> diff --git a/mm/swapfile.c b/mm/swapfile.c > >>> index ae1d2700f6a3..69efa1a57087 100644 > >>> --- a/mm/swapfile.c > >>> +++ b/mm/swapfile.c > >>> @@ -1908,8 +1908,17 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, > >>> VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); > >>> if (pte_swp_exclusive(old_pte)) > >>> rmap_flags |= RMAP_EXCLUSIVE; > >>> - > >>> - folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); > >>> + /* > >>> + * We currently only expect small !anon folios, for which we now that > >>> + * they are either fully exclusive or fully shared. If we ever get > >>> + * large folios here, we have to be careful. > >>> + */ > >>> + if (!folio_test_anon(folio)) { > >>> + VM_WARN_ON_ONCE(folio_test_large(folio)); > >> > >> (comment applies to both cases) > >> > >> Thinking about Hugh's comment, we should likely add here: > >> > >> VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); > >> > >> [the check we are removing from __folio_add_anon_rmap()] > >> > >> and document for folio_add_new_anon_rmap() in patch #1, that when > >> dealing with folios that might be mapped concurrently by others, the > >> folio lock must be held. > > > > I assume you mean something like the following for patch#1? > > > > diff --git a/mm/rmap.c b/mm/rmap.c > > index df1a43295c85..20986b25f1b2 100644 > > --- a/mm/rmap.c > > +++ b/mm/rmap.c > > @@ -1394,7 +1394,8 @@ void folio_add_anon_rmap_pmd(struct folio > > *folio, struct page *page, > > * > > * Like folio_add_anon_rmap_*() but must only be called on *new* folios. > > * This means the inc-and-test can be bypassed. > > - * The folio does not have to be locked. > > + * The folio doesn't necessarily need to be locked while it's > > exclusive unless two threads > > + * map it concurrently. However, the folio must be locked if it's shared. > > * > > * If the folio is pmd-mappable, it is accounted as a THP. > > */ > > @@ -1406,6 +1407,7 @@ void folio_add_new_anon_rmap(struct folio > > *folio, struct vm_area_struct *vma, > > int nr_pmdmapped = 0; > > > > VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); > > + VM_WARN_ON_FOLIO(!exclusive && !folio_test_locked(folio), folio); > > For now this would likely do. I was concerned about a concurrent > scenario in the exclusive case, but that shouldn't really happen I guess. > Since this is primarily a documentation update, I'll wait for two or three days to see if there are any more Linux-next reports before sending v3 combining these fixes together(I've already fixed another doc warn reported by lkp) and seek Andrew's assistance to drop v2 and apply v3. > -- > Cheers, > > David / dhildenb > Thanks Barry
On 20.06.24 11:59, Barry Song wrote: > On Thu, Jun 20, 2024 at 8:49 PM David Hildenbrand <david@redhat.com> wrote: >> >> On 20.06.24 10:33, Barry Song wrote: >>> On Thu, Jun 20, 2024 at 7:46 PM David Hildenbrand <david@redhat.com> wrote: >>>> >>>> On 18.06.24 01:11, Barry Song wrote: >>>>> From: Barry Song <v-songbaohua@oppo.com> >>>>> >>>>> For the !folio_test_anon(folio) case, we can now invoke folio_add_new_anon_rmap() >>>>> with the rmap flags set to either EXCLUSIVE or non-EXCLUSIVE. This action will >>>>> suppress the VM_WARN_ON_FOLIO check within __folio_add_anon_rmap() while initiating >>>>> the process of bringing up mTHP swapin. >>>>> >>>>> static __always_inline void __folio_add_anon_rmap(struct folio *folio, >>>>> struct page *page, int nr_pages, struct vm_area_struct *vma, >>>>> unsigned long address, rmap_t flags, enum rmap_level level) >>>>> { >>>>> ... >>>>> if (unlikely(!folio_test_anon(folio))) { >>>>> VM_WARN_ON_FOLIO(folio_test_large(folio) && >>>>> level != RMAP_LEVEL_PMD, folio); >>>>> } >>>>> ... >>>>> } >>>>> >>>>> It also improves the code’s readability. Currently, all new anonymous >>>>> folios calling folio_add_anon_rmap_ptes() are order-0. This ensures >>>>> that new folios cannot be partially exclusive; they are either entirely >>>>> exclusive or entirely shared. >>>>> >>>>> Suggested-by: David Hildenbrand <david@redhat.com> >>>>> Signed-off-by: Barry Song <v-songbaohua@oppo.com> >>>>> Tested-by: Shuai Yuan <yuanshuai@oppo.com> >>>>> --- >>>>> mm/memory.c | 8 ++++++++ >>>>> mm/swapfile.c | 13 +++++++++++-- >>>>> 2 files changed, 19 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/mm/memory.c b/mm/memory.c >>>>> index 1f24ecdafe05..620654c13b2f 100644 >>>>> --- a/mm/memory.c >>>>> +++ b/mm/memory.c >>>>> @@ -4339,6 +4339,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) >>>>> if (unlikely(folio != swapcache && swapcache)) { >>>>> folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); >>>>> folio_add_lru_vma(folio, vma); >>>>> + } else if (!folio_test_anon(folio)) { >>>>> + /* >>>>> + * We currently only expect small !anon folios, for which we now >>>>> + * that they are either fully exclusive or fully shared. If we >>>>> + * ever get large folios here, we have to be careful. >>>>> + */ >>>>> + VM_WARN_ON_ONCE(folio_test_large(folio)); >>>>> + folio_add_new_anon_rmap(folio, vma, address, rmap_flags); >>>>> } else { >>>>> folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, address, >>>>> rmap_flags); >>>>> diff --git a/mm/swapfile.c b/mm/swapfile.c >>>>> index ae1d2700f6a3..69efa1a57087 100644 >>>>> --- a/mm/swapfile.c >>>>> +++ b/mm/swapfile.c >>>>> @@ -1908,8 +1908,17 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, >>>>> VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); >>>>> if (pte_swp_exclusive(old_pte)) >>>>> rmap_flags |= RMAP_EXCLUSIVE; >>>>> - >>>>> - folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); >>>>> + /* >>>>> + * We currently only expect small !anon folios, for which we now that >>>>> + * they are either fully exclusive or fully shared. If we ever get >>>>> + * large folios here, we have to be careful. >>>>> + */ >>>>> + if (!folio_test_anon(folio)) { >>>>> + VM_WARN_ON_ONCE(folio_test_large(folio)); >>>> >>>> (comment applies to both cases) >>>> >>>> Thinking about Hugh's comment, we should likely add here: >>>> >>>> VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); >>>> >>>> [the check we are removing from __folio_add_anon_rmap()] >>>> >>>> and document for folio_add_new_anon_rmap() in patch #1, that when >>>> dealing with folios that might be mapped concurrently by others, the >>>> folio lock must be held. >>> >>> I assume you mean something like the following for patch#1? >>> >>> diff --git a/mm/rmap.c b/mm/rmap.c >>> index df1a43295c85..20986b25f1b2 100644 >>> --- a/mm/rmap.c >>> +++ b/mm/rmap.c >>> @@ -1394,7 +1394,8 @@ void folio_add_anon_rmap_pmd(struct folio >>> *folio, struct page *page, >>> * >>> * Like folio_add_anon_rmap_*() but must only be called on *new* folios. >>> * This means the inc-and-test can be bypassed. >>> - * The folio does not have to be locked. >>> + * The folio doesn't necessarily need to be locked while it's >>> exclusive unless two threads >>> + * map it concurrently. However, the folio must be locked if it's shared. >>> * >>> * If the folio is pmd-mappable, it is accounted as a THP. >>> */ >>> @@ -1406,6 +1407,7 @@ void folio_add_new_anon_rmap(struct folio >>> *folio, struct vm_area_struct *vma, >>> int nr_pmdmapped = 0; >>> >>> VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); >>> + VM_WARN_ON_FOLIO(!exclusive && !folio_test_locked(folio), folio); >> >> For now this would likely do. I was concerned about a concurrent >> scenario in the exclusive case, but that shouldn't really happen I guess. >> > > Since this is primarily a documentation update, I'll wait for two or > three days to see if > there are any more Linux-next reports before sending v3 combining these fixes > together(I've already fixed another doc warn reported by lkp) and seek Andrew's > assistance to drop v2 and apply v3. Feel free to send fixup patches for such small stuff (for example, as reply to this mail). Usually, no need for a new series. Andrew will squash all fixups before merging it to mm-stable.
> > > > Since this is primarily a documentation update, I'll wait for two or > > three days to see if > > there are any more Linux-next reports before sending v3 combining these fixes > > together(I've already fixed another doc warn reported by lkp) and seek Andrew's > > assistance to drop v2 and apply v3. > > Feel free to send fixup patches for such small stuff (for example, as > reply to this mail). Usually, no need for a new series. Andrew will > squash all fixups before merging it to mm-stable. Hi Andrew, Can you please squash this change(another one suggested by David)? From: Barry Song <v-songbaohua@oppo.com> Date: Sat, 22 Jun 2024 15:14:53 +1200 Subject: [PATCH] enhance doc- mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==false Suggested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Barry Song <v-songbaohua@oppo.com> --- mm/memory.c | 1 + mm/swapfile.c | 1 + 2 files changed, 2 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index 00728ea95583..982d81c83d49 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4346,6 +4346,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) * here, we have to be careful. */ VM_WARN_ON_ONCE(folio_test_large(folio)); + VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); folio_add_new_anon_rmap(folio, vma, address, rmap_flags); } else { folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, address, diff --git a/mm/swapfile.c b/mm/swapfile.c index b99b9f397c1c..ace2440ec0b7 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1923,6 +1923,7 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, */ if (!folio_test_anon(folio)) { VM_WARN_ON_ONCE(folio_test_large(folio)); + VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); folio_add_new_anon_rmap(folio, vma, addr, rmap_flags); } else { folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags);
On Sat, 22 Jun 2024 15:20:02 +1200 Barry Song <21cnbao@gmail.com> wrote: > > > > > > Since this is primarily a documentation update, I'll wait for two or > > > three days to see if > > > there are any more Linux-next reports before sending v3 combining these fixes > > > together(I've already fixed another doc warn reported by lkp) and seek Andrew's > > > assistance to drop v2 and apply v3. > > > > Feel free to send fixup patches for such small stuff (for example, as > > reply to this mail). Usually, no need for a new series. Andrew will > > squash all fixups before merging it to mm-stable. > > Hi Andrew, > > Can you please squash this change(another one suggested by David)? sure, but... > From: Barry Song <v-songbaohua@oppo.com> > Date: Sat, 22 Jun 2024 15:14:53 +1200 > Subject: [PATCH] enhance doc- mm: use folio_add_new_anon_rmap() if > folio_test_anon(folio)==false The only description we have here is "enhance doc" > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4346,6 +4346,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > * here, we have to be careful. > */ > VM_WARN_ON_ONCE(folio_test_large(folio)); > + VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); And these aren't documentation changes. Please send along a small changelog for this patch.
On Tue, Jun 25, 2024 at 11:25 AM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Sat, 22 Jun 2024 15:20:02 +1200 Barry Song <21cnbao@gmail.com> wrote: > > > > > > > > > Since this is primarily a documentation update, I'll wait for two or > > > > three days to see if > > > > there are any more Linux-next reports before sending v3 combining these fixes > > > > together(I've already fixed another doc warn reported by lkp) and seek Andrew's > > > > assistance to drop v2 and apply v3. > > > > > > Feel free to send fixup patches for such small stuff (for example, as > > > reply to this mail). Usually, no need for a new series. Andrew will > > > squash all fixups before merging it to mm-stable. > > > > Hi Andrew, > > > > Can you please squash this change(another one suggested by David)? > > sure, but... > > > From: Barry Song <v-songbaohua@oppo.com> > > Date: Sat, 22 Jun 2024 15:14:53 +1200 > > Subject: [PATCH] enhance doc- mm: use folio_add_new_anon_rmap() if > > folio_test_anon(folio)==false > > The only description we have here is "enhance doc" > > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -4346,6 +4346,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > > * here, we have to be careful. > > */ > > VM_WARN_ON_ONCE(folio_test_large(folio)); > > + VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); > > And these aren't documentation changes. Please send along a small > changelog for this patch. Thanks for the suggestion. Could we have this in changelog? For new anon(!anon), there's a possibility that multiple concurrent threads might execute "if (!anon) folio_add_new_anon_rmap()" in parallel. In such cases, the threads should lock the folio before executing this sequence. We use VM_WARN_ON_FOLIO() to verify if this condition holds true. > Thanks Barry
diff --git a/mm/memory.c b/mm/memory.c index 1f24ecdafe05..620654c13b2f 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4339,6 +4339,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) if (unlikely(folio != swapcache && swapcache)) { folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); folio_add_lru_vma(folio, vma); + } else if (!folio_test_anon(folio)) { + /* + * We currently only expect small !anon folios, for which we now + * that they are either fully exclusive or fully shared. If we + * ever get large folios here, we have to be careful. + */ + VM_WARN_ON_ONCE(folio_test_large(folio)); + folio_add_new_anon_rmap(folio, vma, address, rmap_flags); } else { folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, address, rmap_flags); diff --git a/mm/swapfile.c b/mm/swapfile.c index ae1d2700f6a3..69efa1a57087 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1908,8 +1908,17 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); if (pte_swp_exclusive(old_pte)) rmap_flags |= RMAP_EXCLUSIVE; - - folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); + /* + * We currently only expect small !anon folios, for which we now that + * they are either fully exclusive or fully shared. If we ever get + * large folios here, we have to be careful. + */ + if (!folio_test_anon(folio)) { + VM_WARN_ON_ONCE(folio_test_large(folio)); + folio_add_new_anon_rmap(folio, vma, addr, rmap_flags); + } else { + folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); + } } else { /* ksm created a completely new copy */ folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE); folio_add_lru_vma(folio, vma);