Message ID | 20250224165603.1434404-18-david@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: MM owner tracking for large folios (!hugetlb) + CONFIG_NO_PAGE_MAPCOUNT | expand |
On 24.02.25 17:55, David Hildenbrand wrote: > Let's implement an alternative when per-page mapcounts in large folios are > no longer maintained -- soon with CONFIG_NO_PAGE_MAPCOUNT. > > PM_MMAP_EXCLUSIVE will now be set if folio_likely_mapped_shared() is > true -- when the folio is considered "mapped shared", including when > it once was "mapped shared" but no longer is, as documented. > > This might result in and under-indication of "exclusively mapped", which > is considered better than over-indicating it: under-estimating the USS > (Unique Set Size) is better than over-estimating it. > > As an alternative, we could simply remove that flag with > CONFIG_NO_PAGE_MAPCOUNT completely, but there might be value to it. So, > let's keep it like that and document the behavior. > > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > Documentation/admin-guide/mm/pagemap.rst | 9 +++++++++ > fs/proc/task_mmu.c | 11 +++++++++-- > 2 files changed, 18 insertions(+), 2 deletions(-) > > diff --git a/Documentation/admin-guide/mm/pagemap.rst b/Documentation/admin-guide/mm/pagemap.rst > index 49590306c61a0..131c86574c39a 100644 > --- a/Documentation/admin-guide/mm/pagemap.rst > +++ b/Documentation/admin-guide/mm/pagemap.rst > @@ -37,6 +37,15 @@ There are four components to pagemap: > precisely which pages are mapped (or in swap) and comparing mapped > pages between processes. > > + Note that in some kernel configurations, all pages part of a larger > + allocation (e.g., THP) might be considered "mapped shared" if the large > + allocation is considered "mapped shared": if not all pages are exclusive to > + the same process. Further, some kernel configurations might consider larger > + allocations "mapped shared", if they were at one point considered > + "mapped shared", even if they would now be considered "exclusively mapped". > + Consequently, in these kernel configurations, bit 56 might be set although > + the page is actually "exclusively mapped" I rewrote this yet another time to maybe make it clearer ... + Traditionally, bit 56 indicates that a page is mapped exactly once and bit + 56 is clear when a page is mapped multiple times, even when mapped in the + same process multiple times. In some kernel configurations, the semantics + for pages part of a larger allocation (e.g., THP) differ: bit 56 is set if + all pages part of the corresponding large allocation are *certainly* mapped + in the same process, even if the page is mapped multiple times in that + process. Bit 56 is clear when any page page of the larger allocation + is *maybe* mapped in a different process. In some cases, a large allocation + might be treated as "maybe mapped by multiple processes" even though this + is no longer the case. (talking about "process" is not completely correct, it's actually "MMs"; but that might add more confusion here)
diff --git a/Documentation/admin-guide/mm/pagemap.rst b/Documentation/admin-guide/mm/pagemap.rst index 49590306c61a0..131c86574c39a 100644 --- a/Documentation/admin-guide/mm/pagemap.rst +++ b/Documentation/admin-guide/mm/pagemap.rst @@ -37,6 +37,15 @@ There are four components to pagemap: precisely which pages are mapped (or in swap) and comparing mapped pages between processes. + Note that in some kernel configurations, all pages part of a larger + allocation (e.g., THP) might be considered "mapped shared" if the large + allocation is considered "mapped shared": if not all pages are exclusive to + the same process. Further, some kernel configurations might consider larger + allocations "mapped shared", if they were at one point considered + "mapped shared", even if they would now be considered "exclusively mapped". + Consequently, in these kernel configurations, bit 56 might be set although + the page is actually "exclusively mapped" + Efficient users of this interface will use ``/proc/pid/maps`` to determine which areas of memory are actually mapped and llseek to skip over unmapped regions. diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 2bddcea65cbf1..80839bbf9657f 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1651,6 +1651,13 @@ static int add_to_pagemap(pagemap_entry_t *pme, struct pagemapread *pm) return 0; } +static bool __folio_page_mapped_exclusively(struct folio *folio, struct page *page) +{ + if (IS_ENABLED(CONFIG_PAGE_MAPCOUNT)) + return folio_precise_page_mapcount(folio, page) == 1; + return !folio_maybe_mapped_shared(folio); +} + static int pagemap_pte_hole(unsigned long start, unsigned long end, __always_unused int depth, struct mm_walk *walk) { @@ -1739,7 +1746,7 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm, if (!folio_test_anon(folio)) flags |= PM_FILE; if ((flags & PM_PRESENT) && - folio_precise_page_mapcount(folio, page) == 1) + __folio_page_mapped_exclusively(folio, page)) flags |= PM_MMAP_EXCLUSIVE; } if (vma->vm_flags & VM_SOFTDIRTY) @@ -1814,7 +1821,7 @@ static int pagemap_pmd_range(pmd_t *pmdp, unsigned long addr, unsigned long end, pagemap_entry_t pme; if (folio && (flags & PM_PRESENT) && - folio_precise_page_mapcount(folio, page + idx) == 1) + __folio_page_mapped_exclusively(folio, page)) cur_flags |= PM_MMAP_EXCLUSIVE; pme = make_pme(frame, cur_flags);
Let's implement an alternative when per-page mapcounts in large folios are no longer maintained -- soon with CONFIG_NO_PAGE_MAPCOUNT. PM_MMAP_EXCLUSIVE will now be set if folio_likely_mapped_shared() is true -- when the folio is considered "mapped shared", including when it once was "mapped shared" but no longer is, as documented. This might result in and under-indication of "exclusively mapped", which is considered better than over-indicating it: under-estimating the USS (Unique Set Size) is better than over-estimating it. As an alternative, we could simply remove that flag with CONFIG_NO_PAGE_MAPCOUNT completely, but there might be value to it. So, let's keep it like that and document the behavior. Signed-off-by: David Hildenbrand <david@redhat.com> --- Documentation/admin-guide/mm/pagemap.rst | 9 +++++++++ fs/proc/task_mmu.c | 11 +++++++++-- 2 files changed, 18 insertions(+), 2 deletions(-)