diff mbox series

[078/262] Documentation: update pagemap with shmem exceptions

Message ID 20211105203844.xnHgY8CWB%akpm@linux-foundation.org (mailing list archive)
State New
Headers show
Series [001/262] scripts/spelling.txt: add more spellings to spelling.txt | expand

Commit Message

Andrew Morton Nov. 5, 2021, 8:38 p.m. UTC
From: Tiberiu A Georgescu <tiberiu.georgescu@nutanix.com>
Subject: Documentation: update pagemap with shmem exceptions

This patch follows the discussions on previous documentation patch threads
[1][2].  It presents the exception case of shared memory management from
the pagemap's point of view.  It briefly describes what is missing, why it
is missing and alternatives to the pagemap for page info retrieval in user
space.

In short, the kernel does not keep track of PTEs for swapped out shared
pages within the processes that references them.  Thus, the
proc/pid/pagemap tool cannot print the swap destination of the shared
memory pages, instead setting the pagemap entry to zero for both
non-allocated and swapped out pages.  This can create confusion for users
who need information on swapped out pages.

The reasons why maintaining the PTEs of all swapped out shared pages among
all processes while maintaining similar performance is not a trivial task,
or a desirable change, have been discussed extensively [1][3][4][5]. 
There are also arguments for why this arguably missing information should
eventually be exposed to the user in either a future pagemap patch, or by
an alternative tool.

[1]: https://marc.info/?m=162878395426774
[2]: https://lore.kernel.org/lkml/20210920164931.175411-1-tiberiu.georgescu@nutanix.com/
[3]: https://lore.kernel.org/lkml/20210730160826.63785-1-tiberiu.georgescu@nutanix.com/
[4]: https://lore.kernel.org/lkml/20210807032521.7591-1-peterx@redhat.com/
[5]: https://lore.kernel.org/lkml/20210715201651.212134-1-peterx@redhat.com/


Mention the current missing information in the pagemap and alternatives on
how to retrieve it, in case someone stumbles upon unexpected behaviour.

Link: https://lkml.kernel.org/r/20210923064618.157046-1-tiberiu.georgescu@nutanix.com
Link: https://lkml.kernel.org/r/20210923064618.157046-2-tiberiu.georgescu@nutanix.com
Signed-off-by: Tiberiu A Georgescu <tiberiu.georgescu@nutanix.com>
Reviewed-by: Ivan Teterevkov <ivan.teterevkov@nutanix.com>
Reviewed-by: Florian Schmidt <florian.schmidt@nutanix.com>
Reviewed-by: Carl Waldspurger <carl.waldspurger@nutanix.com>
Reviewed-by: Jonathan Davies <jonathan.davies@nutanix.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/mm/pagemap.rst |   22 +++++++++++++++++++++
 1 file changed, 22 insertions(+)
diff mbox series

Patch

--- a/Documentation/admin-guide/mm/pagemap.rst~documentation-update-pagemap-with-shmem-exceptions
+++ a/Documentation/admin-guide/mm/pagemap.rst
@@ -196,6 +196,28 @@  you can go through every map in the proc
 in kpagecount, and tally up the number of pages that are only referenced
 once.
 
+Exceptions for Shared Memory
+============================
+
+Page table entries for shared pages are cleared when the pages are zapped or
+swapped out. This makes swapped out pages indistinguishable from never-allocated
+ones.
+
+In kernel space, the swap location can still be retrieved from the page cache.
+However, values stored only on the normal PTE get lost irretrievably when the
+page is swapped out (i.e. SOFT_DIRTY).
+
+In user space, whether the page is present, swapped or none can be deduced with
+the help of lseek and/or mincore system calls.
+
+lseek() can differentiate between accessed pages (present or swapped out) and
+holes (none/non-allocated) by specifying the SEEK_DATA flag on the file where
+the pages are backed. For anonymous shared pages, the file can be found in
+``/proc/pid/map_files/``.
+
+mincore() can differentiate between pages in memory (present, including swap
+cache) and out of memory (swapped out or none/non-allocated).
+
 Other notes
 ===========