[132/192] mm/rmap: split try_to_munlock from try_to_unmap

Message ID	20210701015412.snaTuyrO6%akpm@linux-foundation.org (mailing list archive)
State	New
Headers	show Return-Path: <SRS0=qc1w=LZ=kvack.org=owner-linux-mm@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BE2126105A Date: Wed, 30 Jun 2021 18:54:12 -0700 From: Andrew Morton <akpm@linux-foundation.org> To: akpm@linux-foundation.org, apopple@nvidia.com, bskeggs@redhat.com, hch@lst.de, hughd@google.com, jgg@nvidia.com, jhubbard@nvidia.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, rcampbell@nvidia.com, shakeelb@google.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 132/192] mm/rmap: split try_to_munlock from try_to_unmap Message-ID: <20210701015412.snaTuyrO6%akpm@linux-foundation.org> In-Reply-To: <20210630184624.9ca1937310b0dd5ce66b30e7@linux-foundation.org> User-Agent: s-nail v14.8.16 Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	[001/192] mm: memory_hotplug: factor out bootmem core functions to bootmem_info.c \| expand [001/192] mm: memory_hotplug: factor out bootmem core functions to bootmem_info.c [002/192] mm: hugetlb: introduce a new config HUGETLB_PAGE_FREE_VMEMMAP [003/192] mm: hugetlb: gather discrete indexes of tail page [004/192] mm: hugetlb: free the vmemmap pages associated with each HugeTLB page [005/192] mm: hugetlb: defer freeing of HugeTLB pages [006/192] mm: hugetlb: alloc the vmemmap pages associated with each HugeTLB page [007/192] mm: hugetlb: add a kernel parameter hugetlb_free_vmemmap [008/192] mm: memory_hotplug: disable memmap_on_memory when hugetlb_free_vmemmap enabled [009/192] mm: hugetlb: introduce nr_free_vmemmap_pages in the struct hstate [010/192] mm/debug_vm_pgtable: move {pmd/pud}_huge_tests out of CONFIG_TRANSPARENT_HUGEPAGE [011/192] mm/debug_vm_pgtable: remove redundant pfn_{pmd/pte}() and fix one comment mistake [012/192] mm/huge_memory.c: remove dedicated macro HPAGE_CACHE_INDEX_MASK [013/192] mm/huge_memory.c: use page->deferred_list [014/192] mm/huge_memory.c: add missing read-only THP checking in transparent_hugepage_enabled() [015/192] mm/huge_memory.c: remove unnecessary tlb_remove_page_size() for huge zero pmd [016/192] mm/huge_memory.c: don't discard hugepage if other processes are mapping it [017/192] mm/hugetlb: change parameters of arch_make_huge_pte() [018/192] mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge [019/192] mm/vmalloc: enable mapping of huge pages at pte level in vmap [020/192] mm/vmalloc: enable mapping of huge pages at pte level in vmalloc [021/192] powerpc/8xx: add support for huge pages on VMAP and VMALLOC [022/192] khugepaged: selftests: remove debug_cow [023/192] mm, hugetlb: fix racy resv_huge_pages underflow on UFFDIO_COPY [024/192] mm: sparsemem: split the huge PMD mapping of vmemmap pages [025/192] mm: sparsemem: use huge PMD mapping for vmemmap pages [026/192] mm: hugetlb: introduce CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON [027/192] hugetlb: remove prep_compound_huge_page cleanup [028/192] hugetlb: address ref count racing in prep_compound_gigantic_page [029/192] mm/hwpoison: disable pcp for page_handle_poison() [030/192] userfaultfd/selftests: use user mode only [031/192] userfaultfd/selftests: remove the time() check on delayed uffd [032/192] userfaultfd/selftests: dropping VERIFY check in locking_thread [033/192] userfaultfd/selftests: only dump counts if mode enabled [034/192] userfaultfd/selftests: unify error handling [035/192] mm/thp: simplify copying of huge zero page pmd when fork [036/192] mm/userfaultfd: fix uffd-wp special cases for fork() [037/192] mm/userfaultfd: fail uffd-wp registration if not supported [038/192] mm/pagemap: export uffd-wp protection information [039/192] userfaultfd/selftests: add pagemap uffd-wp test [040/192] userfaultfd/shmem: combine shmem_{mcopy_atomic,mfill_zeropage}_pte [041/192] userfaultfd/shmem: support minor fault registration for shmem [042/192] userfaultfd/shmem: support UFFDIO_CONTINUE for shmem [043/192] userfaultfd/shmem: advertise shmem minor fault support [044/192] userfaultfd/shmem: modify shmem_mfill_atomic_pte to use install_pte() [045/192] userfaultfd/selftests: use memfd_create for shmem test type [046/192] userfaultfd/selftests: create alias mappings in the shmem test [047/192] userfaultfd/selftests: reinitialize test context in each test [048/192] userfaultfd/selftests: exercise minor fault handling shmem support [049/192] mm/vmscan.c: fix potential deadlock in reclaim_pages() [050/192] include/trace/events/vmscan.h: remove mm_vmscan_inactive_list_is_low [051/192] mm: workingset: define macro WORKINGSET_SHIFT [052/192] mm/kconfig: move HOLES_IN_ZONE into mm [053/192] docs: proc.rst: meminfo: briefly describe gaps in memory accounting [054/192] fs/proc/kcore: drop KCORE_REMAP and KCORE_OTHER [055/192] fs/proc/kcore: pfn_is_ram check only applies to KCORE_RAM [056/192] fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages [057/192] mm: introduce page_offline_(begin\|end\|freeze\|thaw) to synchronize setting PageOffline() [058/192] virtio-mem: use page_offline_(start\|end) when setting PageOffline() [059/192] fs/proc/kcore: use page_offline_(freeze\|thaw) [060/192] mm/z3fold: define macro NCHUNKS as TOTAL_CHUNKS - ZHDR_CHUNKS [061/192] mm/z3fold: avoid possible underflow in z3fold_alloc() [062/192] mm/z3fold: remove magic number in z3fold_create_pool() [063/192] mm/z3fold: remove unused function handle_to_z3fold_header() [064/192] mm/z3fold: fix potential memory leak in z3fold_destroy_pool() [065/192] mm/z3fold: use release_z3fold_page_locked() to release locked z3fold page [066/192] mm/zbud: reuse unbuddied[0] as buddied in zbud_pool [067/192] mm/zbud: don't export any zbud API [068/192] mm/compaction: use DEVICE_ATTR_WO macro [069/192] mm: compaction: remove duplicate !list_empty(&sublist) check [070/192] mm/compaction: fix 'limit' in fast_isolate_freepages [071/192] mm/mempolicy: cleanup nodemask intersection check for oom [072/192] mm/mempolicy: don't handle MPOL_LOCAL like a fake MPOL_PREFERRED policy [073/192] mm/mempolicy: unify the parameter sanity check for mbind and set_mempolicy [074/192] mm: mempolicy: don't have to split pmd for huge zero page [075/192] mm/mempolicy: use unified 'nodes' for bind/interleave/prefer policies [076/192] include/linux/mmzone.h: add documentation for pfn_valid() [077/192] memblock: update initialization of reserved pages [078/192] arm64: decouple check whether pfn is in linear map from pfn_valid() [079/192] arm64: drop pfn_valid_within() and simplify pfn_valid() [080/192] arm64/mm: drop HAVE_ARCH_PFN_VALID [081/192] mm: migrate: fix missing update page_private to hugetlb_page_subpool [082/192] mm, thp: relax the VM_DENYWRITE constraint on file-backed THPs [083/192] mm: memory: add orig_pmd to struct vm_fault [084/192] mm: memory: make numa_migrate_prep() non-static [085/192] mm: thp: refactor NUMA fault handling [086/192] mm: migrate: account THP NUMA migration counters correctly [087/192] mm: migrate: don't split THP for misplaced NUMA page [088/192] mm: migrate: check mapcount for THP instead of refcount [089/192] mm: thp: skip make PMD PROT_NONE if THP migration is not supported [090/192] mm/thp: make ARCH_ENABLE_SPLIT_PMD_PTLOCK dependent on PGTABLE_LEVELS > 2 [091/192] mm: rmap: make try_to_unmap() void function [092/192] mm/thp: remap_page() is only needed on anonymous THP [093/192] mm: hwpoison_user_mappings() try_to_unmap() with TTU_SYNC [094/192] mm/thp: fix strncpy warning [095/192] nommu: remove __GFP_HIGHMEM in vmalloc/vzalloc [096/192] mm/nommu: unexport do_munmap() [097/192] mm: generalize ZONE_[DMA\|DMA32] [098/192] mm: make variable names for populate_vma_page_range() consistent [099/192] mm/madvise: introduce MADV_POPULATE_(READ\|WRITE) to prefault page tables [100/192] MAINTAINERS: add tools/testing/selftests/vm/ to MEMORY MANAGEMENT [101/192] selftests/vm: add protection_keys_32 / protection_keys_64 to gitignore [102/192] selftests/vm: add test for MADV_POPULATE_(READ\|WRITE) [103/192] mm/memory_hotplug: rate limit page migration warnings [104/192] mm,memory_hotplug: drop unneeded locking [105/192] mm/zswap.c: remove unused function zswap_debugfs_exit() [106/192] mm/zswap.c: avoid unnecessary copy-in at map time [107/192] mm/zswap.c: fix two bugs in zswap_writeback_entry() [108/192] mm: zram: amend SLAB_RECLAIM_ACCOUNT on zspage_cachep [109/192] mm/zsmalloc.c: remove confusing code in obj_free() [110/192] mm/zsmalloc.c: improve readability for async_free_zspage() [111/192] zram: move backing_dev under macro CONFIG_ZRAM_WRITEBACK [112/192] mm: fix typos and grammar error in comments [113/192] mm: define default value for FIRST_USER_ADDRESS [114/192] mm: fix spelling mistakes [115/192] mm/vmscan: remove kerneldoc-like comment from isolate_lru_pages [116/192] mm/vmalloc: include header for prototype of set_iounmap_nonlazy [117/192] mm/page_alloc: make should_fail_alloc_page() static [118/192] mm/mapping_dirty_helpers: remove double Note in kerneldoc [119/192] mm/memcontrol.c: fix kerneldoc comment for mem_cgroup_calculate_protection [120/192] mm/memory_hotplug: fix kerneldoc comment for __try_online_node [121/192] mm/memory_hotplug: fix kerneldoc comment for __remove_memory [122/192] mm/zbud: add kerneldoc fields for zbud_pool [123/192] mm/z3fold: add kerneldoc fields for z3fold_pool [124/192] mm/swap: make swap_address_space an inline function [125/192] mm/mmap_lock: remove dead code for !CONFIG_TRACING configurations [126/192] mm/page_alloc: move prototype for find_suitable_fallback [127/192] mm/swap: make NODE_DATA an inline function on CONFIG_FLATMEM [128/192] mm/thp: define default pmd_pgtable() [129/192] kfence: unconditionally use unbound work queue [130/192] mm: remove special swap entry functions [131/192] mm/swapops: rework swap entry manipulation code [132/192] mm/rmap: split try_to_munlock from try_to_unmap [133/192] mm/rmap: split migration into its own function [134/192] mm: rename migrate_pgmap_owner [135/192] mm/memory.c: allow different return codes for copy_nonpresent_pte() [136/192] mm: device exclusive memory access [137/192] mm: selftests for exclusive device memory [138/192] nouveau/svm: refactor nouveau_range_fault [139/192] nouveau/svm: implement atomic SVM access [140/192] proc: Avoid mixing integer types in mem_rw() [141/192] fs/proc/kcore.c: add mmap interface [142/192] procfs: allow reading fdinfo with PTRACE_MODE_READ [143/192] procfs/dmabuf: add inode number to /proc//fdinfo [144/192] sysctl: remove redundant assignment to first [145/192] drm: include only needed headers in ascii85.h [146/192] kernel.h: split out panic and oops helpers [147/192] lib: decompress_bunzip2: remove an unneeded semicolon [148/192] lib/string_helpers: switch to use BIT() macro [149/192] lib/string_helpers: move ESCAPE_NP check inside 'else' branch in a loop [150/192] lib/string_helpers: drop indentation level in string_escape_mem() [151/192] lib/string_helpers: introduce ESCAPE_NA for escaping non-ASCII [152/192] lib/string_helpers: introduce ESCAPE_NAP to escape non-ASCII and non-printable [153/192] lib/string_helpers: allow to append additional characters to be escaped [154/192] lib/test-string_helpers: print flags in hexadecimal format [155/192] lib/test-string_helpers: get rid of trailing comma in terminators [156/192] lib/test-string_helpers: add test cases for new features [157/192] MAINTAINERS: add myself as designated reviewer for generic string library [158/192] seq_file: introduce seq_escape_mem() [159/192] seq_file: add seq_escape_str() as replica of string_escape_str() [160/192] seq_file: convert seq_escape() to use seq_escape_str() [161/192] nfsd: avoid non-flexible API in seq_quote_mem() [162/192] seq_file: drop unused _escape_mem_ascii() [163/192] lib/math/rational.c: fix divide by zero [164/192] lib/math/rational: add Kunit test cases [165/192] lib/decompressors: fix spelling mistakes [166/192] lib/mpi: fix spelling mistakes [167/192] lib: memscan() fixlet [168/192] lib: uninline simple_strtoull() [169/192] lib/test_string.c: allow module removal [170/192] kernel.h: split out kstrtox() and simple_strtox() to a separate header [171/192] lz4_decompress: declare LZ4_decompress_safe_withPrefix64k static [172/192] lib/decompress_unlz4.c: correctly handle zero-padding around initrds. [173/192] checkpatch: scripts/spdxcheck.py now requires python3 [174/192] checkpatch: improve the indented label test [175/192] checkpatch: do not complain about positive return values starting with EPOLL [176/192] init: print out unknown kernel parameters [177/192] kprobes: remove duplicated strong free_insn_page in x86 and s390 [178/192] nilfs2: remove redundant continue statement in a while-loop [179/192] hfsplus: remove unnecessary oom message [180/192] hfsplus: report create_date to kstat.btime [181/192] x86: signal: don't do sas_ss_reset() until we are certain that sigframe won't be abandoned [182/192] exec: remove checks in __register_bimfmt() [183/192] kcov: add __no_sanitize_coverage to fix noinstr for all architectures [184/192] selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random [185/192] selftests/vm/pkeys: handle negative sys_pkey_alloc() return code [186/192] selftests/vm/pkeys: refill shadow register after implicit kernel write [187/192] selftests/vm/pkeys: exercise x86 XSAVE init state [188/192] lib/decompressors: remove set but not used variabled 'level' [189/192] ipc sem: use kvmalloc for sem_undo allocation [190/192] ipc: use kmalloc for msg_queue and shmid_kernel [191/192] ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock [192/192] ipc/util.c: use binary search for max_idx

Message ID

20210701015412.snaTuyrO6%akpm@linux-foundation.org (mailing list archive)

State

New

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BE2126105A
Date: Wed, 30 Jun 2021 18:54:12 -0700
From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, apopple@nvidia.com, bskeggs@redhat.com,
 hch@lst.de, hughd@google.com, jgg@nvidia.com, jhubbard@nvidia.com,
 linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com,
 rcampbell@nvidia.com, shakeelb@google.com,
 torvalds@linux-foundation.org, willy@infradead.org
Subject: [patch 132/192] mm/rmap: split try_to_munlock from
 try_to_unmap
Message-ID: <20210701015412.snaTuyrO6%akpm@linux-foundation.org>
In-Reply-To: <20210630184624.9ca1937310b0dd5ce66b30e7@linux-foundation.org>
User-Agent: s-nail v14.8.16
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

[001/192] mm: memory_hotplug: factor out bootmem core functions to bootmem_info.c | expand

Commit Message

Andrew Morton July 1, 2021, 1:54 a.m. UTC

From: Alistair Popple <apopple@nvidia.com>
Subject: mm/rmap: split try_to_munlock from try_to_unmap

The behaviour of try_to_unmap_one() is difficult to follow because it
performs different operations based on a fairly large set of flags used in
different combinations.

TTU_MUNLOCK is one such flag.  However it is exclusively used by
try_to_munlock() which specifies no other flags.  Therefore rather than
overload try_to_unmap_one() with unrelated behaviour split this out into
it's own function and remove the flag.

Link: https://lkml.kernel.org/r/20210616105937.23201-4-apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/vm/unevictable-lru.rst |   33 ++++--------
 include/linux/rmap.h                 |    3 -
 mm/mlock.c                           |   12 ++--
 mm/rmap.c                            |   66 ++++++++++++++++++-------
 4 files changed, 69 insertions(+), 45 deletions(-)

--- a/Documentation/vm/unevictable-lru.rst~mm-rmap-split-try_to_munlock-from-try_to_unmap
+++ a/Documentation/vm/unevictable-lru.rst
@@ -389,14 +389,14 @@  mlocked, munlock_vma_page() updates that
 mlocked pages.  Note, however, that at this point we haven't checked whether
 the page is mapped by other VM_LOCKED VMAs.
 
-We can't call try_to_munlock(), the function that walks the reverse map to
+We can't call page_mlock(), the function that walks the reverse map to
 check for other VM_LOCKED VMAs, without first isolating the page from the LRU.
-try_to_munlock() is a variant of try_to_unmap() and thus requires that the page
+page_mlock() is a variant of try_to_unmap() and thus requires that the page
 not be on an LRU list [more on these below].  However, the call to
-isolate_lru_page() could fail, in which case we couldn't try_to_munlock().  So,
+isolate_lru_page() could fail, in which case we can't call page_mlock().  So,
 we go ahead and clear PG_mlocked up front, as this might be the only chance we
-have.  If we can successfully isolate the page, we go ahead and
-try_to_munlock(), which will restore the PG_mlocked flag and update the zone
+have.  If we can successfully isolate the page, we go ahead and call
+page_mlock(), which will restore the PG_mlocked flag and update the zone
 page statistics if it finds another VMA holding the page mlocked.  If we fail
 to isolate the page, we'll have left a potentially mlocked page on the LRU.
 This is fine, because we'll catch it later if and if vmscan tries to reclaim
@@ -545,31 +545,24 @@  munlock or munmap system calls, mm teard
 holepunching, and truncation of file pages and their anonymous COWed pages.
 
 
-try_to_munlock() Reverse Map Scan
+page_mlock() Reverse Map Scan
 ---------------------------------
 
-.. warning::
-   [!] TODO/FIXME: a better name might be page_mlocked() - analogous to the
-   page_referenced() reverse map walker.
-
 When munlock_vma_page() [see section :ref:`munlock()/munlockall() System Call
 Handling <munlock_munlockall_handling>` above] tries to munlock a
 page, it needs to determine whether or not the page is mapped by any
 VM_LOCKED VMA without actually attempting to unmap all PTEs from the
 page.  For this purpose, the unevictable/mlock infrastructure
-introduced a variant of try_to_unmap() called try_to_munlock().
+introduced a variant of try_to_unmap() called page_mlock().
 
-try_to_munlock() calls the same functions as try_to_unmap() for anonymous and
-mapped file and KSM pages with a flag argument specifying unlock versus unmap
-processing.  Again, these functions walk the respective reverse maps looking
-for VM_LOCKED VMAs.  When such a VMA is found, as in the try_to_unmap() case,
-the functions mlock the page via mlock_vma_page() and return SWAP_MLOCK.  This
-undoes the pre-clearing of the page's PG_mlocked done by munlock_vma_page.
+page_mlock() walks the respective reverse maps looking for VM_LOCKED VMAs. When
+such a VMA is found the page is mlocked via mlock_vma_page(). This undoes the
+pre-clearing of the page's PG_mlocked done by munlock_vma_page.
 
-Note that try_to_munlock()'s reverse map walk must visit every VMA in a page's
+Note that page_mlock()'s reverse map walk must visit every VMA in a page's
 reverse map to determine that a page is NOT mapped into any VM_LOCKED VMA.
 However, the scan can terminate when it encounters a VM_LOCKED VMA.
-Although try_to_munlock() might be called a great many times when munlocking a
+Although page_mlock() might be called a great many times when munlocking a
 large region or tearing down a large address space that has been mlocked via
 mlockall(), overall this is a fairly rare event.
 
@@ -602,7 +595,7 @@  inactive lists to the appropriate node's
 shrink_inactive_list() should only see SHM_LOCK'd pages that became SHM_LOCK'd
 after shrink_active_list() had moved them to the inactive list, or pages mapped
 into VM_LOCKED VMAs that munlock_vma_page() couldn't isolate from the LRU to
-recheck via try_to_munlock().  shrink_inactive_list() won't notice the latter,
+recheck via page_mlock().  shrink_inactive_list() won't notice the latter,
 but will pass on to shrink_page_list().
 
 shrink_page_list() again culls obviously unevictable pages that it could
--- a/include/linux/rmap.h~mm-rmap-split-try_to_munlock-from-try_to_unmap
+++ a/include/linux/rmap.h
@@ -87,7 +87,6 @@  struct anon_vma_chain {
 
 enum ttu_flags {
 	TTU_MIGRATION		= 0x1,	/* migration mode */
-	TTU_MUNLOCK		= 0x2,	/* munlock mode */
 
 	TTU_SPLIT_HUGE_PMD	= 0x4,	/* split huge PMD if any */
 	TTU_IGNORE_MLOCK	= 0x8,	/* ignore mlock */
@@ -240,7 +239,7 @@  int page_mkclean(struct page *);
  * called in munlock()/munmap() path to check for other vmas holding
  * the page mlocked.
  */
-void try_to_munlock(struct page *);
+void page_mlock(struct page *page);
 
 void remove_migration_ptes(struct page *old, struct page *new, bool locked);
 
--- a/mm/mlock.c~mm-rmap-split-try_to_munlock-from-try_to_unmap
+++ a/mm/mlock.c
@@ -108,7 +108,7 @@  void mlock_vma_page(struct page *page)
 /*
  * Finish munlock after successful page isolation
  *
- * Page must be locked. This is a wrapper for try_to_munlock()
+ * Page must be locked. This is a wrapper for page_mlock()
  * and putback_lru_page() with munlock accounting.
  */
 static void __munlock_isolated_page(struct page *page)
@@ -118,7 +118,7 @@  static void __munlock_isolated_page(stru
 	 * and we don't need to check all the other vmas.
 	 */
 	if (page_mapcount(page) > 1)
-		try_to_munlock(page);
+		page_mlock(page);
 
 	/* Did try_to_unlock() succeed or punt? */
 	if (!PageMlocked(page))
@@ -158,7 +158,7 @@  static void __munlock_isolation_failed(s
  * munlock()ed or munmap()ed, we want to check whether other vmas hold the
  * page locked so that we can leave it on the unevictable lru list and not
  * bother vmscan with it.  However, to walk the page's rmap list in
- * try_to_munlock() we must isolate the page from the LRU.  If some other
+ * page_mlock() we must isolate the page from the LRU.  If some other
  * task has removed the page from the LRU, we won't be able to do that.
  * So we clear the PageMlocked as we might not get another chance.  If we
  * can't isolate the page, we leave it for putback_lru_page() and vmscan
@@ -168,7 +168,7 @@  unsigned int munlock_vma_page(struct pag
 {
 	int nr_pages;
 
-	/* For try_to_munlock() and to serialize with page migration */
+	/* For page_mlock() and to serialize with page migration */
 	BUG_ON(!PageLocked(page));
 	VM_BUG_ON_PAGE(PageTail(page), page);
 
@@ -205,7 +205,7 @@  static int __mlock_posix_error_return(lo
  *
  * The fast path is available only for evictable pages with single mapping.
  * Then we can bypass the per-cpu pvec and get better performance.
- * when mapcount > 1 we need try_to_munlock() which can fail.
+ * when mapcount > 1 we need page_mlock() which can fail.
  * when !page_evictable(), we need the full redo logic of putback_lru_page to
  * avoid leaving evictable page in unevictable list.
  *
@@ -414,7 +414,7 @@  static unsigned long __munlock_pagevec_f
  *
  * We don't save and restore VM_LOCKED here because pages are
  * still on lru.  In unmap path, pages might be scanned by reclaim
- * and re-mlocked by try_to_{munlock|unmap} before we unmap and
+ * and re-mlocked by page_mlock/try_to_unmap before we unmap and
  * free them.  This will result in freeing mlocked pages.
  */
 void munlock_vma_pages_range(struct vm_area_struct *vma,
--- a/mm/rmap.c~mm-rmap-split-try_to_munlock-from-try_to_unmap
+++ a/mm/rmap.c
@@ -1411,10 +1411,6 @@  static bool try_to_unmap_one(struct page
 	if (flags & TTU_SYNC)
 		pvmw.flags = PVMW_SYNC;
 
-	/* munlock has nothing to gain from examining un-locked vmas */
-	if ((flags & TTU_MUNLOCK) && !(vma->vm_flags & VM_LOCKED))
-		return true;
-
 	if (IS_ENABLED(CONFIG_MIGRATION) && (flags & TTU_MIGRATION) &&
 	    is_zone_device_page(page) && !is_device_private_page(page))
 		return true;
@@ -1476,8 +1472,6 @@  static bool try_to_unmap_one(struct page
 				page_vma_mapped_walk_done(&pvmw);
 				break;
 			}
-			if (flags & TTU_MUNLOCK)
-				continue;
 		}
 
 		/* Unexpected PMD-mapped THP? */
@@ -1790,20 +1784,58 @@  void try_to_unmap(struct page *page, enu
 		rmap_walk(page, &rwc);
 }
 
+/*
+ * Walks the vma's mapping a page and mlocks the page if any locked vma's are
+ * found. Once one is found the page is locked and the scan can be terminated.
+ */
+static bool page_mlock_one(struct page *page, struct vm_area_struct *vma,
+				 unsigned long address, void *unused)
+{
+	struct page_vma_mapped_walk pvmw = {
+		.page = page,
+		.vma = vma,
+		.address = address,
+	};
+
+	/* An un-locked vma doesn't have any pages to lock, continue the scan */
+	if (!(vma->vm_flags & VM_LOCKED))
+		return true;
+
+	while (page_vma_mapped_walk(&pvmw)) {
+		/*
+		 * Need to recheck under the ptl to serialise with
+		 * __munlock_pagevec_fill() after VM_LOCKED is cleared in
+		 * munlock_vma_pages_range().
+		 */
+		if (vma->vm_flags & VM_LOCKED) {
+			/* PTE-mapped THP are never mlocked */
+			if (!PageTransCompound(page))
+				mlock_vma_page(page);
+			page_vma_mapped_walk_done(&pvmw);
+		}
+
+		/*
+		 * no need to continue scanning other vma's if the page has
+		 * been locked.
+		 */
+		return false;
+	}
+
+	return true;
+}
+
 /**
- * try_to_munlock - try to munlock a page
- * @page: the page to be munlocked
+ * page_mlock - try to mlock a page
+ * @page: the page to be mlocked
  *
- * Called from munlock code.  Checks all of the VMAs mapping the page
- * to make sure nobody else has this page mlocked. The page will be
- * returned with PG_mlocked cleared if no other vmas have it mlocked.
+ * Called from munlock code. Checks all of the VMAs mapping the page and mlocks
+ * the page if any are found. The page will be returned with PG_mlocked cleared
+ * if it is not mapped by any locked vmas.
  */
-
-void try_to_munlock(struct page *page)
+void page_mlock(struct page *page)
 {
 	struct rmap_walk_control rwc = {
-		.rmap_one = try_to_unmap_one,
-		.arg = (void *)TTU_MUNLOCK,
+		.rmap_one = page_mlock_one,
 		.done = page_not_mapped,
 		.anon_lock = page_lock_anon_vma_read,
 
@@ -1855,7 +1887,7 @@  static struct anon_vma *rmap_walk_anon_l
  * Find all the mappings of a page using the mapping pointer and the vma chains
  * contained in the anon_vma struct it points to.
  *
- * When called from try_to_munlock(), the mmap_lock of the mm containing the vma
+ * When called from page_mlock(), the mmap_lock of the mm containing the vma
  * where the page was found will be held for write.  So, we won't recheck
  * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
  * LOCKED.
@@ -1908,7 +1940,7 @@  static void rmap_walk_anon(struct page *
  * Find all the mappings of a page using the mapping pointer and the vma chains
  * contained in the address_space struct it points to.
  *
- * When called from try_to_munlock(), the mmap_lock of the mm containing the vma
+ * When called from page_mlock(), the mmap_lock of the mm containing the vma
  * where the page was found will be held for write.  So, we won't recheck
  * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
  * LOCKED.

[132/192] mm/rmap: split try_to_munlock from try_to_unmap

Commit Message

Patch