From patchwork Mon Sep 17 13:38:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 10602771 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6670615A6 for ; Mon, 17 Sep 2018 13:38:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4A59529B6B for ; Mon, 17 Sep 2018 13:38:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3CC0B29B95; Mon, 17 Sep 2018 13:38:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4A44E29B6B for ; Mon, 17 Sep 2018 13:38:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E00E98E0022; Mon, 17 Sep 2018 09:38:30 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D3B9E8E0001; Mon, 17 Sep 2018 09:38:30 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BDDEB8E0022; Mon, 17 Sep 2018 09:38:30 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by kanga.kvack.org (Postfix) with ESMTP id 767DE8E0001 for ; Mon, 17 Sep 2018 09:38:30 -0400 (EDT) Received: by mail-pl1-f198.google.com with SMTP id g12-v6so7868446plo.1 for ; Mon, 17 Sep 2018 06:38:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id; bh=YQ9BWkWgUNtX8lyNe55pgzMHfwL7wexHsonpod9NVqg=; b=KJf9/1HLGuskCXZlvBNI3moMpwRqSQSfphGnHw6ZN+4695q+eXgfK9V+PdC7XwZYf7 UZTz/8YkpmKFbhYVco9b1cmJ2CsOnwVFn2Q+2oddP5uchajrVeZbZ5PPi++OHEFGLvN/ xL38h/9R0M9lW0lGDTTaWNDlS7BnkVMo2IokayG4PY8Uci9gpmfQgVSYzGBiYw4RTWxc Mqe1xdGgZ0zXBrOO+5b30Ul8DC7ydMzw4QcGPfj86FS9Pr0XLmJ/JEGdPb8LlPnQTQxK 0EIsm3sa1F9K8K9xbAqyM8hIzHQd7F2wDhTaAS1PsX3NRz7az//2TckTWve1DrvpRRFo mS9Q== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of kirill.shutemov@linux.intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APzg51BE2eNVf4i8QUH1bhuehS3LICNJlZk8p1ibqQbMVULYUBCgFen8 WoKB7CfBHA0zHGIxIuNIwNkiJCee1LYMzYpVMdMod6ZUq9JkwfkX3aWLGr/L1f6MeczmKP81dWN 0SYjrVzrMBwUsExmUV7mR8DwO6hPfzGV+CT8rsmXLNuGLa803dO+0e3t1yce3lUvSEA== X-Received: by 2002:a62:6c85:: with SMTP id h127-v6mr26070893pfc.65.1537191510125; Mon, 17 Sep 2018 06:38:30 -0700 (PDT) X-Google-Smtp-Source: ANB0VdaBL3t0SeRSZp5pVy8e+VMCHuip3u5tLcG1a6EZiMwQ+gKePU5qguffw5IiyKha/n7+8xqr X-Received: by 2002:a62:6c85:: with SMTP id h127-v6mr26070812pfc.65.1537191508991; Mon, 17 Sep 2018 06:38:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537191508; cv=none; d=google.com; s=arc-20160816; b=VpQWCTfeLjlZumr9tOh1yEO79IHv9DYNJskGLXpY2xsRNWND9IZ8qFesr1dca+nyuB /mizpcwTOsOSLf12JikMFckym0b+Lb+IxA1m8a8FhkjnhN/oLKYojBg+RHCKSmp/n2GA 6ihc1pINJo2RMJXgOwJqf9BkXR+7g/qb9KZCU+1Sg5DSYipsKluEGD8aMhF+LSfxogRN tmAMOWeXG0sXCFQ8FchOif814Y6wJ+7U2ncY+FvcTc/ZMLMKAdRCxcK4VK84QvmOXm8c anjhozxCARHMkwezjqOaIDb6CpYwfo6qn0txNG8h7x9vBlL6ZDgFQC/dAmqX42antjQN R2oA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:date:subject:cc:to:from; bh=YQ9BWkWgUNtX8lyNe55pgzMHfwL7wexHsonpod9NVqg=; b=fT2f1cW2H1iLps1m6EDA7l5odK2wk/RGaTZpsQANp+jBrR43iL2JMoBe4/MojoPPsS iYlBHjYwNYCvfKIvc+2czmREbDvjS/dPFn3x1fyhGxqulpymHILGPsh+r7Kj3Q/BUPJn GsTvs93n04wAYRODcS3HFa3lByhXWJm+pPZsyIDN9Z2vGcY+frllLRiAVtPohNaJPNUi +hI9oNv3nLByrKEQHVgUnfWiiRyaRoO5J0KWHwfrriQdpZLj3SZsYU9DptmvVbM4eilM BjEnZTjE5b+qlvFEHaXXUrJXwZSFeo82FjG1XpGXC34Vgl4ay5cAlX67MSTWSd2+DVTF 26PQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of kirill.shutemov@linux.intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga09.intel.com (mga09.intel.com. [134.134.136.24]) by mx.google.com with ESMTPS id h9-v6si18178009plk.461.2018.09.17.06.38.28 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 17 Sep 2018 06:38:28 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of kirill.shutemov@linux.intel.com designates 134.134.136.24 as permitted sender) client-ip=134.134.136.24; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of kirill.shutemov@linux.intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Sep 2018 06:38:28 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,385,1531810800"; d="scan'208";a="90780111" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga001.fm.intel.com with ESMTP; 17 Sep 2018 06:38:25 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id D827713F; Mon, 17 Sep 2018 16:38:24 +0300 (EEST) From: "Kirill A. Shutemov" To: Andrew Morton Cc: Vegard Nossum , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , stable@vger.kernel.org, Zi Yan , Naoya Horiguchi , Vlastimil Babka , Andrea Arcangeli Subject: [PATCHv2] mm, thp: Fix mlocking THP page with migration enabled Date: Mon, 17 Sep 2018 16:38:16 +0300 Message-Id: <20180917133816.43995-1-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.18.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP A transparent huge page is represented by a single entry on an LRU list. Therefore, we can only make unevictable an entire compound page, not individual subpages. If a user tries to mlock() part of a huge page, we want the rest of the page to be reclaimable. We handle this by keeping PTE-mapped huge pages on normal LRU lists: the PMD on border of VM_LOCKED VMA will be split into PTE table. Introduction of THP migration breaks[1] the rules around mlocking THP pages. If we had a single PMD mapping of the page in mlocked VMA, the page will get mlocked, regardless of PTE mappings of the page. For tmpfs/shmem it's easy to fix by checking PageDoubleMap() in remove_migration_pmd(). Anon THP pages can only be shared between processes via fork(). Mlocked page can only be shared if parent mlocked it before forking, otherwise CoW will be triggered on mlock(). For Anon-THP, we can fix the issue by munlocking the page on removing PTE migration entry for the page. PTEs for the page will always come after mlocked PMD: rmap walks VMAs from oldest to newest. Test-case: #include #include #include #include #include int main(void) { unsigned long nodemask = 4; void *addr; addr = mmap((void *)0x20000000UL, 2UL << 20, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_LOCKED, -1, 0); if (fork()) { wait(NULL); return 0; } mlock(addr, 4UL << 10); mbind(addr, 2UL << 20, MPOL_PREFERRED | MPOL_F_RELATIVE_NODES, &nodemask, 4, MPOL_MF_MOVE); return 0; } [1] https://lkml.kernel.org/r/CAOMGZ=G52R-30rZvhGxEbkTw7rLLwBGadVYeo--iizcD3upL3A@mail.gmail.com Signed-off-by: Kirill A. Shutemov Reported-by: Vegard Nossum Fixes: 616b8371539a ("mm: thp: enable thp migration in generic path") Cc: [v4.14+] Cc: Zi Yan Cc: Naoya Horiguchi Cc: Vlastimil Babka Cc: Andrea Arcangeli Reviewed-by: Zi Yan --- mm/huge_memory.c | 2 +- mm/migrate.c | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 533f9b00147d..00704060b7f7 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2931,7 +2931,7 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) else page_add_file_rmap(new, true); set_pmd_at(mm, mmun_start, pvmw->pmd, pmde); - if (vma->vm_flags & VM_LOCKED) + if ((vma->vm_flags & VM_LOCKED) && !PageDoubleMap(new)) mlock_vma_page(new); update_mmu_cache_pmd(vma, address, pvmw->pmd); } diff --git a/mm/migrate.c b/mm/migrate.c index d6a2e89b086a..9d374011c244 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -275,6 +275,9 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma, if (vma->vm_flags & VM_LOCKED && !PageTransCompound(new)) mlock_vma_page(new); + if (PageTransHuge(page) && PageMlocked(page)) + clear_page_mlock(page); + /* No need to invalidate - it was non-present before */ update_mmu_cache(vma, pvmw.address, pvmw.pte); }