From patchwork Sun Dec 1 01:51:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11268267 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E7C0B921 for ; Sun, 1 Dec 2019 01:51:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AA7102084D for ; Sun, 1 Dec 2019 01:51:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="THT/rLMx" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AA7102084D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 759A56B02B0; Sat, 30 Nov 2019 20:51:34 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7095E6B02B2; Sat, 30 Nov 2019 20:51:34 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6240A6B02B3; Sat, 30 Nov 2019 20:51:34 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0226.hostedemail.com [216.40.44.226]) by kanga.kvack.org (Postfix) with ESMTP id 4C7AD6B02B0 for ; Sat, 30 Nov 2019 20:51:34 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id 05A8F181AEF0B for ; Sun, 1 Dec 2019 01:51:34 +0000 (UTC) X-FDA: 76214895708.24.mouth29_da4590faae21 X-Spam-Summary: 2,0,0,236cb11625a3ebfe,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:arnd@arndb.de:kirill.shutemov@linux.intel.com::mm-commits@vger.kernel.org:stable@vger.kernel.org:thellstrom@vmware.com:torvalds@linux-foundation.org:willy@infradead.org,RULES_HIT:41:355:379:800:960:967:973:988:989:1260:1263:1345:1381:1431:1437:1534:1543:1711:1730:1747:1777:1792:1801:1981:2194:2199:2393:2525:2559:2563:2682:2685:2691:2693:2859:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3354:3865:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4321:4605:5007:6119:6261:6653:7576:7903:8599:9025:9545:10004:10913:11026:11473:11658:11914:12043:12048:12291:12297:12438:12517:12519:12555:12679:12783:12986:13180:13229:13846:14039:14093:14096:14181:14721:14849:21063:21080:21450:21451:21627:21795:21939:30003:30012:30054:30070,0,RBL:error,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,M SBL:0,DN X-HE-Tag: mouth29_da4590faae21 X-Filterd-Recvd-Size: 4840 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Sun, 1 Dec 2019 01:51:33 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B0102215A5; Sun, 1 Dec 2019 01:51:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1575165093; bh=5YG5SJgJYRkpy/Jj3Zis4wmBxE1XZBfNejKoHe/VckI=; h=Date:From:To:Subject:From; b=THT/rLMxIS3YchSgfxZz5Oh8h0BG/75EblcSjLXA2wQcK/Us5+xX8aJngbmv6uxAp YKZx7+OCYsQnOacpLv7wud6vZjmFBNfaaBjUAp9gIr0uacqeIwuwadh61cyOXtjfvq VBIS+WypFFnQ8FHZVtOODQXuCHgWMZKSfClKfChw= Date: Sat, 30 Nov 2019 17:51:32 -0800 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, arnd@arndb.de, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, stable@vger.kernel.org, thellstrom@vmware.com, torvalds@linux-foundation.org, willy@infradead.org Subject: [patch 045/158] mm/memory.c: fix a huge pud insertion race during faulting Message-ID: <20191201015132.LxbEPLRfg%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Thomas Hellstrom Subject: mm/memory.c: fix a huge pud insertion race during faulting A huge pud page can theoretically be faulted in racing with pmd_alloc() in __handle_mm_fault(). That will lead to pmd_alloc() returning an invalid pmd pointer. Fix this by adding a pud_trans_unstable() function similar to pmd_trans_unstable() and check whether the pud is really stable before using the pmd pointer. Race: Thread 1: Thread 2: Comment create_huge_pud() Fallback - not taken. create_huge_pud() Taken. pmd_alloc() Returns an invalid pointer. This will result in user-visible huge page data corruption. Note that this was caught during a code audit rather than a real experienced problem. It looks to me like the only implementation that currently creates huge pud pagetable entries is dev_dax_huge_fault() which doesn't appear to care much about private (COW) mappings or write-tracking which is, I believe, a prerequisite for create_huge_pud() falling back on thread 1, but not in thread 2. Link: http://lkml.kernel.org/r/20191115115808.21181-2-thomas_os@shipmail.org Fixes: a00cc7d9dd93 ("mm, x86: add support for PUD-sized transparent hugepages") Signed-off-by: Thomas Hellstrom Acked-by: Kirill A. Shutemov Cc: Arnd Bergmann Cc: Matthew Wilcox Cc: Signed-off-by: Andrew Morton --- include/asm-generic/pgtable.h | 25 +++++++++++++++++++++++++ mm/memory.c | 6 ++++++ 2 files changed, 31 insertions(+) --- a/include/asm-generic/pgtable.h~mm-fix-a-huge-pud-insertion-race-during-faulting +++ a/include/asm-generic/pgtable.h @@ -938,6 +938,31 @@ static inline int pud_trans_huge(pud_t p } #endif +/* See pmd_none_or_trans_huge_or_clear_bad for discussion. */ +static inline int pud_none_or_trans_huge_or_dev_or_clear_bad(pud_t *pud) +{ + pud_t pudval = READ_ONCE(*pud); + + if (pud_none(pudval) || pud_trans_huge(pudval) || pud_devmap(pudval)) + return 1; + if (unlikely(pud_bad(pudval))) { + pud_clear_bad(pud); + return 1; + } + return 0; +} + +/* See pmd_trans_unstable for discussion. */ +static inline int pud_trans_unstable(pud_t *pud) +{ +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && \ + defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) + return pud_none_or_trans_huge_or_dev_or_clear_bad(pud); +#else + return 0; +#endif +} + #ifndef pmd_read_atomic static inline pmd_t pmd_read_atomic(pmd_t *pmdp) { --- a/mm/memory.c~mm-fix-a-huge-pud-insertion-race-during-faulting +++ a/mm/memory.c @@ -4010,6 +4010,7 @@ static vm_fault_t __handle_mm_fault(stru vmf.pud = pud_alloc(mm, p4d, address); if (!vmf.pud) return VM_FAULT_OOM; +retry_pud: if (pud_none(*vmf.pud) && __transparent_hugepage_enabled(vma)) { ret = create_huge_pud(&vmf); if (!(ret & VM_FAULT_FALLBACK)) @@ -4036,6 +4037,11 @@ static vm_fault_t __handle_mm_fault(stru vmf.pmd = pmd_alloc(mm, vmf.pud, address); if (!vmf.pmd) return VM_FAULT_OOM; + + /* Huge pud page fault raced with pmd_alloc? */ + if (pud_trans_unstable(vmf.pud)) + goto retry_pud; + if (pmd_none(*vmf.pmd) && __transparent_hugepage_enabled(vma)) { ret = create_huge_pmd(&vmf); if (!(ret & VM_FAULT_FALLBACK))