From patchwork Tue Dec 6 14:47:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13065893 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18F3AC352A1 for ; Tue, 6 Dec 2022 14:48:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4BAB28E0002; Tue, 6 Dec 2022 09:48:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 46ADA8E0001; Tue, 6 Dec 2022 09:48:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2E4008E0002; Tue, 6 Dec 2022 09:48:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 1EE2E8E0001 for ; Tue, 6 Dec 2022 09:48:00 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id D42F41C5EBF for ; Tue, 6 Dec 2022 14:47:59 +0000 (UTC) X-FDA: 80212161078.21.867DEED Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf18.hostedemail.com (Postfix) with ESMTP id 5EEBE1C000E for ; Tue, 6 Dec 2022 14:47:58 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=SWuBiTGm; spf=pass (imf18.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670338078; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=iUuXMOHVf03Vb0qMF+Y5euIKjDZHBS53BDQPdht/P5M=; b=27fiUJ/sMF7gTiI1b4R/DweUuVhQQpKLR1YFW/vG2X0bTAxPOYVQgzkypvDBBqSzcZWq7g 36fW8HF9t5iWAAF42k/58+s0augqyQsdfgIw8BeQAAIUOHu2h95Br80niZbA9p7OMWrDVV In9ixZrfNVCssikj0UnGLS9764TPD1U= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=SWuBiTGm; spf=pass (imf18.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670338078; a=rsa-sha256; cv=none; b=DlSe0zd4n/SLC7qcllzqJ8+nTIJBMCuTwYrFatmQURf8N8nVthMn8IQ6iy6eCx4vS9ntgd 6tsOHha5Pf5bUZbJ14jJgB+FvpuBuhmeXoX/sHkTz/X5lHJdUu6dAHlPAI8NYBXJncIRsy 56NtTW83GQC25X2ilyA2jVgv8++YoP8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670338077; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=iUuXMOHVf03Vb0qMF+Y5euIKjDZHBS53BDQPdht/P5M=; b=SWuBiTGmBktrYO33f1ydvnejUj5gTKfDkI5A+bLq5sJ4sSJNF6CMN3mSgPOX+SCfzmIuVc rFBgytMwr3XzNUHsqP3l7hZiQ133snriJb7v4CHxy3T5TOV0SWCctlksoWhMaEhpzlQd5X r5GfLMGetGjXKeIr9RPHPKwJrKoMhDU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-335-3aNaSZDWPJGBU0cwaoxK0w-1; Tue, 06 Dec 2022 09:47:56 -0500 X-MC-Unique: 3aNaSZDWPJGBU0cwaoxK0w-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D08958027F5; Tue, 6 Dec 2022 14:47:53 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.173]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6E89C492B04; Tue, 6 Dec 2022 14:47:33 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: Andrew Morton , Hugh Dickins , John Hubbard , Jason Gunthorpe , Mike Rapoport , Yang Shi , Vlastimil Babka , Nadav Amit , Andrea Arcangeli , Peter Xu , linux-mm@kvack.org, x86@kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-ia64@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, openrisc@lists.librecores.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-xtensa@linux-xtensa.org, David Hildenbrand , Albert Ou , Anton Ivanov , Borislav Petkov , Brian Cain , Christophe Leroy , Chris Zankel , Dave Hansen , "David S. Miller" , Dinh Nguyen , Geert Uytterhoeven , Greg Ungerer , Guo Ren , Helge Deller , "H. Peter Anvin" , Huacai Chen , Ingo Molnar , Ivan Kokshaysky , "James E.J. Bottomley" , Johannes Berg , Matt Turner , Max Filippov , Michael Ellerman , Michal Simek , Nicholas Piggin , Palmer Dabbelt , Paul Walmsley , Richard Henderson , Richard Weinberger , Rich Felker , Russell King , Stafford Horne , Stefan Kristiansson , Thomas Bogendoerfer , Thomas Gleixner , Vineet Gupta , WANG Xuerui , Yoshinori Sato Subject: [PATCH mm-unstable RFC 00/26] mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap PTEs Date: Tue, 6 Dec 2022 15:47:04 +0100 Message-Id: <20221206144730.163732-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Queue-Id: 5EEBE1C000E X-Stat-Signature: hs8p5yra1o9t4rm7iw9999nkgmqzbomo X-Rspam-User: X-Spamd-Result: default: False [-3.40 / 9.00]; BAYES_HAM(-6.00)[100.00%]; SUSPICIOUS_RECIPS(1.50)[]; SUBJECT_HAS_UNDERSCORES(1.00)[]; MID_CONTAINS_FROM(1.00)[]; DMARC_POLICY_ALLOW(-0.50)[redhat.com,none]; R_SPF_ALLOW(-0.20)[+ip4:170.10.129.0/24]; R_DKIM_ALLOW(-0.20)[redhat.com:s=mimecast20190719]; RCVD_NO_TLS_LAST(0.10)[]; MIME_GOOD(-0.10)[text/plain]; MIME_TRACE(0.00)[0:+]; FROM_EQ_ENVFROM(0.00)[]; RCPT_COUNT_GT_50(0.00)[69]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_COUNT_THREE(0.00)[4]; FROM_HAS_DN(0.00)[]; DKIM_TRACE(0.00)[redhat.com:+]; TO_DN_SOME(0.00)[]; ARC_SIGNED(0.00)[hostedemail.com:s=arc-20220608:i=1]; ARC_NA(0.00)[] X-Rspamd-Server: rspam08 X-HE-Tag: 1670338078-644448 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is the follow-up on [1]: [PATCH v2 0/8] mm: COW fixes part 3: reliable GUP R/W FOLL_GET of anonymous pages After we implemented __HAVE_ARCH_PTE_SWP_EXCLUSIVE on most prominent enterprise architectures, implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all remaining architectures that support swap PTEs. This makes sure that exclusive anonymous pages will stay exclusive, even after they were swapped out -- for example, making GUP R/W FOLL_GET of anonymous pages reliable. Details can be found in [1]. This primarily fixes remaining known O_DIRECT memory corruptions that can happen on concurrent swapout, whereby we can lose DMA reads to a page (modifying the user page by writing to it). To verify, there are two test cases (requiring swap space, obviously): (1) The O_DIRECT+swapout test case [2] from Andrea. This test case tries triggering a race condition. (2) My vmsplice() test case [3] that tries to detect if the exclusive marker was lost during swapout, not relying on a race condition. For example, on 32bit x86 (with and without PAE), my test case fails without these patches: $ ./test_swp_exclusive FAIL: page was replaced during COW But succeeds with these patches: $ ./test_swp_exclusive PASS: page was not replaced during COW Why implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE for all architectures, even the ones where swap support might be in a questionable state? This is the first step towards removing "readable_exclusive" migration entries, and instead using pte_swp_exclusive() also with (readable) migration entries instead (as suggested by Peter). The only missing piece for that is supporting pmd_swp_exclusive() on relevant architectures with THP migration support. As all relevant architectures now implement __HAVE_ARCH_PTE_SWP_EXCLUSIVE,, we can drop __HAVE_ARCH_PTE_SWP_EXCLUSIVE in the last patch. RFC because some of the swap PTE layouts are really tricky and I really need some feedback related to deciphering these layouts and "using yet unused PTE bits in swap PTEs". I tried cross-compiling all relevant setups (phew, I might only miss some power/nohash variants), but only tested on x86 so far. CCing arch maintainers only on this cover letter and on the respective patch(es). [1] https://lkml.kernel.org/r/20220329164329.208407-1-david@redhat.com [2] https://gitlab.com/aarcange/kernel-testcases-for-v5.11/-/blob/main/page_count_do_wp_page-swap.c [3] https://gitlab.com/davidhildenbrand/scratchspace/-/blob/main/test_swp_exclusive.c David Hildenbrand (26): mm/debug_vm_pgtable: more pte_swp_exclusive() sanity checks alpha/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE arc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE arm/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE csky/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE hexagon/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE ia64/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE loongarch/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE m68k/mm: remove dummy __swp definitions for nommu m68k/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE microblaze/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE mips/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE nios2/mm: refactor swap PTE layout nios2/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE openrisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE parisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE powerpc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit book3s powerpc/nohash/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE riscv/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE sh/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 64bit um/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE x86/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE also on 32bit xtensa/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE mm: remove __HAVE_ARCH_PTE_SWP_EXCLUSIVE arch/alpha/include/asm/pgtable.h | 40 ++++++++- arch/arc/include/asm/pgtable-bits-arcv2.h | 26 +++++- arch/arm/include/asm/pgtable-2level.h | 3 + arch/arm/include/asm/pgtable-3level.h | 3 + arch/arm/include/asm/pgtable.h | 34 ++++++-- arch/arm64/include/asm/pgtable.h | 1 - arch/csky/abiv1/inc/abi/pgtable-bits.h | 13 ++- arch/csky/abiv2/inc/abi/pgtable-bits.h | 19 ++-- arch/csky/include/asm/pgtable.h | 17 ++++ arch/hexagon/include/asm/pgtable.h | 36 ++++++-- arch/ia64/include/asm/pgtable.h | 31 ++++++- arch/loongarch/include/asm/pgtable-bits.h | 4 + arch/loongarch/include/asm/pgtable.h | 38 +++++++- arch/m68k/include/asm/mcf_pgtable.h | 35 +++++++- arch/m68k/include/asm/motorola_pgtable.h | 37 +++++++- arch/m68k/include/asm/pgtable_no.h | 6 -- arch/m68k/include/asm/sun3_pgtable.h | 38 +++++++- arch/microblaze/include/asm/pgtable.h | 44 +++++++--- arch/mips/include/asm/pgtable-32.h | 86 ++++++++++++++++--- arch/mips/include/asm/pgtable-64.h | 23 ++++- arch/mips/include/asm/pgtable.h | 35 ++++++++ arch/nios2/include/asm/pgtable-bits.h | 3 + arch/nios2/include/asm/pgtable.h | 37 ++++++-- arch/openrisc/include/asm/pgtable.h | 40 +++++++-- arch/parisc/include/asm/pgtable.h | 40 ++++++++- arch/powerpc/include/asm/book3s/32/pgtable.h | 37 ++++++-- arch/powerpc/include/asm/book3s/64/pgtable.h | 1 - arch/powerpc/include/asm/nohash/32/pgtable.h | 22 +++-- arch/powerpc/include/asm/nohash/32/pte-40x.h | 6 +- arch/powerpc/include/asm/nohash/32/pte-44x.h | 18 +--- arch/powerpc/include/asm/nohash/32/pte-85xx.h | 4 +- arch/powerpc/include/asm/nohash/64/pgtable.h | 24 +++++- arch/powerpc/include/asm/nohash/pgtable.h | 15 ++++ arch/powerpc/include/asm/nohash/pte-e500.h | 1 - arch/riscv/include/asm/pgtable-bits.h | 3 + arch/riscv/include/asm/pgtable.h | 28 ++++-- arch/s390/include/asm/pgtable.h | 1 - arch/sh/include/asm/pgtable_32.h | 53 +++++++++--- arch/sparc/include/asm/pgtable_32.h | 26 +++++- arch/sparc/include/asm/pgtable_64.h | 37 +++++++- arch/sparc/include/asm/pgtsrmmu.h | 14 +-- arch/um/include/asm/pgtable.h | 36 +++++++- arch/x86/include/asm/pgtable-2level.h | 26 ++++-- arch/x86/include/asm/pgtable-3level.h | 26 +++++- arch/x86/include/asm/pgtable.h | 3 - arch/xtensa/include/asm/pgtable.h | 31 +++++-- include/linux/pgtable.h | 29 ------- mm/debug_vm_pgtable.c | 25 +++++- mm/memory.c | 4 - mm/rmap.c | 11 --- 50 files changed, 943 insertions(+), 227 deletions(-)