From patchwork Fri Jan 10 18:40:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935230 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94058E7719C for ; Fri, 10 Jan 2025 18:40:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 15FF86B00B1; Fri, 10 Jan 2025 13:40:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 110C66B00B4; Fri, 10 Jan 2025 13:40:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EA4366B00B5; Fri, 10 Jan 2025 13:40:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D074E6B00B1 for ; Fri, 10 Jan 2025 13:40:51 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 7BF2BC0E16 for ; Fri, 10 Jan 2025 18:40:51 +0000 (UTC) X-FDA: 82992408702.24.1C1BFE3 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf13.hostedemail.com (Postfix) with ESMTP id 8E5A720006 for ; Fri, 10 Jan 2025 18:40:49 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=4SQEzOur; spf=pass (imf13.hostedemail.com: domain of 3r2mBZwgKCMozqs02q3rw44w1u.s421y3AD-220Bqs0.47w@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3r2mBZwgKCMozqs02q3rw44w1u.s421y3AD-220Bqs0.47w@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534449; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lMV7uJ7I18OKYw8GMGFQsmzHxbOybRxQHWQLaqO2yfM=; b=tVj4cWyhT3RK7TQhz1ydoenAni7enid/fO3TTiujonGt70YZaOOq85yjUp4/OsPgA9cx+8 hb6sFbjDvowe8bnaGpySKm1HpkFFTtH2wY0u3M0+xQPeTEDiQlgcvas806dCITCBzMF/kM D3eYczl07QzLzrAsKyuZvPAu+gl5uD0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534449; a=rsa-sha256; cv=none; b=XDcNsqaXXfxvym15UfOX6wmP6SIfINIWwAWNHY/gfaQcMXhhqag6kfaHCwRclvc28xinA1 DxXHOQ5cOYZprAHvTTE9sxHgyucvEUD4oOpnxNDGeTn0D8f6/Rt0buaAErpPjPVxNQZGTK F6aAvWKrzR2CXbJG32TLjRS6zEbG4BM= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=4SQEzOur; spf=pass (imf13.hostedemail.com: domain of 3r2mBZwgKCMozqs02q3rw44w1u.s421y3AD-220Bqs0.47w@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3r2mBZwgKCMozqs02q3rw44w1u.s421y3AD-220Bqs0.47w@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-436289a570eso19444325e9.0 for ; Fri, 10 Jan 2025 10:40:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534448; x=1737139248; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=lMV7uJ7I18OKYw8GMGFQsmzHxbOybRxQHWQLaqO2yfM=; b=4SQEzOuru9inTNsHW+B3wnlAaEIpl03tMTiJkKvKxTEldsceho14pAAGgWjs2hCaC2 f+pKJLLMwp6ekVlMs4E8ESbpqXkyIb7wk7dknO2se0KsciKGyv6/w34OJQoFP9r6MJcQ YUYhdN3AO8bXC42u9bvWKMAhVZrH3dMWwCPiR6tMK9SSas7ihyJ0NodqJ7nQVjX9SImQ K2T01eFYGo3xL6I/zMX0OT0JY2neHid8v3L6h3t0hKxb4DanUt3mbd8PrK+fwx0N54uD IweIYIEmWx7DWGsLp33NuNKb/Ivy0dFdsRXZqnNda3NCjaZK0jqvMDgw2rVWl6iTlEG7 1opw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534448; x=1737139248; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lMV7uJ7I18OKYw8GMGFQsmzHxbOybRxQHWQLaqO2yfM=; b=fw1OdicGfhX+beEJ8ArGPgkybC5Sk/YY5yuG+8MbW8qkVYeOzxaafXiy/nnQuzHGmQ Z3thySoT6EkEY2mGMURTfIQVnnHpuv/OSox9YhbafYMaoz0GP3w79v1NseoQRYpwevtn YvBVI48oa2d8KeqkOUWoLW1lRvy69qeX0Fffs9/8TpFjUZp/cuHxr2XwFxy7PBR6kFfd xd/gz5b39HlYWOeYgUyFD3DQK1mJrT2+M70S0zODMBJi+MNhyYpcTc/FQhKl0IrHx1M6 VH8BHrs2+pGK7lfqrImYPobnDhocEsWbg9ArmkNePnyYv8p4ey/r0fpm/RgL9i2n8u+6 2s8A== X-Forwarded-Encrypted: i=1; AJvYcCW6ulCrJbj/FiLadR8mYCnUjqb2TXzT0le1DB9ZkjNmf08wBX1Ipg8EilbpMf2l2umRZEPUSCR3dA==@kvack.org X-Gm-Message-State: AOJu0Ywb+exQQq3sqt7LpDLyeXTBuiWwNWumAFR2BypHEgmYnKBuLUmR 0bEoK/An8qnhZLPYcvRAaVGY4MjhQ9+AnldWVw3L9sdmomWfAxUov6GCb/1oywYU5taEmE9r/GZ F6mbVUUopWg== X-Google-Smtp-Source: AGHT+IFp39E9krgcILEXjwgS0eG/yaDAJpZCMdfJgN+pTbMyBclx0Pttf9qnW8OHp1hJ5TpOh65GX4riAPR9zA== X-Received: from wmbbd12.prod.google.com ([2002:a05:600c:1f0c:b0:434:fd41:173c]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1d03:b0:434:a781:f5d9 with SMTP id 5b1f17b1804b1-436e2697b32mr62972955e9.11.1736534447848; Fri, 10 Jan 2025 10:40:47 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:27 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-1-8419288bc805@google.com> Subject: [PATCH RFC v2 01/29] mm: asi: Make some utility functions noinstr compatible From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 8E5A720006 X-Stat-Signature: w5wzx53h5zi1unepk4gqhe38atztzjxn X-Rspam-User: X-HE-Tag: 1736534449-244109 X-HE-Meta: U2FsdGVkX1+bDHNxUkDYEJY/PMdgYJGCg6Vm0bdc/6ypOKUFjSUIDxnkh9ZemsrvjiYEeml4LGkI/J6o+rLJhhUsmTm73KRSFqj++pOdumB0Zja8JIGTfoV8+0/NCTm8zT88YVKoXlAtrFJpJEshfvC+NU4nSdLaFckFZyq4EPSA+EVf5dW+5t0Tqj46c2J1YIVEvo8t7t9TBb7vwBajlYR0p+woZNft9CKcx8io6s6+PKCTWTcMWXS3WFjbHGZdoMt+WCwm8UdqylqdCYMLjX/LcplqeWdtb7jp6ISMn2uxIsggpbpYpT+TFEXonagkWAByNCMX0Gkr6buPb7Dv7ffvdEuxtCJl3U/DmeZ/4oLJ6UhCG6UeuxBgtsgq7rR3MkU6uf3L26V9IyvZhgTj5JTku13iQ+JsW4yZw6tTuljZKWoYjrAIdhBQcqsxheGMxfGNXjTdNi+4idPNbasQZBRuj76tBQfpZH9tuviVINNwb1m5gWi0rHhp44jVEnd+kV8VuieVu02wKaeGuqJFv4iCsbI5amYsmR6tRl6nHsLuS5uuivQJbiwx/jBPu2aWdFPD4gupGXgws+qUqWDACeWu8EVcaU4YHM+4fJ5gkDcF4hg12I4X1UR6votjtA5FyhRfyZAK6Ov7x34IhIY6RNGfXEgl/xqWBBkVf9jfRKcnHojP/vcOVLrMFXxtM/lESAsOQKURdpm/qf1pmZiIzncreGBPOGNsUKRGr8fgQeMb7L4vmeX4p5+lxGH4nwJvJyI0G8mBttq9q+4HYlURdJ6EQDseeRCWK1s9qGC0FpGkvokTwMo02dgc7ro3k/vEFnsrVee+aNpGSa5wmFJt8jPc7M6emtz+proIWlBYJYKhfKF/BJt24fm6kE9bg1IbVj506T0hcXAZXnDSZo3lqdHQjtk+H94F/6N2yTE5eNkqQ/5ByyPPfWN4mERgZ6eL99tEqtUY/rJuDz1V+Bs aN7EfnZO WFpOZv3gg7vCR8OfX15as2pi3ray6FZyS1oEYzEk+QuWOdCNiJF8xb2kMusIEnb5SRfENfwbaZDDwzx7hFdly/AJNRQGaG3F9bZVIWjE5rLTFsJv6Xa7WCtuqBQ55Tk0T6+CdWwZn3XQIQUuEP25vxayirQwfW8Hm8fY/s3pyo7RwzfD5fJuuTSYIy5CZrg+sqfBb9fe1rb/74pMq4nqvmGC1jUckdl9Kmbumza7Prg5uPXigHzHuM2ix4L7PWfBqgh721U/b6+80FwRTTsifKzya1IS25uIWGW752vq6xvdl6IiZ2KH3Yo8U6tc4pLS3teFHQi7lFSUOhnJBm2iIChkryRGV3+qlWwGrPLNyx6EOALUQnPYLsuGGd4soPF6VkyuFbN0Je9q94WPLNvh7QTVLL2hY2p9i43qCfQs8slehY1GquBnvIOPbU8uDn5p8+a1gU5fhiRlXT9Cl/oNYz67vX6NE4/BALy6TbablPObYxPW77bEcyi1PFNKfxHvlnMGzPqn6kP9++TepPkfn422R812aetKr63fUj3swEx4sJtcByuLSyY5dKti2d8Nt97JvGKTS/R412bSac8vf2pnKU9ksHIceFzALYl433cnBON8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Some existing utility functions would need to be called from a noinstr context in the later patches. So mark these as either noinstr or __always_inline. An earlier version of this by Junaid had a macro that was intended to tell the compiler "either inline this function, or call it in the noinstr section", which basically boiled down to: #define inline_or_noinstr noinline __section(".noinstr.text") Unfortunately Thomas pointed out this will prevent the function from being inlined at call sites in .text. So far I haven't been able[1] to find a formulation that lets us : 1. avoid calls from .noinstr.text -> .text, 2. while also letting the compiler freely decide what to inline. 1 is a functional requirement so here I'm just giving up on 2. Existing callsites of this code are just forced inline. For the incoming code that needs to call it from noinstr, they will be out-of-line calls. [1] https://lore.kernel.org/lkml/CA+i-1C1z35M8wA_4AwMq7--c1OgjNoLGTkn4+Td5gKg7QQAzWw@mail.gmail.com/ Checkpatch-args: --ignore=COMMIT_LOG_LONG_LINE Signed-off-by: Brendan Jackman --- arch/x86/include/asm/processor.h | 2 +- arch/x86/include/asm/special_insns.h | 8 ++++---- arch/x86/include/asm/tlbflush.h | 3 +++ arch/x86/mm/tlb.c | 13 +++++++++---- 4 files changed, 17 insertions(+), 9 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 4a686f0e5dbf6d906ed38276148b186e920927b3..1a1b7ea5d7d32a47d783d9d62cd2a53672addd6f 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -220,7 +220,7 @@ void print_cpu_msr(struct cpuinfo_x86 *); /* * Friendlier CR3 helpers. */ -static inline unsigned long read_cr3_pa(void) +static __always_inline unsigned long read_cr3_pa(void) { return __read_cr3() & CR3_ADDR_MASK; } diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index aec6e2d3aa1d52e5c8f513e188015a45e9eeaeb2..6e103358966f6f1333aa07be97aec5f8af794120 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -42,14 +42,14 @@ static __always_inline void native_write_cr2(unsigned long val) asm volatile("mov %0,%%cr2": : "r" (val) : "memory"); } -static inline unsigned long __native_read_cr3(void) +static __always_inline unsigned long __native_read_cr3(void) { unsigned long val; asm volatile("mov %%cr3,%0\n\t" : "=r" (val) : __FORCE_ORDER); return val; } -static inline void native_write_cr3(unsigned long val) +static __always_inline void native_write_cr3(unsigned long val) { asm volatile("mov %0,%%cr3": : "r" (val) : "memory"); } @@ -153,12 +153,12 @@ static __always_inline void write_cr2(unsigned long x) * Careful! CR3 contains more than just an address. You probably want * read_cr3_pa() instead. */ -static inline unsigned long __read_cr3(void) +static __always_inline unsigned long __read_cr3(void) { return __native_read_cr3(); } -static inline void write_cr3(unsigned long x) +static __always_inline void write_cr3(unsigned long x) { native_write_cr3(x); } diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 69e79fff41b800a0a138bcbf548dde9d72993105..c884174a44e119a3c027c44ada6c5cdba14d1282 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -423,4 +423,7 @@ static inline void __native_tlb_flush_global(unsigned long cr4) native_write_cr4(cr4 ^ X86_CR4_PGE); native_write_cr4(cr4); } + +unsigned long build_cr3_noinstr(pgd_t *pgd, u16 asid, unsigned long lam); + #endif /* _ASM_X86_TLBFLUSH_H */ diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 86593d1b787d8a5b9fa4bd492356898ec8870938..f0428e5e1f1947903ee87c4c6444844ee11b45c3 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -108,7 +108,7 @@ /* * Given @asid, compute kPCID */ -static inline u16 kern_pcid(u16 asid) +static __always_inline u16 kern_pcid(u16 asid) { VM_WARN_ON_ONCE(asid > MAX_ASID_AVAILABLE); @@ -153,9 +153,9 @@ static inline u16 user_pcid(u16 asid) return ret; } -static inline unsigned long build_cr3(pgd_t *pgd, u16 asid, unsigned long lam) +static __always_inline unsigned long build_cr3(pgd_t *pgd, u16 asid, unsigned long lam) { - unsigned long cr3 = __sme_pa(pgd) | lam; + unsigned long cr3 = __sme_pa_nodebug(pgd) | lam; if (static_cpu_has(X86_FEATURE_PCID)) { cr3 |= kern_pcid(asid); @@ -166,6 +166,11 @@ static inline unsigned long build_cr3(pgd_t *pgd, u16 asid, unsigned long lam) return cr3; } +noinstr unsigned long build_cr3_noinstr(pgd_t *pgd, u16 asid, unsigned long lam) +{ + return build_cr3(pgd, asid, lam); +} + static inline unsigned long build_cr3_noflush(pgd_t *pgd, u16 asid, unsigned long lam) { @@ -1084,7 +1089,7 @@ void flush_tlb_kernel_range(unsigned long start, unsigned long end) * It's intended to be used for code like KVM that sneakily changes CR3 * and needs to restore it. It needs to be used very carefully. */ -unsigned long __get_current_cr3_fast(void) +noinstr unsigned long __get_current_cr3_fast(void) { unsigned long cr3 = build_cr3(this_cpu_read(cpu_tlbstate.loaded_mm)->pgd, From patchwork Fri Jan 10 18:40:28 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935231 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0ACBFE77188 for ; Fri, 10 Jan 2025 18:40:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 755BF6B00B5; Fri, 10 Jan 2025 13:40:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6DC5D6B00B6; Fri, 10 Jan 2025 13:40:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 508136B00B9; Fri, 10 Jan 2025 13:40:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 2A3666B00B5 for ; Fri, 10 Jan 2025 13:40:54 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id D8003C0BEB for ; Fri, 10 Jan 2025 18:40:53 +0000 (UTC) X-FDA: 82992408786.14.46A0EF9 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf05.hostedemail.com (Postfix) with ESMTP id DAB3F100007 for ; Fri, 10 Jan 2025 18:40:51 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=paw31RFh; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of 3smmBZwgKCM02tv35t6uz77z4x.v75416DG-553Etv3.7Az@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3smmBZwgKCM02tv35t6uz77z4x.v75416DG-553Etv3.7Az@flex--jackmanb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534452; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LLeVb0ZgyFkqRQqFmxHZC1txeWSVEDEElINoAb1GRXw=; b=QBvVcXEwh2W1fQI+ig2SrgyHH5svDnCD0xfksHwV6fhP+1+SNvwWnm0FpJrZp3F4ezCYAv vZ+J9eJKeQFh3iOyZbCHb2O4edn7nYle9eeuwCxQ97juyqXK/VNgY/ZO++rEVEYxJ7rNeC dnLdjaMzYkQju8Wj+/Du4rA6/+rQbso= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534452; a=rsa-sha256; cv=none; b=3vYQXCvOfZT+fFm9zKUicZTNPrzvUUskTEW/NDMVBc272kGYYtZtddCXwOMM7dD9d8oWmE kuhKrqlG95Wz4gh7RttdYUOVNKb0xxwrrAONIChx0bj3CCSDHQQzMSdsZDw6X4496RqOLS NEQzflg9ZEUfKLhwpYjNTXJZbZZwquM= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=paw31RFh; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of 3smmBZwgKCM02tv35t6uz77z4x.v75416DG-553Etv3.7Az@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3smmBZwgKCM02tv35t6uz77z4x.v75416DG-553Etv3.7Az@flex--jackmanb.bounces.google.com Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-43619b135bcso12091005e9.1 for ; Fri, 10 Jan 2025 10:40:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534450; x=1737139250; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=LLeVb0ZgyFkqRQqFmxHZC1txeWSVEDEElINoAb1GRXw=; b=paw31RFhLjSilvrV4BSkbWO/kxTH8aI4mPSJGdvgPLlWxoqRlUzLW/civrL0C59++U vgdZK2o8E7DxgJL8NSKf48Fsym0TY1IdUlCk4oZpgXS6jZ7dvXUwXZlkwB7HGybjYeTm o+ix+diYIr8jmmIxSra5wEhltTiZ0hB4/TJ82CPaEAT1Vm4q5UBVPjwVR3Gdv0/Q6qfW A/80vtjEES+ZpScfB4Yiyzcv7I06EatVUGdKbEBu1OSfFBc869Qkt/WUwckYX3SYy2qm JfJufaV182fJnZi7F4HDE1kzVr4DpjE5rFP1s900DbdzbtoxnG5jeYtMvXyxsD2QQmEp HZxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534450; x=1737139250; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=LLeVb0ZgyFkqRQqFmxHZC1txeWSVEDEElINoAb1GRXw=; b=cBiJ/Yps86jnVb6VLDxNMG/Miy6Gdy1NAE2jKLySOaULXr2UjYOiSGbGjcDslVNlbe Kh5v4hm+AZF0RNwRgXUFU3GrdJ5tvoR9xv3NoAFxmQK2tdhylP9oNfxSpWG21u+lYKIR zFXR9EO8y9djFxNltBN5LPzqFIxkTHf4JBE9eldvFpdn4R/D918glxK15f1uxKKUNt6Q SHr4hefMu3MlbsiKdw8r+6KPp+j3z6edCEtvaOQsRZ1IXtnp3DlkdTKBN7fEWw3awugG M0JPoTfl11Yhk/A1U0HV6ipe9nMcULu5Gw63Wv++vSwSIU9bLa/7y/J6RxWulbGCyQ1g TKjA== X-Forwarded-Encrypted: i=1; AJvYcCXAAdOuuNOBVmo793FViPUGP1wAbh6/E6KSGVrnIRHTnLM7u40eL0tXuMbcfsanh/zqFTukj9qN0Q==@kvack.org X-Gm-Message-State: AOJu0YzwloyCtxyUaocdpipaNQhOK5syHFtcKDL2sONlHA3tOSwJMUqA ckKG5iKrrisPHq4zTXr1mguTIA5C2t9/SAHmqi26cyMvlbmNwljBM7+pHWNhuEIVXvLgNgOPxAK LqCdy/161mQ== X-Google-Smtp-Source: AGHT+IEYbvktq49LdsReUEtwVFMZ4u0AErNwMdQm9eBuwPrj68sv++2uU8TAXJvwJwbpoJD68/9N90LfrlOyOw== X-Received: from wmbay14.prod.google.com ([2002:a05:600c:1e0e:b0:434:a8d7:e59b]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1d07:b0:434:9934:575 with SMTP id 5b1f17b1804b1-436e26a8f4dmr128290085e9.16.1736534450259; Fri, 10 Jan 2025 10:40:50 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:28 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-2-8419288bc805@google.com> Subject: [PATCH RFC v2 02/29] x86: Create CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman , Junaid Shahid X-Stat-Signature: m6iyssc8ksygxt5f53ag7zn6y7c19e1h X-Rspamd-Queue-Id: DAB3F100007 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1736534451-815616 X-HE-Meta: U2FsdGVkX1+fvxsoolOknjD0GHJOyS1XmkUjm10gFTvkBVUiFZo1KnK2nf4TLTDN9yNduNr0lBdpbddjDTXDBzTVYT2OVXd3GZDv5qzXQanEcgJ4ZyPT334C69wGZ8bXJJ4Yiruznh7/UYdFIdpohCfwkU8G1hUALkjillm7Z/K8KPJ9uXcA/gMDMeWn4OSSxmx+p0+ifPZggCY+7vo0ws/o0dXCO6qc644W7ut4WRsF6xVr1rqaj7vOMMpv8gtWNlVGKTzNvmFFvG5nU5/QdB1yfurt0T3hFwTJIS4Ci2dGOL7y1aykFYCxD+CMZPPou6kCpe7HR1sZWPvNR9fIw3yTNr/yVsBEeCo+fMcieZsCRQYNR5L9Lk+w7QhHax/fNKpQS3yyLYqRsy5Gl1TXyM16BySyfbT5JTxpgdtIzMSBqfXvgzmknaTzJtKzG/cffVlyjW0zvGwAzDscSJWOVKlYQVBIRTa8yyvFm4f1hG0NFSwnE0aKi5nGukfwlKbC6Vl9GokyhcnxJ2Ha3fj9Jh84E010W9xX/Uku1H7/3GZao2n4dFuiANVSmuOhfda6NiZtUcVyRlW+dYldQr6b2mJvMyuk+sWGuvG4Gdb6R+wXHF+lxZN07bEf/TjTLDGY/isPbMK1bTeroezqv+nhaOdjFjXc4TYrNs1sknIA1zeo0EYu3EROs9zfyIjZZd1jqrjXUSrQ63w7R+GBJEIh3vSaVfxeTkHRgrVjzxqtS4+JnMIAkmm2QY4BMpK2O8pRehnVMwDoUdhQNZr1EeQ+fm6bfdyUiviy7e1k3UtJKCKx53hqLZaOQaSk+5QB4z22kpXBbojpcthQPIGMl+WPZ2aWvcVbzh9GnCYbI8+EbOUUdmjU0ED2ns/aZoj/rVBZZIgrI47+Qw7ZlIg0NU/fSjXEKTx2jDsFmiAOMPLszMovrlzOKnjgS1ryZB+l1v2+jBGfykvQKAqstHFOY5t ySaX+c0E IzBsfohupr0qItOsuVnXOjieQLTb+0wS44I3erFagyTpDnm27Bc/eKonLZN6GCJtiCXogplOQT9XNRc+0nPEe/TVPQVsk2LbhJVqRQK7pn4sq4RK7HUv5ZVh07RRKQ7yLxkckjZvqhP2Mo33B7fd95GywcsC5Yx7qUjBlUvWbOlzLrxalPu0baLHRCHHhXJma8WovrsMH/Z83RzXpnu6CxIumrcuBd4Di969amSS1vBVqhbAPvWdqu1mzZARmXgACfqY6i/mgArtVdwwiE+/LWzuxcC+unIjLfJzPKVMtQpWHNmBae5p/LUWRIFvq0VfcWbLd+FBZd4PQkXyJBc6RAmkLFox9991ms8DmT3UbF6s0d3pxkW0XG29FH1g8cVHvjvrcjzN3c+f17bEAlnABwUJ2/RUcuat9VqnGANRGPs08YYYMFbNXlqTW2ZI0a7yFCRBm6XDuvCZfoJSXlrFPEevJT9QN8OaTf1ETnco2D6G2fbWkzZPyoR4gvj4o5X8eNTYWRgTuNAVzrDtzPOxyfkTMbCA0Q/rPBPsJaAeL294TQ/EJNgtCPiO74aHFpRR35iNkexYSLwVpnH5s/RtJiiEKSUdOyWj3iz7DblKCB6A5FxA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently a nop config. Keeping as a separate commit for easy review of the boring bits. Later commits will use and enable this new config. This config is only added for non-UML x86_64 as other architectures do not yet have pending implementations. It also has somewhat artificial dependencies on !PARAVIRT and !KASAN which are explained in the Kconfig file. Co-developed-by: Junaid Shahid Signed-off-by: Junaid Shahid Signed-off-by: Brendan Jackman --- arch/alpha/include/asm/Kbuild | 1 + arch/arc/include/asm/Kbuild | 1 + arch/arm/include/asm/Kbuild | 1 + arch/arm64/include/asm/Kbuild | 1 + arch/csky/include/asm/Kbuild | 1 + arch/hexagon/include/asm/Kbuild | 1 + arch/loongarch/include/asm/Kbuild | 3 +++ arch/m68k/include/asm/Kbuild | 1 + arch/microblaze/include/asm/Kbuild | 1 + arch/mips/include/asm/Kbuild | 1 + arch/nios2/include/asm/Kbuild | 1 + arch/openrisc/include/asm/Kbuild | 1 + arch/parisc/include/asm/Kbuild | 1 + arch/powerpc/include/asm/Kbuild | 1 + arch/riscv/include/asm/Kbuild | 1 + arch/s390/include/asm/Kbuild | 1 + arch/sh/include/asm/Kbuild | 1 + arch/sparc/include/asm/Kbuild | 1 + arch/um/include/asm/Kbuild | 2 +- arch/x86/Kconfig | 14 ++++++++++++++ arch/xtensa/include/asm/Kbuild | 1 + include/asm-generic/asi.h | 5 +++++ 22 files changed, 41 insertions(+), 1 deletion(-) diff --git a/arch/alpha/include/asm/Kbuild b/arch/alpha/include/asm/Kbuild index 396caece6d6d99c7a428f439322a0a18452e1a42..ca72ce3baca13a32913ac9e01a8f86ef42180b1c 100644 --- a/arch/alpha/include/asm/Kbuild +++ b/arch/alpha/include/asm/Kbuild @@ -5,3 +5,4 @@ generic-y += agp.h generic-y += asm-offsets.h generic-y += kvm_para.h generic-y += mcs_spinlock.h +generic-y += asi.h diff --git a/arch/arc/include/asm/Kbuild b/arch/arc/include/asm/Kbuild index 49285a3ce2398cc7442bc44172de76367dc33dda..68604480864bbcb58d896da6bdf71591006ab2f6 100644 --- a/arch/arc/include/asm/Kbuild +++ b/arch/arc/include/asm/Kbuild @@ -6,3 +6,4 @@ generic-y += kvm_para.h generic-y += mcs_spinlock.h generic-y += parport.h generic-y += user.h +generic-y += asi.h diff --git a/arch/arm/include/asm/Kbuild b/arch/arm/include/asm/Kbuild index 03657ff8fbe3d202563184b8902aa181e7474a5e..1e2c3d8dbbd99bdf95dbc6b47c2c78092c68b808 100644 --- a/arch/arm/include/asm/Kbuild +++ b/arch/arm/include/asm/Kbuild @@ -6,3 +6,4 @@ generic-y += parport.h generated-y += mach-types.h generated-y += unistd-nr.h +generic-y += asi.h diff --git a/arch/arm64/include/asm/Kbuild b/arch/arm64/include/asm/Kbuild index 4e350df9a02dd8de387b912740af69035da93e34..15f8aaaa96b80b5657b789ecf3529b1f18d16d80 100644 --- a/arch/arm64/include/asm/Kbuild +++ b/arch/arm64/include/asm/Kbuild @@ -14,6 +14,7 @@ generic-y += qrwlock.h generic-y += qspinlock.h generic-y += parport.h generic-y += user.h +generic-y += asi.h generated-y += cpucap-defs.h generated-y += sysreg-defs.h diff --git a/arch/csky/include/asm/Kbuild b/arch/csky/include/asm/Kbuild index 9a9bc65b57a9d73dadc9d597700d7229f8554ddf..4f497118fb172d1f2bf0f9e472479f24227f42f4 100644 --- a/arch/csky/include/asm/Kbuild +++ b/arch/csky/include/asm/Kbuild @@ -11,3 +11,4 @@ generic-y += qspinlock.h generic-y += parport.h generic-y += user.h generic-y += vmlinux.lds.h +generic-y += asi.h diff --git a/arch/hexagon/include/asm/Kbuild b/arch/hexagon/include/asm/Kbuild index 8c1a78c8f5271ebd47f1baad7b85e87220d1bbe8..b26f186bc03c2e135f8d125a4805b95a41513655 100644 --- a/arch/hexagon/include/asm/Kbuild +++ b/arch/hexagon/include/asm/Kbuild @@ -5,3 +5,4 @@ generic-y += extable.h generic-y += iomap.h generic-y += kvm_para.h generic-y += mcs_spinlock.h +generic-y += asi.h diff --git a/arch/loongarch/include/asm/Kbuild b/arch/loongarch/include/asm/Kbuild index 5b5a6c90e6e20771b1074a6262230861cc51bcb4..dd3d0c6891369a9dfa35ccfb8b81c8697c2a3e90 100644 --- a/arch/loongarch/include/asm/Kbuild +++ b/arch/loongarch/include/asm/Kbuild @@ -11,3 +11,6 @@ generic-y += ioctl.h generic-y += mmzone.h generic-y += statfs.h generic-y += param.h +generic-y += asi.h +generic-y += posix_types.h +generic-y += resource.h diff --git a/arch/m68k/include/asm/Kbuild b/arch/m68k/include/asm/Kbuild index 0dbf9c5c6faeb30eeb38bea52ab7fade99bbd44a..faf0f135df4ab946ef115f3a2fc363f370fc7491 100644 --- a/arch/m68k/include/asm/Kbuild +++ b/arch/m68k/include/asm/Kbuild @@ -4,3 +4,4 @@ generic-y += extable.h generic-y += kvm_para.h generic-y += mcs_spinlock.h generic-y += spinlock.h +generic-y += asi.h diff --git a/arch/microblaze/include/asm/Kbuild b/arch/microblaze/include/asm/Kbuild index a055f5dbe00a31616592c3a848b49bbf9ead5d17..012e4bf83c13497dc296b66cd5e0fd519274306b 100644 --- a/arch/microblaze/include/asm/Kbuild +++ b/arch/microblaze/include/asm/Kbuild @@ -8,3 +8,4 @@ generic-y += parport.h generic-y += syscalls.h generic-y += tlb.h generic-y += user.h +generic-y += asi.h diff --git a/arch/mips/include/asm/Kbuild b/arch/mips/include/asm/Kbuild index 7ba67a0d6c97b2879fb710aca05ae1e2d47c8ce2..3191699298d80735920481eecc64dd2d1dbd2e54 100644 --- a/arch/mips/include/asm/Kbuild +++ b/arch/mips/include/asm/Kbuild @@ -13,3 +13,4 @@ generic-y += parport.h generic-y += qrwlock.h generic-y += qspinlock.h generic-y += user.h +generic-y += asi.h diff --git a/arch/nios2/include/asm/Kbuild b/arch/nios2/include/asm/Kbuild index 0d09829ed14454f2f15a32bf713fa1eb213e85ea..03a5ec74e28b3679a5ef7271606af3c07bb7a198 100644 --- a/arch/nios2/include/asm/Kbuild +++ b/arch/nios2/include/asm/Kbuild @@ -7,3 +7,4 @@ generic-y += kvm_para.h generic-y += mcs_spinlock.h generic-y += spinlock.h generic-y += user.h +generic-y += asi.h diff --git a/arch/openrisc/include/asm/Kbuild b/arch/openrisc/include/asm/Kbuild index cef49d60d74c0f46f01cf46cc35e1e52404185f3..6a81a58bf59e20cafa563c422df4dfa6f9f791ec 100644 --- a/arch/openrisc/include/asm/Kbuild +++ b/arch/openrisc/include/asm/Kbuild @@ -9,3 +9,4 @@ generic-y += spinlock.h generic-y += qrwlock_types.h generic-y += qrwlock.h generic-y += user.h +generic-y += asi.h diff --git a/arch/parisc/include/asm/Kbuild b/arch/parisc/include/asm/Kbuild index 4fb596d94c8932dd1e12a765a21af5b5099fbafd..3cbb4eb14712c7bd6c248dd26ab91cc41da01825 100644 --- a/arch/parisc/include/asm/Kbuild +++ b/arch/parisc/include/asm/Kbuild @@ -5,3 +5,4 @@ generic-y += agp.h generic-y += kvm_para.h generic-y += mcs_spinlock.h generic-y += user.h +generic-y += asi.h diff --git a/arch/powerpc/include/asm/Kbuild b/arch/powerpc/include/asm/Kbuild index e5fdc336c9b22527f824ed30d06b5e8c0fa8a1ef..e86cc027f35564c7b301c283043bde0e5d2d3b6a 100644 --- a/arch/powerpc/include/asm/Kbuild +++ b/arch/powerpc/include/asm/Kbuild @@ -7,3 +7,4 @@ generic-y += kvm_types.h generic-y += mcs_spinlock.h generic-y += qrwlock.h generic-y += early_ioremap.h +generic-y += asi.h diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild index 1461af12da6e2bbbff6cf737a7babf33bd298cdd..82060ed50d9beb1ea72d3570ad236d1e08d9d8c6 100644 --- a/arch/riscv/include/asm/Kbuild +++ b/arch/riscv/include/asm/Kbuild @@ -13,3 +13,4 @@ generic-y += qrwlock.h generic-y += qrwlock_types.h generic-y += user.h generic-y += vmlinux.lds.h +generic-y += asi.h diff --git a/arch/s390/include/asm/Kbuild b/arch/s390/include/asm/Kbuild index 297bf7157968907d6e4c4ff8b65deeef02dbd630..e15c2a138392b57b186633738ddda913474aa8c4 100644 --- a/arch/s390/include/asm/Kbuild +++ b/arch/s390/include/asm/Kbuild @@ -8,3 +8,4 @@ generic-y += asm-offsets.h generic-y += kvm_types.h generic-y += mcs_spinlock.h generic-y += mmzone.h +generic-y += asi.h diff --git a/arch/sh/include/asm/Kbuild b/arch/sh/include/asm/Kbuild index fc44d9c88b41915a7021042eb8b462517cfdbd2c..ea19e4515828552f436d67f764607dd5d15cb19f 100644 --- a/arch/sh/include/asm/Kbuild +++ b/arch/sh/include/asm/Kbuild @@ -3,3 +3,4 @@ generated-y += syscall_table.h generic-y += kvm_para.h generic-y += mcs_spinlock.h generic-y += parport.h +generic-y += asi.h diff --git a/arch/sparc/include/asm/Kbuild b/arch/sparc/include/asm/Kbuild index 43b0ae4c2c2112d4d4d3cb3c60e787b175172dea..cb9062c9be17fe276cc92d2ac99d8b165f6297bf 100644 --- a/arch/sparc/include/asm/Kbuild +++ b/arch/sparc/include/asm/Kbuild @@ -4,3 +4,4 @@ generated-y += syscall_table_64.h generic-y += agp.h generic-y += kvm_para.h generic-y += mcs_spinlock.h +generic-y += asi.h diff --git a/arch/um/include/asm/Kbuild b/arch/um/include/asm/Kbuild index 18f902da8e99769da857d34af43141ea97a0ca63..6054972f1babdaebae64040b05ab48893915cb04 100644 --- a/arch/um/include/asm/Kbuild +++ b/arch/um/include/asm/Kbuild @@ -27,4 +27,4 @@ generic-y += trace_clock.h generic-y += kprobes.h generic-y += mm_hooks.h generic-y += vga.h -generic-y += video.h +generic-y += asi.h diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 7b9a7e8f39acc8e9aeb7d4213e87d71047865f5c..5a50582eb210e9d1309856a737d32b76fa1bfc85 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2519,6 +2519,20 @@ config MITIGATION_PAGE_TABLE_ISOLATION See Documentation/arch/x86/pti.rst for more details. +config MITIGATION_ADDRESS_SPACE_ISOLATION + bool "Allow code to run with a reduced kernel address space" + default n + depends on X86_64 && !PARAVIRT && !UML + help + This feature provides the ability to run some kernel code + with a reduced kernel address space. This can be used to + mitigate some speculative execution attacks. + + The !PARAVIRT dependency is only because of lack of testing; in theory + the code is written to work under paravirtualization. In practice + there are likely to be unhandled cases, in particular concerning TLB + flushes. + config MITIGATION_RETPOLINE bool "Avoid speculative indirect branches in kernel" select OBJTOOL if HAVE_OBJTOOL diff --git a/arch/xtensa/include/asm/Kbuild b/arch/xtensa/include/asm/Kbuild index fa07c686cbcc2153776a478ac4093846f01eddab..07cea6902f98053be244d026ed594fe7246755a6 100644 --- a/arch/xtensa/include/asm/Kbuild +++ b/arch/xtensa/include/asm/Kbuild @@ -8,3 +8,4 @@ generic-y += parport.h generic-y += qrwlock.h generic-y += qspinlock.h generic-y += user.h +generic-y += asi.h diff --git a/include/asm-generic/asi.h b/include/asm-generic/asi.h new file mode 100644 index 0000000000000000000000000000000000000000..c4d9a5ff860a96428422a15000c622aeecc2d664 --- /dev/null +++ b/include/asm-generic/asi.h @@ -0,0 +1,5 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __ASM_GENERIC_ASI_H +#define __ASM_GENERIC_ASI_H + +#endif From patchwork Fri Jan 10 18:40:29 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935232 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 441E0E7719D for ; Fri, 10 Jan 2025 18:40:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C11EE6B00B9; Fri, 10 Jan 2025 13:40:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B989A6B00BB; Fri, 10 Jan 2025 13:40:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 901906B00BC; Fri, 10 Jan 2025 13:40:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 6B51A6B00B9 for ; Fri, 10 Jan 2025 13:40:56 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 2768F43C6A for ; Fri, 10 Jan 2025 18:40:56 +0000 (UTC) X-FDA: 82992408912.20.A2AAE07 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf16.hostedemail.com (Postfix) with ESMTP id 1C6D018000C for ; Fri, 10 Jan 2025 18:40:53 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LCAw8SLe; spf=pass (imf16.hostedemail.com: domain of 3tGmBZwgKCM84vx57v8w19916z.x97638FI-775Gvx5.9C1@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3tGmBZwgKCM84vx57v8w19916z.x97638FI-775Gvx5.9C1@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534454; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XoGlz9R9SPVMFZpNV+d4L6dWHbzE359F5HGhLwEH9Ng=; b=dqB6PTmmH40CVla/dJJYDFQXJojnEUlm3ZbPd3cIdIUwpYma1Py0gmYoVLyPsj2DmSySMA ZMV4q+lMrSIMIzZF378JVq6wjmFnEe8gL6cG+H7QeQvGXe4PRTqdgOxDkmcDJSE4KTpb7I f3BetWmBsV0C6MS93rpqLp+NBD/gg3o= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LCAw8SLe; spf=pass (imf16.hostedemail.com: domain of 3tGmBZwgKCM84vx57v8w19916z.x97638FI-775Gvx5.9C1@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3tGmBZwgKCM84vx57v8w19916z.x97638FI-775Gvx5.9C1@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534454; a=rsa-sha256; cv=none; b=h6NpF2wwR7schz0XvcDdd3DQpPMNN9t9FzTUaQfc8WH9IYNLZsDwnEjZYzE9C7jvyIsOdF C9vd44HncN6KiSwej8yy+xm55TEJ38nJT5aLCapybgQQeiqjgmxQJtK0idL3XiY/Al1K7q 6Ed4Y66Vpty+eFSDRd+ADhUOt7neSvw= Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-4361ac607b6so19205205e9.0 for ; Fri, 10 Jan 2025 10:40:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534453; x=1737139253; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=XoGlz9R9SPVMFZpNV+d4L6dWHbzE359F5HGhLwEH9Ng=; b=LCAw8SLecX+1vdP0Z8DxR4DsEIne7vekQ4wDNpIDWTEMql6vlT0Y0HmVfEd6sJK2+l bErMj+NGwfLW+1CYxUiXBK/kUth2qQ6vz4r397Ae996hERubaIGSgxt9xX+vhL7sUl5Z jKBFnWar3CUenHEzkq5S2AeQgr3Sg0SJzLCguuiu894Sq3jKTOXbPgGw+/CctDKGvOsL vEERf2pb6bNVCNWAg1k6LJRXEbeqpnPYlsObyDBtkoHRNpWV+1MKSOxLJDSDPEzVBqvo Qr1wJA/vzb2MV1hltY+HdVD27ohi1b+payp377dubEAqHjkgSi+af7gKIeNv1D7Gm97h Gl/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534453; x=1737139253; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=XoGlz9R9SPVMFZpNV+d4L6dWHbzE359F5HGhLwEH9Ng=; b=CvFhNg38zEZWADE+t//FoIZxJn2PHXji8HAtS696IClvnGF5gvHFYi+id4lliTpELn qsWlE4wLUSD56WcWydf3j35gwdzSI4ea4M8zvOModShjr+rF8prbiUgzZLqESDcNaM8d R4LAfoDaFXhUipxisZTZKS+gWk8lCihyIwFm/sIuGLLUP1NmU+2I41OgLlnItkNXPv8G vw08vR5RGgGMMK8r2BlDcJ6wOSmaIkfOxiJM88F330Hy1iVr1s6/L2/M0Dj3YEnHZvaL 4VmYdYQp07sZOXLXYPGS78rb+XxMBCb1yJ+xFqDC4RDb/WtExqVKRZisFhvi9fO7f6q7 03Rg== X-Forwarded-Encrypted: i=1; AJvYcCXWdScSlgR83/7AZOle83iZJCJ+OMTqv9MJVbAiQ5Rd4ZJ2VmBMcGENPDRAfc2xd/7lJTuiVpF6wA==@kvack.org X-Gm-Message-State: AOJu0YyaIMxF7hFzY9v+OsGA5QQ88yi/5eOp5KuEYUu4vMneEv+/lmfk LO3ZtbtCRWXL1L8KPzK8WL0kDY+WSoLRgteU9JWF8u2wxry/ebrSjgYVDxokoSHoRP0Lz5oK4OI FIZeiTr+TXw== X-Google-Smtp-Source: AGHT+IFPE8+5SCTXqr49FlBKd+8kKzimRqkvhpstM5ymbp/ZYkpFIu6SM4TYwUHOcXTpPXWmM6SB9Eylsg8TKg== X-Received: from wmbhg22.prod.google.com ([2002:a05:600c:5396:b0:434:e90f:9fd3]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:4745:b0:434:9c60:95a3 with SMTP id 5b1f17b1804b1-436e26c4218mr116881415e9.11.1736534452359; Fri, 10 Jan 2025 10:40:52 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:29 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-3-8419288bc805@google.com> Subject: [PATCH RFC v2 03/29] mm: asi: Introduce ASI core API From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman , Ofir Weisse , Junaid Shahid X-Rspamd-Queue-Id: 1C6D018000C X-Rspamd-Server: rspam12 X-Stat-Signature: pj4m73oufzo8qq33f819jck8ucgqehzm X-Rspam-User: X-HE-Tag: 1736534453-435613 X-HE-Meta: U2FsdGVkX1+wsMI1Go3MhXTrQRueojrqSQSGrl+9RsyefCLcq5ogUOpGmLzH6IgJuNks0kukilTQcbCYlq770nq6xLtYVd/cOMNgm/pWiyrZ3TbuLhPLp07xBTNIqJD8j7RyOTb/D62R6g7ZYFIyS9C/oiCyPfAwTqBgza4EZV7uM/VHXqyAOSwyrTRvI9RJV7Yd9Xnoaw8io8nK/mSjjX7kzbC1gVxCnc/sFupIr1Vkedv1gK3Gq++t//HHyJQqfpYvn7fY0FpVOhlYTu196PuT+Y6Zxjmj5DLQKoQvcJuiQo+NrZA7rXj+FFEa+S4PHltCifroU7F67zrYd/xG7/IcwaMvmmKwCsDHJoEAXbhiBp+dCkIZNjou99OUpDcCLf05V/IdO1uCgN8/LSOvssCeAub9JXDCrvJi6dDSFrzdOr7QnuOZRT+5fdQo8c7F1s29TWv0N8iCkA4Ns1r5psswmRvUQhGjYnOIpnEIeWwUmjkPv0NofTNMle74U1wqAh6apbUxYiZ4qqq3hbWqI6xy1BDUSD2hCkDPptg4wBz4FiKbNFi61dE2ex+F1Kh9PP7FE8PMCLCsnxS12NXRPmtnQBImYgwfVKpdrPEaLgp/lH872Is97Wt8B4gXTtAde8Kv1JuwbnjJgTAId/fXLaB0L07aIBoS2r4IXRj+Wo7nAqNzRyiQ0zD0MLGfagUWmHC054QUdlhc0n5ZCZ5dsGxp6DjNnEsh4nlBXnObE8gmkBpPsR/H3OViJgoG+GM1E0x0Vu0jAR6zcjy7dn3LdnUrXitYYZzLr4duGYu5w9zh0rk1dEGWiL4jpF/BI4kanGZJJZiT/DVcj9FjZfnb4BUY9CJ/5fqbSSWzY772dcc0u6AhVAXD3zsrL+voCS/pnpu0cS2qPl950P68TsVz+XVh6JyHY2zU94UN+FKv3zoe8a8wET8OW8cZ1oPVJrg0yARrEZeu6IgAKcUgG+N +Asy7bdz qO95RDpyilnM/1hcpity3+7EsdL0YGXqFO6ECujANwkLzytIjnZzUOEGr9dcaQW83zS88rJ6cgGOpwRdAoFQlEX+EDN95/o07f+XAL7INmO7uEUZDVYv8aWMzZ3yVbmNY35OcyP27pc9AAX35uq1YODMR/2Rvo5khdZLV6xXf4it7Wq+wWeUNVw4iJxXh6tDk43DGSGdJefZfTAFCBNlyV7t4HZAG106mglttj9niBx+rZ3bo7c+QmgO7wf7K3L2ijd9yKSrhx11KUbhMBk0voGhHraE6BXPyhAtpfH+jpfG1mGEOEDlb/dUmF2aesuchFnd9hodNVqWb7lK9+xRcyW5KP1FRHzLXSF4Ftz6H5JnpRIg092Ii1Ko05khmE1Z7xJk2MnwKR3t7NTP1k3XZEZDJWPVu+QvAcEpqJbEmLG5rIe7Seo5v5rE4SjuJbt/z/bZPx49+z2SjUlIZUTAWXdhAunfq3lrj7bL89pDMfX9vAalednCCUp5nfBuo9w7OTDM2qjr2HJJZYRXSAWq6VTYZ7jGbBTRc9cKWXFTAM9MKy1utjkFjm0oX/nwKKW7Vthds7mxN6TbeqUYc6Sm9DI4uMWY80pClKsOv6IDMjMB/9dcQ2i2tnclo/qEiiSHBIDrl X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Introduce core API for Address Space Isolation (ASI). Kernel address space isolation provides the ability to run some kernel code with a restricted kernel address space. There can be multiple classes of such restricted kernel address spaces (e.g. KPTI, KVM-PTI etc.). Each ASI class is identified by an index. The ASI class can register some hooks to be called when entering/exiting the restricted address space. Currently, there is a fixed maximum number of ASI classes supported. In addition, each process can have at most one restricted address space from each ASI class. Neither of these are inherent limitations and are merely simplifying assumptions for the time being. To keep things simpler for the time being, we disallow context switches within the restricted address space. In the future, we would be able to relax this limitation for the case of context switches to different threads within the same process (or to the idle thread and back). Note that this doesn't really support protecting sibling VM guests within the same VMM process from one another. From first principles it seems unlikely that anyone who cares about VM isolation would do that, but there could be a use-case to think about. In that case need something like the OTHER_MM logic might be needed, but specific to intra-process guest separation. [0]: https://lore.kernel.org/kvm/1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com Notes about RFC-quality implementation details: - Ignoring checkpatch.pl AVOID_BUG. - The dynamic registration of classes might be pointless complexity. This was kept from RFCv1 without much thought. - The other-mm logic is also perhaps overly complex, suggestions are welcome for how best to tackle this (or we could just forget about it for the moment, and rely on asi_exit() happening in process switch). - The taint flag definitions would probably be clearer with an enum or something. Checkpatch-args: --ignore=AVOID_BUG,COMMIT_LOG_LONG_LINE,EXPORT_SYMBOL Co-developed-by: Ofir Weisse Signed-off-by: Ofir Weisse Co-developed-by: Junaid Shahid Signed-off-by: Junaid Shahid Signed-off-by: Brendan Jackman --- arch/x86/include/asm/asi.h | 208 +++++++++++++++++++++++ arch/x86/include/asm/processor.h | 8 + arch/x86/mm/Makefile | 1 + arch/x86/mm/asi.c | 350 +++++++++++++++++++++++++++++++++++++++ arch/x86/mm/init.c | 3 +- arch/x86/mm/tlb.c | 1 + include/asm-generic/asi.h | 67 ++++++++ include/linux/mm_types.h | 7 + kernel/fork.c | 3 + kernel/sched/core.c | 9 + mm/init-mm.c | 4 + 11 files changed, 660 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h new file mode 100644 index 0000000000000000000000000000000000000000..7cc635b6653a3970ba9dbfdc9c828a470e27bd44 --- /dev/null +++ b/arch/x86/include/asm/asi.h @@ -0,0 +1,208 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_ASI_H +#define _ASM_X86_ASI_H + +#include + +#include + +#include +#include +#include + +#ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION + +/* + * Overview of API usage by ASI clients: + * + * Setup: First call asi_init() to create a domain. At present only one domain + * can be created per mm per class, but it's safe to asi_init() this domain + * multiple times. For each asi_init() call you must call asi_destroy() AFTER + * you are certain all CPUs have exited the restricted address space (by + * calling asi_exit()). + * + * Runtime usage: + * + * 1. Call asi_enter() to switch to the restricted address space. This can't be + * from an interrupt or exception handler and preemption must be disabled. + * + * 2. Execute untrusted code. + * + * 3. Call asi_relax() to inform the ASI subsystem that untrusted code execution + * is finished. This doesn't cause any address space change. This can't be + * from an interrupt or exception handler and preemption must be disabled. + * + * 4. Either: + * + * a. Go back to 1. + * + * b. Call asi_exit() before returning to userspace. This immediately + * switches to the unrestricted address space. + * + * The region between 1 and 3 is called the "ASI critical section". During the + * critical section, it is a bug to access any sensitive data, and you mustn't + * sleep. + * + * The restriction on sleeping is not really a fundamental property of ASI. + * However for performance reasons it's important that the critical section is + * absolutely as short as possible. So the ability to do sleepy things like + * taking mutexes oughtn't to confer any convenience on API users. + * + * Similarly to the issue of sleeping, the need to asi_exit in case 4b is not a + * fundamental property of the system but a limitation of the current + * implementation. With further work it is possible to context switch + * from and/or to the restricted address space, and to return to userspace + * directly from the restricted address space, or _in_ it. + * + * Note that the critical section only refers to the direct execution path from + * asi_enter to asi_relax: it's fine to access sensitive data from exceptions + * and interrupt handlers that occur during that time. ASI will re-enter the + * restricted address space before returning from the outermost + * exception/interrupt. + * + * Note: ASI does not modify KPTI behaviour; when ASI and KPTI run together + * there are 2+N address spaces per task: the unrestricted kernel address space, + * the user address space, and one restricted (kernel) address space for each of + * the N ASI classes. + */ + +/* + * ASI uses a per-CPU tainting model to track what mitigation actions are + * required on domain transitions. Taints exist along two dimensions: + * + * - Who touched the CPU (guest, unprotected kernel, userspace). + * + * - What kind of state might remain: "data" means there might be data owned by + * a victim domain left behind in a sidechannel. "Control" means there might + * be state controlled by an attacker domain left behind in the branch + * predictor. + * + * In principle the same domain can be both attacker and victim, thus we have + * both data and control taints for userspace, although there's no point in + * trying to protect against attacks from the kernel itself, so there's no + * ASI_TAINT_KERNEL_CONTROL. + */ +#define ASI_TAINT_KERNEL_DATA ((asi_taints_t)BIT(0)) +#define ASI_TAINT_USER_DATA ((asi_taints_t)BIT(1)) +#define ASI_TAINT_GUEST_DATA ((asi_taints_t)BIT(2)) +#define ASI_TAINT_OTHER_MM_DATA ((asi_taints_t)BIT(3)) +#define ASI_TAINT_USER_CONTROL ((asi_taints_t)BIT(4)) +#define ASI_TAINT_GUEST_CONTROL ((asi_taints_t)BIT(5)) +#define ASI_TAINT_OTHER_MM_CONTROL ((asi_taints_t)BIT(6)) +#define ASI_NUM_TAINTS 6 +static_assert(BITS_PER_BYTE * sizeof(asi_taints_t) >= ASI_NUM_TAINTS); + +#define ASI_TAINTS_CONTROL_MASK \ + (ASI_TAINT_USER_CONTROL | ASI_TAINT_GUEST_CONTROL | ASI_TAINT_OTHER_MM_CONTROL) + +#define ASI_TAINTS_DATA_MASK \ + (ASI_TAINT_KERNEL_DATA | ASI_TAINT_USER_DATA | ASI_TAINT_OTHER_MM_DATA) + +#define ASI_TAINTS_GUEST_MASK (ASI_TAINT_GUEST_DATA | ASI_TAINT_GUEST_CONTROL) +#define ASI_TAINTS_USER_MASK (ASI_TAINT_USER_DATA | ASI_TAINT_USER_CONTROL) + +/* The taint policy tells ASI how a class interacts with the CPU taints */ +struct asi_taint_policy { + /* + * What taints would necessitate a flush when entering the domain, to + * protect it from attack by prior domains? + */ + asi_taints_t prevent_control; + /* + * What taints would necessetate a flush when entering the domain, to + * protect former domains from attack by this domain? + */ + asi_taints_t protect_data; + /* What taints should be set when entering the domain? */ + asi_taints_t set; +}; + +/* + * An ASI domain (struct asi) represents a restricted address space. The + * unrestricted address space (and user address space under PTI) are not + * represented as a domain. + */ +struct asi { + pgd_t *pgd; + struct mm_struct *mm; + int64_t ref_count; + enum asi_class_id class_id; +}; + +DECLARE_PER_CPU_ALIGNED(struct asi *, curr_asi); + +void asi_init_mm_state(struct mm_struct *mm); + +int asi_init_class(enum asi_class_id class_id, struct asi_taint_policy *taint_policy); +void asi_uninit_class(enum asi_class_id class_id); +const char *asi_class_name(enum asi_class_id class_id); + +int asi_init(struct mm_struct *mm, enum asi_class_id class_id, struct asi **out_asi); +void asi_destroy(struct asi *asi); + +/* Enter an ASI domain (restricted address space) and begin the critical section. */ +void asi_enter(struct asi *asi); + +/* + * Leave the "tense" state if we are in it, i.e. end the critical section. We + * will stay relaxed until the next asi_enter. + */ +void asi_relax(void); + +/* Immediately exit the restricted address space if in it */ +void asi_exit(void); + +/* The target is the domain we'll enter when returning to process context. */ +static __always_inline struct asi *asi_get_target(struct task_struct *p) +{ + return p->thread.asi_state.target; +} + +static __always_inline void asi_set_target(struct task_struct *p, + struct asi *target) +{ + p->thread.asi_state.target = target; +} + +static __always_inline struct asi *asi_get_current(void) +{ + return this_cpu_read(curr_asi); +} + +/* Are we currently in a restricted address space? */ +static __always_inline bool asi_is_restricted(void) +{ + return (bool)asi_get_current(); +} + +/* If we exit/have exited, can we stay that way until the next asi_enter? */ +static __always_inline bool asi_is_relaxed(void) +{ + return !asi_get_target(current); +} + +/* + * Is the current task in the critical section? + * + * This is just the inverse of !asi_is_relaxed(). We have both functions in order to + * help write intuitive client code. In particular, asi_is_tense returns false + * when ASI is disabled, which is judged to make user code more obvious. + */ +static __always_inline bool asi_is_tense(void) +{ + return !asi_is_relaxed(); +} + +static __always_inline pgd_t *asi_pgd(struct asi *asi) +{ + return asi ? asi->pgd : NULL; +} + +#define INIT_MM_ASI(init_mm) \ + .asi_init_lock = __MUTEX_INITIALIZER(init_mm.asi_init_lock), + +void asi_handle_switch_mm(void); + +#endif /* CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */ + +#endif diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 1a1b7ea5d7d32a47d783d9d62cd2a53672addd6f..f02220e6b4df911d87e2fee4b497eade61a27161 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -5,6 +5,7 @@ #include /* Forward declaration, a strange C thing */ +struct asi; struct task_struct; struct mm_struct; struct io_bitmap; @@ -503,6 +504,13 @@ struct thread_struct { struct thread_shstk shstk; #endif +#ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION + struct { + /* Domain to enter when returning to process context. */ + struct asi *target; + } asi_state; +#endif + /* Floating point and extended processor state */ struct fpu fpu; /* diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 690fbf48e8538b62a176ce838820e363575b7897..89ade7363798cc20d5e5643526eba7378174baa0 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -61,6 +61,7 @@ obj-$(CONFIG_ACPI_NUMA) += srat.o obj-$(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) += pkeys.o obj-$(CONFIG_RANDOMIZE_MEMORY) += kaslr.o obj-$(CONFIG_MITIGATION_PAGE_TABLE_ISOLATION) += pti.o +obj-$(CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION) += asi.o obj-$(CONFIG_X86_MEM_ENCRYPT) += mem_encrypt.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_amd.o diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c new file mode 100644 index 0000000000000000000000000000000000000000..105cd8b43eaf5c20acc80d4916b761559fb95d74 --- /dev/null +++ b/arch/x86/mm/asi.c @@ -0,0 +1,350 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include + +static struct asi_taint_policy *taint_policies[ASI_MAX_NUM_CLASSES]; + +const char *asi_class_names[] = { +#if IS_ENABLED(CONFIG_KVM) + [ASI_CLASS_KVM] = "KVM", +#endif +}; + +DEFINE_PER_CPU_ALIGNED(struct asi *, curr_asi); +EXPORT_SYMBOL(curr_asi); + +static inline bool asi_class_id_valid(enum asi_class_id class_id) +{ + return class_id >= 0 && class_id < ASI_MAX_NUM_CLASSES; +} + +static inline bool asi_class_initialized(enum asi_class_id class_id) +{ + if (WARN_ON(!asi_class_id_valid(class_id))) + return false; + + return !!(taint_policies[class_id]); +} + +int asi_init_class(enum asi_class_id class_id, struct asi_taint_policy *taint_policy) +{ + if (asi_class_initialized(class_id)) + return -EEXIST; + + WARN_ON(!(taint_policy->prevent_control & ASI_TAINTS_CONTROL_MASK)); + WARN_ON(!(taint_policy->protect_data & ASI_TAINTS_DATA_MASK)); + + taint_policies[class_id] = taint_policy; + + return 0; +} +EXPORT_SYMBOL_GPL(asi_init_class); + +void asi_uninit_class(enum asi_class_id class_id) +{ + if (!asi_class_initialized(class_id)) + return; + + taint_policies[class_id] = NULL; +} +EXPORT_SYMBOL_GPL(asi_uninit_class); + +const char *asi_class_name(enum asi_class_id class_id) +{ + if (WARN_ON_ONCE(!asi_class_id_valid(class_id))) + return ""; + + return asi_class_names[class_id]; +} + +static void __asi_destroy(struct asi *asi) +{ + lockdep_assert_held(&asi->mm->asi_init_lock); + +} + +int asi_init(struct mm_struct *mm, enum asi_class_id class_id, struct asi **out_asi) +{ + struct asi *asi; + int err = 0; + + *out_asi = NULL; + + if (WARN_ON(!asi_class_initialized(class_id))) + return -EINVAL; + + asi = &mm->asi[class_id]; + + mutex_lock(&mm->asi_init_lock); + + if (asi->ref_count++ > 0) + goto exit_unlock; /* err is 0 */ + + BUG_ON(asi->pgd != NULL); + + /* + * For now, we allocate 2 pages to avoid any potential problems with + * KPTI code. This won't be needed once KPTI is folded into the ASI + * framework. + */ + asi->pgd = (pgd_t *)__get_free_pages( + GFP_KERNEL_ACCOUNT | __GFP_ZERO, PGD_ALLOCATION_ORDER); + if (!asi->pgd) { + err = -ENOMEM; + goto exit_unlock; + } + + asi->mm = mm; + asi->class_id = class_id; + +exit_unlock: + if (err) + __asi_destroy(asi); + else + *out_asi = asi; + + mutex_unlock(&mm->asi_init_lock); + + return err; +} +EXPORT_SYMBOL_GPL(asi_init); + +void asi_destroy(struct asi *asi) +{ + struct mm_struct *mm; + + if (!asi) + return; + + if (WARN_ON(!asi_class_initialized(asi->class_id))) + return; + + mm = asi->mm; + /* + * We would need this mutex even if the refcount was atomic as we need + * to block concurrent asi_init calls. + */ + mutex_lock(&mm->asi_init_lock); + WARN_ON_ONCE(asi->ref_count <= 0); + if (--(asi->ref_count) == 0) { + free_pages((ulong)asi->pgd, PGD_ALLOCATION_ORDER); + memset(asi, 0, sizeof(struct asi)); + } + mutex_unlock(&mm->asi_init_lock); +} +EXPORT_SYMBOL_GPL(asi_destroy); + +DEFINE_PER_CPU_ALIGNED(asi_taints_t, asi_taints); + +/* + * Flush out any potentially malicious speculative control flow (e.g. branch + * predictor) state if necessary when we are entering a new domain (which may be + * NULL when we are exiting to the restricted address space). + * + * This is "backwards-looking" mitigation, the attacker is in the past: we want + * then when logically transitioning from A to B and B doesn't trust A. + * + * This function must tolerate reentrancy. + */ +static __always_inline void maybe_flush_control(struct asi *next_asi) +{ + asi_taints_t taints = this_cpu_read(asi_taints); + + if (next_asi) { + taints &= taint_policies[next_asi->class_id]->prevent_control; + } else { + /* + * Going to the unrestricted address space, this has an implicit + * policy of flushing all taints. + */ + taints &= ASI_TAINTS_CONTROL_MASK; + } + + if (!taints) + return; + + /* + * This is where we'll do the actual dirty work of clearing uarch state. + * For now we just pretend, clear the taints. + */ + this_cpu_and(asi_taints, ~ASI_TAINTS_CONTROL_MASK); +} + +/* + * Flush out any data that might be hanging around in uarch state that can be + * leaked through sidechannels if necessary when we are entering a new domain. + * + * This is "forwards-looking" mitigation, the attacker is in the future: we want + * this when logically transitioning from A to B and A doesn't trust B. + * + * This function must tolerate reentrancy. + */ +static __always_inline void maybe_flush_data(struct asi *next_asi) +{ + asi_taints_t taints = this_cpu_read(asi_taints) + & taint_policies[next_asi->class_id]->protect_data; + + if (!taints) + return; + + /* + * This is where we'll do the actual dirty work of clearing uarch state. + * For now we just pretend, clear the taints. + */ + this_cpu_and(asi_taints, ~ASI_TAINTS_DATA_MASK); +} + +static noinstr void __asi_enter(void) +{ + u64 asi_cr3; + struct asi *target = asi_get_target(current); + + /* + * This is actually false restriction, it should be fine to be + * preemptible during the critical section. But we haven't tested it. We + * will also need to disable preemption during this function itself and + * perhaps elsewhere. This false restriction shouldn't create any + * additional burden for ASI clients anyway: the critical section has + * to be as short as possible to avoid unnecessary ASI transitions so + * disabling preemption should be fine. + */ + VM_BUG_ON(preemptible()); + + if (!target || target == this_cpu_read(curr_asi)) + return; + + VM_BUG_ON(this_cpu_read(cpu_tlbstate.loaded_mm) == + LOADED_MM_SWITCHING); + + /* + * Must update curr_asi before writing CR3 to ensure an interrupting + * asi_exit sees that it may need to switch address spaces. + * This is the real beginning of the ASI critical section. + */ + this_cpu_write(curr_asi, target); + maybe_flush_control(target); + + asi_cr3 = build_cr3_noinstr(target->pgd, + this_cpu_read(cpu_tlbstate.loaded_mm_asid), + tlbstate_lam_cr3_mask()); + write_cr3(asi_cr3); + + maybe_flush_data(target); + /* + * It's fine to set the control taints late like this, since we haven't + * actually got to the untrusted code yet. Waiting until now to set the + * data taints is less obviously correct: we've mapped in the incoming + * domain's secrets now so we can't guarantee they haven't already got + * into a sidechannel. However, preemption is off so the only way we can + * reach another asi_enter() is in the return from an interrupt - in + * that case the reentrant asi_enter() call is entering the same domain + * that we're entering at the moment, it doesn't need to flush those + * secrets. + */ + this_cpu_or(asi_taints, taint_policies[target->class_id]->set); +} + +noinstr void asi_enter(struct asi *asi) +{ + VM_WARN_ON_ONCE(!asi); + + /* Should not have an asi_enter() without a prior asi_relax(). */ + VM_WARN_ON_ONCE(asi_get_target(current)); + + asi_set_target(current, asi); + barrier(); + + __asi_enter(); +} +EXPORT_SYMBOL_GPL(asi_enter); + +noinstr void asi_relax(void) +{ + barrier(); + asi_set_target(current, NULL); +} +EXPORT_SYMBOL_GPL(asi_relax); + +noinstr void asi_exit(void) +{ + u64 unrestricted_cr3; + struct asi *asi; + + preempt_disable_notrace(); + + VM_BUG_ON(this_cpu_read(cpu_tlbstate.loaded_mm) == + LOADED_MM_SWITCHING); + + asi = this_cpu_read(curr_asi); + if (asi) { + maybe_flush_control(NULL); + + unrestricted_cr3 = + build_cr3_noinstr(this_cpu_read(cpu_tlbstate.loaded_mm)->pgd, + this_cpu_read(cpu_tlbstate.loaded_mm_asid), + tlbstate_lam_cr3_mask()); + + /* Tainting first makes reentrancy easier to reason about. */ + this_cpu_or(asi_taints, ASI_TAINT_KERNEL_DATA); + write_cr3(unrestricted_cr3); + /* + * Must not update curr_asi until after CR3 write, otherwise a + * re-entrant call might not enter this branch. (This means we + * might do unnecessary CR3 writes). + */ + this_cpu_write(curr_asi, NULL); + } + + preempt_enable_notrace(); +} +EXPORT_SYMBOL_GPL(asi_exit); + +void asi_init_mm_state(struct mm_struct *mm) +{ + memset(mm->asi, 0, sizeof(mm->asi)); + mutex_init(&mm->asi_init_lock); +} + +void asi_handle_switch_mm(void) +{ + /* + * We can't handle context switching in the restricted address space yet + * so this is pointless in practice (we asi_exit() in this path, which + * doesn't care about the fine details of who exactly got at the branch + * predictor), but just to illustrate how the tainting model is supposed + * to work, here we squash the per-domain (guest/userspace) taints into + * a general "other MM" taint. Other processes don't care if their peers + * are attacking them from a guest or from bare metal. + */ + asi_taints_t taints = this_cpu_read(asi_taints); + asi_taints_t new_taints = 0; + + if (taints & ASI_TAINTS_CONTROL_MASK) + new_taints |= ASI_TAINT_OTHER_MM_CONTROL; + if (taints & ASI_TAINTS_DATA_MASK) + new_taints |= ASI_TAINT_OTHER_MM_DATA; + + /* + * We can't race with asi_enter() or we'd clobber the taint it sets. + * Would be odd given this function says context_switch in the name but + * just be to sure... + */ + lockdep_assert_preemption_disabled(); + + /* + * Can'tt just this_cpu_write here as we could be racing with asi_exit() + * (at least, in the future where this function is actually necessary), + * we mustn't clobber ASI_TAINT_KERNEL_DATA. + */ + this_cpu_or(asi_taints, new_taints); + this_cpu_and(asi_taints, ~(ASI_TAINTS_GUEST_MASK | ASI_TAINTS_USER_MASK)); +} diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index eb503f53c3195ca4f299593c0112dab0fb09e7dd..de4227ed5169ff84d0ce80b677caffc475198fa6 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -250,7 +250,8 @@ static void __init probe_page_size_mask(void) /* By the default is everything supported: */ __default_kernel_pte_mask = __supported_pte_mask; /* Except when with PTI where the kernel is mostly non-Global: */ - if (cpu_feature_enabled(X86_FEATURE_PTI)) + if (cpu_feature_enabled(X86_FEATURE_PTI) || + IS_ENABLED(CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION)) __default_kernel_pte_mask &= ~_PAGE_GLOBAL; /* Enable 1 GB linear kernel mappings if available: */ diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index f0428e5e1f1947903ee87c4c6444844ee11b45c3..7c2309996d1d5a7cac23bd122f7b56a869d67d6a 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -608,6 +608,7 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, * Apply process to process speculation vulnerability * mitigations if applicable. */ + asi_handle_switch_mm(); cond_mitigation(tsk); /* diff --git a/include/asm-generic/asi.h b/include/asm-generic/asi.h index c4d9a5ff860a96428422a15000c622aeecc2d664..6b84202837605fa57e4a910318c8353b3f816f06 100644 --- a/include/asm-generic/asi.h +++ b/include/asm-generic/asi.h @@ -2,4 +2,71 @@ #ifndef __ASM_GENERIC_ASI_H #define __ASM_GENERIC_ASI_H +#include + +#ifndef _ASSEMBLY_ + +/* + * An ASI class is a type of isolation that can be applied to a process. A + * process may have a domain for each class. + */ +enum asi_class_id { +#if IS_ENABLED(CONFIG_KVM) + ASI_CLASS_KVM, +#endif + ASI_MAX_NUM_CLASSES, +}; + +typedef u8 asi_taints_t; + +#ifndef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION + +struct asi_hooks {}; +struct asi {}; + +static inline +int asi_init_class(enum asi_class_id class_id, + asi_taints_t control_taints, asi_taints_t data_taints) +{ + return 0; +} + +static inline void asi_uninit_class(enum asi_class_id class_id) { } + +struct mm_struct; +static inline void asi_init_mm_state(struct mm_struct *mm) { } + +static inline int asi_init(struct mm_struct *mm, enum asi_class_id class_id, + struct asi **out_asi) +{ + return 0; +} + +static inline void asi_destroy(struct asi *asi) { } + +static inline void asi_enter(struct asi *asi) { } + +static inline void asi_relax(void) { } + +static inline bool asi_is_relaxed(void) { return true; } + +static inline bool asi_is_tense(void) { return false; } + +static inline void asi_exit(void) { } + +static inline bool asi_is_restricted(void) { return false; } + +static inline struct asi *asi_get_current(void) { return NULL; } + +struct task_struct; +static inline struct asi *asi_get_target(struct task_struct *p) { return NULL; } + +static inline pgd_t *asi_pgd(struct asi *asi) { return NULL; } + +static inline void asi_handle_switch_mm(void) { } + +#endif /* !CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */ + +#endif /* !_ASSEMBLY_ */ + #endif diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 6e3bdf8e38bcaee66a71f5566ac7debb94c0ee78..391e32a41ca3df84a619f3ee8ea45d3729a43023 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -19,8 +19,10 @@ #include #include #include +#include #include +#include #ifndef AT_VECTOR_SIZE_ARCH #define AT_VECTOR_SIZE_ARCH 0 @@ -826,6 +828,11 @@ struct mm_struct { atomic_t membarrier_state; #endif +#ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION + struct asi asi[ASI_MAX_NUM_CLASSES]; + struct mutex asi_init_lock; +#endif + /** * @mm_users: The number of users including userspace. * diff --git a/kernel/fork.c b/kernel/fork.c index 22f43721d031d48fd5be2606e86642334be9735f..bb73758790d08112265d398b16902ff9a4c2b8fe 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -112,6 +112,7 @@ #include #include #include +#include #include @@ -1296,6 +1297,8 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, if (mm_alloc_pgd(mm)) goto fail_nopgd; + asi_init_mm_state(mm); + if (init_new_context(p, mm)) goto fail_nocontext; diff --git a/kernel/sched/core.c b/kernel/sched/core.c index a1c353a62c5684e3e773dd100afbddb818c480be..b1f7f73730c1e56f700cd3611a8093f177184842 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -78,6 +78,7 @@ #include #include #include +#include #define CREATE_TRACE_POINTS #include @@ -5272,6 +5273,14 @@ static __always_inline struct rq * context_switch(struct rq *rq, struct task_struct *prev, struct task_struct *next, struct rq_flags *rf) { + /* + * It's possible to avoid this by tweaking ASI's domain management code + * and updating code that modifies CR3 to be ASI-aware. Even without + * that, it's probably possible to get rid of this in certain cases just + * by fiddling with the context switch path itself. + */ + asi_exit(); + prepare_task_switch(rq, prev, next); /* diff --git a/mm/init-mm.c b/mm/init-mm.c index 24c809379274503ac4f261fe7cfdbab3cb1ed1e7..e820e1c6edd48836a0ebe58e777046498d6a89ee 100644 --- a/mm/init-mm.c +++ b/mm/init-mm.c @@ -12,6 +12,7 @@ #include #include #include +#include #ifndef INIT_MM_CONTEXT #define INIT_MM_CONTEXT(name) @@ -44,6 +45,9 @@ struct mm_struct init_mm = { #endif .user_ns = &init_user_ns, .cpu_bitmap = CPU_BITS_NONE, +#ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION + INIT_MM_ASI(init_mm) +#endif INIT_MM_CONTEXT(init_mm) }; From patchwork Fri Jan 10 18:40:30 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935233 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3BD5BE7719C for ; Fri, 10 Jan 2025 18:41:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 953FA6B00BB; Fri, 10 Jan 2025 13:40:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8D9356B00BD; Fri, 10 Jan 2025 13:40:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6DD376B00BF; Fri, 10 Jan 2025 13:40:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 46AE06B00BB for ; Fri, 10 Jan 2025 13:40:58 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 08A3880DD2 for ; Fri, 10 Jan 2025 18:40:58 +0000 (UTC) X-FDA: 82992408996.29.9AA1D67 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf11.hostedemail.com (Postfix) with ESMTP id 11EE640005 for ; Fri, 10 Jan 2025 18:40:55 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=P0eHEPKA; spf=pass (imf11.hostedemail.com: domain of 3tmmBZwgKCNE6xz79xAy3BB381.zB985AHK-997Ixz7.BE3@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3tmmBZwgKCNE6xz79xAy3BB381.zB985AHK-997Ixz7.BE3@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534456; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2/DwiwuishkWohq1Selw9AVX00i6WMTVrJEAuBvDTaE=; b=XI+xdl3r7RodmZ91CxQG4oB6b8CdMqSo9qHRxtWpwt8eH62OUSEBHFXhaa4UmzF9LiiKt2 kOLHjwwr+nBFewSSmLbHJll17AIpLn3w43XTDJLkVzY2ZS3Fnu9DDm6D5hxOf+SH/itOi9 zw777+xr6pFpUFnNoUEY1MNKVxV+hlk= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=P0eHEPKA; spf=pass (imf11.hostedemail.com: domain of 3tmmBZwgKCNE6xz79xAy3BB381.zB985AHK-997Ixz7.BE3@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3tmmBZwgKCNE6xz79xAy3BB381.zB985AHK-997Ixz7.BE3@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534456; a=rsa-sha256; cv=none; b=R8F0sC+HxtQqBRZ5W52hdj4v5vP2AaPdzF9SwN1sLQUIQMGeW4Ab0I7JOY3lYmh+s+4CiU rcs2mMNzRCCXnzwY17pEUKV2QXtpoPf2bm/h2ZzNNel43SAySdmKzxJI0tWgeTmKzG1aGA aYeQh9BAHHxjI1urph8sUATdpk//7+0= Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-436379713baso11445745e9.2 for ; Fri, 10 Jan 2025 10:40:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534455; x=1737139255; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=2/DwiwuishkWohq1Selw9AVX00i6WMTVrJEAuBvDTaE=; b=P0eHEPKAP6G6b0C6XnWSXHAhXbUPA4PewuAbnVInejx3n9+nTPUoKOzaqiWZuyQqMt MnjczyN2i5GROJnBQuf9nS40r8DJIu/f1FSMh0Dds3e/AYJ48ktHPt0pKWXqOQaFgpi5 Fzowz45CnADaaBeOXjClM2/ausqCOV7L0tr6pvHtQjqdmtN/GRV+puJDQw9xEPyv25mQ mcF2C2XKYEUpDGuSGHH4n3XyznRHZJsbO8h220bT1vTDLqrKoYu36y577k9b50EQNLKD ya22ell4oVprQZ56rnzjPDCFfeHIRyrEaD2ref1ZEhY0DpUmA2XFcJi86EpREs+qDBmM zTdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534455; x=1737139255; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=2/DwiwuishkWohq1Selw9AVX00i6WMTVrJEAuBvDTaE=; b=WtrL0d0C7pu5UpJEX7Aa4BULTWrdeH2W8ObhZufi+LLUHg5mkoR3nHMauBRdlMQkL0 PRulGNgLAJ0ArlYOG9RCCZ2Nh/FxVeAYoZaMrwqzvnb4GkuxF4+3ttH5bkQg3QR6l8Z7 DUPqB2vS2ZSEnVsikBl6HKKcOU3HUbg0jSyVqomYwu2WuZJ1hvzF4yO9Erelr6xw2WeW fNNvx+y4K/l9FFCddHOqb+USBNa+1MfOGaaZvyLM4vczPf1JTx+5xDgby0QkKzzHhdH2 51QJf5aUnu8iji3KeHRa7HhuPeYiLF5oqQMRrU837suEo+IipE27RB7NAgK8mBWmp2YA dFIA== X-Forwarded-Encrypted: i=1; AJvYcCUAyUG1P0uQaO1oTvwTaDJJvDw4EPfKbEsyz1Tjkyc3Dj1Q/oyVHtY28nKdWOGm0UeL7VVofaTG3A==@kvack.org X-Gm-Message-State: AOJu0YzZFrdtv8Lr67bcVgoUd3n+ptNYmd3yCr70PAP7oqyBVFkKg7Is 8Ljv2cSX+ByoTX8mFo6ffhDrXGFKn3yFoFq5fhVx7Yuihl41ChuRr8Cx7HcdmIYNJst9TeKmo1x Sy8UKnLwXpA== X-Google-Smtp-Source: AGHT+IFG2gVbYjLEgd+rVsDr92xsiDXdhg3a2JRtZbNVmnh3EgFoUPLrUSfZq6jOhgEgZFCkNrLAplfHg1IQBQ== X-Received: from wmqa17.prod.google.com ([2002:a05:600c:3491:b0:434:fa72:f1bf]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1d14:b0:436:5fc9:309d with SMTP id 5b1f17b1804b1-436e26f6d81mr55448175e9.30.1736534454514; Fri, 10 Jan 2025 10:40:54 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:30 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-4-8419288bc805@google.com> Subject: [PATCH RFC v2 04/29] mm: asi: Add infrastructure for boot-time enablement From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman , Junaid Shahid , Yosry Ahmed X-Rspamd-Queue-Id: 11EE640005 X-Rspamd-Server: rspam12 X-Stat-Signature: y3nd137hqfjfyuwf9dys7k5bfkk91t83 X-Rspam-User: X-HE-Tag: 1736534455-956592 X-HE-Meta: U2FsdGVkX19/kmwQRfzJtGdp6J8/YD0sLoEfAD3DsB7Neu8Et6qRhmcd/xikrlJ7Fam8Czc8o74s4q4yCmYpxGVA74ClsVtb5kVOeAgDNx9OnUrnOPZopShMdIkZoR0u6wL/mUII19ljp1Vy77C/o7RxgGSmtRn+aSTbU9BzVxqDH2OCmfdr/xGTce2cE6Ditwiz0DxHHvzLcqiPge15nLlwJAt8rAnOBNGgZjd0Pir5D6dJimYd7JQw/jDMLJIJly6gG4CPGxN5Fx/YyKjh1nw5EfgHMITYgQ+8X7O6WyiOwxJVXLIeGX77VV+qnXz/vAy0BPniMV6/8cDIiGj3EVfQhZ/jqADsvWZssfezoXxNLLl2Z/TlrLasvL6Sj+YA/U5O8hwZkWNqUyXJgsUo0irKM3+JPU9BlVzH5kW10w30RxE3PGOnKVazc6ayIVhjV59VijwDtGJnxa1fV+n949bD92y3nwPtZf4QL3A5y5kjCun0QnO2LhdTuFP+ocqYZPanmm3Mgdi7E1GvwzBJu8CDWODgO5fJwy7IXfERDzTb9k+b6xDbVUwrkHYBfqR+hQxg1FkLuiMpz67e92oZvTRAELmDN72x42JwWAAg0CAFrLAp+SZqQLMux4NXLnO2lF/+k2QphMkXi9Iqmk3wRxxcQnHvDAuXZdGj1llWgBymxN3eeceW2y9QXq9UUIZdKyFUQhfbeUfWCzp3frmzQcQLp4dBwjgtUzUU7/mILknfut3TAKYOLtJkydjYN4HzejhynoKnF/aim8MX52Oe0knVjQ+jsJxQ6cP6FHXYM7IgaaNG0ay/fkfmWrqDcgmmiQ4m7UTnbJgTu3rT7aVaoZXC4wRJ8F0YNnmjXICGh1WuVpax2tl37XZNj8VWhaBNj3EdJW9HHzeVnyxiymfEiIdtzvtxW0mDr1WsFn1LfIAjOMUWwi/ExZHUp3BucLxttc9o8LWlcuGcQN3FGv4 pfE1NCWv ck9OwoKKzcn+3lZbStzNRTyAObfttvpb6Y1Eb3AjsoWjUHFjiggue6ipuvuP+KJhaXtWZfWF6lOFUFCt86LoLLtQiw2fGXOIpKwTg/1P6rw2fKeiZ5hEfWNo49qS45sA3EQhORfLo1Yh/Bd+3Ky0g7TIM9o1YEe24pTxqVERQMV1uvcnUQc8YplJpPk2M+x4fKI5/Fv7+3XfqQLnfJpi5k4VoBSJdG7URB3pA/jaNd0K2BbwcHmRCVnGinKamevXxrUnB0vTKZkWEzwYVkajjUOv1G9kKKketmA8MRYKp2ysYTAsQSUFTfSfMexO0E+OtKbMhhg3CLr7DalkoDapB5XcTu4f8d14pLd0lp90xludGIwwd+PvDXON5DKQZQFmup3Mw2DbsCZ7SziT2XN+Umj6s49h00Sd1ygrC3m+dnQIrCa/40g+ICPN82e7EdeVqjMKKTae94nRaJvz4ScVcBxqQQyW6DokBcBKx2T75XngUiTx0y1Lb4x7L6G9me1LXgKeUJ8Jz819S4u8l59DtfebGRbsjHIQEX7CPWC1Eua0qiLzY8e+tYA+i1Qprs5Y0y1Fw X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add a boot time parameter to control the newly added X86_FEATURE_ASI. "asi=on" or "asi=off" can be used in the kernel command line to enable or disable ASI at boot time. If not specified, ASI enablement depends on CONFIG_ADDRESS_SPACE_ISOLATION_DEFAULT_ON, which is off by default. asi_check_boottime_disable() is modeled after pti_check_boottime_disable(). The boot parameter is currently ignored until ASI is fully functional. Once we have a set of ASI features checked in that we have actually tested, we will stop ignoring the flag. But for now let's just add the infrastructure so we can implement the usage code. Ignoring checkpatch.pl CONFIG_DESCRIPTION because the _DEFAULT_ON Kconfig is trivial to explain. Checkpatch-args: --ignore CONFIG_DESCRIPTION Co-developed-by: Junaid Shahid Signed-off-by: Junaid Shahid Co-developed-by: Yosry Ahmed Signed-off-by: Yosry Ahmed Signed-off-by: Brendan Jackman --- arch/x86/Kconfig | 9 +++++ arch/x86/include/asm/asi.h | 19 ++++++++-- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/disabled-features.h | 8 ++++- arch/x86/mm/asi.c | 61 +++++++++++++++++++++++++++----- arch/x86/mm/init.c | 4 ++- include/asm-generic/asi.h | 4 +++ 7 files changed, 92 insertions(+), 14 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 5a50582eb210e9d1309856a737d32b76fa1bfc85..1fcb52cb8cd5084ac3cef04af61b7d1653162bdb 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2533,6 +2533,15 @@ config MITIGATION_ADDRESS_SPACE_ISOLATION there are likely to be unhandled cases, in particular concerning TLB flushes. + +config ADDRESS_SPACE_ISOLATION_DEFAULT_ON + bool "Enable address space isolation by default" + default n + depends on MITIGATION_ADDRESS_SPACE_ISOLATION + help + If selected, ASI is enabled by default at boot if the asi=on or + asi=off are not specified. + config MITIGATION_RETPOLINE bool "Avoid speculative indirect branches in kernel" select OBJTOOL if HAVE_OBJTOOL diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index 7cc635b6653a3970ba9dbfdc9c828a470e27bd44..b9671ef2dd3278adceed18507fd260e21954d574 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -8,6 +8,7 @@ #include #include +#include #include #ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION @@ -66,6 +67,8 @@ * the N ASI classes. */ +#define static_asi_enabled() cpu_feature_enabled(X86_FEATURE_ASI) + /* * ASI uses a per-CPU tainting model to track what mitigation actions are * required on domain transitions. Taints exist along two dimensions: @@ -131,6 +134,8 @@ struct asi { DECLARE_PER_CPU_ALIGNED(struct asi *, curr_asi); +void asi_check_boottime_disable(void); + void asi_init_mm_state(struct mm_struct *mm); int asi_init_class(enum asi_class_id class_id, struct asi_taint_policy *taint_policy); @@ -155,7 +160,9 @@ void asi_exit(void); /* The target is the domain we'll enter when returning to process context. */ static __always_inline struct asi *asi_get_target(struct task_struct *p) { - return p->thread.asi_state.target; + return static_asi_enabled() + ? p->thread.asi_state.target + : NULL; } static __always_inline void asi_set_target(struct task_struct *p, @@ -166,7 +173,9 @@ static __always_inline void asi_set_target(struct task_struct *p, static __always_inline struct asi *asi_get_current(void) { - return this_cpu_read(curr_asi); + return static_asi_enabled() + ? this_cpu_read(curr_asi) + : NULL; } /* Are we currently in a restricted address space? */ @@ -175,7 +184,11 @@ static __always_inline bool asi_is_restricted(void) return (bool)asi_get_current(); } -/* If we exit/have exited, can we stay that way until the next asi_enter? */ +/* + * If we exit/have exited, can we stay that way until the next asi_enter? + * + * When ASI is disabled, this returns true. + */ static __always_inline bool asi_is_relaxed(void) { return !asi_get_target(current); diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 913fd3a7bac6506141de65f33b9ee61c615c7d7d..d6a808d10c3b8900d190ea01c66fc248863f05e2 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -474,6 +474,7 @@ #define X86_FEATURE_CLEAR_BHB_HW (21*32+ 3) /* BHI_DIS_S HW control enabled */ #define X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT (21*32+ 4) /* Clear branch history at vmexit using SW loop */ #define X86_FEATURE_FAST_CPPC (21*32 + 5) /* AMD Fast CPPC */ +#define X86_FEATURE_ASI (21*32+6) /* Kernel Address Space Isolation */ /* * BUG word(s) diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index c492bdc97b0595ec77f89dc9b0cefe5e3e64be41..c7964ed4fef8b9441e1c0453da587787d8008d9d 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -50,6 +50,12 @@ # define DISABLE_PTI (1 << (X86_FEATURE_PTI & 31)) #endif +#ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION +# define DISABLE_ASI 0 +#else +# define DISABLE_ASI (1 << (X86_FEATURE_ASI & 31)) +#endif + #ifdef CONFIG_MITIGATION_RETPOLINE # define DISABLE_RETPOLINE 0 #else @@ -154,7 +160,7 @@ #define DISABLED_MASK17 0 #define DISABLED_MASK18 (DISABLE_IBT) #define DISABLED_MASK19 (DISABLE_SEV_SNP) -#define DISABLED_MASK20 0 +#define DISABLED_MASK20 (DISABLE_ASI) #define DISABLED_MASK21 0 #define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 22) diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index 105cd8b43eaf5c20acc80d4916b761559fb95d74..5baf563a078f5b3a6cd4b9f5e92baaf81b0774c4 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -4,6 +4,7 @@ #include #include +#include #include #include #include @@ -29,6 +30,9 @@ static inline bool asi_class_id_valid(enum asi_class_id class_id) static inline bool asi_class_initialized(enum asi_class_id class_id) { + if (!boot_cpu_has(X86_FEATURE_ASI)) + return 0; + if (WARN_ON(!asi_class_id_valid(class_id))) return false; @@ -51,6 +55,9 @@ EXPORT_SYMBOL_GPL(asi_init_class); void asi_uninit_class(enum asi_class_id class_id) { + if (!boot_cpu_has(X86_FEATURE_ASI)) + return; + if (!asi_class_initialized(class_id)) return; @@ -66,10 +73,36 @@ const char *asi_class_name(enum asi_class_id class_id) return asi_class_names[class_id]; } +void __init asi_check_boottime_disable(void) +{ + bool enabled = IS_ENABLED(CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION_DEFAULT_ON); + char arg[4]; + int ret; + + ret = cmdline_find_option(boot_command_line, "asi", arg, sizeof(arg)); + if (ret == 3 && !strncmp(arg, "off", 3)) { + enabled = false; + pr_info("ASI disabled through kernel command line.\n"); + } else if (ret == 2 && !strncmp(arg, "on", 2)) { + enabled = true; + pr_info("Ignoring asi=on param while ASI implementation is incomplete.\n"); + } else { + pr_info("ASI %s by default.\n", + enabled ? "enabled" : "disabled"); + } + + if (enabled) + pr_info("ASI enablement ignored due to incomplete implementation.\n"); +} + static void __asi_destroy(struct asi *asi) { - lockdep_assert_held(&asi->mm->asi_init_lock); + WARN_ON_ONCE(asi->ref_count <= 0); + if (--(asi->ref_count) > 0) + return; + free_pages((ulong)asi->pgd, PGD_ALLOCATION_ORDER); + memset(asi, 0, sizeof(struct asi)); } int asi_init(struct mm_struct *mm, enum asi_class_id class_id, struct asi **out_asi) @@ -79,6 +112,9 @@ int asi_init(struct mm_struct *mm, enum asi_class_id class_id, struct asi **out_ *out_asi = NULL; + if (!boot_cpu_has(X86_FEATURE_ASI)) + return 0; + if (WARN_ON(!asi_class_initialized(class_id))) return -EINVAL; @@ -122,7 +158,7 @@ void asi_destroy(struct asi *asi) { struct mm_struct *mm; - if (!asi) + if (!boot_cpu_has(X86_FEATURE_ASI) || !asi) return; if (WARN_ON(!asi_class_initialized(asi->class_id))) @@ -134,11 +170,7 @@ void asi_destroy(struct asi *asi) * to block concurrent asi_init calls. */ mutex_lock(&mm->asi_init_lock); - WARN_ON_ONCE(asi->ref_count <= 0); - if (--(asi->ref_count) == 0) { - free_pages((ulong)asi->pgd, PGD_ALLOCATION_ORDER); - memset(asi, 0, sizeof(struct asi)); - } + __asi_destroy(asi); mutex_unlock(&mm->asi_init_lock); } EXPORT_SYMBOL_GPL(asi_destroy); @@ -255,6 +287,9 @@ static noinstr void __asi_enter(void) noinstr void asi_enter(struct asi *asi) { + if (!static_asi_enabled()) + return; + VM_WARN_ON_ONCE(!asi); /* Should not have an asi_enter() without a prior asi_relax(). */ @@ -269,8 +304,10 @@ EXPORT_SYMBOL_GPL(asi_enter); noinstr void asi_relax(void) { - barrier(); - asi_set_target(current, NULL); + if (static_asi_enabled()) { + barrier(); + asi_set_target(current, NULL); + } } EXPORT_SYMBOL_GPL(asi_relax); @@ -279,6 +316,9 @@ noinstr void asi_exit(void) u64 unrestricted_cr3; struct asi *asi; + if (!static_asi_enabled()) + return; + preempt_disable_notrace(); VM_BUG_ON(this_cpu_read(cpu_tlbstate.loaded_mm) == @@ -310,6 +350,9 @@ EXPORT_SYMBOL_GPL(asi_exit); void asi_init_mm_state(struct mm_struct *mm) { + if (!boot_cpu_has(X86_FEATURE_ASI)) + return; + memset(mm->asi, 0, sizeof(mm->asi)); mutex_init(&mm->asi_init_lock); } diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index de4227ed5169ff84d0ce80b677caffc475198fa6..ded3a47f2a9c1f554824d4ad19f3b48bce271274 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -28,6 +28,7 @@ #include #include #include +#include /* * We need to define the tracepoints somewhere, and tlb.c @@ -251,7 +252,7 @@ static void __init probe_page_size_mask(void) __default_kernel_pte_mask = __supported_pte_mask; /* Except when with PTI where the kernel is mostly non-Global: */ if (cpu_feature_enabled(X86_FEATURE_PTI) || - IS_ENABLED(CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION)) + cpu_feature_enabled(X86_FEATURE_ASI)) __default_kernel_pte_mask &= ~_PAGE_GLOBAL; /* Enable 1 GB linear kernel mappings if available: */ @@ -754,6 +755,7 @@ void __init init_mem_mapping(void) unsigned long end; pti_check_boottime_disable(); + asi_check_boottime_disable(); probe_page_size_mask(); setup_pcid(); diff --git a/include/asm-generic/asi.h b/include/asm-generic/asi.h index 6b84202837605fa57e4a910318c8353b3f816f06..eedc961ee916a9e1da631ca489ea4a7bc9e6089f 100644 --- a/include/asm-generic/asi.h +++ b/include/asm-generic/asi.h @@ -65,6 +65,10 @@ static inline pgd_t *asi_pgd(struct asi *asi) { return NULL; } static inline void asi_handle_switch_mm(void) { } +#define static_asi_enabled() false + +static inline void asi_check_boottime_disable(void) { } + #endif /* !CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */ #endif /* !_ASSEMBLY_ */ From patchwork Fri Jan 10 18:40:31 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935234 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D63DE77188 for ; Fri, 10 Jan 2025 18:41:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 000AC6B00BD; Fri, 10 Jan 2025 13:41:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EF4ED6B00BF; Fri, 10 Jan 2025 13:41:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CAA0E6B00C0; Fri, 10 Jan 2025 13:41:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 9F28D6B00BD for ; Fri, 10 Jan 2025 13:41:00 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 63ACFA0DEE for ; Fri, 10 Jan 2025 18:41:00 +0000 (UTC) X-FDA: 82992409080.27.C1D1F66 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf25.hostedemail.com (Postfix) with ESMTP id 6777AA000B for ; Fri, 10 Jan 2025 18:40:58 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Q6ZZQEyx; spf=pass (imf25.hostedemail.com: domain of 3uGmBZwgKCNM8z19BzC05DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3uGmBZwgKCNM8z19BzC05DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534458; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=06++54m2jQMoQajJn4JgiUDifyLkkN/hJfJdF7FjHRg=; b=5Ak0F1QRR4/rXMlJ9Jd1s/EAXiW/xNxpt71XQYjfKx5/HvL7UVSbGnflOuajWK9woMF0ub iSzc7cehViZenGZrJowaoQk3RukhGDj4YhKyGAltj8Drduc69YKU7ad3gtBcJkw+4L6lub Cbokv7vuIHf8O4J2erd9gPUu9K4veaU= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Q6ZZQEyx; spf=pass (imf25.hostedemail.com: domain of 3uGmBZwgKCNM8z19BzC05DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3uGmBZwgKCNM8z19BzC05DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534458; a=rsa-sha256; cv=none; b=viC3FZK0eO4aVtyLBzN7UAxSp0ldeLD5qx01WXE10wXaORfAYOD5XyB/9eziKDbxTUSi+d LUcLUVYhLH0rAegeJjmP2Ivy9M4KtU2SijJ6J03tV0xxBEM0bsOJ7clkJIGrEAOau8gIxQ 127MdPHjB/gxKt6/kTqy/s7gFiYWkgQ= Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43631d8d9c7so11981215e9.1 for ; Fri, 10 Jan 2025 10:40:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534457; x=1737139257; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=06++54m2jQMoQajJn4JgiUDifyLkkN/hJfJdF7FjHRg=; b=Q6ZZQEyx+Hyd8rjYT32sPbasvVxjNoTcVu1Zaag59WXCiGlTJ/q4VQUGSmVayAyba3 60+WaJzkk1ONV0fETbBIK1wxGfwMmTIQ01ANHU9P2ZEdRtRFLuBjIh9EGoW5xfejOj+Q jYmcbwjd2LWm+liEmGGqeHhsdUcXyYy4+ya61xZ3ejnKlavPpdk3Oj5LWITifjKoyVgd rRWGlHPwHDcOkzK9QiO72QkRsU8Aa/HwXbTFjxQ1+y/0mALSP3Vnof3U+dEt9Iv00ZC3 oOgxVtcXrE7+RT8tdTUfehnO9I6dJLNOsjlp2jPyh2Z7MEqIIo+O34q7p6aLeIOKSxQa i8dw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534457; x=1737139257; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=06++54m2jQMoQajJn4JgiUDifyLkkN/hJfJdF7FjHRg=; b=T5RtRo47I4kedPmcw1DoM2V0XZBjMwr52/NmTHXyjbOipK64j1nuvb1UbMMO/ERB2T /RwfqtbVWjUwV4TNNfHbTz+W5SoEA4uIYiw5awwFfJujcsfZj5EXIANA2etue+JO8mm3 rCSv9gfyvzakqULjW4yHHIh0cGuNHVSwjw41tG9dssfIIPxEjolblZL8/KY+GCh7eTKT DMEzVFM1Jw3Ea6E0gY8FfYyiQQZsjiPkKbVDyTSisP+5565Nw7myfSF6sSfy/FWU77UR OFIlj18oPLzPyC8PT9WNsGp96iJ1DW5naxQbG6w9HqbQzaOx85FieRyBlAFtikizU2cW H9fg== X-Forwarded-Encrypted: i=1; AJvYcCUb89DWSMZvnlAZ/5V9rdzY4Rb+acBG7S2sX4ykGsVDPtlZ2zBrhk3sw7fykz8wm/prWsrbbWkk0A==@kvack.org X-Gm-Message-State: AOJu0YxYYbuNqlCVkC5LP/Mb4UT4oufR6SX4ZpVyiyBv/xuQ1RssJLUf u8IbWZDBmyX2rmQZ0k84+HAv4HBuwczo+QK/yZmkojv+H7bZUOwxUmhlpNaRlUtfdAZ67Bg11/Q p5Ynzvlgnow== X-Google-Smtp-Source: AGHT+IGnlkbZgvto5V3RLsC0pJW1qd+uu81ptaoP4aB2W/3FKEzvBJWjRMlcB/euDId6QKmAnD2UbMWUf073rw== X-Received: from wmqd22.prod.google.com ([2002:a05:600c:34d6:b0:436:185e:c91d]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:444a:b0:434:fe3c:c662 with SMTP id 5b1f17b1804b1-436e9d7b99cmr59996085e9.12.1736534456699; Fri, 10 Jan 2025 10:40:56 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:31 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-5-8419288bc805@google.com> Subject: [PATCH RFC v2 05/29] mm: asi: ASI support in interrupts/exceptions From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman , Junaid Shahid X-Rspamd-Queue-Id: 6777AA000B X-Stat-Signature: o1ayr78f8adr33h4rzjab7p9b54hf8oq X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1736534458-712268 X-HE-Meta: U2FsdGVkX19WO0W+2KGfYgd1ovO2jwjOdJQCz2Hak/R+oIUFayrcyXYb+jcy97ytbYttZ04WgjMPvW1W7qeL80VcAA2n8HVYzKG6K595J+m8EuDgf1u5OG3VhkSPqfSKJAyp2r7wbXGo5uNEXyNbTVNLHLQ56jQ31KsKgVbsTqao4G3jgBhZRQ1v2tnuJl696zng6u4OFbeR/3M2ByGxICiPkpW+vOo9fHfl6gIxbhGwLdVZrkWnZ6r8qEyQpbLQgTmXNU6bSEd9E2FOZ8DVxT9hhy3SZvd7fdQ3KFcYXtBrTFhEbPDtxhtEmX3dpqFjBowo0PHlASJTntzLvHONBJQa6KNxZeCrDnWo8+QHQWbq0oJusg1670Uskq0Tv0/fPLDz80XVUjmZ7QzwoU6XVbvcj5S2qibKMKIf7XzS5U37e4p+CX+05Z4ivKEpbygxSes2Z+YXfMBqkd9ETXQ+5gtHUYj7ccQrmJf/rEB6Dc95ko+9j1yGSn1qUlG2HDpTYNw/4q1pLzFd4j8ufy2FFsvjb4exRCkuBscqfyEaL9sTSmV6htlXVmfp6mXAtRNy5zBOeageytW1F2LhcE79oPHJ+oKj1ToMD7Lr9uK873qw4nmw/slFy5nbRaRdjs3gF0R4qrT6/VPu9EYX6oSP2Iry7SXBe6GlKIXrfC4pq47ZzzKDmqXxmKs4++glmPPOXAnrd9hAR3DceOzxTP6m06abFv2oI+JnZ6+qUihfpyQIthK2UBzUS7UdbYIRxOZySj+UZBOH1dqlTaWPbcAwAt+nbNroGb9NfZVIT4OaWZbJp81juybtiWBOQAsoyQ4KLuBbR2SvNBQTD49HY8Lz4s2mSUA31U99D+Feh1Ssi+adEiJk9PN0yQSpMOGDLTMlatOcl5jI7tX7ME0GbyTgi2qSsWokMR/K9+smybUzU4YuNxbvtzZETeJsYx54Fs8MSoYWL9QrEQNMxA7BbRh 6Iu4eYlK zWonI1xakWH9yUnUkkbz8z5cgnNXGqt3P/VNP3lA7J1Bshz7LNudP08G+oroT4QbzjT9DekSN9o6gZy9UeJ3sgAs7buLL7URfEM8qZxgbrFwe51Xtee1u0UrRIlIoKfaOXzX3weNjJmC7FkyLvootqXtKERrMHD2PVKHoTv3sBMXzd385cXOoygIbZHFH6Hb8cbewvZ8YoPnc1bR89CScBkcr2IU7gYYF0wp+MypiuoGKYghIcGK5pShQhK4Nso5vJW07vLJWc9CIUNETe8OLOHvL2hhJlD6B/EPLJG4MsRVV/NSQeexTn9ecg9Qkrhqg6lJY5HPqGPCkPpAMmlm9Ozry1/ZEdCA0q2m3TuQYSzAlbq97f2yt2NpB/xTELK97OeLCCkJmQFIhEAtqjp0klF1DwAyFoqbXzZSd7yOFNThUy4mquvYpfvHYX4QtRIkDKcLKWY4tBFDaJmQHmIVUwOGarQ3LFZ3q5TVamkU3SPb+eqp8t742KRSEpZX7MGmWNV1DE4QaUUn2YJ7NHi3XvSIOLzUMQAfc40VcR8ZIeIbaXQtBs1chNkZgtQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add support for potentially switching address spaces from within interrupts/exceptions/NMIs etc. An interrupt does not automatically switch to the unrestricted address space. It can switch if needed to access some memory not available in the restricted address space, using the normal asi_exit call. On return from the outermost interrupt, if the target address space was the restricted address space (e.g. we were in the critical code path between ASI Enter and VM Enter), the restricted address space will be automatically restored. Otherwise, execution will continue in the unrestricted address space until the next explicit ASI Enter. In order to keep track of when to restore the restricted address space, an interrupt/exception nesting depth counter is maintained per-task. An alternative implementation without needing this counter is also possible, but the counter unlocks an additional nice-to-have benefit by allowing detection of whether or not we are currently executing inside an exception context, which would be useful in a later patch. Note that for KVM on SVM, this is not actually necessary as NMIs are in fact maskable via CLGI. It's not clear to me if VMX has something equivalent but we will need this infrastructure in place for userspace support anyway. RFC: Once userspace ASI is implemented, this idtentry integration looks a bit heavy-handed. For example, we don't need this logic for INT 80 emulation, so having it in DEFINE_IDTENTRY_RAW is confusing. It could lead to a bug if the order of interrupter counter modifications and ASI transition logic gets flipped around somehow. checkpatch.pl SPACING is false positive. AVOID_BUG ignored for RFC. Checkpatch-args: --ignore=SPACING,AVOID_BUG Signed-off-by: Junaid Shahid Signed-off-by: Brendan Jackman --- arch/x86/include/asm/asi.h | 68 ++++++++++++++++++++++++++++++++++++++-- arch/x86/include/asm/idtentry.h | 50 ++++++++++++++++++++++++----- arch/x86/include/asm/processor.h | 5 +++ arch/x86/kernel/process.c | 2 ++ arch/x86/kernel/traps.c | 22 +++++++++++++ arch/x86/mm/asi.c | 7 ++++- include/asm-generic/asi.h | 10 ++++++ 7 files changed, 153 insertions(+), 11 deletions(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index b9671ef2dd3278adceed18507fd260e21954d574..9a9a139518289fc65f26a4d1cd311aa52cc5357f 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -157,6 +157,11 @@ void asi_relax(void); /* Immediately exit the restricted address space if in it */ void asi_exit(void); +static inline void asi_init_thread_state(struct thread_struct *thread) +{ + thread->asi_state.intr_nest_depth = 0; +} + /* The target is the domain we'll enter when returning to process context. */ static __always_inline struct asi *asi_get_target(struct task_struct *p) { @@ -197,9 +202,10 @@ static __always_inline bool asi_is_relaxed(void) /* * Is the current task in the critical section? * - * This is just the inverse of !asi_is_relaxed(). We have both functions in order to - * help write intuitive client code. In particular, asi_is_tense returns false - * when ASI is disabled, which is judged to make user code more obvious. + * This is just the inverse of !asi_is_relaxed(). We have both functions in + * order to help write intuitive client code. In particular, asi_is_tense + * returns false when ASI is disabled, which is judged to make user code more + * obvious. */ static __always_inline bool asi_is_tense(void) { @@ -211,6 +217,62 @@ static __always_inline pgd_t *asi_pgd(struct asi *asi) return asi ? asi->pgd : NULL; } +static __always_inline void asi_intr_enter(void) +{ + if (static_asi_enabled() && asi_is_tense()) { + current->thread.asi_state.intr_nest_depth++; + barrier(); + } +} + +void __asi_enter(void); + +static __always_inline void asi_intr_exit(void) +{ + if (static_asi_enabled() && asi_is_tense()) { + /* + * If an access to sensitive memory got reordered after the + * decrement, the #PF handler for that access would see a value + * of 0 for the counter and re-__asi_enter before returning to + * the faulting access, triggering an infinite PF loop. + */ + barrier(); + + if (--current->thread.asi_state.intr_nest_depth == 0) { + /* + * If the decrement got reordered after __asi_enter, an + * interrupt that came between __asi_enter and the + * decrement would always see a nonzero value for the + * counter so it wouldn't call __asi_enter again and we + * would return to process context in the wrong address + * space. + */ + barrier(); + __asi_enter(); + } + } +} + +/* + * Returns the nesting depth of interrupts/exceptions that have interrupted the + * ongoing critical section. If the current task is not in a critical section + * this is 0. + */ +static __always_inline int asi_intr_nest_depth(void) +{ + return current->thread.asi_state.intr_nest_depth; +} + +/* + * Remember that interrupts/exception don't count as the critical section. If + * you want to know if the current task is in the critical section use + * asi_is_tense(). + */ +static __always_inline bool asi_in_critical_section(void) +{ + return asi_is_tense() && !asi_intr_nest_depth(); +} + #define INIT_MM_ASI(init_mm) \ .asi_init_lock = __MUTEX_INITIALIZER(init_mm.asi_init_lock), diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h index ad5c68f0509d4dfd0834303c0f9dabc93ef73aa4..9e00da0a3b08f83ca5e603dc2abbfd5fa3059811 100644 --- a/arch/x86/include/asm/idtentry.h +++ b/arch/x86/include/asm/idtentry.h @@ -12,6 +12,7 @@ #include #include +#include typedef void (*idtentry_t)(struct pt_regs *regs); @@ -55,12 +56,15 @@ static __always_inline void __##func(struct pt_regs *regs); \ \ __visible noinstr void func(struct pt_regs *regs) \ { \ - irqentry_state_t state = irqentry_enter(regs); \ + irqentry_state_t state; \ \ + asi_intr_enter(); \ + state = irqentry_enter(regs); \ instrumentation_begin(); \ __##func (regs); \ instrumentation_end(); \ irqentry_exit(regs, state); \ + asi_intr_exit(); \ } \ \ static __always_inline void __##func(struct pt_regs *regs) @@ -102,12 +106,15 @@ static __always_inline void __##func(struct pt_regs *regs, \ __visible noinstr void func(struct pt_regs *regs, \ unsigned long error_code) \ { \ - irqentry_state_t state = irqentry_enter(regs); \ + irqentry_state_t state; \ \ + asi_intr_enter(); \ + state = irqentry_enter(regs); \ instrumentation_begin(); \ __##func (regs, error_code); \ instrumentation_end(); \ irqentry_exit(regs, state); \ + asi_intr_exit(); \ } \ \ static __always_inline void __##func(struct pt_regs *regs, \ @@ -139,7 +146,16 @@ static __always_inline void __##func(struct pt_regs *regs, \ * is required before the enter/exit() helpers are invoked. */ #define DEFINE_IDTENTRY_RAW(func) \ -__visible noinstr void func(struct pt_regs *regs) +static __always_inline void __##func(struct pt_regs *regs); \ + \ +__visible noinstr void func(struct pt_regs *regs) \ +{ \ + asi_intr_enter(); \ + __##func (regs); \ + asi_intr_exit(); \ +} \ + \ +static __always_inline void __##func(struct pt_regs *regs) /** * DEFINE_FREDENTRY_RAW - Emit code for raw FRED entry points @@ -178,7 +194,18 @@ noinstr void fred_##func(struct pt_regs *regs) * is required before the enter/exit() helpers are invoked. */ #define DEFINE_IDTENTRY_RAW_ERRORCODE(func) \ -__visible noinstr void func(struct pt_regs *regs, unsigned long error_code) +static __always_inline void __##func(struct pt_regs *regs, \ + unsigned long error_code); \ + \ +__visible noinstr void func(struct pt_regs *regs, unsigned long error_code)\ +{ \ + asi_intr_enter(); \ + __##func (regs, error_code); \ + asi_intr_exit(); \ +} \ + \ +static __always_inline void __##func(struct pt_regs *regs, \ + unsigned long error_code) /** * DECLARE_IDTENTRY_IRQ - Declare functions for device interrupt IDT entry @@ -209,14 +236,17 @@ static void __##func(struct pt_regs *regs, u32 vector); \ __visible noinstr void func(struct pt_regs *regs, \ unsigned long error_code) \ { \ - irqentry_state_t state = irqentry_enter(regs); \ + irqentry_state_t state; \ u32 vector = (u32)(u8)error_code; \ \ + asi_intr_enter(); \ + state = irqentry_enter(regs); \ kvm_set_cpu_l1tf_flush_l1d(); \ instrumentation_begin(); \ run_irq_on_irqstack_cond(__##func, regs, vector); \ instrumentation_end(); \ irqentry_exit(regs, state); \ + asi_intr_exit(); \ } \ \ static noinline void __##func(struct pt_regs *regs, u32 vector) @@ -255,13 +285,16 @@ static __always_inline void instr_##func(struct pt_regs *regs) \ \ __visible noinstr void func(struct pt_regs *regs) \ { \ - irqentry_state_t state = irqentry_enter(regs); \ + irqentry_state_t state; \ \ + asi_intr_enter(); \ + state = irqentry_enter(regs); \ kvm_set_cpu_l1tf_flush_l1d(); \ instrumentation_begin(); \ instr_##func (regs); \ instrumentation_end(); \ irqentry_exit(regs, state); \ + asi_intr_exit(); \ } \ \ void fred_##func(struct pt_regs *regs) \ @@ -294,13 +327,16 @@ static __always_inline void instr_##func(struct pt_regs *regs) \ \ __visible noinstr void func(struct pt_regs *regs) \ { \ - irqentry_state_t state = irqentry_enter(regs); \ + irqentry_state_t state; \ \ + asi_intr_enter(); \ + state = irqentry_enter(regs); \ kvm_set_cpu_l1tf_flush_l1d(); \ instrumentation_begin(); \ instr_##func (regs); \ instrumentation_end(); \ irqentry_exit(regs, state); \ + asi_intr_exit(); \ } \ \ void fred_##func(struct pt_regs *regs) \ diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index f02220e6b4df911d87e2fee4b497eade61a27161..a32a53405f45e4c0473fe081e216029cf5bd0cdd 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -508,6 +508,11 @@ struct thread_struct { struct { /* Domain to enter when returning to process context. */ struct asi *target; + /* + * The depth of interrupt/exceptions interrupting an ASI + * critical section + */ + int intr_nest_depth; } asi_state; #endif diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index f63f8fd00a91f3d1171f307b92179556ba2d716d..44abc161820153b7f68664b97267658b8e011101 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -96,6 +96,8 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) #ifdef CONFIG_VM86 dst->thread.vm86 = NULL; #endif + asi_init_thread_state(&dst->thread); + /* Drop the copied pointer to current's fpstate */ dst->thread.fpu.fpstate = NULL; diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 2dbadf347b5f4f66625c4f49b76c41b412270d57..beea861da8d3e9a4e2afb3a92ed5f66f11d67bd6 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -65,6 +65,7 @@ #include #include #include +#include #include #include #include @@ -463,6 +464,27 @@ DEFINE_IDTENTRY_DF(exc_double_fault) } #endif + /* + * Do an asi_exit() only here because a #DF usually indicates + * the system is in a really bad state, and we don't want to + * cause any additional issue that would prevent us from + * printing a correct stack trace. + * + * The additional issues are not related to a possible triple + * fault, which can only occurs if a fault is encountered while + * invoking this handler, but here we are already executing it. + * Instead, an ASI-induced #PF here could potentially end up + * getting another #DF. For example, if there was some issue in + * invoking the #PF handler. The handler for the second #DF + * could then again cause an ASI-induced #PF leading back to the + * same recursion. + * + * This is not needed in the espfix64 case above, since that + * code is about turning a #DF into a #GP which is okay to + * handle in the restricted domain. That's also why we don't + * asi_exit() in the #GP handler. + */ + asi_exit(); irqentry_nmi_enter(regs); instrumentation_begin(); notify_die(DIE_TRAP, str, regs, error_code, X86_TRAP_DF, SIGSEGV); diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index 5baf563a078f5b3a6cd4b9f5e92baaf81b0774c4..054315d566c082c0925a00ce3a0877624c8b9957 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -235,7 +235,7 @@ static __always_inline void maybe_flush_data(struct asi *next_asi) this_cpu_and(asi_taints, ~ASI_TAINTS_DATA_MASK); } -static noinstr void __asi_enter(void) +noinstr void __asi_enter(void) { u64 asi_cr3; struct asi *target = asi_get_target(current); @@ -250,6 +250,7 @@ static noinstr void __asi_enter(void) * disabling preemption should be fine. */ VM_BUG_ON(preemptible()); + VM_BUG_ON(current->thread.asi_state.intr_nest_depth != 0); if (!target || target == this_cpu_read(curr_asi)) return; @@ -290,6 +291,7 @@ noinstr void asi_enter(struct asi *asi) if (!static_asi_enabled()) return; + VM_WARN_ON_ONCE(asi_intr_nest_depth()); VM_WARN_ON_ONCE(!asi); /* Should not have an asi_enter() without a prior asi_relax(). */ @@ -305,6 +307,7 @@ EXPORT_SYMBOL_GPL(asi_enter); noinstr void asi_relax(void) { if (static_asi_enabled()) { + VM_WARN_ON_ONCE(asi_intr_nest_depth()); barrier(); asi_set_target(current, NULL); } @@ -326,6 +329,8 @@ noinstr void asi_exit(void) asi = this_cpu_read(curr_asi); if (asi) { + WARN_ON_ONCE(asi_in_critical_section()); + maybe_flush_control(NULL); unrestricted_cr3 = diff --git a/include/asm-generic/asi.h b/include/asm-generic/asi.h index eedc961ee916a9e1da631ca489ea4a7bc9e6089f..7f542c59c2b8a2b74432e4edb7199f9171db8a84 100644 --- a/include/asm-generic/asi.h +++ b/include/asm-generic/asi.h @@ -52,6 +52,8 @@ static inline bool asi_is_relaxed(void) { return true; } static inline bool asi_is_tense(void) { return false; } +static inline bool asi_in_critical_section(void) { return false; } + static inline void asi_exit(void) { } static inline bool asi_is_restricted(void) { return false; } @@ -65,6 +67,14 @@ static inline pgd_t *asi_pgd(struct asi *asi) { return NULL; } static inline void asi_handle_switch_mm(void) { } +static inline void asi_init_thread_state(struct thread_struct *thread) { } + +static inline void asi_intr_enter(void) { } + +static inline int asi_intr_nest_depth(void) { return 0; } + +static inline void asi_intr_exit(void) { } + #define static_asi_enabled() false static inline void asi_check_boottime_disable(void) { } From patchwork Fri Jan 10 18:40:32 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935235 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DF83E77188 for ; Fri, 10 Jan 2025 18:41:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4E6F36B00C1; Fri, 10 Jan 2025 13:41:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 46E5F6B00C2; Fri, 10 Jan 2025 13:41:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 270B96B00C3; Fri, 10 Jan 2025 13:41:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 00D1A6B00C1 for ; Fri, 10 Jan 2025 13:41:02 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id ACFF9C0E00 for ; Fri, 10 Jan 2025 18:41:02 +0000 (UTC) X-FDA: 82992409164.01.15FEB30 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf08.hostedemail.com (Postfix) with ESMTP id CB8ED16001D for ; Fri, 10 Jan 2025 18:41:00 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=MoaxMIRc; spf=pass (imf08.hostedemail.com: domain of 3u2mBZwgKCNYB24CE2F38GG8D6.4GEDAFMP-EECN24C.GJ8@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3u2mBZwgKCNYB24CE2F38GG8D6.4GEDAFMP-EECN24C.GJ8@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534460; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kYL7XeCfcudeBLlUDEGoh0o6eIbGWBwJtwhvHOdo8c0=; b=CjMHSSTU1L7bSVuoap5N5YKLJVookR9YGhqz3vnKwqaKUVCldG/OaNdgrB3kgdfNL7mvMA ZUnz7g/veNJXjeBBK0lCRXHnpfuONiHt3gKDryij21h3s6TR3ry7VA5XFOZwMXGkL0y/P4 Em2OniMNOKla1BVpiirhEjS1+81W8T8= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=MoaxMIRc; spf=pass (imf08.hostedemail.com: domain of 3u2mBZwgKCNYB24CE2F38GG8D6.4GEDAFMP-EECN24C.GJ8@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3u2mBZwgKCNYB24CE2F38GG8D6.4GEDAFMP-EECN24C.GJ8@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534460; a=rsa-sha256; cv=none; b=cl5o17uaQNUfjnupaYL6h5iPB/vC6wG1h9QSvPjJITjW1bJ3ZQOqrPsbkrFox882SkVNvX KnxG3FjQSIpEZdZPt0qbQ+9mM/lyPGlCDPTkevoyjgk914RRLhWNk8aDe2AH5Vr42cdDV4 Gudd32oF0mKSMZlAPbPAWnEv+ikCoOY= Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-4361ecebc5bso12445175e9.1 for ; Fri, 10 Jan 2025 10:41:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534459; x=1737139259; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kYL7XeCfcudeBLlUDEGoh0o6eIbGWBwJtwhvHOdo8c0=; b=MoaxMIRcciz8ejfW4n38yXtQQnI4/7gutY7SmiWkbPHm/2VVFSqTCiYZqiCh9UaKhI 6R4D6F3+d3zvpbJWCqy+AcOA/Uah/90dD6Rq47Q0O5SRJFHkLiTJgMTCVL3dqLODPYuU 4IpYIdUvTRNcnnhon3hSgXqQb3kIQOkw03ukbvXI6jTgu9KZSR2gj8nPbwDDLM19s2ia mdkmT4gXiHqzjNliih4Sw0Wu9cicW41Orwv5H12ZN7ZQwvOxG2ObbB221rPez8uGsB7h Mri63LS5XKmtvVq1ncw8MFjJVXVesPdlQV13K8tR8ZZLjYydrpufLGaAIds1oiPGHBIi h1Sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534459; x=1737139259; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kYL7XeCfcudeBLlUDEGoh0o6eIbGWBwJtwhvHOdo8c0=; b=JP61wip3b8MdX83l/tfxdehvTRpRZ/lK9Dh+yjxfM+c02pRpHKbHg0km6cCELHgyA6 RUQS4+f8D3TiYczeyx34emGxD81KgAJ7JXcsBWwgLmOfADA85FrF8hMwrTi8Z55OIlcD bWUknVvcLa25O28+T/rREouEi+bZDjRrpCpl6p7Q3aU2QkSd4lU/7v7Oq3DUVoRgG6vL KQJx5uFP6ggba9Ma4WXyooioUMUpYZp70VVZ9nTm0oblHcFj7wS0tRFdj+HUSlHEnDj3 YCuMsdff9DIxfbtZC3DrGYrkDBpnI49hO4MFPOuSgvT1Hm9JUNEuYexRkn0V7OkC5TAT a7iw== X-Forwarded-Encrypted: i=1; AJvYcCVIUdJgTlxh/xpOVu8+EF/ZnPu49IhJ4z5yEa8m99GswW2XJmnN9hXi82ZsGn0pyvgxZQjI5PPn5A==@kvack.org X-Gm-Message-State: AOJu0Yytt8i+rDnPu+h6oBMo93rVqCza07bXkhB5OqW6VL/oiayNH8gh razqWWywXJs3W96BpGcl4tuKVXpcP6RbLTvZHlpjI2Y+gjHP6aCUwTvAONOCaj7cRZqv4Tn3MNu HR1exyXSlwQ== X-Google-Smtp-Source: AGHT+IH0DyTQGUBeXAkb8pkdNHIPyF1JMCOQ9V4x2A26uY/IXi2bUp3/F2mGP9HMTIoPxxyBiM7AkavMfpYEag== X-Received: from wmdn10.prod.google.com ([2002:a05:600c:294a:b0:436:d819:e4eb]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:4f09:b0:434:fafe:edb with SMTP id 5b1f17b1804b1-436e26f00fdmr96576145e9.24.1736534459144; Fri, 10 Jan 2025 10:40:59 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:32 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-6-8419288bc805@google.com> Subject: [PATCH RFC v2 06/29] mm: asi: Use separate PCIDs for restricted address spaces From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman , Yosry Ahmed , Junaid Shahid X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: CB8ED16001D X-Rspam-User: X-Stat-Signature: uebcyqt4e7coaujucek5tzkhg9hqrwrk X-HE-Tag: 1736534460-925707 X-HE-Meta: U2FsdGVkX1/wxscHzruO2tQU+Eo6UENp33j6hEaKFSMdxrQ/2ObyFYRWkbbbwo5MKUTeuR1v6OMzddhswNzh2P2lEANo4xLoYQAhnaC7Kobq1Lt6tyAQCwS/VP5jw7q/UlHfbG9tsdqqIpxJgTfeNP7r7cpUzeYtryTL2+E36urRcpNR2eXmxKz9P/S+UJSwXWBzB45B3WuXBfLivM2AnY/059tNASECJGMD6DDB3JmBQAKrB6dfrwrO/G+kSCXkelhg+Xi8DJEzI+sdyQEDAO89UXDUmNRXG5vsg/UKHAedjbCyorO92peo+72LIOIXmu6CuYVdpXj5yysPZSRD67ENTnCifaf4naI6OnTeEvNtYfMQqUauUsLAO9CnqtcAi/8QPybHBG+Aea4R/yzR3RBXGJhu6U5afL66RpknJTF9UKf8voVf1fRUxq2HDhebr8/QUrYZa0JspUcwxqUEdu19dN8dcqf4rG0nDPVSuc6ywNZ8LL1hpJjdsN78HX3OO0lIN251grAKw08FX93pNnVRMSYA5K3WKsIiIA7Vew5kqrUt/7Lf7N6SY+NcV9sMQJ0nryT8Fvxfj02PlI0HxYsgR67PxvYOneJRbxq2otuvs8McvXTYX9p31b9GcoKpPcaTB9fMYTlNMuhoRo0rCIj/iLfUP4y6SghPV9iVrg1S2AVG0Nc+VGLaBLTq4dej8tTosZYHKHgCoxknoZawj5mQ1/u6DUoL9+16GKilc8K85Qfqb7w14KGpMja/XiZDBsiewDOFdRvHPbWC9ctK9P4TNbwzpqM6pgrZzAnAoWCWnJkA2wtBLYVSj4lsVnm/x3EQdwvfkPSgDFkmyNFseQoV8jhVlNJQwcznoQOgy/dYuK9RuSiZaNgWKqBQ/wdpzn22mNMFF4ir7H6gDyEvR6YwbTgQ+UzDwFti80yoWiTR7rS3lmy3AhZDvCMdJbwAmcqGc8giHdDlN4LAGU5 thkaPMpN 9xSahKc496JVIXOtsxkKl+qd9ZINRwO9KHx7FRAx8+38/pBEDCN9Hx8XKwJZQ2+oQImjp73mBAlsCp7JzobrWFa1zGZq3IOTS9qWd5PUWxYWZS4ly5vOH4u3S+VPsZ2J2cSdfQM6mDwrvstRICQ6oahg41vlV8Ee4oqCOzyW7IFeUnnKnFobT6rAk+UZ55KanY7t1Qk5lAxUQ+PwwMcSPeF5M86AqQ7ANWi37vQ9/nfR/LQkcFnIK/043l55qYRDI4+JX/1T6kaoGaxblqcuazvPIoPtuHQeUkhaQbTZDCIsYVTHKGhC/wY36co0VfT5vq/+pQuK+8f4X2ilH6Zhr3IoRddjo7Vh1NQGK9r0QWG74cQgQA5H2DoGyzSO+zHaiF/X/JAvlzoz/1TO6CH/FCqYe9XLdn+UN2ltY9F2SKo2Zdosa/jsrZVEDvcolcpylhBDTtzOChWtKQf1rcvUwDA8rqJCt5NNliMqER1q0OXq6T/ftJj7kR7PZhAJJNoOr1fBmIvrAlw2ZCZQCeaAmqOCNamYWmnZ63Xv1dDSjmX+Efs+j8SXQHhXCHg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Yosry Ahmed Each restricted address space is assigned a separate PCID. Since currently only one ASI instance per-class exists for a given process, the PCID is just derived from the class index. This commit only sets the appropriate PCID when switching CR3, but does not actually use the NOFLUSH bit. That will be done by later patches. Co-developed-by: Junaid Shahid Signed-off-by: Junaid Shahid Signed-off-by: Yosry Ahmed Signed-off-by: Brendan Jackman --- arch/x86/include/asm/asi.h | 4 +-- arch/x86/include/asm/processor-flags.h | 24 +++++++++++++++++ arch/x86/include/asm/tlbflush.h | 3 +++ arch/x86/mm/asi.c | 10 +++---- arch/x86/mm/tlb.c | 49 +++++++++++++++++++++++++++++++--- include/asm-generic/asi.h | 2 ++ 6 files changed, 81 insertions(+), 11 deletions(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index 9a9a139518289fc65f26a4d1cd311aa52cc5357f..a55e73f1b2bc84c41b9ab25f642a4d5f1aa6ba90 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -4,13 +4,13 @@ #include -#include - #include #include #include #include +#include + #ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION /* diff --git a/arch/x86/include/asm/processor-flags.h b/arch/x86/include/asm/processor-flags.h index e5f204b9b33dfaa92ed1b05faa6b604e50d5f2f3..42c5acb67c2d2a6b03deb548fe3dd088baa88842 100644 --- a/arch/x86/include/asm/processor-flags.h +++ b/arch/x86/include/asm/processor-flags.h @@ -55,4 +55,28 @@ # define X86_CR3_PTI_PCID_USER_BIT 11 #endif +/* + * An ASI identifier is included in the higher bits of PCID to use a different + * PCID for each restricted address space, different from the PCID of the + * unrestricted address space (see asi_pcid()). We use the bits directly after + * the bit used by PTI (if any). + */ +#ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION + +#define X86_CR3_ASI_PCID_BITS 2 + +/* Use the highest available PCID bits after the PTI bit (if any) */ +#ifdef CONFIG_MITIGATION_PAGE_TABLE_ISOLATION +#define X86_CR3_ASI_PCID_END_BIT (X86_CR3_PTI_PCID_USER_BIT - 1) +#else +#define X86_CR3_ASI_PCID_END_BIT (X86_CR3_PCID_BITS - 1) +#endif + +#define X86_CR3_ASI_PCID_BITS_SHIFT (X86_CR3_ASI_PCID_END_BIT - X86_CR3_ASI_PCID_BITS + 1) +#define X86_CR3_ASI_PCID_MASK (((1UL << X86_CR3_ASI_PCID_BITS) - 1) << X86_CR3_ASI_PCID_BITS_SHIFT) + +#else +#define X86_CR3_ASI_PCID_BITS 0 +#endif + #endif /* _ASM_X86_PROCESSOR_FLAGS_H */ diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index c884174a44e119a3c027c44ada6c5cdba14d1282..f167feb5ebdfc7faba26b8b18ac65888cd9b0494 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -425,5 +425,8 @@ static inline void __native_tlb_flush_global(unsigned long cr4) } unsigned long build_cr3_noinstr(pgd_t *pgd, u16 asid, unsigned long lam); +unsigned long build_cr3_pcid_noinstr(pgd_t *pgd, u16 pcid, unsigned long lam, bool noflush); + +u16 asi_pcid(struct asi *asi, u16 asid); #endif /* _ASM_X86_TLBFLUSH_H */ diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index 054315d566c082c0925a00ce3a0877624c8b9957..8d060c633be68b508847e2c1c111761df1da92af 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -238,6 +238,7 @@ static __always_inline void maybe_flush_data(struct asi *next_asi) noinstr void __asi_enter(void) { u64 asi_cr3; + u16 pcid; struct asi *target = asi_get_target(current); /* @@ -266,9 +267,8 @@ noinstr void __asi_enter(void) this_cpu_write(curr_asi, target); maybe_flush_control(target); - asi_cr3 = build_cr3_noinstr(target->pgd, - this_cpu_read(cpu_tlbstate.loaded_mm_asid), - tlbstate_lam_cr3_mask()); + pcid = asi_pcid(target, this_cpu_read(cpu_tlbstate.loaded_mm_asid)); + asi_cr3 = build_cr3_pcid_noinstr(target->pgd, pcid, tlbstate_lam_cr3_mask(), false); write_cr3(asi_cr3); maybe_flush_data(target); @@ -335,8 +335,8 @@ noinstr void asi_exit(void) unrestricted_cr3 = build_cr3_noinstr(this_cpu_read(cpu_tlbstate.loaded_mm)->pgd, - this_cpu_read(cpu_tlbstate.loaded_mm_asid), - tlbstate_lam_cr3_mask()); + this_cpu_read(cpu_tlbstate.loaded_mm_asid), + tlbstate_lam_cr3_mask()); /* Tainting first makes reentrancy easier to reason about. */ this_cpu_or(asi_taints, ASI_TAINT_KERNEL_DATA); diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 7c2309996d1d5a7cac23bd122f7b56a869d67d6a..2601beed83aef182d88800c09d70e4c5e95e7ed0 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -13,6 +13,7 @@ #include #include +#include #include #include #include @@ -96,7 +97,10 @@ # define PTI_CONSUMED_PCID_BITS 0 #endif -#define CR3_AVAIL_PCID_BITS (X86_CR3_PCID_BITS - PTI_CONSUMED_PCID_BITS) +#define CR3_AVAIL_PCID_BITS (X86_CR3_PCID_BITS - PTI_CONSUMED_PCID_BITS - \ + X86_CR3_ASI_PCID_BITS) + +static_assert(BIT(CR3_AVAIL_PCID_BITS) > TLB_NR_DYN_ASIDS); /* * ASIDs are zero-based: 0->MAX_AVAIL_ASID are valid. -1 below to account @@ -125,6 +129,11 @@ static __always_inline u16 kern_pcid(u16 asid) */ VM_WARN_ON_ONCE(asid & (1 << X86_CR3_PTI_PCID_USER_BIT)); #endif + +#ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION + BUILD_BUG_ON(TLB_NR_DYN_ASIDS >= (1 << X86_CR3_ASI_PCID_BITS_SHIFT)); + VM_WARN_ON_ONCE(asid & X86_CR3_ASI_PCID_MASK); +#endif /* * The dynamically-assigned ASIDs that get passed in are small * (class_id + 1) << X86_CR3_ASI_PCID_BITS_SHIFT); + // return kern_pcid(asid) | ((asi->index + 1) << X86_CR3_ASI_PCID_BITS_SHIFT); +} + +#else /* CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */ + +u16 asi_pcid(struct asi *asi, u16 asid) { return kern_pcid(asid); } + +#endif /* CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */ + void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, unsigned long end, unsigned int stride_shift, bool freed_tables) diff --git a/include/asm-generic/asi.h b/include/asm-generic/asi.h index 7f542c59c2b8a2b74432e4edb7199f9171db8a84..f777a6cf604b0656fb39087f6eba08f980b2cb6f 100644 --- a/include/asm-generic/asi.h +++ b/include/asm-generic/asi.h @@ -2,6 +2,7 @@ #ifndef __ASM_GENERIC_ASI_H #define __ASM_GENERIC_ASI_H +#include #include #ifndef _ASSEMBLY_ @@ -16,6 +17,7 @@ enum asi_class_id { #endif ASI_MAX_NUM_CLASSES, }; +static_assert(order_base_2(X86_CR3_ASI_PCID_BITS) <= ASI_MAX_NUM_CLASSES); typedef u8 asi_taints_t; From patchwork Fri Jan 10 18:40:33 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935236 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 575B8E77188 for ; Fri, 10 Jan 2025 18:41:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 005ED6B00C3; Fri, 10 Jan 2025 13:41:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id ECBEA6B00C4; Fri, 10 Jan 2025 13:41:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CFA4D6B00C5; Fri, 10 Jan 2025 13:41:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id A51836B00C3 for ; Fri, 10 Jan 2025 13:41:05 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 48D904433B for ; Fri, 10 Jan 2025 18:41:05 +0000 (UTC) X-FDA: 82992409290.28.F6EAD72 Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) by imf11.hostedemail.com (Postfix) with ESMTP id 5A96340003 for ; Fri, 10 Jan 2025 18:41:03 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Lmir1uJK; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of 3vWmBZwgKCNgD46EG4H5AIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jackmanb.bounces.google.com designates 209.85.221.74 as permitted sender) smtp.mailfrom=3vWmBZwgKCNgD46EG4H5AIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jackmanb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534463; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/j+RB8/lAgzqGTtPca+XTmsHyqC3ld9qGwXYbAsCbQ0=; b=4qEH/W+LIyWfqiQ1RfNJoQZUJxfbrc3sCTwstKeQ+P7/3QA2x7H7LPNcfAxgog+F8a0gxz YGuCSb2ra6iQxmmizRTovUPnBSY4JOlEzTCjt2ibrR11HghJ8f91qzSXux05iIBRoK5Dk6 tZu7xtMFiQZdKGX8yN/rwp58FsXosQ0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534463; a=rsa-sha256; cv=none; b=r4A/Vi+T0WjkNEvKpHdxbcjpRnm/QleLqjq669+WGeRqbWCaVNuaG50QnD+G+Wafh+npKn CFnrQXsnVqLeTv7a6XMZgkzFh5ISMvDbM43+dOnwxjUoHw92Je57ZkFi61wJxAqsl+bHcZ LP8fXqjHcgG+4SeRhy6FfHdMXm7kl0M= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Lmir1uJK; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of 3vWmBZwgKCNgD46EG4H5AIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jackmanb.bounces.google.com designates 209.85.221.74 as permitted sender) smtp.mailfrom=3vWmBZwgKCNgD46EG4H5AIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jackmanb.bounces.google.com Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-385d6ee042eso1539329f8f.0 for ; Fri, 10 Jan 2025 10:41:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534462; x=1737139262; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=/j+RB8/lAgzqGTtPca+XTmsHyqC3ld9qGwXYbAsCbQ0=; b=Lmir1uJKq61yPZlt8QUQPk2jHA9a6rHiFJ3p+eV7fZykND84mCINHjdEP9QKgTIEb1 Ilr1gwn4vFeWILRlo5u6VdVy0CZNyMWh5QlnoPC81nO94RTmgjnM8D7BWXFklMP1En+w pDBXAxZEr0qRtY/oXiBuBnwejjMlyDJktNEsbrw1s5b5Ph+baXLVAOd0nkTdBgtDd589 FFT2Gja+bT2pt67qnn4sDDZoXQ1t1Mo9LZ3EyEv5ufB5RTrqYN26kxJLq5LFk9ar6bs7 y+BBjuc37U+Ixhr53vQp5x4MNUO96urGIuEZ2NnMHxbLhVlDLhZ5h9D/d1njwsDmIMf/ ff7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534462; x=1737139262; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/j+RB8/lAgzqGTtPca+XTmsHyqC3ld9qGwXYbAsCbQ0=; b=uf0BnccPkCUhpI9lRqv1AKv/UfWr0pWTBwJkT6HtZv8dmhcFqadYZuwQH+mPpd8V1c 9bvFmsryWqbC5gVSFv9B0egG2RFjbQnjprDB7SI3upEIWJ87gdNKSdBdAlM7lhsmVghV 62BU5ejf9lH63WMurOF4gtxm0NGCrJ5DTJrpJIPCU5a6cd+92R1fRTk7SWuISQ1pTNC5 PavaddKrHtw57oyax67FGk+QvUfymg8C4GCO3OrQc6AcUBOVIDnM0Ep3QxAyO8qfgN2P kizAX9UISKOk5uCD2C7XBuacLJO7dvIHgaRswVtUZR/gl+tLVb8VO3BSyThHMmNUmuQy zkvg== X-Forwarded-Encrypted: i=1; AJvYcCUmX22JiImnPxtwzv7Gfrc+UkWNhZ67xGa83dAUKSg2ohUeXdmTQ+PeniOi/YvA/pf4k23XfAmtyQ==@kvack.org X-Gm-Message-State: AOJu0YwOcitSx0plJsQQpVEtCZSThwfp/TmzU9Pn6MV6vhAwqVDLdkq9 FdRfu70Uxh6rDIyVm3a7Pe5o0VIlnzs3KJ6smb4cxzzyhRuJH5aXxpo9u3lOpiqW3EosLtlf9vk eP5m5J9RFhA== X-Google-Smtp-Source: AGHT+IEaBJ4kVsH9fdlEpqyA1vEXALZVlXyRkre6Nr4itvNL2JHwYHaMyiaIiF1xr8efeWfazalHNzAFq1Ad7A== X-Received: from wrbfi1.prod.google.com ([2002:a05:6000:4401:b0:386:333e:ad16]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:470d:b0:385:dedb:a156 with SMTP id ffacd0b85a97d-38a872cfdffmr10312148f8f.6.1736534461506; Fri, 10 Jan 2025 10:41:01 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:33 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-7-8419288bc805@google.com> Subject: [PATCH RFC v2 07/29] mm: asi: Make __get_current_cr3_fast() ASI-aware From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman , Junaid Shahid X-Stat-Signature: 49o17hx3uod69nibncq1gc6w5tby8nqz X-Rspamd-Queue-Id: 5A96340003 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1736534463-956048 X-HE-Meta: U2FsdGVkX1/9Scc07VYCNrz2TtybRhvAtd1sGXihfH0or/eIz7CwbHj0afnIwhFx3REQo8ux/qM/oH3HJdfxintoq9a9vrr3Zj8aeKznHRRkQmSaqoTv4H6ELIFkVW21F4whzy7Gle0hKdt80/XgOCOHxL7YBro2yibtNywdL6R8CdvKKYDu61QeTV/UCQRal3dZiIpgbXfMj6GrPrUEr419Grd7W5VWk4JXW8P8Y88kDl6vfXHGGLDwswHKDIGX8WnZ/4Qp/N6HaVdC9WchUtMPvMFn4SvPpaO+rTqYGJTxMGs+Jlo1j8jcOlJnuU5Re5KG1DU85obujaX4L7+H4twK5ZGfYQodeZ/p/DFXRQmQ3UAtZfZzZKVwyliwWZuLnH2H+WxeeLNpaliceODMjU3GFrtF4KtgOGCWEioxX7Ap5fyZ0AuvEmErqEdg53j7RElSvH1PGgsqGSuh/nHbdoMQAGH4Eg2m6TOlqmrX2oNkMC0O3HqyuTFgOlb5gbzVAXLtQsSroElKO1IPIjwgFvdmwJyE5PabKTrMIXR08FrZokK+CwfJdL9Azu8mVamThstsOhEjY8YGvAapq9tLjSUIapkwDu0co2EDCqY1UG7mg/fXgUQGiG1a9N3RcqxZ7BO8IMMZf0LI+EK3GWSpS2Z7MkX4qNHwCxRHv5njnSnBSv6Gu66KnMBrWotVusOgNfunCGY0oowwmrkjUJW8JtaBXIFgJhwgxLuIjz5bn/3X1n250fYRMhgxTUcBL1VRuok+h/A83jusqw6lgkHp3etUgeFskiMeoV6++DKiJYKoUcxU4i8HaW8GJzmqzVD1zGSM7wsIn7mTxj1roVuoV2Ub+dqizmf912uKPi7TphnQV1SQVkb1LDJ1q6H+fswmk0z+8w0KCXC7kmCYB/dxPE9Fc9QaiTkPs3xILfemaUWmBHoD1viB53/0K1jWYY581dY0yTqY+8I9vOZebnw q9OyJMsQ 6fz6BSNZrsF6OC62ujsVm2TEY8mlLmmN153l3B10aUhcKDQet23uXmILUteVS1oNK1358aBZWuzSTWWn1DDfAIwQXgTwgh+8vnyQeG7c7jD7ZQBJ3Gy1ac+5yEjI4ZLfMd8PKQOh2h8HM8NrmCZ9oYJWe2jQcsjzfBZfqlyvMR7sNIjtcXrz2rZ2LhDkO161E52paNdAhxZEaWdmAkfgCOnBYcfDlM+rKXrKR2be1trgWSEvPeysnQkliiRDMqfOKA6Q3IbYAlBY33/apxgK289gABlLcJsYN6HbdNLPRWlyRuB/LdMJMuWJBRvzO7FrPOtmxFEQm+fTPYZ0omjeKq13rMd7J+lbaAWoVm15MVAhGfF7jynrPqi8Ym5n8aDD3FNrjcK+iWTLRCuTe/yUjA8Ms+ElMd+xljZXqpctMkVChfH5eYnvGhh7l2j7K6NNl7CvN7PDEg6oZU2pyPvteXWJygOefrO7jj5tOJjDnZyJn//nxRxKMMRtqbY6LYJqV1GRVPMIdgHX401VHsy2aeGQfQCf+a0vNjZdN5u9XEIwCXvM9UKE0y7om8PnFtbKUdnzWxbKhLeh5M6M58/aLGMFrLIk7t6o6IbtKdaKmMChpqFF7LkX07O3SYgfk9Prl7nP8vrnJvdt4ZQE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Junaid Shahid When ASI is active, __get_current_cr3_fast() adjusts the returned CR3 value accordingly to reflect the actual ASI CR3. Signed-off-by: Junaid Shahid Signed-off-by: Brendan Jackman --- arch/x86/mm/tlb.c | 37 +++++++++++++++++++++++++++++++------ 1 file changed, 31 insertions(+), 6 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 2601beed83aef182d88800c09d70e4c5e95e7ed0..b2a13fdab0c6454c1d9d4e3338801f3402da4191 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include "mm_internal.h" @@ -197,8 +198,8 @@ static inline unsigned long build_cr3_noflush(pgd_t *pgd, u16 asid, return build_cr3(pgd, asid, lam) | CR3_NOFLUSH; } -noinstr unsigned long build_cr3_pcid_noinstr(pgd_t *pgd, u16 pcid, - unsigned long lam, bool noflush) +static __always_inline unsigned long build_cr3_pcid(pgd_t *pgd, u16 pcid, + unsigned long lam, bool noflush) { u64 noflush_bit = 0; @@ -210,6 +211,12 @@ noinstr unsigned long build_cr3_pcid_noinstr(pgd_t *pgd, u16 pcid, return __build_cr3(pgd, pcid, lam) | noflush_bit; } +noinstr unsigned long build_cr3_pcid_noinstr(pgd_t *pgd, u16 pcid, + unsigned long lam, bool noflush) +{ + return build_cr3_pcid(pgd, pcid, lam, noflush); +} + /* * We get here when we do something requiring a TLB invalidation * but could not go invalidate all of the contexts. We do the @@ -1133,14 +1140,32 @@ void flush_tlb_kernel_range(unsigned long start, unsigned long end) */ noinstr unsigned long __get_current_cr3_fast(void) { - unsigned long cr3 = - build_cr3(this_cpu_read(cpu_tlbstate.loaded_mm)->pgd, - this_cpu_read(cpu_tlbstate.loaded_mm_asid), - tlbstate_lam_cr3_mask()); + unsigned long cr3; + pgd_t *pgd; + u16 asid = this_cpu_read(cpu_tlbstate.loaded_mm_asid); + struct asi *asi = asi_get_current(); + u16 pcid; + + if (asi) { + pgd = asi_pgd(asi); + pcid = asi_pcid(asi, asid); + } else { + pgd = this_cpu_read(cpu_tlbstate.loaded_mm)->pgd; + pcid = kern_pcid(asid); + } + + cr3 = build_cr3_pcid(pgd, pcid, tlbstate_lam_cr3_mask(), false); /* For now, be very restrictive about when this can be called. */ VM_WARN_ON(in_nmi() || preemptible()); + /* + * Outside of the ASI critical section, an ASI-restricted CR3 is + * unstable because an interrupt (including an inner interrupt, if we're + * already in one) could cause a persistent asi_exit. + */ + VM_WARN_ON_ONCE(asi && asi_in_critical_section()); + VM_BUG_ON(cr3 != __read_cr3()); return cr3; } From patchwork Fri Jan 10 18:40:34 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935237 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66547E77188 for ; Fri, 10 Jan 2025 18:41:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CF97D6B00C5; Fri, 10 Jan 2025 13:41:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CA8AA6B00C6; Fri, 10 Jan 2025 13:41:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A90CE6B00C7; Fri, 10 Jan 2025 13:41:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 88ABB6B00C5 for ; Fri, 10 Jan 2025 13:41:07 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 3F0B6160CE5 for ; Fri, 10 Jan 2025 18:41:07 +0000 (UTC) X-FDA: 82992409374.03.1394861 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf08.hostedemail.com (Postfix) with ESMTP id 5B9C216001D for ; Fri, 10 Jan 2025 18:41:05 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LJMQ0emg; spf=pass (imf08.hostedemail.com: domain of 3v2mBZwgKCNoF68GI6J7CKKCHA.8KIHEJQT-IIGR68G.KNC@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3v2mBZwgKCNoF68GI6J7CKKCHA.8KIHEJQT-IIGR68G.KNC@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534465; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NdZ1cQRVb5G6t/AzbRzOJmiK1PmEXn2j4H5pJNPFFDE=; b=cS9aBKFFB5r0bzpQk2ImOm+KX03rQnUkNqNZu8QKxOJ6n6B3OLFX5Rva/Pot/nbTYyhb2r MTdxyDvtwCJWMmYItLQc486QC0Hh4SN3I3RTwMURwTu/e/7qb3sveVXReWdPbUBKV5mvp2 hrvn+CWLFjmb6Z9zeC2JPXPZi4BLQoA= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LJMQ0emg; spf=pass (imf08.hostedemail.com: domain of 3v2mBZwgKCNoF68GI6J7CKKCHA.8KIHEJQT-IIGR68G.KNC@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3v2mBZwgKCNoF68GI6J7CKKCHA.8KIHEJQT-IIGR68G.KNC@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534465; a=rsa-sha256; cv=none; b=PpK/qO/OaiaH2iAAgeKYZxGqtZnTa6oDENL9+eT84FH6ModR047u+dNULQPyZ1ix1tQOS2 dhXStHc3ZqB3aAjHKCsfvrGdknM26EMjJh3XSyikkMZHozpADJvNeO5J/9SBvyeI60P2vY Qfh2mSeoH8k6IMfQGKcLVXzgk979zw0= Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-436379713baso11446195e9.2 for ; Fri, 10 Jan 2025 10:41:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534464; x=1737139264; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=NdZ1cQRVb5G6t/AzbRzOJmiK1PmEXn2j4H5pJNPFFDE=; b=LJMQ0emgDdWpM4MwphRJM7K0TV0cEYm/XewpM/QdG6Tgj+MLiLfUOoosvGOtbGFAUK ve5ZiQh31iKGeNmzmWRLBCqImHDi9PrmqVGG3sGDNtJ+AjfQmzx4OF0WBSRp1Nc4Esi+ jyEssjQrFCow1Cv3aZp6V7WPM5+RUQpoYDAjLGDKVCtqtOqE1qNvuyG+URuaXHoefbxw 5k0wHz5IrghM05+nQa0Lf73LGGq4fc6pmKSoCHz6s8D+RvSHzjQSeExoqx4+FUGjjGVh vFlW21ogRhAGbbJuH7D6ACw5DohZJCwB58HEflNejAz8bFLln5SFSG6zg94cr/adWmuS Uo9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534464; x=1737139264; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=NdZ1cQRVb5G6t/AzbRzOJmiK1PmEXn2j4H5pJNPFFDE=; b=bjM9rfQ3f9eHYTf8Svhkzyz0ZuJskhoaQI9fJAA4ziuPyRNIZxr8Riaut0SMNs1QUz 6X6Xq0TPu4EkWDv5eHXq7csR+zGhPKRUgf1A/3edhHND9m1XB+nuQi3TWimEWMraNmoE 86wq4eyXSpowIls84eDtK96m5HlKMAhXYyCmD2Pa8jej5pBCxiu9rClV7u9fwk4qzjbF WEJo4D7PWqol7YZScjctffosVxdEraefgKCnvxsrV4Mj6SWjN70Wc2UwF0lmZeSHveiD 5CCqtTe+6ihhSs1PCg1IFBt7bR5nq22dF+y6mdnBaWtCNVQ5QUhIk5uI6qqjC3SJpouV 4W9w== X-Forwarded-Encrypted: i=1; AJvYcCUY4dOX5DgBN43EsqbEIpOfdbJZ1Er5xb/2LOpchBy1tstNNto9YHqboEYGmdi63huoOLkgU1dyTQ==@kvack.org X-Gm-Message-State: AOJu0YwKSoCV7bdKmoPTgDEg+5W2Tg36JSDkZXP+beBv5Gw1RDm/MVyL 8YIMg8rFpIDcc46d/2t4WYB8k9GgT7WJYuExR0rRd6JGv9UcbFGN5XPPvEE7FIN7GxkAATuGpZB 63Y5H2VXKpA== X-Google-Smtp-Source: AGHT+IHOqPqFuZt9UmOoUeA2JDIFLsGLn/JF5TCYjoXI0XK4RQVEWWzyMiwW2MB3CIHY3dySJS8wD0E+4CgsAA== X-Received: from wmgg11.prod.google.com ([2002:a05:600d:b:b0:434:ff52:1c7]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:4ed3:b0:434:f7e3:bfbd with SMTP id 5b1f17b1804b1-436e26dda8cmr98320145e9.23.1736534463780; Fri, 10 Jan 2025 10:41:03 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:34 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-8-8419288bc805@google.com> Subject: [PATCH RFC v2 08/29] mm: asi: Avoid warning from NMI userspace accesses in ASI context From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman , Junaid Shahid , Yosry Ahmed X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 5B9C216001D X-Rspam-User: X-Stat-Signature: hsuhjmpf85za6dz96u9xeytd7ibb4mcy X-HE-Tag: 1736534465-989166 X-HE-Meta: U2FsdGVkX1+8ZL75ZIX9ccx/AQEvmp0YCtxauOgLnrvZr59euZgXiAXH+cwyyvYvKtGlhoi2UiKv4psHAZqfSULUqP1STo/ntV9Y70shPeURClxGyk/qvkKyddI+SnRc9aOHF4z7jK/lpwMKGubtpd/nyJQrA7x4Ctkh38jWqyf6+hGS2CqZ+g7bf/cPvm5d0vfLd6bPuhRULGwqMW9JuGbo8tOH/u7wN+tB28omsuWbDyEhIY8LCWSV5+WKTUnPruCVpBbc82uyrHHQ2BJAzsV7u2+Q5Y79QDq9ciq/RBa+lYj7VcwUWies3xHdXQmUF748ScAhKjC/3lhRce4iymoLI/XDEYc43B1RT/F5NgdRSW+uBXoUZnElx3x8pyzcNuU1xmlsduZExl1yGknWnspyrlj6i6+TMGdT2sQBnDOgry9YjOE6x2v2oc7aXFGGYoNoguKnYWTw4aB4Hi3tVh1nyBvelAfExOAUz/wnklBOCPk/nIGUy/qvTobPfOXsOx7vcgzkaxykvZoEzNEbFqfBp7JWmCUn0C4WazVTAbVnCQ0uPj7Zo+0ZwM3I1hZR4uqqmA91+4XUtS/3Np7sgqU6dHbU6RBSIUYYKmOcqgA5XnRm+Wb4+Q9nQlhE5ikRWw1urrd1zPKNUAurWTqj0Hh/kCKXuO4ccv+w2xWXx+5e4TKKcoZ9wK3hX+B0XLAE3OvbU1wUyNQlwY9A/8MIV+9ff7pAOGrf0S9rXb3CGojQFb3vEjg49JAdW7OijYntANX2qj4aUAsADwpO9a197cYfcnPnwh/2J1xNcCp6GxP/C4zTSjpj/HZ6CDLSQ5q+H2zvW7o5wuHasWtygpM3oh9mb/wsBZjAjBHgqeeIjyYwrnpSp2PoiYAVeLC6uvqF0xa8vI7FDLhE8POwIeFZD2g0xk8g5/W+xjP4ddulGYSDlWxHbDVC85NL5e/qFREX/EuwDMYItwlfc/fdWdg aME794Km /w/W6tXJC5Nju132QrskfDUP/OnitKXZpKwFZ51vr5Eyd5ZpmJ79VzA76/BkOANG9GRX8KsTEAdT/ftCkHhpgxsxQA2ohm+sO9DaYADgZstWA3+y6486MscB4/VKOGkKK1veIK8Hy+SBEnK1/3n3yLpvu9VqxgW5rdUcH9UAxKrQtt48rzqv4PPC3D98OSyLJBTWmcoWebw8Lju7gFxHhGlgLRBOMBFVmVTlVb7YNvOTfEGCwa9mbFSFpi1FOwaAk+VmhPmPCHeFDTeJG3WP8/4NVWpQgx003EYVGuGFzUX01eY0zIne2Elru1yf5oH/2pSmF4CDlfu0Tynyk3ZMPvC/RflNGVzNpHsK/tGz4QcAQ8id7LO/dk1Y1kRT7DFeFbFRzNityxLwtZ7ZNOikJLzeeEs7QxRLaXm5onMNWa+NG+CSIw0Gozb7HjydzCUxA/JZJKPlLccV4Sj5QW3EPSS+6m7ZudncTbS7bfRW/W4Uuh3xdI6S+nmp5AesqPPPEW9kLXg7nEraD+1p484RH7uH873+fW793R2lqFGlic5lWy2/Ydkq7MKyojXzOuTLxEkHZldaMDxpVmK7V5m4q2cDSyA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: nmi_uaccess_okay() emits a warning if current CR3 != mm->pgd. Limit the warning to only when ASI is not active. Co-developed-by: Junaid Shahid Signed-off-by: Junaid Shahid Co-developed-by: Yosry Ahmed Signed-off-by: Yosry Ahmed Signed-off-by: Brendan Jackman --- arch/x86/mm/tlb.c | 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index b2a13fdab0c6454c1d9d4e3338801f3402da4191..c41e083c5b5281684be79ad0391c1a5fc7b0c493 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1340,6 +1340,22 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) put_cpu(); } +static inline bool cr3_matches_current_mm(void) +{ + struct asi *asi = asi_get_current(); + pgd_t *pgd_asi = asi_pgd(asi); + pgd_t *pgd_cr3; + + /* + * Prevent read_cr3_pa -> [NMI, asi_exit] -> asi_get_current, + * otherwise we might find CR3 pointing to the ASI PGD but not + * find a current ASI domain. + */ + barrier(); + pgd_cr3 = __va(read_cr3_pa()); + return pgd_cr3 == current->mm->pgd || pgd_cr3 == pgd_asi; +} + /* * Blindly accessing user memory from NMI context can be dangerous * if we're in the middle of switching the current user task or @@ -1355,10 +1371,10 @@ bool nmi_uaccess_okay(void) VM_WARN_ON_ONCE(!loaded_mm); /* - * The condition we want to check is - * current_mm->pgd == __va(read_cr3_pa()). This may be slow, though, - * if we're running in a VM with shadow paging, and nmi_uaccess_okay() - * is supposed to be reasonably fast. + * The condition we want to check that CR3 points to either + * current_mm->pgd or an appropriate ASI PGD. Reading CR3 may be slow, + * though, if we're running in a VM with shadow paging, and + * nmi_uaccess_okay() is supposed to be reasonably fast. * * Instead, we check the almost equivalent but somewhat conservative * condition below, and we rely on the fact that switch_mm_irqs_off() @@ -1367,7 +1383,7 @@ bool nmi_uaccess_okay(void) if (loaded_mm != current_mm) return false; - VM_WARN_ON_ONCE(current_mm->pgd != __va(read_cr3_pa())); + VM_WARN_ON_ONCE(!cr3_matches_current_mm()); return true; } From patchwork Fri Jan 10 18:40:35 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935238 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3854DE7719D for ; Fri, 10 Jan 2025 18:41:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1BADB6B0092; Fri, 10 Jan 2025 13:41:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 112806B00C7; Fri, 10 Jan 2025 13:41:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DEF926B00C9; Fri, 10 Jan 2025 13:41:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id B98566B0092 for ; Fri, 10 Jan 2025 13:41:09 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 71FADB0496 for ; Fri, 10 Jan 2025 18:41:09 +0000 (UTC) X-FDA: 82992409458.26.7959B4E Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf12.hostedemail.com (Postfix) with ESMTP id 9576740004 for ; Fri, 10 Jan 2025 18:41:07 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=qZYQVmZ0; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of 3wWmBZwgKCNwH8AIK8L9EMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3wWmBZwgKCNwH8AIK8L9EMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--jackmanb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534467; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WaFp0TOYaWSuGPhKgfBnenMecO/UsWYRxmgXZ26ETaU=; b=ytDQJAB0soSf0Tx4WWQrUBPQ8eiMpaWrQbaoJsDsVkM0ajBQgvQjvYVn+eBjQueEP083/Z xd+M/ZIP4z8HwEu4LUcKZTVVNjVwwYkwqBEZMWg+sap5SQoaprQpe7JE6UH/1DCE5rxu+7 CFORaY1QYBEK7a40HN486/bv6pt5lGc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534467; a=rsa-sha256; cv=none; b=lLkXtNWkOwgwYFjKSsIG5liXxTrP/lLmpCKNkenHV9APk+0ItMHUwbYz6IV63ayddkC3a5 neGmwKhjYA7/QfI57Dzbd/MzSjd1hNLs2GYowqVVEe0RinqJlzf21X0hZVfIMyCvSt/u2M udoAHrucolnVhJKjuXF8gVQlgXsn0/k= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=qZYQVmZ0; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of 3wWmBZwgKCNwH8AIK8L9EMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3wWmBZwgKCNwH8AIK8L9EMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--jackmanb.bounces.google.com Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-4359206e1e4so19288725e9.2 for ; Fri, 10 Jan 2025 10:41:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534466; x=1737139266; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=WaFp0TOYaWSuGPhKgfBnenMecO/UsWYRxmgXZ26ETaU=; b=qZYQVmZ0p9hX2/pNRV9dKT4B5r4ArLoVoTp0IW3ab+yiNRNnHGmskqQip6v0jGS/mt hmCTUFchYKJMT6VQnc05NYw5fI8m7xqSM+P3AmichOGKovdBluk9sE5QjpkPoNxvaiSt ya2RdFRNSaylj7fSVVB9p26+/8yDbYhRlJLVWK1ZdOhCY6ALUhTwq83X2pmsG+0dFELq rTLl6arYmafdb0mGENkWs8m4Vsazlj3pttzIdk1ijHreEpZx8reeoEfllz77hTgXo+hu V/84RD+LCZ9WC7Tn1Ng5yHb9fI8LhBhrGtvuSmxPuHjVchHQICWajazKn+FFsSXdrIed Ajhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534466; x=1737139266; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WaFp0TOYaWSuGPhKgfBnenMecO/UsWYRxmgXZ26ETaU=; b=vB775fVS9drOqpqTPuBdoe7bU8SgQy1CsTjV/tOOxvJgEeWX1v/3tCPfLdCMWWGW6I x2yc/zvzSUZDgexpomjdkvYsfxPJubONhxymxaBT2uQaFJExQCcZ33zYbHpbhC3jb8Xw /jkgVzWfRdtUY5sRedggPR1VpTKnVc6OPiSPhRHSc8YZ1uWX8tNOLmB6nqiP3EuKFGCz bMHH5OVmWn12OaUFXfASVMGVEYa4CV+Q/bO4ve/+LGhKpqJcGqFwiXuqGloJAIzRFUXd QO3J1B6E7gVbeLjvynVFofV0wZyGqrIlumgpF2RRP7Z6bfB8odWAFXiOyCDKujbKZzcH 1Kuw== X-Forwarded-Encrypted: i=1; AJvYcCWE5nAotOfVVjsq2+dlLZVddlaUbSwNhA6hYkUUnhary2HE0ooWaM0CMLBO53z3w+TAimg4OS7Fyg==@kvack.org X-Gm-Message-State: AOJu0Yy2q5eJCHPRPi8sSJq2xzakmvLYxR0xEOA+XEh/ofwt9mBWDDpL RNzENXAs5lXArQ4ZAFldqp9QcF2BSrWLyFqGHUVkcHvHm9UqVSmOUq3CtFSxBy7d1gq7zbXHwoR GO8rJqN+Niw== X-Google-Smtp-Source: AGHT+IEE4az+Oz6XpM1h3immNPii0BZQy1AjHZ6Oe5B/in2qx8QsaCvn5PLTDmCUbo/W1PshFIv+Soo2Sqz9XA== X-Received: from wmrn35.prod.google.com ([2002:a05:600c:5023:b0:434:f2eb:aa72]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:3ca4:b0:434:a26c:8291 with SMTP id 5b1f17b1804b1-436e26e203emr101768035e9.24.1736534465947; Fri, 10 Jan 2025 10:41:05 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:35 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-9-8419288bc805@google.com> Subject: [PATCH RFC v2 09/29] mm: asi: ASI page table allocation functions From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman , Junaid Shahid X-Stat-Signature: pf6owxogotai5sjygoijdnsnapxb64zt X-Rspamd-Queue-Id: 9576740004 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1736534467-30985 X-HE-Meta: U2FsdGVkX19Hrhh8dFBE3/JZ78+af2RIiuD0YWXmyIUyDYCP0Nzb1fre7KlaMBNz6r/q3GVpbBPQMFdWG1G9Brp8yUR658Yi454XGqNCMM9mKVMRkNIxfE70VRhHuteUlyj8/4wx4xla6tXEUgJkutmioS2OqeJeIChIZBNPpPtGJiNrh2OgSryPOjqhyw9i4KSCzRUcAywk/LP6ray+dGzyeZhbxboUg01J1A0iZpyLAkseTNiJPEDP97Fs6jEBhEQUHpg5vM0JPDx2qIKwpgxrvsgErn7j3ZjJyBb/ZFcrxDrevNnCwUgh6iZitHSBcYDeuuxrka73PmXjKboPCbeHdFYN1hHtX3Mu/FHho7lKRV/8xozn06BG6zSukstg9ZH27drPuo4likQZUtErAu4vwPow6n6tcMX3C0ivT/JSpG/h3lzuMiR83NnQEW1cXTkUOIng2qIdQxwbIfCDjVCmUEPxf7z8MImbxtInHMy7fENymwhuxazGLyjdampZLHiJWxq8YEI4EjcY1g3ufD6Pr8WQn8SKGLTSa6b3evJTTgOZqa9M48BoEzYLasbaootm814NW6mJB07TXwRcNyKR68ALyFWTbRId8Lmra+LQeDWC1f8kt2zj6MhIchBeWrE+HVcjQ4hxeQEX1foix1q0Lg8R54Yc0ktU+74Fn/m0tSWLpmcUBe3YS2fSKNudwgz8jT1fp1sSTyCIp3KfK6P6N8dqlfOBWM1zOnJKk2IPS+eNwZtwGkkbvV+bLK9kMWqgyuBCs7CxquEz8AQT7n8Zw68rQlFIwKUKC7uU+4sLfXLp3oOBgr8nh+HES9cIHR4svPClyLFNXRx9Vaq2SVZ8u5XQKSdae8FmbJlwlR3X9iE2xfFsBP40VxMfYUCWq23JJr7P2NRAp2hgCQ65spP8Ixy1Q7jNex2Shv7HRkKN9CbZhkK+56QGvBf8yLcH1GNl/C3ivdEQqg9TJ2x Qsk1jO5n F5qExIlsi8I1FjOKIgAlY4AIzeF6wuufFgkzp3j+P1oR86QIlekEYe4Ryqpx1+nwKzKg3oO0OVdnmPaFNz+FQLg5FRjaeqRw8tdsveLtpbNmNMiz3KhxHVfWfCObmGmZYFvGMfl25P5ORdnn+wElPrHgKZm6M5B9KlwztcN7yUCRnvanRTRNBSnMqiU2UW0z9CdGD1DP0FGz5iSjINxHtgCRgrHv1hkysgwB42hmW3BTSRbG4bFFMFQFOpWdyiTsRk6XwrsMGZFLcarJDBMhnwa11NYWpF1uoK5c1hdvzmcsw0FhQczGsNrH/zfLt/QXTbjAyBkQNoFrNdCGKX0nUfMLrW5wyd1LcnM5xuIbExyQz5ikaT70BlvQY9z6PaFjSEtWyB4Zm8kWAs5nK5RxkfTW4jzPap+dEafMygfQRTipdwVX1+fhqhdOVQeMo2IRcP1vt9p3ZlyOmcgOih4dbbuDad6ksr0DzT64KqlZL8ZWu4Ylt89F22+K4kcvvw0bhXXiwGHRWQZnfttPdTyEczjFpIewGfT5gJ0qTjFeEQcE9cpIihOUn6qgBhw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Junaid Shahid This adds custom allocation and free functions for ASI page tables. The alloc functions support allocating memory using different GFP reclaim flags, in order to be able to support non-sensitive allocations from both standard and atomic contexts. They also install the page tables locklessly, which makes it slightly simpler to handle non-sensitive allocations from interrupts/exceptions. checkpatch.pl MACRO_ARG_UNUSED,SPACING is false positive. COMPLEX_MACRO - I dunno, suggestions welcome. Checkpatch-args: --ignore=MACRO_ARG_UNUSED,SPACING,COMPLEX_MACRO Signed-off-by: Junaid Shahid Signed-off-by: Brendan Jackman --- arch/x86/mm/asi.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 59 insertions(+) diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index 8d060c633be68b508847e2c1c111761df1da92af..b15d043acedc9f459f17e86564a15061650afc3a 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -73,6 +73,65 @@ const char *asi_class_name(enum asi_class_id class_id) return asi_class_names[class_id]; } +#ifndef mm_inc_nr_p4ds +#define mm_inc_nr_p4ds(mm) do {} while (false) +#endif + +#ifndef mm_dec_nr_p4ds +#define mm_dec_nr_p4ds(mm) do {} while (false) +#endif + +#define pte_offset pte_offset_kernel + +/* + * asi_p4d_alloc, asi_pud_alloc, asi_pmd_alloc, asi_pte_alloc. + * + * These are like the normal xxx_alloc functions, but: + * + * - They use atomic operations instead of taking a spinlock; this allows them + * to be used from interrupts. This is necessary because we use the page + * allocator from interrupts and the page allocator ultimately calls this + * code. + * - They support customizing the allocation flags. + * + * On the other hand, they do not use the normal page allocation infrastructure, + * that means that PTE pages do not have the PageTable type nor the PagePgtable + * flag and we don't increment the meminfo stat (NR_PAGETABLE) as they do. + */ +static_assert(!IS_ENABLED(CONFIG_PARAVIRT)); +#define DEFINE_ASI_PGTBL_ALLOC(base, level) \ +__maybe_unused \ +static level##_t * asi_##level##_alloc(struct asi *asi, \ + base##_t *base, ulong addr, \ + gfp_t flags) \ +{ \ + if (unlikely(base##_none(*base))) { \ + ulong pgtbl = get_zeroed_page(flags); \ + phys_addr_t pgtbl_pa; \ + \ + if (!pgtbl) \ + return NULL; \ + \ + pgtbl_pa = __pa(pgtbl); \ + \ + if (cmpxchg((ulong *)base, 0, \ + pgtbl_pa | _PAGE_TABLE) != 0) { \ + free_page(pgtbl); \ + goto out; \ + } \ + \ + mm_inc_nr_##level##s(asi->mm); \ + } \ +out: \ + VM_BUG_ON(base##_leaf(*base)); \ + return level##_offset(base, addr); \ +} + +DEFINE_ASI_PGTBL_ALLOC(pgd, p4d) +DEFINE_ASI_PGTBL_ALLOC(p4d, pud) +DEFINE_ASI_PGTBL_ALLOC(pud, pmd) +DEFINE_ASI_PGTBL_ALLOC(pmd, pte) + void __init asi_check_boottime_disable(void) { bool enabled = IS_ENABLED(CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION_DEFAULT_ON); From patchwork Fri Jan 10 18:40:36 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935239 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B463E77188 for ; Fri, 10 Jan 2025 18:41:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 844AA6B00C9; Fri, 10 Jan 2025 13:41:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 778E16B00CA; Fri, 10 Jan 2025 13:41:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 556976B00CB; Fri, 10 Jan 2025 13:41:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 34AEB6B00C9 for ; Fri, 10 Jan 2025 13:41:12 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id E486D446FF for ; Fri, 10 Jan 2025 18:41:11 +0000 (UTC) X-FDA: 82992409542.10.E63C312 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf04.hostedemail.com (Postfix) with ESMTP id ED78140008 for ; Fri, 10 Jan 2025 18:41:09 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ggQY2fxB; spf=pass (imf04.hostedemail.com: domain of 3xGmBZwgKCN8KBDLNBOCHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3xGmBZwgKCN8KBDLNBOCHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534470; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ffc/jeUEwcE9NsWtGdDMnxjVfG4g8pYiskW7P1F5M34=; b=n76aCg0rT3w0rNFe25782fnB0vW4dNBV/M9V8GmT76BMP4C9Yzx5oB8pVQ/3E/vJnAbH7s 6lbzETTbWm70HRo9dqCQmYJS2WgIETCT6j71XTHhTfg/Qbr06wZY8lZZj0KrforjR+iijZ lJJQWgGYxxrFePVzJe49m8KuUnVCaNI= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ggQY2fxB; spf=pass (imf04.hostedemail.com: domain of 3xGmBZwgKCN8KBDLNBOCHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3xGmBZwgKCN8KBDLNBOCHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534470; a=rsa-sha256; cv=none; b=eMEEqMS0KcF/6x3+4TW1IYKcJ1TgK13ePK/ofGSoWITKjamz2Zy0Bx+80ZNjtv0WNswIJr P+RHhf8GzLU4wH0hbD52GVgQecNQ0cHyoFk4fgBb1Wnf5kmVO1Inyr1ewnWzzA1RLDA2Mg PYQ08/qUZYPX6FADAbQnuUVnVysXKq4= Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-43673af80a6so19102545e9.1 for ; Fri, 10 Jan 2025 10:41:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534468; x=1737139268; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ffc/jeUEwcE9NsWtGdDMnxjVfG4g8pYiskW7P1F5M34=; b=ggQY2fxBxSUIt+R3poxDnkfWg5eOdPEpgEDrjvm7b71zkV0UWNYAlwhYt6A9b+hp0T PkAKRp2CLT5t8ad0mVkS9jdqQimbzO30uhqdqel1DgLCK6HsngffO8DPaElGt9oKF8uG NHmxYv9vWvl2BKDUPilMmE/o3KIxXL1E5YTJ5Cv/Ssqx4Y6i1QnG7lwztmvWN9oOiubV 4gHNtqbPlPWgw9vHvmRngAjA9XrQdLi1nojqbmr7xIeXoAjB/j5YVEQmfO7OLDt7IFfg OVYtTV8KiHhQ376a2izNCDr7UoImLehJlhSWG+uXaoykAUkf19WyMIYjkmrln6Bs34mx 2v9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534468; x=1737139268; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ffc/jeUEwcE9NsWtGdDMnxjVfG4g8pYiskW7P1F5M34=; b=Wf3Jv5VmEFGO/emTUrFwYqM55yqYj5sld+wUHPAW7x460CQFhsYhcW+wF2O//Yc6P9 F3YZBdrkXxgDe2t74CA1LUIFzOGynPdrErbzIEe0XCTrroNDT1DxyQ6xG9HtOOJtYZrW snkQQ11ht9FBu1D1JXfuC6Zj0KumngTKVtQbNOkY63l71vplImWxgKkwdAAzSDzNTXJy Kazf742+WtL2t6BOIPFXNDlkkJ42TinpvxqnNFwyXFfq4AIGp+6xUpNxzHwIsmqornNC zdp+hrB0JdI8U/1kk9hL42A3YIRDtm2mnwhp7NAFq8YsR7cV5nuzLZNFgJj7YCxgSpcB J/Aw== X-Forwarded-Encrypted: i=1; AJvYcCU4gA1tAwHaaXX8gRBdqsVqv1iAwTXg1SqP5YI7Zk5+c3IwwcxJvIie7zn+hfwBp3/0RJZDtZZ6sQ==@kvack.org X-Gm-Message-State: AOJu0YwadfquEedtD0FRlQw57FgknmxlzuvEB2JB1Ei1OlWGAsvyA7u5 nNt+EvKnBXRL/RZLTXDekpmRyyVhfHdmGMsrGnUYv3U0WZQLZQHXgrwh2iO4aOKpR3tTMaeBFxh sV0lqQ0XHOw== X-Google-Smtp-Source: AGHT+IF/q1QCp5xw/vI+ho6WzVZa4JA3GTBWKWx8cvSyaKNRoxJopmgrWWLxru20teHN6JHe+1fTI5ygNpFhkw== X-Received: from wmso37.prod.google.com ([2002:a05:600c:5125:b0:434:a98d:6a1c]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:524f:b0:435:d22:9c9e with SMTP id 5b1f17b1804b1-436e26d0cf9mr103592725e9.19.1736534468098; Fri, 10 Jan 2025 10:41:08 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:36 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-10-8419288bc805@google.com> Subject: [PATCH RFC v2 10/29] mm: asi: asi_exit() on PF, skip handling if address is accessible From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman , Ofir Weisse X-Rspamd-Server: rspam05 X-Stat-Signature: nw9aooxb7eq56zznu3ybkkthwyy3r4dh X-Rspamd-Queue-Id: ED78140008 X-Rspam-User: X-HE-Tag: 1736534469-631920 X-HE-Meta: U2FsdGVkX19D57GvSfii2xcdZcNxonmP+byefmVMAzX72SNoEx26P0XGbi8iHCaW7uCA5Fd/62BsiNkZ9PsVNpCP+BwsAyTFY2ody+xtaNwVK29aIW+wDzaaVfgAuRmcBGkxWHgiJIDB+SnfQ1sfLiSVVpkrxw6T2QNBnoZINetdOG0ZNpblSmhwhBqyjgs6SzkUqYoj3u+cYSMk4ORifdKQ1NvRKWSSbuJqo3VGc7MBDQKF/YRGTAtpsirJ5aRLImkBDPPLQU/65r3TO5r3juAYJl/VQjZkv1psLifI76eW71qyAmYrHTglCFtB3yPdXJ1HR981bf5ui0WzF7K4KNgpewS6Lm2gKHBSd/R7oGs0K9n2BbWwoQ1wM3/E4P1FUUANZ6quavxPGKplPdgYWl1/Ps87+q+DVVuSBcHHdgePVb8i1576CQ7NW5+AkfO74EbR27j8NgCMF4y7sh8TTNqOBbK6e3TNUglWj0SKGz6JboIO6HYs9ilio5Z4Gn50h5SKFluRdKtaO5bgHQqtm68EiXINggSdSudti3p+t3XPsMB8eTNYeC8sd3SDXc4c9/qPaj6hqnNs8h61SRYrIr+3mm5YUw4WbmS0jmC/oDWdtwLAhCEYddPdjF0HhWHR2RLUH/EL8STtyZsKu2rWnCVvzlPoZjcCQFzoNz/xgkMHJ13mRoCxkcIxEJJ/OftwUyjdlDv53Z892NPUR/CZkE5QS4sjj8+T/CpVLeos6//eHSBRY+i4QX7Rlh1sdKjijDzOS4B2cGZopgFr9DBIaMvlHIsTF8AkRtjWw2ZDSkV83ONkL8EOo85X9OK7VYzZiLlMACfDtYjxPz7aIpggziZiypdJb9QgUVyPtNpw0vVtSYbsmIEnaPOdmOOq4Sc5zsWPhFmgY5VDfXpZ1A9K7oyH+x/01mlJ6/WP2UU5KgrcIE9IP7SLWsaOjXYUYYMRe/+NwtvMK/4u0ZQXVp6 hCFPozGY 1vaBT6lXT7Yjb47lk4CSPOwIKvs+4239OfUeedCfHD+Z2bBJabrugcSwuzGL8v9yNUWDEdeE3ldZzmEkIcpGa8aUJanuk8EC5QfeT5N00NlerjnDo7bczNkEOENBAr6d60d6j3u+dpjud6hpG+4vMvfUUWrZyPOuGEwyByaDLrqqCgTZdjXVhb0HM9uvXGIdrPf0fNsdospPu727n8jKcLDneoHgOfjkAENFNxaAN7pxdTPCjMLKMktwX6/7otZ7Ufy3LII33OOxRS68TgZ0BLMSY3YL8rDDxdFvkGWccbYD++iBvjPHfaycjytpT2tK7TlPpQUiOVuBHpuZzyq/ETSgS07N2i4GSjrsRE8SrcfvvLnzB/Az3mXS5mqiZspE4up9TZWDPBuwRP7ljy+5NOW13GaG0DQzcBKItpLMjtAH16mv5g74IDSLi842IbQbrpEUKtTu3caSeg0r9TpFXLCrqhQl4Bs1sh4vcvsH6zrKx1BVqWFfJ6Stk+DHBXziY1u9h5n+lRmYLl8JpkD1BvbM4Qkcs0lndMGclzpsWbaaLO8gmRb9R0Z8kBA6tpQIKUadD9ekoZPz1Pw+u2mDm/VPw0jBIZLntxoEnXXaYoKpYAHs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Ofir Weisse On a page-fault - do asi_exit(). Then check if now after the exit the address is accessible. We do this by refactoring spurious_kernel_fault() into two parts: 1. Verify that the error code value is something that could arise from a lazy TLB update. 2. Walk the page table and verify permissions, which is now called is_address_accessible(). We also define PTE_PRESENT() and PMD_PRESENT() which are suitable for checking userspace pages. For the sake of spurious faults, pte_present() and pmd_present() are only good for kernelspace pages. This is because these macros might return true even if the present bit is 0 (only relevant for userspace). checkpatch.pl VSPRINTF_SPECIFIER_PX - it's in a WARN that only fires in a debug build of the kernel when we hit a disastrous bug, seems OK to leak addresses. RFC note: A separate refactoring/prep commit should be split out of this patch. Checkpatch-args: --ignore=VSPRINTF_SPECIFIER_PX Signed-off-by: Ofir Weisse Signed-off-by: Brendan Jackman --- arch/x86/mm/fault.c | 118 +++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 103 insertions(+), 15 deletions(-) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index e6c469b323ccb748de22adc7d9f0a16dd195edad..ee8f5417174e2956391d538f41e2475553ca4972 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -948,7 +948,7 @@ do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long address, force_sig_fault(SIGBUS, BUS_ADRERR, (void __user *)address); } -static int spurious_kernel_fault_check(unsigned long error_code, pte_t *pte) +static __always_inline int kernel_protection_ok(unsigned long error_code, pte_t *pte) { if ((error_code & X86_PF_WRITE) && !pte_write(*pte)) return 0; @@ -959,6 +959,8 @@ static int spurious_kernel_fault_check(unsigned long error_code, pte_t *pte) return 1; } +static int kernel_access_ok(unsigned long error_code, unsigned long address, pgd_t *pgd); + /* * Handle a spurious fault caused by a stale TLB entry. * @@ -984,11 +986,6 @@ static noinline int spurious_kernel_fault(unsigned long error_code, unsigned long address) { pgd_t *pgd; - p4d_t *p4d; - pud_t *pud; - pmd_t *pmd; - pte_t *pte; - int ret; /* * Only writes to RO or instruction fetches from NX may cause @@ -1004,6 +1001,50 @@ spurious_kernel_fault(unsigned long error_code, unsigned long address) return 0; pgd = init_mm.pgd + pgd_index(address); + return kernel_access_ok(error_code, address, pgd); +} +NOKPROBE_SYMBOL(spurious_kernel_fault); + +/* + * For kernel addresses, pte_present and pmd_present are sufficient for + * is_address_accessible. For user addresses these functions will return true + * even though the pte is not actually accessible by hardware (i.e _PAGE_PRESENT + * is not set). This happens in cases where the pages are physically present in + * memory, but they are not made accessible to hardware as they need software + * handling first: + * + * - ptes/pmds with _PAGE_PROTNONE need autonuma balancing (see pte_protnone(), + * change_prot_numa(), and do_numa_page()). + * + * - pmds with _PAGE_PSE & !_PAGE_PRESENT are undergoing splitting (see + * split_huge_page()). + * + * Here, we care about whether the hardware can actually access the page right + * now. + * + * These issues aren't currently present for PUD but we also have a custom + * PUD_PRESENT for a layer of future-proofing. + */ +#define PUD_PRESENT(pud) (pud_flags(pud) & _PAGE_PRESENT) +#define PMD_PRESENT(pmd) (pmd_flags(pmd) & _PAGE_PRESENT) +#define PTE_PRESENT(pte) (pte_flags(pte) & _PAGE_PRESENT) + +/* + * Check if an access by the kernel would cause a page fault. The access is + * described by a page fault error code (whether it was a write/instruction + * fetch) and address. This doesn't check for types of faults that are not + * expected to affect the kernel, e.g. PKU. The address can be user or kernel + * space, if user then we assume the access would happen via the uaccess API. + */ +static noinstr int +kernel_access_ok(unsigned long error_code, unsigned long address, pgd_t *pgd) +{ + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + pte_t *pte; + int ret; + if (!pgd_present(*pgd)) return 0; @@ -1012,27 +1053,27 @@ spurious_kernel_fault(unsigned long error_code, unsigned long address) return 0; if (p4d_leaf(*p4d)) - return spurious_kernel_fault_check(error_code, (pte_t *) p4d); + return kernel_protection_ok(error_code, (pte_t *) p4d); pud = pud_offset(p4d, address); - if (!pud_present(*pud)) + if (!PUD_PRESENT(*pud)) return 0; if (pud_leaf(*pud)) - return spurious_kernel_fault_check(error_code, (pte_t *) pud); + return kernel_protection_ok(error_code, (pte_t *) pud); pmd = pmd_offset(pud, address); - if (!pmd_present(*pmd)) + if (!PMD_PRESENT(*pmd)) return 0; if (pmd_leaf(*pmd)) - return spurious_kernel_fault_check(error_code, (pte_t *) pmd); + return kernel_protection_ok(error_code, (pte_t *) pmd); pte = pte_offset_kernel(pmd, address); - if (!pte_present(*pte)) + if (!PTE_PRESENT(*pte)) return 0; - ret = spurious_kernel_fault_check(error_code, pte); + ret = kernel_protection_ok(error_code, pte); if (!ret) return 0; @@ -1040,12 +1081,11 @@ spurious_kernel_fault(unsigned long error_code, unsigned long address) * Make sure we have permissions in PMD. * If not, then there's a bug in the page tables: */ - ret = spurious_kernel_fault_check(error_code, (pte_t *) pmd); + ret = kernel_protection_ok(error_code, (pte_t *) pmd); WARN_ONCE(!ret, "PMD has incorrect permission bits\n"); return ret; } -NOKPROBE_SYMBOL(spurious_kernel_fault); int show_unhandled_signals = 1; @@ -1490,6 +1530,29 @@ handle_page_fault(struct pt_regs *regs, unsigned long error_code, } } +static __always_inline void warn_if_bad_asi_pf( + unsigned long error_code, unsigned long address) +{ +#ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION + struct asi *target; + + /* + * It's a bug to access sensitive data from the "critical section", i.e. + * on the path between asi_enter and asi_relax, where untrusted code + * gets run. #PF in this state sees asi_intr_nest_depth() as 1 because + * #PF increments it. We can't think of a better way to determine if + * this has happened than to check the ASI pagetables, hence we can't + * really have this check in non-debug builds unfortunately. + */ + VM_WARN_ONCE( + (target = asi_get_target(current)) != NULL && + asi_intr_nest_depth() == 1 && + !kernel_access_ok(error_code, address, asi_pgd(target)), + "ASI-sensitive data access from critical section, addr=%px error_code=%lx class=%s", + (void *) address, error_code, asi_class_name(target->class_id)); +#endif +} + DEFINE_IDTENTRY_RAW_ERRORCODE(exc_page_fault) { irqentry_state_t state; @@ -1497,6 +1560,31 @@ DEFINE_IDTENTRY_RAW_ERRORCODE(exc_page_fault) address = cpu_feature_enabled(X86_FEATURE_FRED) ? fred_event_data(regs) : read_cr2(); + if (static_asi_enabled() && !user_mode(regs)) { + pgd_t *pgd; + + /* Can be a NOP even for ASI faults, because of NMIs */ + asi_exit(); + + /* + * handle_page_fault() might oops if we run it for a kernel + * address in kernel mode. This might be the case if we got here + * due to an ASI fault. We avoid this case by checking whether + * the address is now, after asi_exit(), accessible by hardware. + * If it is - there's nothing to do. Note that this is a bit of + * a shotgun; we can also bail early from user-address faults + * here that weren't actually caused by ASI. So we might wanna + * move this logic later in the handler. In particular, we might + * be losing some stats here. However for now this keeps ASI + * page faults nice and fast. + */ + pgd = (pgd_t *)__va(read_cr3_pa()) + pgd_index(address); + if (!user_mode(regs) && kernel_access_ok(error_code, address, pgd)) { + warn_if_bad_asi_pf(error_code, address); + return; + } + } + prefetchw(¤t->mm->mmap_lock); /* From patchwork Fri Jan 10 18:40:37 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935240 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00FA7E7719C for ; Fri, 10 Jan 2025 18:41:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E31C86B00CB; Fri, 10 Jan 2025 13:41:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DBB9D6B00CC; Fri, 10 Jan 2025 13:41:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B979C6B00CD; Fri, 10 Jan 2025 13:41:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 90CEF6B00CB for ; Fri, 10 Jan 2025 13:41:14 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 4DDA31A0DC9 for ; Fri, 10 Jan 2025 18:41:14 +0000 (UTC) X-FDA: 82992409668.29.699A937 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf16.hostedemail.com (Postfix) with ESMTP id 4450F18000A for ; Fri, 10 Jan 2025 18:41:12 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0WjXlTKp; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf16.hostedemail.com: domain of 3xmmBZwgKCOEMDFNPDQEJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3xmmBZwgKCOEMDFNPDQEJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--jackmanb.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534472; a=rsa-sha256; cv=none; b=fHdIkQBOm8bTSvD3TQY152/IlXarjXBtdCN5w+VB0Qa1B+X2xkU0CqQJzay136fyquJIsX yCIwZkT29R4bcPy86j/Br/Z3Td19/7OMKi8vB19uBP0u0LfC7fMhJmxPH4Ly7Tv1yGcRB5 gk23dCdyloqQLKHULFYMCwVIUlPzlB8= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0WjXlTKp; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf16.hostedemail.com: domain of 3xmmBZwgKCOEMDFNPDQEJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3xmmBZwgKCOEMDFNPDQEJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--jackmanb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534472; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9R39fCT8BkGaT3ojJEFUEFWnag6aLMpEmNSF+NNhCcA=; b=VAa3zVn5QYJTg5cgMzsauPVw668xUTVQrzXhioYu3ZtFp43fkzsdnnUgF6xJtXVnGugkrc f0Ep8ExdY/719RbxY+wh89HvrR/I+sSN5WT5C21QCgwMhr9YOInjTn159uLgu2UGE4HvyX yrPprIUo+v46ildGitrooURUVIJ1GUM= Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43623bf2a83so20554375e9.0 for ; Fri, 10 Jan 2025 10:41:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534471; x=1737139271; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=9R39fCT8BkGaT3ojJEFUEFWnag6aLMpEmNSF+NNhCcA=; b=0WjXlTKps2Dj2nM4wVbpKE4OaRsfOAD7vZo4XgP8BtV+eDbIbLepAc992bKTMmWR9Y FSfNXcJs5Zg3h5pQ0mxrYjw/iXtbXs672BauQVKyPnf0Yz+8TVUJWW6UOFK8X6U9Nr6w Cye5PGMOc5zEW4piZBS7ORas6YXeG9/mxphCYunn83VGYVZn03YWjxdMu7cVESCPeF1X yboL2g2X/QUTZDK7S+bGDoQq5hydl8PQI/JDWQKVLuka4VMQ/FVvPX/OalSndzDCMd3i ZrPt73oRap1YbFc3PHjsGMKfz5Zh9ORuRRE+9OxXWHl2AKO6l1ze84V2qhxs7QX3Omy6 ApYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534471; x=1737139271; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9R39fCT8BkGaT3ojJEFUEFWnag6aLMpEmNSF+NNhCcA=; b=aAsdiB1RtkBRHqe+sezeZ87w5y5NAPXVBxEJvmwT+JwLTmuCbHzhGyhp5L/rHQVsVt LqCIfDGwNUrD7wvk/t9i9/JR8/gUeSi9w0nmXWyjAwTw/1FU8tcbxeecYWn2wZn5ed0C 4zElfLaP4Y5fw9x/JhqhYDop2bUNHPAyyQueYn7xE27Oay7XNghSInWzxPxYvjAMjDOc LtVqMMM6uxJbgB1Ta6nRggkL9MABtNuMG0YR9bbXcI2fQ16ratfOuFbBLRWFp8b9vUIh 81YmoePdErAki24SyRa2OzOH6Ggj0TjRWQ+Z2JTjRslh77lj4P0P+DvpfUwMjB0bqXaK b3AQ== X-Forwarded-Encrypted: i=1; AJvYcCWMb6G0CLKuKMWF0FnzNEhjPCLyjUAgNaFwYjR7hv+uevavi6w0PpopQ0ljfU+aPwRK0dA/asoRGg==@kvack.org X-Gm-Message-State: AOJu0Yy9dMklzkzuzwEWmSH+HbWy1/+LQLyRnsBdbZ4OKXeRooCM/vyX cyQTR11Ui/P+YtrPgbXmd43gAHXBZFesNAJYesD36X6NBNvN6a7cthxvt3Z+cIqjkoh6kej3s7W 8fiSHuwSv0g== X-Google-Smtp-Source: AGHT+IFAF35eNkOAcz1ZyY8tB2zfXGnUuByWneOp6LmVFLX6PovKFbMifWci/fMsVQav8K8GF97UZk3GG6JG8w== X-Received: from wmbfl22.prod.google.com ([2002:a05:600c:b96:b0:436:6fa7:621]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:310c:b0:436:840b:2593 with SMTP id 5b1f17b1804b1-436e26ad50emr117815595e9.15.1736534470650; Fri, 10 Jan 2025 10:41:10 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:37 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-11-8419288bc805@google.com> Subject: [PATCH RFC v2 11/29] mm: asi: Functions to map/unmap a memory range into ASI page tables From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman , Junaid Shahid , Kevin Cheng X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 4450F18000A X-Stat-Signature: n7j3mwdfgbohwiwjydyj1d47ogspbko7 X-HE-Tag: 1736534472-761981 X-HE-Meta: U2FsdGVkX1+gbbCu9+ab/g6rMHoIEELWXWYgOANPdPAaH8xpaenzHaSC97tGCo8/ooEWWaNPXUGZvSEfa0SJShEAG+u2r/iRO/ZqSfJHW9m9b7eQst1oDL7uLxkM43Sls2tBksme/h/lacptqfHCbW10ac9HC5Ed3tIGQ7oqfqRsHi6RGE0Hvx960IFWfYn/KGjZL9wV3zg8fteJE4oKbu7dnXufoxR5w4nUlq+eaJpRXLWpv0U7K+tEzIYtpHl3fFxXeJNDHeZBEJehmF39Rb3X2RIUBIk9WD+uCl+KoRM2mGNhHnM0N9e18CvdAa9SdFVQWEwuHC54RBY54Pk6LwLaSXeel40/wUn3FP29cFYsgM0X7GyckTRy/wGzfo8PlBQHY0lAq9KAfQCqHkOJj1CIgdJuE3h9hSWb/VUk5UTS7uaKOnkGv2nOkWp1UrYYNakjI/mEOkXBN1ziBsefJlYO1tyRqnlpcEItcm9mj1uF8T2XzWmiqZyEGR3pUDZTkmPZ2GfqCf+bAmx48yMtvHIe+8Wf/S6kpRF5kK8cbjC9C1WA3kke7HFnPB04yYQHf53AUmf+ka7psJJo5n1a7gc3VDrmLZOfo1XjSlmRGmH6u9WoKKSo9C58eZhhbkd2ZXyNUiW6Mh8dYbPGdCiBrh8sXWn7qHPJWDzfYNcqwzF6j2myZ296JE2w9mHDtdrYYJ59MhVIWyTLJGyDF16tv7TjuibNIrYN8Ro0tqvUWkRqIewkN9np1F9E1IcFgmhg8VS6hAwYTnqHNnz4G3x3rxLHn6qjIkDFbJp9+NSqtiXFFxAyuyE1+V2selG2UioHET97249fkSafoWeEyZTcDUlszCzuZgkn3kkbaYkXoWMl9U0BdYsNs69do2/8NDsYqEucafbd6DNVCPX1XuzaUdFGlGYX2QX8vVSRKRCDuNWskrLhCU0r9zRlWDjH5cd/7YWKDxFjq1LQbPUsvcI ieZv+tzh tyf8e9jc/TFRnROGwmEvbk5Lt4h2HBXCjSu8acQHKdabrchNSIAAadaXOF6RUDVZ8DH57D0IilZu6TgezSc1qtK1HuyP84aQDD/AuxS1V6El3ndzvJb/crQSAVem5GiqyQ9GfAGocUGQ57ouBlWIk78dreAjeL9l0ZdFTPLn3rhdq+wJXwMQFmRoHXZpGZmSCtQ0Ff9/6j9esmw7j4B1XVqx/XUo0/d904hU39V9TohUYXLcNlo8BJf3P9ApiOk7mGHteLBm6GU4wdG2CTIi/A4LGsiDlvIRT1K0hp2QeUKwtcHAbrZe8g0xXodLQthsFedLxIRA+JaA6RQuqFeCiFdZ3vACVs0Ts2te0BuXp/y4zJJAaeeT9RNQf3rZRHgCfsvL9yMGVBGrV/6Dk+I3/C35o8OShS+5++v4aZBUNjzIeZ4dJI+4RhBuo+CV2G2hLOxg+k/uA1kK0Lmjn7FJCvqNP4dYUFOwGmtwHKzqZKj7gzqJ4QnflIRJ/QJk8RijTfL4KzmMwOtIb/MCCGwaH0/ktgI+iWpGwhEHFFO5MdIgXLa6MTjrDWn3TbQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Junaid Shahid Two functions, asi_map() and asi_map_gfp(), are added to allow mapping memory into ASI page tables. The mapping will be identical to the one for the same virtual address in the unrestricted page tables. This is necessary to allow switching between the page tables at any arbitrary point in the kernel. Another function, asi_unmap() is added to allow unmapping memory mapped via asi_map* RFC Notes: Don't read too much into the implementation of this, lots of it should probably be rewritten. It also needs to gain support for partial unmappings. Checkpatch-args: --ignore=MACRO_ARG_UNUSED Signed-off-by: Junaid Shahid Signed-off-by: Brendan Jackman Signed-off-by: Kevin Cheng --- arch/x86/include/asm/asi.h | 5 + arch/x86/mm/asi.c | 236 ++++++++++++++++++++++++++++++++++++++++++++- arch/x86/mm/tlb.c | 5 + include/asm-generic/asi.h | 11 +++ include/linux/pgtable.h | 3 + mm/internal.h | 2 + mm/vmalloc.c | 32 +++--- 7 files changed, 280 insertions(+), 14 deletions(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index a55e73f1b2bc84c41b9ab25f642a4d5f1aa6ba90..33f18be0e268b3a6725196619cbb8d847c21e197 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -157,6 +157,11 @@ void asi_relax(void); /* Immediately exit the restricted address space if in it */ void asi_exit(void); +int asi_map_gfp(struct asi *asi, void *addr, size_t len, gfp_t gfp_flags); +int asi_map(struct asi *asi, void *addr, size_t len); +void asi_unmap(struct asi *asi, void *addr, size_t len); +void asi_flush_tlb_range(struct asi *asi, void *addr, size_t len); + static inline void asi_init_thread_state(struct thread_struct *thread) { thread->asi_state.intr_nest_depth = 0; diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index b15d043acedc9f459f17e86564a15061650afc3a..f2d8fbc0366c289891903e1c2ac6c59b9476d95f 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -11,6 +11,9 @@ #include #include #include +#include + +#include "../../../mm/internal.h" static struct asi_taint_policy *taint_policies[ASI_MAX_NUM_CLASSES]; @@ -100,7 +103,6 @@ const char *asi_class_name(enum asi_class_id class_id) */ static_assert(!IS_ENABLED(CONFIG_PARAVIRT)); #define DEFINE_ASI_PGTBL_ALLOC(base, level) \ -__maybe_unused \ static level##_t * asi_##level##_alloc(struct asi *asi, \ base##_t *base, ulong addr, \ gfp_t flags) \ @@ -455,3 +457,235 @@ void asi_handle_switch_mm(void) this_cpu_or(asi_taints, new_taints); this_cpu_and(asi_taints, ~(ASI_TAINTS_GUEST_MASK | ASI_TAINTS_USER_MASK)); } + +static bool is_page_within_range(unsigned long addr, unsigned long page_size, + unsigned long range_start, unsigned long range_end) +{ + unsigned long page_start = ALIGN_DOWN(addr, page_size); + unsigned long page_end = page_start + page_size; + + return page_start >= range_start && page_end <= range_end; +} + +static bool follow_physaddr( + pgd_t *pgd_table, unsigned long virt, + phys_addr_t *phys, unsigned long *page_size, ulong *flags) +{ + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + pte_t *pte; + + /* RFC: This should be rewritten with lookup_address_in_*. */ + + *page_size = PGDIR_SIZE; + pgd = pgd_offset_pgd(pgd_table, virt); + if (!pgd_present(*pgd)) + return false; + if (pgd_leaf(*pgd)) { + *phys = PFN_PHYS(pgd_pfn(*pgd)) | (virt & ~PGDIR_MASK); + *flags = pgd_flags(*pgd); + return true; + } + + *page_size = P4D_SIZE; + p4d = p4d_offset(pgd, virt); + if (!p4d_present(*p4d)) + return false; + if (p4d_leaf(*p4d)) { + *phys = PFN_PHYS(p4d_pfn(*p4d)) | (virt & ~P4D_MASK); + *flags = p4d_flags(*p4d); + return true; + } + + *page_size = PUD_SIZE; + pud = pud_offset(p4d, virt); + if (!pud_present(*pud)) + return false; + if (pud_leaf(*pud)) { + *phys = PFN_PHYS(pud_pfn(*pud)) | (virt & ~PUD_MASK); + *flags = pud_flags(*pud); + return true; + } + + *page_size = PMD_SIZE; + pmd = pmd_offset(pud, virt); + if (!pmd_present(*pmd)) + return false; + if (pmd_leaf(*pmd)) { + *phys = PFN_PHYS(pmd_pfn(*pmd)) | (virt & ~PMD_MASK); + *flags = pmd_flags(*pmd); + return true; + } + + *page_size = PAGE_SIZE; + pte = pte_offset_map(pmd, virt); + if (!pte) + return false; + + if (!pte_present(*pte)) { + pte_unmap(pte); + return false; + } + + *phys = PFN_PHYS(pte_pfn(*pte)) | (virt & ~PAGE_MASK); + *flags = pte_flags(*pte); + + pte_unmap(pte); + return true; +} + +/* + * Map the given range into the ASI page tables. The source of the mapping is + * the regular unrestricted page tables. Can be used to map any kernel memory. + * + * The caller MUST ensure that the source mapping will not change during this + * function. For dynamic kernel memory, this is generally ensured by mapping the + * memory within the allocator. + * + * If this fails, it may leave partial mappings behind. You must asi_unmap them, + * bearing in mind asi_unmap's requirements on the calling context. Part of the + * reason for this is that we don't want to unexpectedly undo mappings that + * weren't created by the present caller. + * + * If the source mapping is a large page and the range being mapped spans the + * entire large page, then it will be mapped as a large page in the ASI page + * tables too. If the range does not span the entire huge page, then it will be + * mapped as smaller pages. In that case, the implementation is slightly + * inefficient, as it will walk the source page tables again for each small + * destination page, but that should be ok for now, as usually in such cases, + * the range would consist of a small-ish number of pages. + * + * RFC: * vmap_p4d_range supports huge mappings, we can probably use that now. + */ +int __must_check asi_map_gfp(struct asi *asi, void *addr, unsigned long len, gfp_t gfp_flags) +{ + unsigned long virt; + unsigned long start = (size_t)addr; + unsigned long end = start + len; + unsigned long page_size; + + if (!static_asi_enabled()) + return 0; + + VM_BUG_ON(!IS_ALIGNED(start, PAGE_SIZE)); + VM_BUG_ON(!IS_ALIGNED(len, PAGE_SIZE)); + /* RFC: fault_in_kernel_space should be renamed. */ + VM_BUG_ON(!fault_in_kernel_space(start)); + + gfp_flags &= GFP_RECLAIM_MASK; + + if (asi->mm != &init_mm) + gfp_flags |= __GFP_ACCOUNT; + + for (virt = start; virt < end; virt = ALIGN(virt + 1, page_size)) { + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + pte_t *pte; + phys_addr_t phys; + ulong flags; + + if (!follow_physaddr(asi->mm->pgd, virt, &phys, &page_size, &flags)) + continue; + +#define MAP_AT_LEVEL(base, BASE, level, LEVEL) { \ + if (base##_leaf(*base)) { \ + if (WARN_ON_ONCE(PHYS_PFN(phys & BASE##_MASK) !=\ + base##_pfn(*base))) \ + return -EBUSY; \ + continue; \ + } \ + \ + level = asi_##level##_alloc(asi, base, virt, gfp_flags);\ + if (!level) \ + return -ENOMEM; \ + \ + if (page_size >= LEVEL##_SIZE && \ + (level##_none(*level) || level##_leaf(*level)) && \ + is_page_within_range(virt, LEVEL##_SIZE, \ + start, end)) { \ + page_size = LEVEL##_SIZE; \ + phys &= LEVEL##_MASK; \ + \ + if (!level##_none(*level)) { \ + if (WARN_ON_ONCE(level##_pfn(*level) != \ + PHYS_PFN(phys))) { \ + return -EBUSY; \ + } \ + } else { \ + set_##level(level, \ + __##level(phys | flags)); \ + } \ + continue; \ + } \ + } + + pgd = pgd_offset_pgd(asi->pgd, virt); + + MAP_AT_LEVEL(pgd, PGDIR, p4d, P4D); + MAP_AT_LEVEL(p4d, P4D, pud, PUD); + MAP_AT_LEVEL(pud, PUD, pmd, PMD); + /* + * If a large page is going to be partially mapped + * in 4k pages, convert the PSE/PAT bits. + */ + if (page_size >= PMD_SIZE) + flags = protval_large_2_4k(flags); + MAP_AT_LEVEL(pmd, PMD, pte, PAGE); + + VM_BUG_ON(true); /* Should never reach here. */ + } + + return 0; +#undef MAP_AT_LEVEL +} + +int __must_check asi_map(struct asi *asi, void *addr, unsigned long len) +{ + return asi_map_gfp(asi, addr, len, GFP_KERNEL); +} + +/* + * Unmap a kernel address range previously mapped into the ASI page tables. + * + * The area being unmapped must be a whole previously mapped region (or regions) + * Unmapping a partial subset of a previously mapped region is not supported. + * That will work, but may end up unmapping more than what was asked for, if + * the mapping contained huge pages. A later patch will remove this limitation + * by splitting the huge mapping in the ASI page table in such a case. For now, + * vunmap_pgd_range() will just emit a warning if this situation is detected. + * + * This might sleep, and cannot be called with interrupts disabled. + */ +void asi_unmap(struct asi *asi, void *addr, size_t len) +{ + size_t start = (size_t)addr; + size_t end = start + len; + pgtbl_mod_mask mask = 0; + + if (!static_asi_enabled() || !len) + return; + + VM_BUG_ON(start & ~PAGE_MASK); + VM_BUG_ON(len & ~PAGE_MASK); + VM_BUG_ON(!fault_in_kernel_space(start)); /* Misnamed, ignore "fault_" */ + + vunmap_pgd_range(asi->pgd, start, end, &mask); + + /* We don't support partial unmappings. */ + if (mask & PGTBL_P4D_MODIFIED) { + VM_WARN_ON(!IS_ALIGNED((ulong)addr, P4D_SIZE)); + VM_WARN_ON(!IS_ALIGNED((ulong)len, P4D_SIZE)); + } else if (mask & PGTBL_PUD_MODIFIED) { + VM_WARN_ON(!IS_ALIGNED((ulong)addr, PUD_SIZE)); + VM_WARN_ON(!IS_ALIGNED((ulong)len, PUD_SIZE)); + } else if (mask & PGTBL_PMD_MODIFIED) { + VM_WARN_ON(!IS_ALIGNED((ulong)addr, PMD_SIZE)); + VM_WARN_ON(!IS_ALIGNED((ulong)len, PMD_SIZE)); + } + + asi_flush_tlb_range(asi, addr, len); +} diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index c41e083c5b5281684be79ad0391c1a5fc7b0c493..c55733e144c7538ce7f97b74ea2b1b9c22497c32 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1040,6 +1040,11 @@ noinstr u16 asi_pcid(struct asi *asi, u16 asid) // return kern_pcid(asid) | ((asi->index + 1) << X86_CR3_ASI_PCID_BITS_SHIFT); } +void asi_flush_tlb_range(struct asi *asi, void *addr, size_t len) +{ + flush_tlb_kernel_range((ulong)addr, (ulong)addr + len); +} + #else /* CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */ u16 asi_pcid(struct asi *asi, u16 asid) { return kern_pcid(asid); } diff --git a/include/asm-generic/asi.h b/include/asm-generic/asi.h index f777a6cf604b0656fb39087f6eba08f980b2cb6f..5be8f7d657ba0bc2196e333f62b084d0c9eef7b6 100644 --- a/include/asm-generic/asi.h +++ b/include/asm-generic/asi.h @@ -77,6 +77,17 @@ static inline int asi_intr_nest_depth(void) { return 0; } static inline void asi_intr_exit(void) { } +static inline int asi_map(struct asi *asi, void *addr, size_t len) +{ + return 0; +} + +static inline +void asi_unmap(struct asi *asi, void *addr, size_t len) { } + +static inline +void asi_flush_tlb_range(struct asi *asi, void *addr, size_t len) { } + #define static_asi_enabled() false static inline void asi_check_boottime_disable(void) { } diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index e8b2ac6bd2ae3b0a768734c8411f45a7d162e12d..492a9cdee7ff3d4e562c4bf508dc14fd7fa67e36 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1900,6 +1900,9 @@ typedef unsigned int pgtbl_mod_mask; #ifndef pmd_leaf #define pmd_leaf(x) false #endif +#ifndef pte_leaf +#define pte_leaf(x) 1 +#endif #ifndef pgd_leaf_size #define pgd_leaf_size(x) (1ULL << PGDIR_SHIFT) diff --git a/mm/internal.h b/mm/internal.h index 64c2eb0b160e169ab9134e3ab618d8a1d552d92c..c0454fe019b9078a963b1ab3685bf31ccfd768b7 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -395,6 +395,8 @@ void unmap_page_range(struct mmu_gather *tlb, void page_cache_ra_order(struct readahead_control *, struct file_ra_state *, unsigned int order); void force_page_cache_ra(struct readahead_control *, unsigned long nr); +void vunmap_pgd_range(pgd_t *pgd_table, unsigned long addr, unsigned long end, + pgtbl_mod_mask *mask); static inline void force_page_cache_readahead(struct address_space *mapping, struct file *file, pgoff_t index, unsigned long nr_to_read) { diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 634162271c0045965eabd9bfe8b64f4a1135576c..8d260f2174fe664b54dcda054cb9759ae282bf03 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -427,6 +427,24 @@ static void vunmap_p4d_range(pgd_t *pgd, unsigned long addr, unsigned long end, } while (p4d++, addr = next, addr != end); } +void vunmap_pgd_range(pgd_t *pgd_table, unsigned long addr, unsigned long end, + pgtbl_mod_mask *mask) +{ + unsigned long next; + pgd_t *pgd = pgd_offset_pgd(pgd_table, addr); + + BUG_ON(addr >= end); + + do { + next = pgd_addr_end(addr, end); + if (pgd_bad(*pgd)) + *mask |= PGTBL_PGD_MODIFIED; + if (pgd_none_or_clear_bad(pgd)) + continue; + vunmap_p4d_range(pgd, addr, next, mask); + } while (pgd++, addr = next, addr != end); +} + /* * vunmap_range_noflush is similar to vunmap_range, but does not * flush caches or TLBs. @@ -441,21 +459,9 @@ static void vunmap_p4d_range(pgd_t *pgd, unsigned long addr, unsigned long end, */ void __vunmap_range_noflush(unsigned long start, unsigned long end) { - unsigned long next; - pgd_t *pgd; - unsigned long addr = start; pgtbl_mod_mask mask = 0; - BUG_ON(addr >= end); - pgd = pgd_offset_k(addr); - do { - next = pgd_addr_end(addr, end); - if (pgd_bad(*pgd)) - mask |= PGTBL_PGD_MODIFIED; - if (pgd_none_or_clear_bad(pgd)) - continue; - vunmap_p4d_range(pgd, addr, next, &mask); - } while (pgd++, addr = next, addr != end); + vunmap_pgd_range(init_mm.pgd, start, end, &mask); if (mask & ARCH_PAGE_TABLE_SYNC_MASK) arch_sync_kernel_mappings(start, end); From patchwork Fri Jan 10 18:40:38 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935241 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0CDD3E77188 for ; Fri, 10 Jan 2025 18:41:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 11A346B00CC; Fri, 10 Jan 2025 13:41:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 024AB6B00CD; Fri, 10 Jan 2025 13:41:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D6EEE6B00CE; Fri, 10 Jan 2025 13:41:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B6FA66B00CC for ; Fri, 10 Jan 2025 13:41:16 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 60482120DDE for ; Fri, 10 Jan 2025 18:41:16 +0000 (UTC) X-FDA: 82992409752.12.2589D9B Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf03.hostedemail.com (Postfix) with ESMTP id 63D702000C for ; Fri, 10 Jan 2025 18:41:14 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Q+Pjg+r2; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of 3yGmBZwgKCOMOFHPRFSGLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3yGmBZwgKCOMOFHPRFSGLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--jackmanb.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534474; a=rsa-sha256; cv=none; b=DFLW4rQqTnoVL+DeGz3jU4tbHemLTYQWdfIGcOzX7ojV5UN6DGmmQyfI4yeIpU73WFNTSm Cw5gqKRw1YYrEUG70o2NmK3/VyaSNTx1ROA6owZKs5RKWJCi5nXKEvMOlevgfSUfQMaljA 7zWUzEBmg2MjmkjqpXbFKoUp9qjyoTg= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Q+Pjg+r2; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of 3yGmBZwgKCOMOFHPRFSGLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3yGmBZwgKCOMOFHPRFSGLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--jackmanb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534474; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=UzXM9zi9X6couWShNCb4EkjJb05AI+Oo7qUi6Mvjsww=; b=nWCDUDo9Rz7oy6hG76oPdkKQbbHUoLtWQ+jG6WzZYmiCmuVN+/YzocFeK/lxY8DccC/hrW T5tGO63wx+s1oFfbqOKoFqfoeaHD4dcspRc6YxnBM/IuyNv0xgFuREXE5iAZSdC83Z1xkE RO0U1GqaXwlyPAE/HHz4cTM/Tpjonjg= Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-43624b08181so12084445e9.0 for ; Fri, 10 Jan 2025 10:41:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534473; x=1737139273; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=UzXM9zi9X6couWShNCb4EkjJb05AI+Oo7qUi6Mvjsww=; b=Q+Pjg+r2zEtZVY21124bQ4M8fkfs3ACL47HgYdBynh2vc2QaNjEOLVlCISRvwPD2j+ oDIRDA+Es3WhYXQdV5Qvw+Niuz/fELVfpRpzoOZZCHavhM6C/9inap6lorvWr3X8KQZp ucix+8YU7Mo/3lbV3OCu5Ub1WHWEl2UcosaUBsJX+ScbPdy8PgxDVAit+Bl0xp6beYul nmX/Cm+e+czaG6onB300yabKYLVZxAxSjEhitttR4yJMcU7IbmNsXpVLvT2HZiofBQUw ftQMGyCQaZDyHNx2sqq6RDlXScsMrGHb1ijvgVtvg+C5Ez0D8UEm7m66sHWd8G7Nu1JD hTAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534473; x=1737139273; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=UzXM9zi9X6couWShNCb4EkjJb05AI+Oo7qUi6Mvjsww=; b=b6fczufyKqd0uWfZY4K4a2L6nfmB78g0CN1qfEEcCSrCK1IAiiCNxY7kYNC3OUl8XX 7V0n6CU2O1nuTv4SQQbb4902i0ohnQgYSpjDMSKItgRm1/tW3nf7c1T7Ugh9RDZjsahw 3Nantxz/Sje2nAwzfgp1Bcab4g485aKqWVNQVFjdLczmUDtZQUxhlmDug5XZ1lvc/j/R wIZGW6BYU8waiOWl+EvqRODPIwNoWMmZjAFCvjbMBSO/9emtZ4nkA+CtillXoD+6irXL UV5qeYlc7MKQv7u6IfsoN8sR6+fRxrRsngxiV/lC1ZFYFTAOO0o0ayKD9FxImnjJOAL+ NJYw== X-Forwarded-Encrypted: i=1; AJvYcCXesWQ4o2dG4JLicr67Dz0xcFQYB7t9YkU0wnjWt9/nTgAnT1aIirZmKwIKdbhMw791Nok2v2oESA==@kvack.org X-Gm-Message-State: AOJu0YwM6eBtUDqPpsIUygZ7H3Os4auJpjXrOd4WNbwedOrKbvDQURTB Gb+jzk87//iJjGGsR5EnNHcpCO1CdHm08wKB4q8bvsUxyEEFAYqVlLWlsk+puy6bWDM9HVBKd/S xPtXweFd+Cw== X-Google-Smtp-Source: AGHT+IFDRv4zNENk63KYcLxYmQdIUcLs7zTtS/Sd8xMbR3tJqcykEbsngD0F8tGcq40UK8q2XIPTmnyPYnp+hA== X-Received: from wmqe5.prod.google.com ([2002:a05:600c:4e45:b0:435:21e:7bec]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1d2a:b0:435:edb0:5d27 with SMTP id 5b1f17b1804b1-436e8827fbcmr75837805e9.9.1736534472863; Fri, 10 Jan 2025 10:41:12 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:38 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-12-8419288bc805@google.com> Subject: [PATCH RFC v2 12/29] mm: asi: Add basic infrastructure for global non-sensitive mappings From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman , Junaid Shahid X-Stat-Signature: 9yamd54c96i8imok4cy3yenh3o7g1np9 X-Rspam-User: X-Rspamd-Queue-Id: 63D702000C X-Rspamd-Server: rspam08 X-HE-Tag: 1736534474-498965 X-HE-Meta: U2FsdGVkX1/uhKExAJ8HSOxb8k9Ct9co+sD28UK79l7MB+obqy6PrMyWPxNUo7YUlD9al6bLoUlkc51WnNxFDhoCNEm3OGrVASqGqssFjBbcmNHnPHPWammlamwDXUIw3tFBOPSSBNE68g/b4atzyAqFHNHSTVqJiPMYYCSfMkXbLUkAhu6/wivL8j9JU/4JwJ/ZV7DN2L6u/oiQpzuQLXiDaHGa0oNbt9cET/tqRtVGrJMSqZSF3Z2VOSU7hQXiLjN6+uowpBEbS1jymZAMwXVnEzKztXIS/ml5LFQxBv86D+zjrk3+WT5FVm/ZCVBi/U4a7FFdRZOLlKqtPo23Wcvoi4oAqaQiJEElIG08L0FEfngPgL43TsMDEJFCfr72RoJwBua+C9YxMk1wh2Ep61xRB65rQ3HNplmANbTeMlsW9rgLQtMHDc5gl09mp9e/VXG6sUGd/e7/jeB0xEmyuITVBQ1GhjWSDL8NtDng71icVl2FIn2H1a1u6PH7UmNkB6hu0Az6hkcB5DbGLgjHatJ+6dLb/fP+kyxg/Lqdv9NL1OrLNch54sKGGEcgcQdzZigzcAVGwese7eO8LWZ77t1+DU1b2g8hCdc/md24vRmjSW9lmeeFiTtr8BE8MKr3R3SOYe/+OVey2aLiiTQlF/Fv4Ej8skrc2q2dS2uUdZDhGisKVAyBhNed4Mx7C0iamIFyn7lbblcNoOvohZHTMc3OmBW7PBGN/lsH2YZOnjB187qA4AKT4YZUDyXmdAYOmKYNAW0clh1j0S3u690aEOR0qs6Dwt7RB9mJ5Wb5DIPAPoSrJmZ5z6/us1YtdL+JznxCm5t5K0wtg2rKAgrFe+zXlTEHJc1oHrqWpxDwaYgOoimRKPHQ4Zan9M5OiFw6Vqg+BFUmtcGtst0J6IpoJ+DMc5dkuMD0Irc3TsUSIMHfSv3V8PjKQboIizX08j8vfzfF1QzooQ7Eby4J4iq PS3jDkLP xVXBTenrXWpd+6OCK6rLWCFA/UC4QpP/LN2pp5RB+j6tE8ZEIvbPVM8yMhoh997JYuwxA75gAONORbBjinT9R1JVvB1xvbGafb2SrCjbxrfKhqVilg3ngVxGxj3TX569Hwr+HdwALcRH8yjAMx5RM1nOV89a1V2qFWET7XJToTfdU6sCm0jkdWZuk10sfWF2FbPY0NqaS/guGGTKIjPavg465ysR8X9RA0I+Azu2sEFcWqmbflAfK8uA1ucA4puLCGJmPy1HNa51ghJfTx+cEUqa1BK9Tdlb08Niqe2EZWm5sCdPKj5mGg7BF2uCCw/iW9sUnF+03n7LV7G2I/48ppf74mNvVDpupj2HbgMLFDoS5zMdPnKPp7XHnOkOjG47eKjkV+uTCBn7d/P7/C44Y5lvgNPunEsk7QEYY6o/axlKRvaLP73plhq7TF+ofCWK30U/67uyaIz3qF/DLudZQq5fDbRMUQABxBJYW9oAf5C4YpaWjk/ggJoogvm+lqTrtw6qCqoCJ4pi/8AV941yuisdT5haaogaaV2NCF4YAO/Myjy+6/YjZblsB8A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Junaid Shahid A pseudo-PGD is added to store global non-sensitive ASI mappings. Actual ASI PGDs copy entries from this pseudo-PGD during asi_init(). Memory can be mapped as globally non-sensitive by calling asi_map() with ASI_GLOBAL_NONSENSITIVE. Page tables allocated for global non-sensitive mappings are never freed. These page tables are shared between all domains and init_mm, so they don't need special synchronization. RFC note: A refactoring/prep commit should be split out of this patch. Signed-off-by: Junaid Shahid Signed-off-by: Brendan Jackman --- arch/x86/include/asm/asi.h | 3 +++ arch/x86/mm/asi.c | 37 +++++++++++++++++++++++++++++++++++++ arch/x86/mm/init_64.c | 25 ++++++++++++++++--------- arch/x86/mm/mm_internal.h | 3 +++ include/asm-generic/asi.h | 2 ++ 5 files changed, 61 insertions(+), 9 deletions(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index 33f18be0e268b3a6725196619cbb8d847c21e197..555edb5f292e4d6baba782f51d014aa48dc850b6 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -120,6 +120,9 @@ struct asi_taint_policy { asi_taints_t set; }; +extern struct asi __asi_global_nonsensitive; +#define ASI_GLOBAL_NONSENSITIVE (&__asi_global_nonsensitive) + /* * An ASI domain (struct asi) represents a restricted address space. The * unrestricted address space (and user address space under PTI) are not diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index f2d8fbc0366c289891903e1c2ac6c59b9476d95f..17391ec8b22e3c0903cd5ca29cbb03fcc4cbacce 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -13,6 +13,7 @@ #include #include +#include "mm_internal.h" #include "../../../mm/internal.h" static struct asi_taint_policy *taint_policies[ASI_MAX_NUM_CLASSES]; @@ -26,6 +27,13 @@ const char *asi_class_names[] = { DEFINE_PER_CPU_ALIGNED(struct asi *, curr_asi); EXPORT_SYMBOL(curr_asi); +static __aligned(PAGE_SIZE) pgd_t asi_global_nonsensitive_pgd[PTRS_PER_PGD]; + +struct asi __asi_global_nonsensitive = { + .pgd = asi_global_nonsensitive_pgd, + .mm = &init_mm, +}; + static inline bool asi_class_id_valid(enum asi_class_id class_id) { return class_id >= 0 && class_id < ASI_MAX_NUM_CLASSES; @@ -156,6 +164,31 @@ void __init asi_check_boottime_disable(void) pr_info("ASI enablement ignored due to incomplete implementation.\n"); } +static int __init asi_global_init(void) +{ + if (!boot_cpu_has(X86_FEATURE_ASI)) + return 0; + + /* + * Lower-level pagetables for global nonsensitive mappings are shared, + * but the PGD has to be copied into each domain during asi_init. To + * avoid needing to synchronize new mappings into pre-existing domains + * we just pre-allocate all of the relevant level N-1 entries so that + * the global nonsensitive PGD already has pointers that can be copied + * when new domains get asi_init()ed. + */ + preallocate_sub_pgd_pages(asi_global_nonsensitive_pgd, + PAGE_OFFSET, + PAGE_OFFSET + PFN_PHYS(max_pfn) - 1, + "ASI Global Non-sensitive direct map"); + preallocate_sub_pgd_pages(asi_global_nonsensitive_pgd, + VMALLOC_START, VMALLOC_END, + "ASI Global Non-sensitive vmalloc"); + + return 0; +} +subsys_initcall(asi_global_init) + static void __asi_destroy(struct asi *asi) { WARN_ON_ONCE(asi->ref_count <= 0); @@ -170,6 +203,7 @@ int asi_init(struct mm_struct *mm, enum asi_class_id class_id, struct asi **out_ { struct asi *asi; int err = 0; + uint i; *out_asi = NULL; @@ -203,6 +237,9 @@ int asi_init(struct mm_struct *mm, enum asi_class_id class_id, struct asi **out_ asi->mm = mm; asi->class_id = class_id; + for (i = KERNEL_PGD_BOUNDARY; i < PTRS_PER_PGD; i++) + set_pgd(asi->pgd + i, asi_global_nonsensitive_pgd[i]); + exit_unlock: if (err) __asi_destroy(asi); diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index ff253648706fa9cd49169a54882014a72ad540cf..9d358a05c4e18ac6d5e115de111758ea6cdd37f2 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1288,18 +1288,15 @@ static void __init register_page_bootmem_info(void) #endif } -/* - * Pre-allocates page-table pages for the vmalloc area in the kernel page-table. - * Only the level which needs to be synchronized between all page-tables is - * allocated because the synchronization can be expensive. - */ -static void __init preallocate_vmalloc_pages(void) +/* Initialize empty pagetables at the level below PGD. */ +void __init preallocate_sub_pgd_pages(pgd_t *pgd_table, ulong start, + ulong end, const char *name) { unsigned long addr; const char *lvl; - for (addr = VMALLOC_START; addr <= VMEMORY_END; addr = ALIGN(addr + 1, PGDIR_SIZE)) { - pgd_t *pgd = pgd_offset_k(addr); + for (addr = start; addr <= end; addr = ALIGN(addr + 1, PGDIR_SIZE)) { + pgd_t *pgd = pgd_offset_pgd(pgd_table, addr); p4d_t *p4d; pud_t *pud; @@ -1335,7 +1332,17 @@ static void __init preallocate_vmalloc_pages(void) * The pages have to be there now or they will be missing in * process page-tables later. */ - panic("Failed to pre-allocate %s pages for vmalloc area\n", lvl); + panic("Failed to pre-allocate %s pages for %s area\n", lvl, name); +} + +/* + * Pre-allocates page-table pages for the vmalloc area in the kernel page-table. + * Only the level which needs to be synchronized between all page-tables is + * allocated because the synchronization can be expensive. + */ +static void __init preallocate_vmalloc_pages(void) +{ + preallocate_sub_pgd_pages(init_mm.pgd, VMALLOC_START, VMEMORY_END, "vmalloc"); } void __init mem_init(void) diff --git a/arch/x86/mm/mm_internal.h b/arch/x86/mm/mm_internal.h index 3f37b5c80bb32ff34656a20789449da92e853eb6..1203a977edcd523589ad88a37aab01398a10a129 100644 --- a/arch/x86/mm/mm_internal.h +++ b/arch/x86/mm/mm_internal.h @@ -25,4 +25,7 @@ void update_cache_mode_entry(unsigned entry, enum page_cache_mode cache); extern unsigned long tlb_single_page_flush_ceiling; +extern void preallocate_sub_pgd_pages(pgd_t *pgd_table, ulong start, + ulong end, const char *name); + #endif /* __X86_MM_INTERNAL_H */ diff --git a/include/asm-generic/asi.h b/include/asm-generic/asi.h index 5be8f7d657ba0bc2196e333f62b084d0c9eef7b6..7867b8c23449058a1dd06308ab5351e0d210a489 100644 --- a/include/asm-generic/asi.h +++ b/include/asm-generic/asi.h @@ -23,6 +23,8 @@ typedef u8 asi_taints_t; #ifndef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION +#define ASI_GLOBAL_NONSENSITIVE NULL + struct asi_hooks {}; struct asi {}; From patchwork Fri Jan 10 18:40:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935242 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9AB4E7719C for ; Fri, 10 Jan 2025 18:41:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 027556B00CF; Fri, 10 Jan 2025 13:41:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F19DE6B00D0; Fri, 10 Jan 2025 13:41:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CFC306B00D1; Fri, 10 Jan 2025 13:41:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id AFFA86B00CF for ; Fri, 10 Jan 2025 13:41:18 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 76AC2C0DB2 for ; Fri, 10 Jan 2025 18:41:18 +0000 (UTC) X-FDA: 82992409836.28.D49F9DD Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf30.hostedemail.com (Postfix) with ESMTP id 9C9C380023 for ; Fri, 10 Jan 2025 18:41:16 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=IuS+C7VL; spf=pass (imf30.hostedemail.com: domain of 3ymmBZwgKCOUQHJRTHUINVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3ymmBZwgKCOUQHJRTHUINVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534476; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DqQy0/uo7rp5xt5hxwNN280PfwFL6RX7EFGPCdWY0es=; b=nAFYLSC0fcqZcMn6KDYpjX9sBPJ2hpyrSauXpXrBi/NfrlZI7jEyd4OOSG1Sl/5tGcVt/E SzHF+BCmx1vletSIKMtd4x8W3iKcqaYAByjPUJm6hj29sjdZ6TwIwIKIxWFql9PCB5tZt1 lAqgpWZVV00Cxkz4+93bf7X+uLmKf5o= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=IuS+C7VL; spf=pass (imf30.hostedemail.com: domain of 3ymmBZwgKCOUQHJRTHUINVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3ymmBZwgKCOUQHJRTHUINVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534476; a=rsa-sha256; cv=none; b=aZcoKzuuUxZTU/O1D+/ysOP+RbUV9mfrA4fVui2mGv36eiK+DyhEzcmYU90MEeoPKffpW4 NeiF8nCN3joYPjO7KWw3sbuT33O9ykdkO6AiwyeQ9CSHHKogLe3FiQMabtAs3wEnlicxnr w3JoKs+DTn/AGJ5hYPQCAz7jRG2DUjc= Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-4361f371908so16547485e9.0 for ; Fri, 10 Jan 2025 10:41:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534475; x=1737139275; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=DqQy0/uo7rp5xt5hxwNN280PfwFL6RX7EFGPCdWY0es=; b=IuS+C7VLSuum2Vgr2Y1kVz1a/sOQ4c7jm885o+iVMfSj2l4GIe1ikC+jeI+7x/fe5C E30MXOT86WIFVUDSlR//LhnXxu+NOQfuG1bUznTsg8fRORisknXFGn6TDs5GVv8xwI6H MqsSUuKUkZ6c33URm/CS+ASNWTzCqob9zR6new/BptVZVGxHF2tcjXbFoiLrjd4pduss Ea6xKUFs9veWgFqkqF/YpYdeJeBNtCLaTSejjiqUxlWtq7PieUtWPO4UkJq6rz2EZXOu mgaxwmfUEQoUTOgCfZufQLlSc+xDzky+Z9Xa9v7xlpS3P7aKhS/+GY7+Ggpdg7z7VN3e 3zbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534475; x=1737139275; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=DqQy0/uo7rp5xt5hxwNN280PfwFL6RX7EFGPCdWY0es=; b=h/EWj878ndCfbfgs7e/iQqp3KeXoysK1sVcmFiDsW+JO2E+UpsEaAwhHx1aNxMgDTC XFMZrQ+Q11GA/W2U6KwlRILaaGF1xsJfjJF1Lh5KtbNdWTgnnQsxWuQ2rg4Q7wS23P4u PETeixG2YmAanXhqh5T0iJbF4bNW9p4/M9fYJuK8VEDpktdvxAP8KfBsVE7KxReIRJGP hBpnSMCz3+biSDTpvOTri5EGoc+Ga+ST0Rm7pzP7tDzAY3NsgZce6gP0r9a9orhd8i29 quoCeu1nN69f3vBd+HDWn4kXjlm//bK+EFFATilMdgimhs4SpiP4gREyuVBFggB5NDDw 5s8A== X-Forwarded-Encrypted: i=1; AJvYcCWC+TENTRMFGCycp78gQ0l8L81z77nqsyD0qt3OhsMVXHDdpGb1Qy0PHIk8B3k+GCclBM1yeNY1+w==@kvack.org X-Gm-Message-State: AOJu0Yxv3GQ7rmex4s69wxYOzx+pgtkg6FACyvdpnx+FxTevv1pjKSWl Gy0PZf3+pucDb0KPFcxPlGSE2yF4IKFLI2t0jWkW1FovXjlAS1ZRKSbN0+xjhRbQYSZ+9zZaMf1 qMdoSSyY81A== X-Google-Smtp-Source: AGHT+IEHMyjAf4Ii6uANDaoYyo5ZkF+D1/fwKOgm4kaaSXfegc+L5vVxTYn6WIEloGlJDmt93RFbSdb2jTZqKA== X-Received: from wmbjt19.prod.google.com ([2002:a05:600c:5693:b0:435:4bd2:1dcd]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:4e44:b0:434:e9ee:c3d with SMTP id 5b1f17b1804b1-436e27070b1mr106983675e9.20.1736534474969; Fri, 10 Jan 2025 10:41:14 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:39 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-13-8419288bc805@google.com> Subject: [PATCH RFC v2 13/29] mm: Add __PAGEFLAG_FALSE From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 9C9C380023 X-Rspam-User: X-Stat-Signature: 1atpqnjqq9kgjxfkiedjb8atgz1nic6f X-HE-Tag: 1736534476-51243 X-HE-Meta: U2FsdGVkX1+iJlJTLi9/ZDwrqzpOmvNrtKsqff4h0BBoNNM0H51fuRMKoNvAqXMokajf+8sen8Lnn+NDOdWWzWoBuTO+O0wzS1ahpLQi9L8jKzUBmNRz/W5qGGLJ2i8vgQTmF9vTXijdPCUhIhtEMTjdW7YLu3iPXChPofTaNxu265N5nmKwKt0QZMiPVXGVr8frCU81TZLAuv4uzygj331Cc8R5E1dt6/P+6owR1+iI1q3Wo5fQD9wP5LqrYmEbk56K5+HiApUTMgh9MAjkoVUGUbs1qPGst4xoZ6cbC46+H8yh88dCaAlf/RE5Nm0SNBGvIkImEj/uaRlTGALvxupy1rtcliMiksBka6fFzqgYLg2EChX7rCCgzs93VZJtpcR32ngEka3Bg8AxExbvge7HjaH/jfTCoHhu0MqZd22Wu6K9Uwr8pm55p5kP5FTua7ZmtWpOuHRhu+mjpcnWRLH6cZGwCs+PqErcLESflv4L3/DeLAGo1hIurlmVHxjwPdeTpF39tlRRfUpXuFA1rX/QQtaAVNpKsYEF7K/jyzRRHyESrePe5wUvb1rxv6rwQgEjays+WBpCOxDEHtRFbIJShensMnBOTGviFyrJj9SWYroh8oK+P3VfwzR9z7Z7hu+qWGe0Y0B80fZ930OEWt31AT7euzm47OS33LSgW3xYFeTlF1RZEWIvlhyzXRbY2sevi0EdoyRtkQD1pybu/8jb28Wxurji+7V8wgl8OlOt1iUGGijGq9eK6/5sK0bAvKUCQRUDerSoNg+XFNHV2jytBOkQWvoVh2+RxRRXMu7yZhNmIYARft+4bsU9oyGX32Wr7IgNHZgd7JI9HfFwmbQVyFQfGbJB4B3mg6EziJbfKIieU2gZRy1YXjxTxH0HwAdhu8yNVc4jfy0NPUvcSk8kOOpDIMHezv3Ej8mZ448FvibOkaI4eXZa++eAttmq3UCyBoaQaQrNWNL7pQ2 cF2xB7Pc Eb5EmhAvo3ACF7NEpHUnk2Tpj93ctcQqYzlrtvu81gF5PSNRM2BYs/6VlasZ/3BET3V/wXnxEvweptaAIC5osBt2A+7AntXJuvmW51a+EYEG7jJASsYkmGeVkA0dBWbFnlYgVzDyp5GyKzoJzw0WC03uYXh3emD97xqVBNLxK/AAEKuwdnZOuRN3xKHfMbW/v7vqO0MByHZeYg8fJ8AQ5Li2Fh29aCcD+v+uTrwM1GVo1X/ORAlmoYIbJ5OS24xEEUl16qJPjOmJ9GJLoc5/1PWghj2Lhbpfh0J91HIgSGS8ivPvbLmHVzcEotq8JdhqmDNDdwtZd4+lsjJe6I94Narwp7/LmpkPs3Tx5PuSQaupUl9mc3jhfCl6iHx8a5j0KdGDxy5Z44PrUsyc8vLthNBpiogLHtY748bNECC/HY/GOKff+gNfUgTiimgCss7wkMS0YUuhBo4LIA1er1h2GUuJ2HuUYyfaigp87noONHdEMskPGg1nySz2JRVmJo1FaOEH8H7oA3GZz2IubiH8xVnTv0IZrgdHsZy1++uuyaTN9bkbDmFJ5xpyYoT2c7suJoK3ObDAp+NtdstG83lnZ7k5aEZLSd6XAvLqa5ceT7/ZiVkY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: __PAGEFLAG_FALSE is a non-atomic equivalent of PAGEFLAG_FALSE. Checkpatch-args: --ignore=COMPLEX_MACRO Signed-off-by: Brendan Jackman --- include/linux/page-flags.h | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index cc839e4365c18223e68c35efd0f67e7650708e8b..7ee9a0edc6d21708fc93dfa8913dc1ae9478dee3 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -484,6 +484,10 @@ static inline int Page##uname(const struct page *page) { return 0; } FOLIO_SET_FLAG_NOOP(lname) \ static inline void SetPage##uname(struct page *page) { } +#define __SETPAGEFLAG_NOOP(uname, lname) \ +static inline void __folio_set_##lname(struct folio *folio) { } \ +static inline void __SetPage##uname(struct page *page) { } + #define CLEARPAGEFLAG_NOOP(uname, lname) \ FOLIO_CLEAR_FLAG_NOOP(lname) \ static inline void ClearPage##uname(struct page *page) { } @@ -506,6 +510,9 @@ static inline int TestClearPage##uname(struct page *page) { return 0; } #define TESTSCFLAG_FALSE(uname, lname) \ TESTSETFLAG_FALSE(uname, lname) TESTCLEARFLAG_FALSE(uname, lname) +#define __PAGEFLAG_FALSE(uname, lname) TESTPAGEFLAG_FALSE(uname, lname) \ + __SETPAGEFLAG_NOOP(uname, lname) __CLEARPAGEFLAG_NOOP(uname, lname) + __PAGEFLAG(Locked, locked, PF_NO_TAIL) FOLIO_FLAG(waiters, FOLIO_HEAD_PAGE) FOLIO_FLAG(referenced, FOLIO_HEAD_PAGE) From patchwork Fri Jan 10 18:40:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935243 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 56F69E7719C for ; Fri, 10 Jan 2025 18:41:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 840928D0003; Fri, 10 Jan 2025 13:41:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7EE4A6B00D1; Fri, 10 Jan 2025 13:41:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5CD278D0003; Fri, 10 Jan 2025 13:41:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 37F9B6B00D0 for ; Fri, 10 Jan 2025 13:41:21 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id D91D043078 for ; Fri, 10 Jan 2025 18:41:20 +0000 (UTC) X-FDA: 82992409920.11.E9D8D1E Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) by imf03.hostedemail.com (Postfix) with ESMTP id EA4B520008 for ; Fri, 10 Jan 2025 18:41:18 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=2xKPBTWJ; spf=pass (imf03.hostedemail.com: domain of 3zWmBZwgKCOgTKMUWKXLQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--jackmanb.bounces.google.com designates 209.85.221.74 as permitted sender) smtp.mailfrom=3zWmBZwgKCOgTKMUWKXLQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534479; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pY19vAYDyHEycAFKYfxYiz+ttrD3KEZuAB5DIP4OiZk=; b=sO8Ra91vF95Xz9qolpWp+RHPYMpYxC0c2oa3spt1CvtlZPlo49WkeeTGkwpMckpCCiTSq4 hw0f0pT8ID79uCFkXP4N3hT075O/4prn+ZrBjI6TipnkSeEOT+0DitEuGJFpe5rzE8TPYD QUhjXbHwUU7oHWp1dwlLsIQZtaqsZnM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534479; a=rsa-sha256; cv=none; b=pLyB2WFoLf+zxbqFSmaTzgdDE+D4iwasXmJCehCpJIBgrwfcOaZWnvrjk7zyy4Fq4X6QBS KntHSG2mHt6Z4Oki7YjrPMoZ9E4U93VGCHr4zzzWpm95m72jCDKkSOVV1fxZCgWJUqyqci Ld1LPO44quveHwhho2TYYQ/YmXBr5Do= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=2xKPBTWJ; spf=pass (imf03.hostedemail.com: domain of 3zWmBZwgKCOgTKMUWKXLQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--jackmanb.bounces.google.com designates 209.85.221.74 as permitted sender) smtp.mailfrom=3zWmBZwgKCOgTKMUWKXLQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-385d6ee042eso1539524f8f.0 for ; Fri, 10 Jan 2025 10:41:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534477; x=1737139277; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=pY19vAYDyHEycAFKYfxYiz+ttrD3KEZuAB5DIP4OiZk=; b=2xKPBTWJkPiHOQW075nEbYc11VmsN+d84w7/1GL7RXfNCuPAIFOQ3IpoGWlETFnesp 9KFUKzUKgwscYjuzx7FNUm3QVviTyzFncJz9NWtkmupZ5reLfiD3PtbK2Le1Vc004D/E fvseVKp9q1k9YEnyPI854IjlxAGqs+tCewunrEP2HjJPI5nytuefWU2FdY6qHZqBHDIM v8sbYh8uec0ij4RmueSA5XGpyrk091bRO/uTAb7++yzrJOKfX+vU1f5rtVMlKYj9k4tH LsU6KxBWSXhsA+MU0HIm9LVIZpTqkdQA51JTwScoiyEP86qG7t5iSRFk9xn6IjIQ/MFa PdSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534477; x=1737139277; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=pY19vAYDyHEycAFKYfxYiz+ttrD3KEZuAB5DIP4OiZk=; b=vI+bHVeMplBYlZcx4x6RjydcZdfxCK4FpCY9ZPsRYaqvUFHFAcLVlKLbTFYEFF0U7W 3/aHBOPfrq0FUshuhXxDuoydlS+yQXrSwTTFOOvH0KYum7gsdygGbVIpH/U6NERjLZt/ k+KiAUvOqYFoSEGTQAlnZE3/H9z7JglLEyZXeo6i2Hef7V/Mgy2cSTLoYKsCKbL7RyZl Ko/4LDOmgcxTqWWGlkkqxwfAMKecnrYwCwtn1WNGQIzu6mJritGDvdASAc/gW6yj2hMp 3PUC6hBbSSqnADbQbYSBP/HSIg1y/WABKZloK6gDHz5adGMdlnrhsng96RBeJuwJ2jPC Gdpw== X-Forwarded-Encrypted: i=1; AJvYcCUtkZ7ktVoXVgbgMmIOfLAIzTFDjw3aaQxdNVeMqV2QBwmbd6gByD3bjaJ++7hRaWoCc0vvTJIDSA==@kvack.org X-Gm-Message-State: AOJu0YxwpOHPKpRaR4j3CBDGbF9LVZ3pfoKDbwyW2gBxlzjYjfcpJah7 SQN9CqZeO+YnT+c85uMlItABz/ZO6wwRkOFuZ3k4LOWq12ACJT7kyiwNDqTiZe4HgTyG+ltX/Ph olzULqufbtg== X-Google-Smtp-Source: AGHT+IH1S1jc9i/BllMTXkt6dmv+P02SGjsqDnUUQ50M5C80r6DIU0GdLO3snheexKkp2plBsUVSQfJITnAXdg== X-Received: from wmbfk13.prod.google.com ([2002:a05:600c:ccd:b0:434:f9da:44af]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:154a:b0:385:eeb9:a5bb with SMTP id ffacd0b85a97d-38a872de3femr10883177f8f.17.1736534477376; Fri, 10 Jan 2025 10:41:17 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:40 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-14-8419288bc805@google.com> Subject: [PATCH RFC v2 14/29] mm: asi: Map non-user buddy allocations as nonsensitive From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: EA4B520008 X-Stat-Signature: ooyxwryi34gz7cxot8kjo4nih89aw9r5 X-Rspam-User: X-HE-Tag: 1736534478-842834 X-HE-Meta: U2FsdGVkX1/RgPeLG90k9YiYSJgkka1i2tt3RVNjYtS0LjgCeqonqizXAcZc5zG4rEuA0rrUqGUmLkvysIlTvOfVRZ16viMMwCM+4hBHmMOA+uar3wRrFNYwgyGE15hgMvpXSjYJvA3RPPv9vcMNHk854rIQ8JNCh5z2dGhSys5rd9vf+IrlInLm/p2vPbWbEmIF3vF6a7pNyFlHVyB/kI4H1aTLbj+HBrBlAJnj+JiuDSFVsq73mfJvPUM+wgiHhBZkq9KV/zjMWj0BzaZbfPMmW66O/SJ4/TISG7ELCR2PhzM2+cdS9ttQNo3nBaxyzGKhSWKilfT/4b/4qAFyv6xoo3bL8zqdAFE+Ucli/tQbZEtDAuzy8ga81LdaYK+xD2uX86eRfNGcMeiXSo7tyE6XYzKOA6qs3dwFXKTvOFrZyQeh6CyH1ROhtNudvkkoCe8MuvuW4xw1pnbvhIApbF3+CQb9UjdQ7tDmJmGxIkghKq9EBqQoKsrk6IXbV8/Zu+yYib3UyIiYXr6mRu3n6hYhRKfRbYbISY9HkR2v91gNusLKLcujkac9y9qmXihN6Qg8TDdL7reuquJVgrQk9pfXYtNmlrEaf0BZWcOIYPkJ0kTMxosy6OHyYm0BUpESaenCgeCXb7/1N6q/U5hriSRd53vKVpZig2HvO0XLYJRjs0iVl0K8qo+MW4xdoA82jaFzYmgPjWHeeXc1N+pntg8kXK7y1keWc+ejeX0xOnSsFJF2C3K1eITxE9hmgkbM+eWLthWQjbOGHp6wqro0jqats39wq7asVqbmJ7vtT0IqHlHMLgSmn1JVgRiMKaVuY20C/OaCuV2dHKZdtWthwlfn18PsDsfS2MWxtlaT10m78oxT5mhAGMiZRSTjAmOXih0OK/ZrzuH+LFXbZPyPgTMeqomQNkiBMNjT8WqRnSYT87Vr+aQ4n2aH42EML2YF2OgL+oYDADWkSsTXqm+ 6TRoJlLc GAhNKKIvmaPybSDORF7OvkXxo85LTmXB2Yvfpah/ofmbHZVnvJ5PMWcMjedg01frxNjduiV6yDoXXxyjpJa8Ak/f0S9zkfKgsR6qqdIU3adypJ4GYsP3IVTLOrdaSIGN/lbmDHY+NEqs9TL6HMNnN18AmGl8Py9mFL8QRPafqYw+CZ/I8txoBGUz0gUmZv7Tkz2aqJS+QfG/5sPTI7rlL16nOxGE8kjqQ0+aF6uIT4Vc1vYqk1w8RA/pAT7+W6+cx9Ri36QaNcI0gNtVPHe0/OPcjBP4C6Fgib8H8vfpzrdARz9iVcfnsnNp/yjuB6psi+/RGWIIhDkOX+WxKvWKkI2sLDdO9f4dlhBOLIPxfigboMrXw1teBiEEEmBk1h5DKRyqufixHOevf2uqtOTTbT/G1tuJsrB3iWJuIkU3kcyrxyFWpIvRmOSNnjIuPQmg4oy5WTQRnvQ4C/211CiCjYvQDQptiR/VpP/dzWAO9f1QP9ARzQkBP750F86Wkrb57NwuCPT4DWDDE+aRrFTCg2VBTMfwsYhWmXvJDDH9FjsKdOiCqynYdQsti6L/WsdREogjTgCnqyA0zfcts9ZZQ87Q6OVw2xFZJVdWjsqLxd/cGkfVMGSinLaBBB4aGXJ2ehcemPnDV7tIf8CY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This is just simplest possible page_alloc patch I could come up with to demonstrate ASI working in a "denylist" mode: we map the direct map into the restricted address space, except pages allocated with GFP_USER. Pages must be asi_unmap()'d before they can be re-allocated. This requires a TLB flush, which can't generally be done from the free path (requires IRQs on), so pages that need unmapping are freed via a workqueue. This solution is not ideal: - If the async queue gets long, we'll run out of allocatable memory. - We don't batch the TLB flushing or worker wakeups at all. - We drop FPI flags and skip the pcplists. Internally at Google we've so far found with extra complexity we're able to make this solution work for the workloads we've tested so far. It seems likely this won't keep working forever. So instead for the [PATCH] version I hope to come up with an implementation that instead just makes the allocator more deeply aware of sensitivity, most likely this will look a bit like an extra "dimension" like movability etc. This was discussed at LSF/MM/BPF [1], I plan to research this right after RFCv2. However, once that research is done we might want to consider merging a sub-optimal solution to unblock iteration and development. [1] https://youtu.be/WD9-ey8LeiI The main thing in here that is "real" and may warrant discussion is __GFP_SENSITIVE (or at least, some sort of allocator switch to determine sensitivity, in an "allowlist" model we would probably have the opposite, and in future iterations we might want additional options for different "types" of sensitivity). I think we need this as an extension to the allocation API; the main alternative would be to infer from context of the allocation whether the data should be treated as sensitive; however I think we will have contexts where both sensitive and nonsensitive data needs to be allocatable. If there are concerns about __GFP flags specifically, rather than just the general problem of expanding the allocator API, we could always just provide an API like __alloc_pages_sensitive or something, implemented with ALLOC_ flags internally. Checkpatch-args: --ignore=SPACING,MACRO_ARG_UNUSED,COMPLEX_MACRO Signed-off-by: Brendan Jackman --- arch/x86/mm/asi.c | 33 +++++++++- include/linux/gfp.h | 5 ++ include/linux/gfp_types.h | 15 ++++- include/linux/page-flags.h | 11 ++++ include/trace/events/mmflags.h | 12 +++- mm/mm_init.c | 1 + mm/page_alloc.c | 134 ++++++++++++++++++++++++++++++++++++++++- tools/perf/builtin-kmem.c | 1 + 8 files changed, 205 insertions(+), 7 deletions(-) diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index 17391ec8b22e3c0903cd5ca29cbb03fcc4cbacce..b951f2100b8bdea5738ded16166255deb29faf57 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -5,6 +5,8 @@ #include #include +#include + #include #include #include @@ -104,10 +106,17 @@ const char *asi_class_name(enum asi_class_id class_id) * allocator from interrupts and the page allocator ultimately calls this * code. * - They support customizing the allocation flags. + * - They avoid infinite recursion when the page allocator calls back to + * asi_map * * On the other hand, they do not use the normal page allocation infrastructure, * that means that PTE pages do not have the PageTable type nor the PagePgtable * flag and we don't increment the meminfo stat (NR_PAGETABLE) as they do. + * + * As an optimisation we attempt to map the pagetables in + * ASI_GLOBAL_NONSENSITIVE, but this can fail, and for simplicity we don't do + * anything about that. This means it's invalid to access ASI pagetables from a + * critical section. */ static_assert(!IS_ENABLED(CONFIG_PARAVIRT)); #define DEFINE_ASI_PGTBL_ALLOC(base, level) \ @@ -116,8 +125,11 @@ static level##_t * asi_##level##_alloc(struct asi *asi, \ gfp_t flags) \ { \ if (unlikely(base##_none(*base))) { \ - ulong pgtbl = get_zeroed_page(flags); \ + /* Stop asi_map calls causing recursive allocation */ \ + gfp_t pgtbl_gfp = flags | __GFP_SENSITIVE; \ + ulong pgtbl = get_zeroed_page(pgtbl_gfp); \ phys_addr_t pgtbl_pa; \ + int err; \ \ if (!pgtbl) \ return NULL; \ @@ -131,6 +143,16 @@ static level##_t * asi_##level##_alloc(struct asi *asi, \ } \ \ mm_inc_nr_##level##s(asi->mm); \ + \ + err = asi_map_gfp(ASI_GLOBAL_NONSENSITIVE, \ + (void *)pgtbl, PAGE_SIZE, flags); \ + if (err) \ + /* Should be rare. Spooky. */ \ + pr_warn_ratelimited("Created sensitive ASI %s (%pK, maps %luK).\n",\ + #level, (void *)pgtbl, addr); \ + else \ + __SetPageGlobalNonSensitive(virt_to_page(pgtbl));\ + \ } \ out: \ VM_BUG_ON(base##_leaf(*base)); \ @@ -586,6 +608,9 @@ static bool follow_physaddr( * reason for this is that we don't want to unexpectedly undo mappings that * weren't created by the present caller. * + * This must not be called from the critical section, as ASI's pagetables are + * not guaranteed to be mapped in the restricted address space. + * * If the source mapping is a large page and the range being mapped spans the * entire large page, then it will be mapped as a large page in the ASI page * tables too. If the range does not span the entire huge page, then it will be @@ -606,6 +631,9 @@ int __must_check asi_map_gfp(struct asi *asi, void *addr, unsigned long len, gfp if (!static_asi_enabled()) return 0; + /* ASI pagetables might be sensitive. */ + WARN_ON_ONCE(asi_in_critical_section()); + VM_BUG_ON(!IS_ALIGNED(start, PAGE_SIZE)); VM_BUG_ON(!IS_ALIGNED(len, PAGE_SIZE)); /* RFC: fault_in_kernel_space should be renamed. */ @@ -706,6 +734,9 @@ void asi_unmap(struct asi *asi, void *addr, size_t len) if (!static_asi_enabled() || !len) return; + /* ASI pagetables might be sensitive. */ + WARN_ON_ONCE(asi_in_critical_section()); + VM_BUG_ON(start & ~PAGE_MASK); VM_BUG_ON(len & ~PAGE_MASK); VM_BUG_ON(!fault_in_kernel_space(start)); /* Misnamed, ignore "fault_" */ diff --git a/include/linux/gfp.h b/include/linux/gfp.h index a951de920e208991b37fb2d878d9a0e9c550548c..dd3678b5b08016ceaee2d8e1932bf4aefbc7efb0 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -396,6 +396,11 @@ extern void page_frag_free(void *addr); #define __free_page(page) __free_pages((page), 0) #define free_page(addr) free_pages((addr), 0) +#ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION +void page_alloc_init_asi(void); +#else +static inline void page_alloc_init_asi(void) { } +#endif void page_alloc_init_cpuhp(void); int decay_pcp_high(struct zone *zone, struct per_cpu_pages *pcp); void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp); diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h index 65db9349f9053c701e24bdcf1dfe6afbf1278a2d..5147dbd53eafdccc32cfd506569b04d5c082d1b2 100644 --- a/include/linux/gfp_types.h +++ b/include/linux/gfp_types.h @@ -58,6 +58,7 @@ enum { #ifdef CONFIG_SLAB_OBJ_EXT ___GFP_NO_OBJ_EXT_BIT, #endif + ___GFP_SENSITIVE_BIT, ___GFP_LAST_BIT }; @@ -103,6 +104,11 @@ enum { #else #define ___GFP_NO_OBJ_EXT 0 #endif +#ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION +#define ___GFP_SENSITIVE BIT(___GFP_SENSITIVE_BIT) +#else +#define ___GFP_SENSITIVE 0 +#endif /* * Physical address zone modifiers (see linux/mmzone.h - low four bits) @@ -299,6 +305,12 @@ enum { /* Disable lockdep for GFP context tracking */ #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP) +/* + * Allocate sensitive memory, i.e. do not map it into ASI's restricted address + * space. + */ +#define __GFP_SENSITIVE ((__force gfp_t)___GFP_SENSITIVE) + /* Room for N __GFP_FOO bits */ #define __GFP_BITS_SHIFT ___GFP_LAST_BIT #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1)) @@ -380,7 +392,8 @@ enum { #define GFP_NOWAIT (__GFP_KSWAPD_RECLAIM | __GFP_NOWARN) #define GFP_NOIO (__GFP_RECLAIM) #define GFP_NOFS (__GFP_RECLAIM | __GFP_IO) -#define GFP_USER (__GFP_RECLAIM | __GFP_IO | __GFP_FS | __GFP_HARDWALL) +#define GFP_USER (__GFP_RECLAIM | __GFP_IO | __GFP_FS | \ + __GFP_HARDWALL | __GFP_SENSITIVE) #define GFP_DMA __GFP_DMA #define GFP_DMA32 __GFP_DMA32 #define GFP_HIGHUSER (GFP_USER | __GFP_HIGHMEM) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 7ee9a0edc6d21708fc93dfa8913dc1ae9478dee3..761b082f1885976b860196d8e69044276e8fa9ca 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -125,6 +125,9 @@ enum pageflags { #endif #ifdef CONFIG_ARCH_USES_PG_ARCH_3 PG_arch_3, +#endif +#ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION + PG_global_nonsensitive, #endif __NR_PAGEFLAGS, @@ -632,6 +635,14 @@ FOLIO_TEST_CLEAR_FLAG_FALSE(young) FOLIO_FLAG_FALSE(idle) #endif +#ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION +__PAGEFLAG(GlobalNonSensitive, global_nonsensitive, PF_ANY); +#define __PG_GLOBAL_NONSENSITIVE (1UL << PG_global_nonsensitive) +#else +__PAGEFLAG_FALSE(GlobalNonSensitive, global_nonsensitive); +#define __PG_GLOBAL_NONSENSITIVE 0 +#endif + /* * PageReported() is used to track reported free pages within the Buddy * allocator. We can use the non-atomic version of the test and set diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h index bb8a59c6caa21971862b6f200263c74cedff3882..a511a76b4310e949fd5b40b01253cf7d262f0177 100644 --- a/include/trace/events/mmflags.h +++ b/include/trace/events/mmflags.h @@ -50,7 +50,8 @@ gfpflag_string(__GFP_RECLAIM), \ gfpflag_string(__GFP_DIRECT_RECLAIM), \ gfpflag_string(__GFP_KSWAPD_RECLAIM), \ - gfpflag_string(__GFP_ZEROTAGS) + gfpflag_string(__GFP_ZEROTAGS), \ + gfpflag_string(__GFP_SENSITIVE) #ifdef CONFIG_KASAN_HW_TAGS #define __def_gfpflag_names_kasan , \ @@ -95,6 +96,12 @@ #define IF_HAVE_PG_ARCH_3(_name) #endif +#ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION +#define IF_HAVE_ASI(_name) ,{1UL << PG_##_name, __stringify(_name)} +#else +#define IF_HAVE_ASI(_name) +#endif + #define DEF_PAGEFLAG_NAME(_name) { 1UL << PG_##_name, __stringify(_name) } #define __def_pageflag_names \ @@ -122,7 +129,8 @@ IF_HAVE_PG_HWPOISON(hwpoison) \ IF_HAVE_PG_IDLE(idle) \ IF_HAVE_PG_IDLE(young) \ IF_HAVE_PG_ARCH_2(arch_2) \ -IF_HAVE_PG_ARCH_3(arch_3) +IF_HAVE_PG_ARCH_3(arch_3) \ +IF_HAVE_ASI(global_nonsensitive) #define show_page_flags(flags) \ (flags) ? __print_flags(flags, "|", \ diff --git a/mm/mm_init.c b/mm/mm_init.c index 4ba5607aaf1943214c7f79f2a52e17eefac2ad79..30b84c0dd8b764e91fb64b116805ebb46526dd7e 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -2639,6 +2639,7 @@ void __init mm_core_init(void) BUILD_BUG_ON(MAX_ZONELISTS > 2); build_all_zonelists(NULL); page_alloc_init_cpuhp(); + page_alloc_init_asi(); /* * page_ext requires contiguous pages, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index b6958333054d06ed910f8fef863d83a7312eca9e..3e98fdfbadddb1f7d71e9e050b63255b2008d167 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1041,6 +1041,8 @@ static void kernel_init_pages(struct page *page, int numpages) kasan_enable_current(); } +static bool asi_async_free_enqueue(struct page *page, unsigned int order); + __always_inline bool free_pages_prepare(struct page *page, unsigned int order) { @@ -1049,6 +1051,11 @@ __always_inline bool free_pages_prepare(struct page *page, bool init = want_init_on_free(); bool compound = PageCompound(page); struct folio *folio = page_folio(page); + /* + * __PG_GLOBAL_NONSENSITIVE needs to be kept around for the ASI async + * free logic. + */ + unsigned long flags_mask = ~PAGE_FLAGS_CHECK_AT_PREP | __PG_GLOBAL_NONSENSITIVE; VM_BUG_ON_PAGE(PageTail(page), page); @@ -1107,7 +1114,7 @@ __always_inline bool free_pages_prepare(struct page *page, continue; } } - (page + i)->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; + (page + i)->flags &= flags_mask; } } if (PageMappingFlags(page)) { @@ -1123,7 +1130,7 @@ __always_inline bool free_pages_prepare(struct page *page, } page_cpupid_reset_last(page); - page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; + page->flags &= flags_mask; reset_page_owner(page, order); page_table_check_free(page, order); pgalloc_tag_sub(page, 1 << order); @@ -1164,7 +1171,7 @@ __always_inline bool free_pages_prepare(struct page *page, debug_pagealloc_unmap_pages(page, 1 << order); - return true; + return !asi_async_free_enqueue(page, order); } /* @@ -4528,6 +4535,118 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order, return true; } +#ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION + +struct asi_async_free_cpu_state { + struct work_struct work; + struct list_head to_free; +}; +static DEFINE_PER_CPU(struct asi_async_free_cpu_state, asi_async_free_cpu_state); + +static void asi_async_free_work_fn(struct work_struct *work) +{ + struct asi_async_free_cpu_state *cpu_state = + container_of(work, struct asi_async_free_cpu_state, work); + struct page *page, *tmp; + struct list_head to_free = LIST_HEAD_INIT(to_free); + + local_irq_disable(); + list_splice_init(&cpu_state->to_free, &to_free); + local_irq_enable(); /* IRQs must be on for asi_unmap. */ + + /* Use _safe because __free_the_page uses .lru */ + list_for_each_entry_safe(page, tmp, &to_free, lru) { + unsigned long order = page_private(page); + + asi_unmap(ASI_GLOBAL_NONSENSITIVE, page_to_virt(page), + PAGE_SIZE << order); + for (int i = 0; i < (1 << order); i++) + __ClearPageGlobalNonSensitive(page + i); + + free_one_page(page_zone(page), page, page_to_pfn(page), order, FPI_NONE); + cond_resched(); + } +} + +/* Returns true if the page was queued for asynchronous freeing. */ +static bool asi_async_free_enqueue(struct page *page, unsigned int order) +{ + struct asi_async_free_cpu_state *cpu_state; + unsigned long flags; + + if (!PageGlobalNonSensitive(page)) + return false; + + local_irq_save(flags); + cpu_state = this_cpu_ptr(&asi_async_free_cpu_state); + set_page_private(page, order); + list_add(&page->lru, &cpu_state->to_free); + if (mm_percpu_wq) + queue_work_on(smp_processor_id(), mm_percpu_wq, &cpu_state->work); + local_irq_restore(flags); + + return true; +} + +void __init page_alloc_init_asi(void) +{ + int cpu; + + if (!static_asi_enabled()) + return; + + for_each_possible_cpu(cpu) { + struct asi_async_free_cpu_state *cpu_state + = &per_cpu(asi_async_free_cpu_state, cpu); + + INIT_WORK(&cpu_state->work, asi_async_free_work_fn); + INIT_LIST_HEAD(&cpu_state->to_free); + } +} + +static int asi_map_alloced_pages(struct page *page, uint order, gfp_t gfp_mask) +{ + + if (!static_asi_enabled()) + return 0; + + if (!(gfp_mask & __GFP_SENSITIVE)) { + int err = asi_map_gfp( + ASI_GLOBAL_NONSENSITIVE, page_to_virt(page), + PAGE_SIZE * (1 << order), gfp_mask); + uint i; + + if (err) + return err; + + for (i = 0; i < (1 << order); i++) + __SetPageGlobalNonSensitive(page + i); + } + + return 0; +} + +#else /* CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */ + +static inline +int asi_map_alloced_pages(struct page *pages, uint order, gfp_t gfp_mask) +{ + return 0; +} + +static inline +bool asi_unmap_freed_pages(struct page *page, unsigned int order) +{ + return true; +} + +static bool asi_async_free_enqueue(struct page *page, unsigned int order) +{ + return false; +} + +#endif + /* * __alloc_pages_bulk - Allocate a number of order-0 pages to a list or array * @gfp: GFP flags for the allocation @@ -4727,6 +4846,10 @@ struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order, if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp)) return NULL; + /* Clear out old (maybe sensitive) data before reallocating as nonsensitive. */ + if (!static_asi_enabled() && !(gfp & __GFP_SENSITIVE)) + gfp |= __GFP_ZERO; + gfp &= gfp_allowed_mask; /* * Apply scoped allocation constraints. This is mainly about GFP_NOFS @@ -4773,6 +4896,11 @@ struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order, trace_mm_page_alloc(page, order, alloc_gfp, ac.migratetype); kmsan_alloc_page(page, order, alloc_gfp); + if (page && unlikely(asi_map_alloced_pages(page, order, gfp))) { + __free_pages(page, order); + page = NULL; + } + return page; } EXPORT_SYMBOL(__alloc_pages_noprof); diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c index a756147e2eec7a3820e1953db346fafa8fe687ba..99f4c6632155d2573f1370af131c15c3d8baa655 100644 --- a/tools/perf/builtin-kmem.c +++ b/tools/perf/builtin-kmem.c @@ -682,6 +682,7 @@ static const struct { { "__GFP_RECLAIM", "R" }, { "__GFP_DIRECT_RECLAIM", "DR" }, { "__GFP_KSWAPD_RECLAIM", "KR" }, + { "__GFP_SENSITIVE", "S" }, }; static size_t max_gfp_len; From patchwork Fri Jan 10 18:40:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935244 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A9AFE77188 for ; Fri, 10 Jan 2025 18:41:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DBE038D0006; Fri, 10 Jan 2025 13:41:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D45458D0005; Fri, 10 Jan 2025 13:41:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B23C38D0006; Fri, 10 Jan 2025 13:41:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 8899E8D0005 for ; Fri, 10 Jan 2025 13:41:23 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 3663D1A0D98 for ; Fri, 10 Jan 2025 18:41:23 +0000 (UTC) X-FDA: 82992410046.20.93AD26E Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) by imf02.hostedemail.com (Postfix) with ESMTP id 47D2C80009 for ; Fri, 10 Jan 2025 18:41:21 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=CBUxmKbn; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf02.hostedemail.com: domain of 3z2mBZwgKCOoVMOWYMZNSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--jackmanb.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=3z2mBZwgKCOoVMOWYMZNSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--jackmanb.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534481; a=rsa-sha256; cv=none; b=3oKnGf8WDb8s/u+c4u+Rk9FCeCMGg9J8unJhHhlmKZtj13EzRaOMZ+EhqCpWUkIeQ3aSjO lOuU/lJvPwuazyrj+RzNVnivPmG3l23WFFjAeL7KdA7tsb1LOS2jtB8pIvBeLpkXToKN7R Y2r8aHeQ9J8ZU2NLXeKbH6DIALTPbWM= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=CBUxmKbn; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf02.hostedemail.com: domain of 3z2mBZwgKCOoVMOWYMZNSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--jackmanb.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=3z2mBZwgKCOoVMOWYMZNSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--jackmanb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534481; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Aw/CnxStV8mvixiIOCl4TLOf7+c8cjSAdFrfDhkuuHg=; b=ibyBp9tq0pKmSj3kGGOpoDzWChFKU1/1c2YX42yiAN5is3K6hYb1slVC+iay1Gx5IVafBN nt4koqr9LoMScS+JF0nhCpEa4+JMcc+4m6UspAfWxxgPw3g0LJGejav8IKtxZebsDh0aqW G2nLCSWGvZ8HciNPT31q1wp8N15v2WM= Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-385ed79291eso1665598f8f.0 for ; Fri, 10 Jan 2025 10:41:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534480; x=1737139280; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Aw/CnxStV8mvixiIOCl4TLOf7+c8cjSAdFrfDhkuuHg=; b=CBUxmKbnxch3OatJWSa9skxzj/l89hxfhhIeH4mIWXANvy8oa1lIJmlyBx3QY+5Q9M vclknxZtWPkLBIDf2xiRtM0QPKNFPJpA1Ig4NkSG/L5Xxjhedm+D7+gqVkUZf3/NzyXm 1HMlYHus0XMCElelgNNlJmdd/uYfP7VOnDZ8Z25P9ROlKsljqe+DFN5fFpwJqoP8luY3 YSX2w9HXKyLlxXF3LIDMpQUuuD/I18Q+7UB2PxLkPjHWVQFc28SGA18gXLvTaE3HEJ/5 Z1435IEtdOATTSH3rCxSOP996b9G4/uXbA+j6jMue91rKrRWgwz5YX4bOoaKP0pww+td opXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534480; x=1737139280; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Aw/CnxStV8mvixiIOCl4TLOf7+c8cjSAdFrfDhkuuHg=; b=NTEVhkK9nPGyRtkjrUkPGRUMl1UF1gDanHgsR4iPGqL/ygKvKmwfVm2fgYxC6TeuV7 d6ET2GlrmYRABM57nvA6ZIvvl2z+v9P7ewtg5thglLSVhTsVNWSUlSDOGo+eV4YpBmyj ZFEI2L4IiB00i5ACLtd1nGyq6LveOtOMFqCEMA4USGpr28vBhy09uNY2YUUt+slzlHUL CuO1w7l+8H0y1l20WZeMLGcXSyZV3tS7kPCQeMHmdNfKSaSpllDb+Nw8mUwvf7VqXfJS tzDVTWOZC1PIG4EZu7CEHs0ELdWVq8rOVFX/d67NmfvjqEBR6+8tdR00KHMvN3SM++lW OJvQ== X-Forwarded-Encrypted: i=1; AJvYcCVwkpruaZSSZrMwAo6O0CYVPNUxSju94d7uJ8/yfWXcvWO+VyxZPRjDuLQZnLGfzm1hXufuLAr8eQ==@kvack.org X-Gm-Message-State: AOJu0Yx/ViFip1B+2x77bEe5CqsBf8jIAnt629FXJiux9PzwS3YnLRDH sf/9ZDgrk2+Vkoizl/Qpgf0Z82Dgbcp+eKZWYCnFduzeJrGwX7nuze2kNv+moOJUD4+VZMjivUa V7PHuPozGRA== X-Google-Smtp-Source: AGHT+IEfIL+1aAzwB4pbtK9eyCgzXtWnxn8Tg5skOTY2l9Jdg/lcArfVr7lNF3vncmrBkENqt6o7go6KThD7+Q== X-Received: from wmqe1.prod.google.com ([2002:a05:600c:4e41:b0:434:a050:ddcf]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:1547:b0:386:34af:9bae with SMTP id ffacd0b85a97d-38a8b0b7fd0mr6975597f8f.4.1736534479521; Fri, 10 Jan 2025 10:41:19 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:41 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-15-8419288bc805@google.com> Subject: [PATCH TEMP WORKAROUND RFC v2 15/29] mm: asi: Workaround missing partial-unmap support From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman X-Stat-Signature: c9ypfkq6pyxfmo9pmh8zhshiz5rz4giz X-Rspam-User: X-Rspamd-Queue-Id: 47D2C80009 X-Rspamd-Server: rspam08 X-HE-Tag: 1736534481-954122 X-HE-Meta: U2FsdGVkX1/UUM68rFfvwi5+lp7quwK4zj9nyWDepTkkcUqJh0VLxM1wMvR/D3TDaELptmL3ujt+b8R+Mg4clYtqta7wbY2FPhqHRIBbZ4pYdPHmLy1as1bt0Vy6xn9K9B6EmP/Ukb7OIE0P47k0VLIwOCYqb4PBzezLsKl2zQCQtKgirKCUa5pkbUlc6ipUT1QvunI51dtJ+T7Nu2CMUvpDLolsVi3T2uSGxtp/atb3pFdQM7Kz8LeY1rKKfvafepZp1/YspoNmxjsf3xuo/Pn3BapN2GWtU9y8UpMWZikOzzmSRHZtuz3mLN9qICGmXcjVubISop8z7pYcM1R/t/ar3rcVjVn+JMJbasXkAWMITPc6T3w2NMUmtKGeWe/SSNTTOsK0YgkeL3ORoz3AoanNQlVlMA6s68sDIwmyVm1OHm9zYFM8S9UI5+EYwNXDcosHHEOZL9DO/KOYe6LVi3uzIYK7pjL7Ii02N29RtdaSMPNeIckD+WEHWesgylIlhw8d6vkwxAcOYBZbifyT1vZsUX5o9u3Qki2JuSb7UPYSOTpX7P6K7S0ttK7m5pJjRB6EFCwIE689iDEb2qoOGqkXI4xeVPYT1wN/QLXZ0vBcf/JtP4fc+Fm3ACObUoF5vMPnLYid6NQUiKlSqp6/AK73M2lyQgcpqu4tEWE/H7PF8J0ePi6TDdwjWobRghARqFgaLPvoPVedd0rcsOqWE+X6TVZMKtEnuvssERW3ApCcl3zD5h1420y2t4dTSEBMw1cjEyh17P2f/oyx9SyS1a2Tqg00l2O6M8aB5MdacY4W36mli0N6PjAk7PLccCvqo7nQcpOJgkXjYh9YVN9yf7GZTixK4DKqsul0wYrDcuVEX3+zZH5aoZjPx7Wj/7EBl2NDpyBKtPFslS+MTsixlwGNs63CUvr266RNuHnlxG7D2bv67P+2qom/7DWqKDjoayhSTc8nI72PDcFn8oM f9zM6I5R zWxaYDa/gl+1KQQHr/vmt3CAF1tVZuX48X3iZi+kUeowXd7TvFOPU/p0SMHlKOTEzIq+05YGZIUg/wdi81gOsO6Ovs1JgpGkHfuriABol70879BoPL2EWJJuOihQtIv922q98jI/qSLWvAMnUjT8sUViirI28kF4xpde3XhN73aeQw3sRYO99RsVe0eVC9dPmTYkjh5EMMAZ5pDFtzX7Tn0KTi+qzjMm76EMKgBjTBkotm9V7/Su2/Z8oQWj69N9fFaPR5SacvtKhfFTaCKSb4shRBS6rasJc8Og/laV57qL6HT1C0O9XLpMWA2vwMDurIW7i34tHv0RMOJE/dZs7paJ8H3Xgxfc5JHCNTV2gu8hpRe6T7LlmWGJiLn1TXYMfZsev7Vr2f9jghPiE3B0rYxGCneyxuBQe1kspYnaCwUR7IriNm+mJqswpmW359I2qXgeUDWVN3W+HdPxtWlqG8/ImT5XcHirScnTucdVI8tcXyHTuCHb8Igozh4djCt1QEJMYMcpjPuZae76z21En1MOjwhOORj2HdP4rlMhgfU5S5v9oXEnWosP5vg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This is a hack, no need to review it carefully. asi_unmap() doesn't currently work unless it corresponds exactly to an asi_map() of the exact same region. This is mostly harmless (it's only a functional problem if you wanna touch those pages from the ASI critical section) but it's messy. For now, working around the only practical case that appears by moving the asi_map call up the call stack in the page allocator, to the place where we know the actual size the mapping is supposed to end up at. This just removes the main case where that happens. Later, a proper solution for partial unmaps will be needed. Signed-off-by: Brendan Jackman --- mm/page_alloc.c | 40 ++++++++++++++++++++++++++-------------- 1 file changed, 26 insertions(+), 14 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 3e98fdfbadddb1f7d71e9e050b63255b2008d167..f96e95032450be90b6567f67915b0b941fc431d8 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4604,22 +4604,20 @@ void __init page_alloc_init_asi(void) } } -static int asi_map_alloced_pages(struct page *page, uint order, gfp_t gfp_mask) +static int asi_map_alloced_pages(struct page *page, size_t size, gfp_t gfp_mask) { if (!static_asi_enabled()) return 0; if (!(gfp_mask & __GFP_SENSITIVE)) { - int err = asi_map_gfp( - ASI_GLOBAL_NONSENSITIVE, page_to_virt(page), - PAGE_SIZE * (1 << order), gfp_mask); + int err = asi_map_gfp(ASI_GLOBAL_NONSENSITIVE, page_to_virt(page), size, gfp_mask); uint i; if (err) return err; - for (i = 0; i < (1 << order); i++) + for (i = 0; i < (size >> PAGE_SHIFT); i++) __SetPageGlobalNonSensitive(page + i); } @@ -4629,7 +4627,7 @@ static int asi_map_alloced_pages(struct page *page, uint order, gfp_t gfp_mask) #else /* CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */ static inline -int asi_map_alloced_pages(struct page *pages, uint order, gfp_t gfp_mask) +int asi_map_alloced_pages(struct page *pages, size_t size, gfp_t gfp_mask) { return 0; } @@ -4896,7 +4894,7 @@ struct page *__alloc_pages_noprof(gfp_t gfp, unsigned int order, trace_mm_page_alloc(page, order, alloc_gfp, ac.migratetype); kmsan_alloc_page(page, order, alloc_gfp); - if (page && unlikely(asi_map_alloced_pages(page, order, gfp))) { + if (page && unlikely(asi_map_alloced_pages(page, PAGE_SIZE << order, gfp))) { __free_pages(page, order); page = NULL; } @@ -5118,12 +5116,13 @@ void page_frag_free(void *addr) } EXPORT_SYMBOL(page_frag_free); -static void *make_alloc_exact(unsigned long addr, unsigned int order, - size_t size) +static void *finish_exact_alloc(unsigned long addr, unsigned int order, + size_t size, gfp_t gfp_mask) { if (addr) { unsigned long nr = DIV_ROUND_UP(size, PAGE_SIZE); struct page *page = virt_to_page((void *)addr); + struct page *first = page; struct page *last = page + nr; split_page_owner(page, order, 0); @@ -5132,9 +5131,22 @@ static void *make_alloc_exact(unsigned long addr, unsigned int order, while (page < --last) set_page_refcounted(last); - last = page + (1UL << order); + last = page + (1 << order); for (page += nr; page < last; page++) __free_pages_ok(page, 0, FPI_TO_TAIL); + + /* + * ASI doesn't support partially undoing calls to asi_map, so + * we can only safely free sub-allocations if they were made + * with __GFP_SENSITIVE in the first place. Users of this need + * to map with forced __GFP_SENSITIVE and then here we'll make a + * second asi_map_alloced_pages() call to do any mapping that's + * necessary, but with the exact size. + */ + if (unlikely(asi_map_alloced_pages(first, nr << PAGE_SHIFT, gfp_mask))) { + free_pages_exact(first, size); + return NULL; + } } return (void *)addr; } @@ -5162,8 +5174,8 @@ void *alloc_pages_exact_noprof(size_t size, gfp_t gfp_mask) if (WARN_ON_ONCE(gfp_mask & (__GFP_COMP | __GFP_HIGHMEM))) gfp_mask &= ~(__GFP_COMP | __GFP_HIGHMEM); - addr = get_free_pages_noprof(gfp_mask, order); - return make_alloc_exact(addr, order, size); + addr = get_free_pages_noprof(gfp_mask | __GFP_SENSITIVE, order); + return finish_exact_alloc(addr, order, size, gfp_mask); } EXPORT_SYMBOL(alloc_pages_exact_noprof); @@ -5187,10 +5199,10 @@ void * __meminit alloc_pages_exact_nid_noprof(int nid, size_t size, gfp_t gfp_ma if (WARN_ON_ONCE(gfp_mask & (__GFP_COMP | __GFP_HIGHMEM))) gfp_mask &= ~(__GFP_COMP | __GFP_HIGHMEM); - p = alloc_pages_node_noprof(nid, gfp_mask, order); + p = alloc_pages_node_noprof(nid, gfp_mask | __GFP_SENSITIVE, order); if (!p) return NULL; - return make_alloc_exact((unsigned long)page_address(p), order, size); + return finish_exact_alloc((unsigned long)page_address(p), order, size, gfp_mask); } /** From patchwork Fri Jan 10 18:40:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935245 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9145E77188 for ; Fri, 10 Jan 2025 18:41:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B735A8D0007; Fri, 10 Jan 2025 13:41:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AFC428D0005; Fri, 10 Jan 2025 13:41:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 974EE8D0007; Fri, 10 Jan 2025 13:41:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 7348C8D0005 for ; Fri, 10 Jan 2025 13:41:25 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 3EED1120DDA for ; Fri, 10 Jan 2025 18:41:25 +0000 (UTC) X-FDA: 82992410130.04.7D47EC6 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf21.hostedemail.com (Postfix) with ESMTP id 5C8671C000A for ; Fri, 10 Jan 2025 18:41:23 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LKsqYMkV; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of 30WmBZwgKCOwXOQYaObPUccUZS.QcaZWbil-aaYjOQY.cfU@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=30WmBZwgKCOwXOQYaObPUccUZS.QcaZWbil-aaYjOQY.cfU@flex--jackmanb.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534483; a=rsa-sha256; cv=none; b=JVG/kJUIKgageMjFoXIXfUfjzctolugTk0Bqn8l2PoPBDdmR8PtvmzgM2rS+sBHemSX9Bh W+hGvH/75lF8KTz7+bRZY2aaPvd3obLAJco1EZAjupEVg73nGBCrjM2HYR0gLErbuwdSJT jSZyjoUMIREqk1DayKyEmBJ/P4xJawg= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LKsqYMkV; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of 30WmBZwgKCOwXOQYaObPUccUZS.QcaZWbil-aaYjOQY.cfU@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=30WmBZwgKCOwXOQYaObPUccUZS.QcaZWbil-aaYjOQY.cfU@flex--jackmanb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534483; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KV+yP+i7I0yJqjdnPuT5zohhg/S0Da4STnyrBal2kA0=; b=2SxMR5uXhrbkwWojwxLkNrAVFEH4OFlIYg32AH970y4F3+WvZqV68gqruh7z9J1mdU6HC+ i4U3toWivakh3N7yv9q6wIhEGgp6FkEKctUsiCV69fvBao2GXQPPuW+y4yVuFCehKlFYw6 jpZPNfJXfRvlx4a2IY99eAV8DWsUyWU= Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-436379713baso11447025e9.2 for ; Fri, 10 Jan 2025 10:41:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534482; x=1737139282; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=KV+yP+i7I0yJqjdnPuT5zohhg/S0Da4STnyrBal2kA0=; b=LKsqYMkVAdZSgdWpl4M8xJGL0RGpRAco5Q73cJTH1xTpiVfbpLag/zuwOsCa3AhDrq jqJ8K+jn8U5I687v/StFh+mrz/KfJKoD32PQdXhZaw+7x566blWS5lzUH/QeBICvLjQ2 rho3ZYZP/xR5pEpF4f3x7RSlA0oetKVWCgUa3pPN3+97+4nkUaViR8bxmlt+SYTf4qC4 FWgeof/ylylsHk4xxh9K+eVb7IP2MEYhv1gE7ugWHnqLD+oK1tU+XXyXBPfcKdrg7mSM vPsy7K2EI15eQmGSq2HgnVO18Lr8NNilv/b80R7vhIGtRkHI98DsbxLCY5JpRqOkKKFQ Ew7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534482; x=1737139282; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KV+yP+i7I0yJqjdnPuT5zohhg/S0Da4STnyrBal2kA0=; b=voteOPH7dx4yuKNJhQi7ew7AvyunnHCD2VTrRys/JJReUyGKmJFMaVa9IzGVrD/d+V uaa/2TLjaBWN8vM+FMzRKmAnekykZs9kbNDmYVZOVySEL1treAE6FWql7MlIBzdx39kf 80kBO3+4j8eYCXynYnmkldcGMoZ5zYbGiG+KJ/Qb8DmSkxrn6AvTY8vSpM88Ce/OHdh/ JtPxd3xk1dBzETchUJ7R21g4BhkpxjZUP1a8BXG2hQhl553tjLJNscXkBrbpznwfCPl7 /9v6GGWLcg2VtWSAes53icGMPrME2Qhz6qQXBFm+YvxjrJsd+Ohc6RjGFR9rXjZ/PmKE 2zug== X-Forwarded-Encrypted: i=1; AJvYcCUm1xvnlnAyu/hIWpwxb5JkVk6S+ytSrx2zqvmG+LFxwHxZjEt023lgVJ8ltp/0mwD2VUxnC5ZTnA==@kvack.org X-Gm-Message-State: AOJu0YxybBYzI6A5gSOGSQmKyTB2JVcZZVUu0Ukyi1yZIeo3gETuab6j x6VjOdzJKUzsoeejW7weIESUk+QmZSl/MQDaApp9yV37i99I751iBYfwtJsEitzfeOv0nR60x0b cUTwEuaRo5g== X-Google-Smtp-Source: AGHT+IGysxF4lj+Pa3edYCQyxcx2uUdgDB4x/aH3it4KUk62HUjYx0xS9V+RTFKPnAfHJUfpiQOrJBNcPZbgiA== X-Received: from wmbfl22.prod.google.com ([2002:a05:600c:b96:b0:436:6fa7:621]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:5117:b0:431:60ec:7a96 with SMTP id 5b1f17b1804b1-436e26ddc53mr94187935e9.25.1736534481726; Fri, 10 Jan 2025 10:41:21 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:42 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-16-8419288bc805@google.com> Subject: [PATCH RFC v2 16/29] mm: asi: Map kernel text and static data as nonsensitive From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 5C8671C000A X-Stat-Signature: iwco5jonjps9ntzjpx736kcp5i7nwq51 X-Rspam-User: X-HE-Tag: 1736534483-473138 X-HE-Meta: U2FsdGVkX19forVi00lq8HF8Jvw1KlDA+ke7RRCaC8pdh0fiDzi5qCMHicxk9AeYwh5YUuCBwHK80bHWl/hwgqxjGc8kc1uqHr9/q93nnBYPxur4LVF/oxA5PTIHYWW+72WYFzNSlnqq4HpOKJPQpo2At/vsGolDFkJLU45gEWjGYGjzGacZnWYDjZ3McGIe66TYHrc8Gt6zHhow2cyEGekPUB5ZpN37DMrQypOp+DI6ii6KSB4+6Kw5zuEm6egW+N4ti4hd5mXgZGXNggWriFgWl7Hyxp+hoW8I+DJmUztPNnyP7S10ALVN+dOpUG6klw/F5NKxZUzcxTBC594zgp+vbjp8Bp0evScCKTIKagln29GwS6mkn4/WSZ1X9pX3PLByUHuzasca3VBH5HDFS7iBj8r4e7iN94swu0UPGXGdxY56NCTG+3bVkoCu+5E2Ceg3N9DpozPo7uT81w2wbzvMxi+CWcyuXxtRMPxvNp1T+9BjMHzhAu5njWFk/Govok1iATIG1/x3ioH0Y/D+hnHxoWhuheCYbxNnpoDaBgIhnJi9lKjRq3Y40hjcUJHNtv92ErHoa9iGlpWGg5QfbhsLHiHHmoNQtFJoIsR2HQipHeoSTNfBjO06N9oMTTVxBs9Luu1C3TPOnkLtPMvQkMxAa/91H8A0Wlk2bvZfX4jhyOzbyRJ9yWQ1E32U3YRa7iSXmVkAn1NwKCS4ifWZmhvInqwKdqbeaDd1z08XNO8GaeSP9POTuQrsdvjVfR+2ibxeE/Z6+DTKpBC8IGF0yHqg4LDphCfKQhTu3fswvasPIRiY/BhQjb6O4vVeWiPOlLnKOQD66L6Bh8FEhSCqyWoLsXtNHGu+4Kcxu8byy3k3trGswqk1GKG7BQDasngUlhZkl72rq97HAdpPth7C6RVzv1E3FtGZ8Au0VcJ91/TTKOR9621Wwxkx883QdKPYJhAjlaXWOARaYKu7+Ig Cme31vJz 6wrYwCopvajzNIkjtc40ltA8EVRWkz/jwrfUqwY42N1PNf29OLQAaR7h2+/MuolVNZJeH0ZMEN3PnH2NXACuLDVFVOl5ZOtecmuuuB5gE50zOpSOZVdxs0QDY8kp57NqhnGQnOHo9OY96KLZwjbW2SCr6O6J0xpbmmVLcHUGNihp/e0a8/6Fkx2Ne2SoGCnx03FpbmjvxM53OnztBH4SFt5NdkdlGSmc/ZlZWKbK5uPrlCZhJBKO1tGlU8lSVSgy1bSkKJ4W6IHfX0UHl+29K3V394A3akiXjOKBP+AuoK3v6X31qxxoYQBfG5n3NRqQ/kcE+JnOhhEJaYcGztWHoxFaJz59nH0/y7p2VzBXNl5EBgHpH5PqRcPTFFUR9PtveYUIyLepEvVm2t8h1FnCoI16W+JKr7yPFFwH7SvG9rccQW1rQ3qRUYmtAGIVEcr6FoK7CNNYmM9HvnqMRFIn23bqrAxnHVEIh44SSQDNgVG51uTD4rWseOj1yXvwVMEuwQcl7hOgKxrlkCVqpHujZQjt3f6TARo59wcSktcqK1U8cvdjRL5Q/snroIQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Basically we need to map the kernel code and all its static variables. Per-CPU variables need to be treated specially as described in the comments. The cpu_entry_area is similar - this needs to be nonsensitive so that the CPU can access the GDT etc when handling a page fault. Under 5-level paging, most of the kernel memory comes under a single PGD entry (see Documentation/x86/x86_64/mm.rst. Basically, the mapping is for this big region is the same as under 4-level, just wrapped in an outer PGD entry). For that region, the "clone" logic is moved down one step of the paging hierarchy. Note that the p4d_alloc in asi_clone_p4d won't actually be used in practice; the relevant PGD entry will always have been populated by prior asi_map calls so this code would "work" if we just wrote p4d_offset (but asi_clone_p4d would be broken if viewed in isolation). The vmemmap area is not under this single PGD, it has its own 2-PGD area, so we still use asi_clone_pgd for that one. Signed-off-by: Brendan Jackman --- arch/x86/mm/asi.c | 105 +++++++++++++++++++++++++++++++++++++- include/asm-generic/vmlinux.lds.h | 11 ++++ 2 files changed, 115 insertions(+), 1 deletion(-) diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index b951f2100b8bdea5738ded16166255deb29faf57..bc2cf0475a0e7344a66d81453f55034b2fc77eef 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -7,7 +7,6 @@ #include #include -#include #include #include #include @@ -186,8 +185,68 @@ void __init asi_check_boottime_disable(void) pr_info("ASI enablement ignored due to incomplete implementation.\n"); } +/* + * Map data by sharing sub-PGD pagetables with the unrestricted mapping. This is + * more efficient than asi_map, but only works when you know the whole top-level + * page needs to be mapped in the restricted tables. Note that the size of the + * mappings this creates differs between 4 and 5-level paging. + */ +static void asi_clone_pgd(pgd_t *dst_table, pgd_t *src_table, size_t addr) +{ + pgd_t *src = pgd_offset_pgd(src_table, addr); + pgd_t *dst = pgd_offset_pgd(dst_table, addr); + + if (!pgd_val(*dst)) + set_pgd(dst, *src); + else + WARN_ON_ONCE(pgd_val(*dst) != pgd_val(*src)); +} + +/* + * For 4-level paging this is exactly the same as asi_clone_pgd. For 5-level + * paging it clones one level lower. So this always creates a mapping of the + * same size. + */ +static void asi_clone_p4d(pgd_t *dst_table, pgd_t *src_table, size_t addr) +{ + pgd_t *src_pgd = pgd_offset_pgd(src_table, addr); + pgd_t *dst_pgd = pgd_offset_pgd(dst_table, addr); + p4d_t *src_p4d = p4d_alloc(&init_mm, src_pgd, addr); + p4d_t *dst_p4d = p4d_alloc(&init_mm, dst_pgd, addr); + + if (!p4d_val(*dst_p4d)) + set_p4d(dst_p4d, *src_p4d); + else + WARN_ON_ONCE(p4d_val(*dst_p4d) != p4d_val(*src_p4d)); +} + +/* + * percpu_addr is where the linker put the percpu variable. asi_map_percpu finds + * the place where the percpu allocator copied the data during boot. + * + * This is necessary even when the page allocator defaults to + * global-nonsensitive, because the percpu allocator uses the memblock allocator + * for early allocations. + */ +static int asi_map_percpu(struct asi *asi, void *percpu_addr, size_t len) +{ + int cpu, err; + void *ptr; + + for_each_possible_cpu(cpu) { + ptr = per_cpu_ptr(percpu_addr, cpu); + err = asi_map(asi, ptr, len); + if (err) + return err; + } + + return 0; +} + static int __init asi_global_init(void) { + int err; + if (!boot_cpu_has(X86_FEATURE_ASI)) return 0; @@ -207,6 +266,46 @@ static int __init asi_global_init(void) VMALLOC_START, VMALLOC_END, "ASI Global Non-sensitive vmalloc"); + /* Map all kernel text and static data */ + err = asi_map(ASI_GLOBAL_NONSENSITIVE, (void *)__START_KERNEL, + (size_t)_end - __START_KERNEL); + if (WARN_ON(err)) + return err; + err = asi_map(ASI_GLOBAL_NONSENSITIVE, (void *)FIXADDR_START, + FIXADDR_SIZE); + if (WARN_ON(err)) + return err; + /* Map all static percpu data */ + err = asi_map_percpu( + ASI_GLOBAL_NONSENSITIVE, + __per_cpu_start, __per_cpu_end - __per_cpu_start); + if (WARN_ON(err)) + return err; + + /* + * The next areas are mapped using shared sub-P4D paging structures + * (asi_clone_p4d instead of asi_map), since we know the whole P4D will + * be mapped. + */ + asi_clone_p4d(asi_global_nonsensitive_pgd, init_mm.pgd, + CPU_ENTRY_AREA_BASE); +#ifdef CONFIG_X86_ESPFIX64 + asi_clone_p4d(asi_global_nonsensitive_pgd, init_mm.pgd, + ESPFIX_BASE_ADDR); +#endif + /* + * The vmemmap area actually _must_ be cloned via shared paging + * structures, since mappings can potentially change dynamically when + * hugetlbfs pages are created or broken down. + * + * We always clone 2 PGDs, this is a corrolary of the sizes of struct + * page, a page, and the physical address space. + */ + WARN_ON(sizeof(struct page) * MAXMEM / PAGE_SIZE != 2 * (1UL << PGDIR_SHIFT)); + asi_clone_pgd(asi_global_nonsensitive_pgd, init_mm.pgd, VMEMMAP_START); + asi_clone_pgd(asi_global_nonsensitive_pgd, init_mm.pgd, + VMEMMAP_START + (1UL << PGDIR_SHIFT)); + return 0; } subsys_initcall(asi_global_init) @@ -599,6 +698,10 @@ static bool follow_physaddr( * Map the given range into the ASI page tables. The source of the mapping is * the regular unrestricted page tables. Can be used to map any kernel memory. * + * In contrast to some internal ASI logic (asi_clone_pgd and asi_clone_p4d) this + * never shares pagetables between restricted and unrestricted address spaces, + * instead it creates wholly new equivalent mappings. + * * The caller MUST ensure that the source mapping will not change during this * function. For dynamic kernel memory, this is generally ensured by mapping the * memory within the allocator. diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index eeadbaeccf88b73af40efe5221760a7cb37058d2..18f6c0448baf5dfbd0721ba9a6d89000fa86f061 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -1022,6 +1022,16 @@ COMMON_DISCARDS \ } +/* + * ASI maps certain sections with certain sensitivity levels, so they need to + * have a page-aligned size. + */ +#ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION +#define ASI_ALIGN() ALIGN(PAGE_SIZE) +#else +#define ASI_ALIGN() . +#endif + /** * PERCPU_INPUT - the percpu input sections * @cacheline: cacheline size @@ -1043,6 +1053,7 @@ *(.data..percpu) \ *(.data..percpu..shared_aligned) \ PERCPU_DECRYPTED_SECTION \ + . = ASI_ALIGN(); \ __per_cpu_end = .; /** From patchwork Fri Jan 10 18:40:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935246 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A927E7719D for ; Fri, 10 Jan 2025 18:41:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 35CE58D0008; Fri, 10 Jan 2025 13:41:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2E4D38D0005; Fri, 10 Jan 2025 13:41:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 13B758D0008; Fri, 10 Jan 2025 13:41:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E5FA48D0005 for ; Fri, 10 Jan 2025 13:41:27 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 908921C76DA for ; Fri, 10 Jan 2025 18:41:27 +0000 (UTC) X-FDA: 82992410214.26.20EF374 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf22.hostedemail.com (Postfix) with ESMTP id B910DC0005 for ; Fri, 10 Jan 2025 18:41:25 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=K4mxsxvs; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of 31GmBZwgKCO8aRTbdReSXffXcV.TfdcZelo-ddbmRTb.fiX@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=31GmBZwgKCO8aRTbdReSXffXcV.TfdcZelo-ddbmRTb.fiX@flex--jackmanb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534485; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HFKS9He99X0lK3xnnbI99ZyULyVP6qmX2/a79HHV3Mw=; b=JHaLW3f5+ev2q/zTYbj0AF2D4AfEnp53zxF9lMQPLmBubU0HC3DC49GC2u73woe0MjBsox Qd4HJgT+9vG4ImP6TL0aJ9MAS3QMr8X1+ZFS1VayX0bWA6pjSIT6q/f6qzW+5U584K3cGD 79+sJ/WOUK8C0gEh33TD5R6Il1I/4bY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534485; a=rsa-sha256; cv=none; b=k+wYUJDgF7NM2J1y4HSkjiLGhSnmdqWJgDMntriWElwEm7+kEi+XlQvVUzdmDsNPDsmypw cEb4tXGlkWnl8qeJRwaEQ7l92kOEc+z5AxNbc9ChDaO5ApWR68gWX3FMcUsmzHbHsX0F8p vy4uWUb21iezHdPfM81kf6/8W7P73M8= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=K4mxsxvs; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of 31GmBZwgKCO8aRTbdReSXffXcV.TfdcZelo-ddbmRTb.fiX@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=31GmBZwgKCO8aRTbdReSXffXcV.TfdcZelo-ddbmRTb.fiX@flex--jackmanb.bounces.google.com Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-43621907030so19341115e9.1 for ; Fri, 10 Jan 2025 10:41:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534484; x=1737139284; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=HFKS9He99X0lK3xnnbI99ZyULyVP6qmX2/a79HHV3Mw=; b=K4mxsxvs+J8FeIHZrXHOhckMULLJVxBeT8rxf6AjDTvyyHt92mky64wnNS6QZeJwW1 ejExxbCDtQ7Jo2ZaW+FZJGvfH2La8wWhneooJQVbkg4/2ZHP/W/3O6gC2HeBQ46yeF1/ k+nVkqdRYKgKuajCr0po3KlMCu9tq0MpxV+RK4p3dqa+r2CdqPs0Isx/rxaFqtRRELog OnVQyTC5nXeKpGRgoO4lnmYPMt09D1WZ+QpuBijev5YIz4535IvIUGfNba9do72Z1dSa vJu2osMfsZgdjWdO0vmYUoXeG2tcrQtwAa9QhlQjsm0Ipl4QLl4xEHh9RBBBy2YK9f9E 3alQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534484; x=1737139284; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=HFKS9He99X0lK3xnnbI99ZyULyVP6qmX2/a79HHV3Mw=; b=iOaxe8gsfT8Dz86/J34DU7as9ZikqZC7QLIwsmbAb9qvSdReu+RJo76dAB8HERacmK dFx1bNGl8owtHtlQvE9dqUccOGkIvJaJ03yiaqr1oiD2xjVEBgWyyG2+355VXfhuR9Ah i0RfbblZdGjR5dey1eDIap0wdK7WAdCyS+ltD3f9LR0y+soiFjoNCQYxi7+v31J1Pglz G3g/5LdeEiPNr3dHXFxLJ1CaefS8IJFJzcxYvcNZSn/cYbEvaQyIV4KgrIOXXzk422PU 3sqb6RB46ar+SrWVLCvDlGVkKrWWKnX9QA7dK2Wu50HxTy+bd2i6SPbHHB7YtER300RF bE7Q== X-Forwarded-Encrypted: i=1; AJvYcCUKaA9tjp4QpxSo3cDrsn4GTRpv4STDmwAmJUtdR5LBBcYklIEEEsTF6KypnhKYXMl+R8TnAm1I+A==@kvack.org X-Gm-Message-State: AOJu0YwF2GK6Zm7OplIeXs6F/0az4zv1cM1MZXZ2dKTbGeBMpDT4O6Bc dBNn48QK5OI3YZRE+wFlrWJMBdlEfhAJd5hpI9Pv6kNOCAECKocrudLekVu0BucUKS3es2lepvu Qm+bnnRxDWA== X-Google-Smtp-Source: AGHT+IHb2tLJeEj/nhEIcvCueP11HP0tzz903p60s9RvGJF27LFwQlQKr5YD0ey6aPDR1IvAD0AJ740K8RZlJQ== X-Received: from wmbbh20.prod.google.com ([2002:a05:600c:3d14:b0:434:fd4d:ffad]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:3c85:b0:436:18d0:aa6e with SMTP id 5b1f17b1804b1-436e2679a7cmr125832515e9.5.1736534484149; Fri, 10 Jan 2025 10:41:24 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:43 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-17-8419288bc805@google.com> Subject: [PATCH RFC v2 17/29] mm: asi: Map vmalloc/vmap data as nonsensitive From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman X-Stat-Signature: 436r7gop4b5bmwc397fuor9fksth7osm X-Rspamd-Queue-Id: B910DC0005 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1736534485-225950 X-HE-Meta: U2FsdGVkX19wOkCKFRRw+DMa73BJ8t3blS0X0vL8LGyiwo8D3jr8aFOcA4Dt4Au+PAaqibkB/Qx/rH45B8+SSfFBdrhK01E4rKTmj7IxRSEUMVm/ix5hBez2EoauS1s/HrZyFw/CWGlucgpW70CmJ9iU41/CVIgTko6HTzHgV8BP/WqrE6nRjlnpbyh2/oiQlDkoob1AuwYhYOkYLFKfokJKAhQBCtlXgqs/dEktUKiAZzmmBMHTYGvJ0QStGaCLzpwXeiPK2bE7bRKf3j+s8Nm0yo0ZyqDyX4zLq0LgsLoaqgnG2BN7vk3tw835k9koKdQjXVbEwmoeQY/aZdpKzEK0CP0zH+wGrc3afH1WMXDxXIl1eD3jd0lgHZS9z5R/ymNKaGeXR7NHY1vDkZ8olzXGHw/jVvHmDkwafv6Px9ykGx3qeYEyDhQjDANNN2byXuWN9Sd/izKK0tS0fnWNRi6zJzbSiRnlJe7s86DC3JHiFKPIN5p5XqOwRNSmOGG493Tb4QUWyFiix64hE3GSMcRRlTjW2vd8mVjyOCtNtWz+7tsgBlHFWE/eloHGDPmO+i4vMbtKhk3CRcP9ifZgNEbPH/xgSSKzm9HifS/Uu1NPOF59qbJ/GqR1YR0ODsZFVwehccN8blVWUg3SLTToLgw/a0OtY3dKVc1JBvQ7Cwx/rqtR7N7+sIYQ1QCmZlFJwXlvD4rfUfGT1FGDyTUBc6T/O5XcBBMuYMKmFoV7Do6li71CeD+ufQY0h5ud5Ei7bBYL6pvnyuBeXh+xobHj5rDVrODx+3lgoqc925vUYnncQEaepUJ5aC2gz7WLDXfF1CPHaPzoThpLBE/4Pueu5j9vfWhNyeKUn/EEitBB4Pfiah4jxVTZ9q4eYMGvRnFl6O8JLdkXx3R+XNzVQwyQwX0S67zhH/XQNc2Fa5n7sTu0xCOPoKbACwb8lOIKRBuH+aSxXg0tTqsA0vrg9OW jgm/V4JA J6Tysf2g2ceDfM0yGRoy3wrIQzs9Y0vS+wPU2HXUi/bxhDowUydrcdJGekKeQcQfe6AOSq3UWURZUAqCHCwCq+pl5ZJi3RpoQTb3Bufr54Jjj9sJA6jpVTE1Uy63lcjnoUnaeRZTpNSROG/tzWqe2vbYzv0ohpGVZvn9CwwP1COmxDVQ7xvYdgz8g4BZPqGu/HJ87jUvtyrkpCiBtGug0YciOjdt42T5V+31p0v4DdrajlFQaPPe6/xF9WSErW+ODFGUEWocQl5GTv74umjcdUnf/psZdVDsQNzaotBbw1szXqh9c74H0pV8hnZVqGlgwTNExaRtbuLHuK5xr5RbAberZuoTQwzKWMqex+0Ew7Ihf+gWiJkzft6HGlWBC88LIkarsD4GqbkakrTX2bZ/d3+QoEhv7+w4ww+zwT8mEcVndzTpauDmwVfBHqovpWLcD596QEEplkQJt3DRpc5KmVlCG6YsS42eK3lZG/hJIjAwOzJkUM0vGptqlzkCdy1Lus8YxsYETz+DLt0CipF016ihBnNR4OgNn4juAL6gersO1Jd/ynNPS9fH77VETRUJJILfmOmYgf7yPwulW6Ywms/sSr62FW4HTECFG X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: We add new VM flags for sensitive and global-nonsensitive, parallel to the corresponding GFP flags. __get_vm_area_node and friends will default to creating global-nonsensitive VM areas, and vmap then calls asi_map as necessary. __vmalloc_node_range has additional logic to check and set defaults for the sensitivity of the underlying page allocation. It does this via an initial __set_asi_flags call - note that it then calls __get_vm_area_node which also calls __set_asi_flags. This second call is a NOP. By default, we mark the underlying page allocation as sensitive, even if the VM area is global-nonsensitive. This is just an optimization to avoid unnecessary asi_map etc, since presumably most code has no reason to access vmalloc'd data through the direct map. There are some details of the GFP-flag/VM-flag interaction that are not really obvious, for example: what should happen when callers of __vmalloc explicitly set GFP sensitivity flags? (That function has no VM flags argument). For the moment let's just not block on that and focus on adding the infrastructure, though. At the moment, the high-level vmalloc APIs doesn't actually provide a way to configure sensitivity, this commit just adds the infrastructure. We'll have to decide how to expose this to allocation sites as we implement more denylist logic. vmap does already allow configuring vm flags. Signed-off-by: Brendan Jackman --- mm/vmalloc.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 8d260f2174fe664b54dcda054cb9759ae282bf03..00745edf0b2c5f4c769a46bdcf0872223de5299d 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -3210,6 +3210,7 @@ struct vm_struct *remove_vm_area(const void *addr) { struct vmap_area *va; struct vm_struct *vm; + unsigned long vm_addr; might_sleep(); @@ -3221,6 +3222,7 @@ struct vm_struct *remove_vm_area(const void *addr) if (!va || !va->vm) return NULL; vm = va->vm; + vm_addr = (unsigned long) READ_ONCE(vm->addr); debug_check_no_locks_freed(vm->addr, get_vm_area_size(vm)); debug_check_no_obj_freed(vm->addr, get_vm_area_size(vm)); @@ -3352,6 +3354,7 @@ void vfree(const void *addr) addr); return; } + asi_unmap(ASI_GLOBAL_NONSENSITIVE, vm->addr, get_vm_area_size(vm)); if (unlikely(vm->flags & VM_FLUSH_RESET_PERMS)) vm_reset_perms(vm); @@ -3397,6 +3400,7 @@ void vunmap(const void *addr) addr); return; } + asi_unmap(ASI_GLOBAL_NONSENSITIVE, vm->addr, get_vm_area_size(vm)); kfree(vm); } EXPORT_SYMBOL(vunmap); @@ -3445,16 +3449,21 @@ void *vmap(struct page **pages, unsigned int count, addr = (unsigned long)area->addr; if (vmap_pages_range(addr, addr + size, pgprot_nx(prot), - pages, PAGE_SHIFT) < 0) { - vunmap(area->addr); - return NULL; - } + pages, PAGE_SHIFT) < 0) + goto err; + + if (asi_map(ASI_GLOBAL_NONSENSITIVE, area->addr, + get_vm_area_size(area))) + goto err; /* The necessary asi_unmap() is in vunmap. */ if (flags & VM_MAP_PUT_PAGES) { area->pages = pages; area->nr_pages = count; } return area->addr; +err: + vunmap(area->addr); + return NULL; } EXPORT_SYMBOL(vmap); @@ -3711,6 +3720,10 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, goto fail; } + if (asi_map(ASI_GLOBAL_NONSENSITIVE, area->addr, + get_vm_area_size(area))) + goto fail; /* The necessary asi_unmap() is in vfree. */ + return area->addr; fail: From patchwork Fri Jan 10 18:40:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935247 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9349E7719C for ; Fri, 10 Jan 2025 18:41:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 46FA98D0009; Fri, 10 Jan 2025 13:41:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 380718D0005; Fri, 10 Jan 2025 13:41:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 186308D0009; Fri, 10 Jan 2025 13:41:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id E7FC48D0005 for ; Fri, 10 Jan 2025 13:41:29 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id A5CD41C7EEE for ; Fri, 10 Jan 2025 18:41:29 +0000 (UTC) X-FDA: 82992410298.14.B001445 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) by imf14.hostedemail.com (Postfix) with ESMTP id CAB4A100007 for ; Fri, 10 Jan 2025 18:41:27 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="AtW31Er/"; spf=pass (imf14.hostedemail.com: domain of 31mmBZwgKCPEcTVdfTgUZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--jackmanb.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=31mmBZwgKCPEcTVdfTgUZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534487; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+fqV4f6s/Eo+EZ+PE2OCyE0kBm/vfnsLcFKRz+1otrE=; b=8VtxCJoR5sIvJUaSkcMJ02h9goFkm4sMMhK8QAg2kNrV6oWu+ubh1U1SfnosvzkIp3PTT2 DiuFwsObLuhJWelJ6fJ3yofmYdPHmLzkGirodz8a81IJAg6cwwEfDAADzbYuYDDTZtwnTY y+KJ4hzb0FhlHAduYfSpy52N/H9vaCU= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="AtW31Er/"; spf=pass (imf14.hostedemail.com: domain of 31mmBZwgKCPEcTVdfTgUZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--jackmanb.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=31mmBZwgKCPEcTVdfTgUZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534487; a=rsa-sha256; cv=none; b=A3v/3N1y59ytIjHvGL9V1MExaugnE5zX5wN7p8ssvhL0oJiOHyAajw2tR1035HuA3rwdfM iNuUvOQP/BwNhuQxtRFifE9VTiotyeKsNImMnvoZ+Gai/2HC08DxY369mAd1+pU15PLxyw bTVpQ8MNIvJrkdrnddvDbMOK1zR3I80= Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-385e49efd59so941630f8f.0 for ; Fri, 10 Jan 2025 10:41:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534486; x=1737139286; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=+fqV4f6s/Eo+EZ+PE2OCyE0kBm/vfnsLcFKRz+1otrE=; b=AtW31Er/SKhLzzVYw3eMqpemQtPbQ082gHAykfkIoXhe5cDV0nHrTVmFNihpzT5JQQ o7HN35Z4tO+2k4bOOT0tNMVnuforNuBFZQnWWDjAkwEU+VsLNvRkhXUwZAsOtfg336Qd 3AVtVS+RVPAJdL+q7gPGL15lSyg9xjzHk+IaI/CRH7JGt3H/Y1AkL7LvFVE7lst5syyW 9C5si4D5m5qRuq6cJ1GL2OCQ6iyoMqbRuiV5cQvjyRr+n52uIFB0ZFOGglc6c701/bNb BjiF/kUrvDIYw8nc1FgHDzJ37nQp03lUEO2XUXooXL3NSP1JgfDjuDjNzX8WQs/4ECKl DPCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534486; x=1737139286; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+fqV4f6s/Eo+EZ+PE2OCyE0kBm/vfnsLcFKRz+1otrE=; b=HTl5MMFDtplq3xtYHjLDpwEhLudltvsS0RKWMPHyoZK2GeuW2c+ZAWika/9D+FYG5L WHeekZ/ji/HSpY67W2JAS31R3A0CmazJJR5dfHdN5MYN0YXMHuvgs2kf2PGsVkAWAxCy EnS4yWqvRxzFI3yfpflkV1mGWUeglZqA08/acXBhwQGJ1DZUlsvD9vQju3TjYaABePLn dhWEyYEKI9EjSbwbRVWgLIwLshp3RPfWxJmPPIeCJMJHh1zepC+U4UIzsm8pIHcrric+ Rf2MCtgQ5y6/882J012nc/qwQ029GCGK/lK+6vGcy2g5h/7+WnjQcHUUJPuvZ0DiApXD VPsg== X-Forwarded-Encrypted: i=1; AJvYcCXK/BWeDB1pos769glo9oPzT0JvXsRNlELBFER/tFtPCxN/u6FSMSoz6VvyyC+FqpnIH/rwScWu4A==@kvack.org X-Gm-Message-State: AOJu0Yy34ouxtT6cmYBdry8SN67rCjl4eh9PK5VAFYSAOWRkfqYz83ck Uu9GXboUXPfh87IeWCajRsULeTo9jNL2W9yfxw2c17/7RNWoyMSSryoQILSjl6cKksiKudOXrVG YFkZxrFW0kQ== X-Google-Smtp-Source: AGHT+IFgJUpClA++DtJe4xQ45Yc9kKsVSyJCXKqF0cxpPlCBfuMButRzIe/JLM7ynzshFcOsnG5hhjluySNsEA== X-Received: from wrbeh5.prod.google.com ([2002:a05:6000:4105:b0:382:4235:c487]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:adf:8b5b:0:b0:38a:88bc:aea4 with SMTP id ffacd0b85a97d-38a88bcaebfmr7355114f8f.30.1736534486264; Fri, 10 Jan 2025 10:41:26 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:44 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-18-8419288bc805@google.com> Subject: [PATCH RFC v2 18/29] mm: asi: Map dynamic percpu memory as nonsensitive From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman , Reiji Watanabe , Junaid Shahid X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: CAB4A100007 X-Rspam-User: X-Stat-Signature: 5f5jhu167wb3e36656oidqsbdhunjqef X-HE-Tag: 1736534487-779672 X-HE-Meta: U2FsdGVkX18uLKHUZ/v0fRN1EEaY5HFd3f4I5EYaRyPToX/fwSyfg1OApZUw/IXINWz9mhrigztWCofSYxGYfbhIpy1z32aLZpsPqZn/1kNWAzjz1AQd+dubUKbDu4eZIkeC73nfP+iF7HRYObX5v1U0k+ulAd5ED1EhrE8IG2SJQgI2gR0d53kHIqvSO6bCy1jFml8ENhwYEGonMIj7f7YBtJqkgI/SodEx6yrQQqco+Ea6fYVBRFYBwJ90NjOVYYtUOBDIN/ZzmJMlEVFKb+sDxzKj6RQAehk0UqW2TETVL21umt8zGc0TnvWn67+ZuDIg/V3zE7nl6s6SAP+jnZwPzwsNlqE6Nr/1ezQQHs8EYikusYxA4SbeB2lU53Diaisx/iVbqnJTD7H2Mwp8UXvLCGvhgfPk6eID0Uwm7fTZjcmcLpkzUiBOayiTPCjPfgvib8UdtiObfXzRNHhEUVzbLTK5HyGBa1JpSHYyb/+/H4Ngo2OtBYsxki6v5nDWmK2YPwUXMALrTv17EQGbpeUUg3oUFAqZNlaRtaAiv+Jzypc5y8kkMTtwPnbMwOTiLheIHulkxsTpMrrPetlcY6nG1rvVdXH6o2lviFRvTXVBBNqJNJByonIe4xPYTUn09NNaZisZ0MdvLxF3zZvEkxKcxBTraM+/A+j/b4TaUhwJC3Y0i+hCRdevwxECQQlZU6xsiDZt8wr97nj1D6yzG1DI4bIGLbZkB33eQ0L0vlon+yqVvNdgqOj9Zsy+b4bmNKsUCsvWspt/pD7H6Vjy3OQjFCFTUt3vxfY0Vh6eHlYbKMP351KBCOgwNIpVKKOnbOuqTOU0XCtvBjdOAZne+HOhBLVKPNmsctccTFe3v1KlL+SdCW2cLR37SmYKq6Pxx/E1lF87PbYl59pNT9S4EpAEfzRDumL76ZxmlNNLpq1K1I7UkVJ29zzeOlb+0hsF5JrGfdaQ8BplWAnhLcb G9+bwZ1L d9TCu5ofv6oiLbh3q6wqfbXvsxjVN5dJ6U4eA7nIx4KA54GRS+xcNj1PHSNrRp9LT59kT9PRqzWO7ugoRB+TUKiVkbR49J/fLDIK2V/y5bWretIMEv/IDuGheVErHHW6eOu81PHJWquNsZsHg5E5EocA7LAU0qlDtdSeMKvMzOzCHtD0kDykWuHN2mkXxs6t9Le9HyK1EHT95271jXjVun29NMlAouw5dZBxaLfp1m8aCitwjCSpAzl7mt4gzQrCT37tV9IeY3oB0DKk39PPQDFhO3TsaFzKcMnmyjB0KIiToTpuDDOqmVqSRUnFW3USLxu3hLHOJq0IRRx/jl47aJi/XyBONTrHsquA7tkKi3xeTOhFz7Br8+oMfNcPCSyS4X3lWEqGh0c7EmKJqjSnOyG6rLbumlrvV0wFy/IZeJTiOCmgJiXdr4GAAwj5VjzySwyQcrB0+GlHT8WqOmfwi0LXKmQMwjsmoIz3U1BQ5fQwC6P1+GIryDACl+U4wdHnpddaW7oauMMvhdNAdf/aHdyRIMmyvSups1pINw32TsrOql2D2a60f7mDXfQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Reiji Watanabe Currently, all dynamic percpu memory is implicitly (and unintentionally) treated as sensitive memory. Unconditionally map pages for dynamically allocated percpu memory as global nonsensitive memory, other than pages that are allocated for pcpu_{first,reserved}_chunk during early boot via memblock allocator (these will be taken care by the following patch). We don't support sensitive percpu memory allocation yet. Co-developed-by: Junaid Shahid Signed-off-by: Junaid Shahid Signed-off-by: Reiji Watanabe Signed-off-by: Brendan Jackman WIP: Drop VM_SENSITIVE checks from percpu code --- mm/percpu-vm.c | 50 ++++++++++++++++++++++++++++++++++++++++++++------ mm/percpu.c | 4 ++-- 2 files changed, 46 insertions(+), 8 deletions(-) diff --git a/mm/percpu-vm.c b/mm/percpu-vm.c index cd69caf6aa8d8eded2395eb4bc4051b78ec6aa33..2935d7fbac41548819a94dcc60566bd18cde819a 100644 --- a/mm/percpu-vm.c +++ b/mm/percpu-vm.c @@ -132,11 +132,20 @@ static void pcpu_pre_unmap_flush(struct pcpu_chunk *chunk, pcpu_chunk_addr(chunk, pcpu_high_unit_cpu, page_end)); } -static void __pcpu_unmap_pages(unsigned long addr, int nr_pages) +static void ___pcpu_unmap_pages(unsigned long addr, int nr_pages) { vunmap_range_noflush(addr, addr + (nr_pages << PAGE_SHIFT)); } +static void __pcpu_unmap_pages(unsigned long addr, int nr_pages, + unsigned long vm_flags) +{ + unsigned long size = nr_pages << PAGE_SHIFT; + + asi_unmap(ASI_GLOBAL_NONSENSITIVE, (void *)addr, size); + ___pcpu_unmap_pages(addr, nr_pages); +} + /** * pcpu_unmap_pages - unmap pages out of a pcpu_chunk * @chunk: chunk of interest @@ -153,6 +162,8 @@ static void __pcpu_unmap_pages(unsigned long addr, int nr_pages) static void pcpu_unmap_pages(struct pcpu_chunk *chunk, struct page **pages, int page_start, int page_end) { + struct vm_struct **vms = (struct vm_struct **)chunk->data; + unsigned long vm_flags = vms ? vms[0]->flags : VM_ALLOC; unsigned int cpu; int i; @@ -165,7 +176,7 @@ static void pcpu_unmap_pages(struct pcpu_chunk *chunk, pages[pcpu_page_idx(cpu, i)] = page; } __pcpu_unmap_pages(pcpu_chunk_addr(chunk, cpu, page_start), - page_end - page_start); + page_end - page_start, vm_flags); } } @@ -190,13 +201,38 @@ static void pcpu_post_unmap_tlb_flush(struct pcpu_chunk *chunk, pcpu_chunk_addr(chunk, pcpu_high_unit_cpu, page_end)); } -static int __pcpu_map_pages(unsigned long addr, struct page **pages, - int nr_pages) +/* + * __pcpu_map_pages() should not be called during the percpu initialization, + * as asi_map() depends on the page allocator (which isn't available yet + * during percpu initialization). Instead, ___pcpu_map_pages() can be used + * during the percpu initialization. But, any pages that are mapped with + * ___pcpu_map_pages() will be treated as sensitive memory, unless + * they are explicitly mapped with asi_map() later. + */ +static int ___pcpu_map_pages(unsigned long addr, struct page **pages, + int nr_pages) { return vmap_pages_range_noflush(addr, addr + (nr_pages << PAGE_SHIFT), PAGE_KERNEL, pages, PAGE_SHIFT); } +static int __pcpu_map_pages(unsigned long addr, struct page **pages, + int nr_pages, unsigned long vm_flags) +{ + unsigned long size = nr_pages << PAGE_SHIFT; + int err; + + err = ___pcpu_map_pages(addr, pages, nr_pages); + if (err) + return err; + + /* + * If this fails, pcpu_map_pages()->__pcpu_unmap_pages() will call + * asi_unmap() and clean up any partial mappings. + */ + return asi_map(ASI_GLOBAL_NONSENSITIVE, (void *)addr, size); +} + /** * pcpu_map_pages - map pages into a pcpu_chunk * @chunk: chunk of interest @@ -214,13 +250,15 @@ static int __pcpu_map_pages(unsigned long addr, struct page **pages, static int pcpu_map_pages(struct pcpu_chunk *chunk, struct page **pages, int page_start, int page_end) { + struct vm_struct **vms = (struct vm_struct **)chunk->data; + unsigned long vm_flags = vms ? vms[0]->flags : VM_ALLOC; unsigned int cpu, tcpu; int i, err; for_each_possible_cpu(cpu) { err = __pcpu_map_pages(pcpu_chunk_addr(chunk, cpu, page_start), &pages[pcpu_page_idx(cpu, page_start)], - page_end - page_start); + page_end - page_start, vm_flags); if (err < 0) goto err; @@ -232,7 +270,7 @@ static int pcpu_map_pages(struct pcpu_chunk *chunk, err: for_each_possible_cpu(tcpu) { __pcpu_unmap_pages(pcpu_chunk_addr(chunk, tcpu, page_start), - page_end - page_start); + page_end - page_start, vm_flags); if (tcpu == cpu) break; } diff --git a/mm/percpu.c b/mm/percpu.c index da21680ff294cb53dfb42bf0d3b3bbd2654d2cfa..c2d913c579bf07892957ac7f601a6a71defadc4b 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -3273,8 +3273,8 @@ int __init pcpu_page_first_chunk(size_t reserved_size, pcpu_fc_cpu_to_node_fn_t pcpu_populate_pte(unit_addr + (i << PAGE_SHIFT)); /* pte already populated, the following shouldn't fail */ - rc = __pcpu_map_pages(unit_addr, &pages[unit * unit_pages], - unit_pages); + rc = ___pcpu_map_pages(unit_addr, &pages[unit * unit_pages], + unit_pages); if (rc < 0) panic("failed to map percpu area, err=%d\n", rc); From patchwork Fri Jan 10 18:40:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935248 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53B2DE7719D for ; Fri, 10 Jan 2025 18:41:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 61CE18D000A; Fri, 10 Jan 2025 13:41:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5CC7C8D0005; Fri, 10 Jan 2025 13:41:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F94C8D000A; Fri, 10 Jan 2025 13:41:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 1F4448D0005 for ; Fri, 10 Jan 2025 13:41:32 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id D7FEBAE2BF for ; Fri, 10 Jan 2025 18:41:31 +0000 (UTC) X-FDA: 82992410382.09.4552A9E Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf22.hostedemail.com (Postfix) with ESMTP id 10EEBC0004 for ; Fri, 10 Jan 2025 18:41:29 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=fjAzo60e; spf=pass (imf22.hostedemail.com: domain of 32GmBZwgKCPMeVXfhViWbjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=32GmBZwgKCPMeVXfhViWbjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534490; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lgAtsH7tF25YqHAENGvYLMvhBopdecMcg5W9s6t9Qm4=; b=WS0P8RWmA2nLe7fSQVg47w3XUYlHVG1fTdCFNqd3IMC1cXMwm2cB1NSLMc0Cr9DRqAUBVk DUcsgNN/gcxHrt/oaZa2D2xj8MMaDbS9vj+s/rqDsIwGxdH152uxRWHgfjC6pB5vJgeTfs w9Zp6JrE5hoBq+rdnhHbDBZkxk1N9vg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534490; a=rsa-sha256; cv=none; b=HLbjdVEyVrltN/K/lbL9TPegukqrNDGqA7ODO2vLb+3UyXnT9yDH/1jCJU2NKURJMOuF3u YMHDEpBlCormY4vLV3GdCO5cTF5BU499qcyBcN08LcXX2oDrTLhXMBLBJOA2Qb6WATpIeW IqEuz9cZ8b5GQvtzmq1ob+u4VL/PU1M= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=fjAzo60e; spf=pass (imf22.hostedemail.com: domain of 32GmBZwgKCPMeVXfhViWbjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=32GmBZwgKCPMeVXfhViWbjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-4362f893bfaso12711825e9.1 for ; Fri, 10 Jan 2025 10:41:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534489; x=1737139289; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=lgAtsH7tF25YqHAENGvYLMvhBopdecMcg5W9s6t9Qm4=; b=fjAzo60eXb2GkfvMIttPy4SgfvC4ZjLMpJXbxpw0xe3EsCHdKuo6lhVSgLnZvGvQ7n lNxKeQrRRzUwM2IlGPjnc3P5YFIWhiEjIz3AXZ5gPVq4Y51LiNv6likBc6BAat7/gY59 qdZfC6e/+D0W7mvThAPqYMqqwrZq69r61MTDocfoHBrYqFUz0YhR6aIzkvpWjYWrhDD+ QY2utnc+ejhFuezyrlKn5FvCPvyfimJcaiOAYaXyYVwHKnU8S7IPBsCUfwrGXU+tkWCE +ApA8s/PdHa52Y93yPEgJzjJzG2CCHcRcwBZwfpJrxoj2TTmx2jPMf4LE6ItoAEEMeye mgog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534489; x=1737139289; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lgAtsH7tF25YqHAENGvYLMvhBopdecMcg5W9s6t9Qm4=; b=oCCmhvenDZ0qzImqXbW1S+FN4WLhxnk4dsS/hdIW0o2/MUNdHfrkC/1hMjuoO4ivDb kfbK4EHJrLc9P61OWx/TuH5Zixcn8rQPbowV8PKpc1/Lz2+96oXwBJbjFoFgeRYWF7Sy ORL8C8GuffcV/J6hWVkPWF/Bjlu0LbyieWAAju8KfqbmVlsEJaTx1nkfC2phOgjcWDj4 98B+bc0pHa4scoSMyoi0ZWnvDWWHCdrI86REka8YQHBtOkTTPt0tqczhl/endlyR+4uH XKXYmKN/XkFNShY02NBc+gBRek6vEL0CH6nNILDaacyAXDV3JplU0Oka5cFbw/H12jtt 9riQ== X-Forwarded-Encrypted: i=1; AJvYcCWYGaAfhSgbTGgunh7uA1usK+hek0GJQWQuxpfpBmmG7wHSfGhnzvCd917OFnhFCeg6zMvgSANCxg==@kvack.org X-Gm-Message-State: AOJu0Yw2exnFYvTftV3eJUHMs2m9357yGkl4XrTht3fzl3SyRD6IZIOg jVq4xx/g5RqFoXpFY4MfrRGhIi+eO4J5uJzw8NiRb6wK599Ca0NuFQXnlwXTazKY34b4Me2Z/XV MdZFPwAlQ6g== X-Google-Smtp-Source: AGHT+IFeKlzdEnxlsx468YN6J+lzj2fJ2REIK9m1LE7fZOV2UV2JrR/PAzf48/PWmJ/jsXO7ZyblIN12CqtAPQ== X-Received: from wmqa17.prod.google.com ([2002:a05:600c:3491:b0:434:fa72:f1bf]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:4586:b0:434:9e1d:7626 with SMTP id 5b1f17b1804b1-436e26f4b91mr97248925e9.25.1736534488470; Fri, 10 Jan 2025 10:41:28 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:45 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-19-8419288bc805@google.com> Subject: [PATCH RFC v2 19/29] mm: asi: Stabilize CR3 in switch_mm_irqs_off() From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman X-Rspamd-Queue-Id: 10EEBC0004 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: zz1u3kn7kb6cwnmb3a67kb7r3bgocfu9 X-HE-Tag: 1736534489-858840 X-HE-Meta: U2FsdGVkX18ur+VNV9HZJH5HsK/p/2Z+p0y4P8NMsU7cpXIs+sgeuYwDtF3SLeEvLBPwLabE2SFX4ARZuzD0dP/4uzUlVG7CwV/rT20kDijOQdWSdHt8A+xAdHPBt4hSSRAi1PikpNGdfM0m/Bvs7sWhjGameHCFxOiY/HsydjJoezBNqmWS+tEvcGHygHdyA06Lng7Ydzlkcu0ax0gHWSAH/Gy9FdOlgghxfxtHhCWHlH9lXJP/dDFDGjWz8M3gXgPrlVmKEMiZ/c55ManG5DJJbY3oFDwPF1cXA3/YKhX00gbao0qISUYZcaKnoPdBi/WboIpJDaJ0oaojRy6/MlIpMtqcteHxqZcEtV06+d97/++zjd3oKQSwaXlonm29RaODwMawjG+BaMukSY+SHPmkqVAnkKzpStfANiAlVUu8UxUFs8/6wtATNs9Y2krxE5cyaWwQZBxc+kglfc0caeM2xllvrgKR/jgTd80TnXzmRV0exD0NLK4LicXlSoMDc7i+jivWEwkWEnLhdMQRdncX8WW/F+gwnWtN/BcJECvH+uHS6ueC2Oixke3xles+EEJd8ICp9kcHPV28SJUPj6LWOy9OtCjijYTJdfHvcpzV5F7beXh4Ol/QuDtd0lBQPrSI72KIAvYKdaQOfN6i0GtZf5IEbyzlbXmFom8r7qtsIfnl3iRdmQvTTPHMhrWcWB4J8IhuaW6INgy8ZlzUfkKC8QMEQx/0RzkyvXNe4kIeuZgZsSny4PMGC7GOeqSormq1HbLZApt7Ys+hvisQi4ebVDdacJugXFXarrCt3gIaR6VaYxJ6CENb8N11MMjvSMei5qcFA531MzzVMvkHS6aHr6+7d1+z5f3Sd1MaxLg8GRDI7NA70ZqD9+VRKpr+K5W86G/P4/oRIQao5p5lzdZhEhQbt+8pPUOq/W9eM+7EJlM3FcG3vhXv+CGfudqAQWG+Lp25/cjnWL99fc2 YCCkMc38 wZc3+o+wOPfUoyvn622NXqIp5kS91SFX0JZSQWRUJPf3cyX2GJAdjrhxN3Ac3+6yDTC9/PDCAk7E9goxARJDh0oW7b8kdHPY+PWAMi8PHEl+Md9NhGV537iOeW0wvCRnRBMZxBgibn21iw4C66dRREzzKQ0ZflkSItSXBGHZh9RB1WaI/1+zbkkCRA8GUGzh/3WMBCKxZs/81HpohfN4cLC2FWFOucOMmqKLr6DyOYTLCRmjN+taQhYYpRuWo01W5BkkCB0G9Gu84bFkyn8rJYYdE22TK+26cBd05bpCpV+kO2A3Kr27OB/yhaX7Gp5j2FgkWeZRZ/nsVyeNpHY4DhSeFe1L6f/56rwRWixtj/hnPNTmg3lCfgC6h+jswzwbgsmsTpcD7ZcQCVYI3loxNaLJQ5prxByg3btLM7IBUKLWgB76/D9BdXITF9rzaplbCH98AP6bj8HGigleW3BwwxTgX7rAt6atKOis1BXByjOjIi+buqQjaJRP1WfDzFZofEuMFehkf9FycutgsBgbOaEwRiLMPZkEpNwsYhJPJ8XMHaexzfc7CwZmY3ufRPrY1Lw53FQV0iPj6cvfEbs+GxJYUd8yP0UZmBXjLvXapaN8XxBo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: An ASI-restricted CR3 is unstable as interrupts can cause ASI-exits. Although we already unconditionally ASI-exit during context-switch, and before returning from the VM-run path, it's still possible to reach switch_mm_irqs_off() in a restricted context, because KVM code updates static keys, which requires using a temporary mm. Signed-off-by: Brendan Jackman --- arch/x86/mm/tlb.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index c55733e144c7538ce7f97b74ea2b1b9c22497c32..ce5598f96ea7a84dc0e8623022ab5bfbba401b48 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -546,6 +546,9 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, bool need_flush; u16 new_asid; + /* Stabilize CR3, before reading or writing CR3 */ + asi_exit(); + /* We don't want flush_tlb_func() to run concurrently with us. */ if (IS_ENABLED(CONFIG_PROVE_LOCKING)) WARN_ON_ONCE(!irqs_disabled()); From patchwork Fri Jan 10 18:40:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935249 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C781E7719C for ; Fri, 10 Jan 2025 18:41:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AD6FB8D000B; Fri, 10 Jan 2025 13:41:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A861E8D0005; Fri, 10 Jan 2025 13:41:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B29C8D000B; Fri, 10 Jan 2025 13:41:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 680198D0005 for ; Fri, 10 Jan 2025 13:41:34 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 2BCC1AE2BF for ; Fri, 10 Jan 2025 18:41:34 +0000 (UTC) X-FDA: 82992410508.20.C352F55 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) by imf07.hostedemail.com (Postfix) with ESMTP id 4BDBE40008 for ; Fri, 10 Jan 2025 18:41:32 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=N8JAsaBt; spf=pass (imf07.hostedemail.com: domain of 32mmBZwgKCPUgXZhjXkYdlldib.Zljifkru-jjhsXZh.lod@flex--jackmanb.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=32mmBZwgKCPUgXZhjXkYdlldib.Zljifkru-jjhsXZh.lod@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534492; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=z5b6fu2IPJ4vMWdeJyvKaXZwCsIFdjwPZGonrVH8VuY=; b=UaQ4hp8p0MVKa/jKcsysCRS/1CaNQcKMdyVZ9FrqlC+ElPE1SSDuilrSykY/QegczM7jqN CFg2cvNTxO7rSFQd1Lxum7DQeoUtxMtOX+q+7tMDq3XCuA9gfWIC979jBc9Kn9TNfBQzo8 A6KM218C/iG2FN9rgvqs7YXg8qZcf9w= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534492; a=rsa-sha256; cv=none; b=ZmmxXh6elvyXCv3Uu96RtctElvUQaAjOK8pJUIOBy6HixMSMTP1DH8qHlGvxzLdYxj7RTx fVs0jeyetfbzeFeqdJ/dmz3QO/I4U/LbLL2l8Bnq6OOHkqL4nQvLaI7/pRtE8hUuhUn+rU lt0OTyQwR9WTRI6Op1VuJOTjo2jwl8Q= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=N8JAsaBt; spf=pass (imf07.hostedemail.com: domain of 32mmBZwgKCPUgXZhjXkYdlldib.Zljifkru-jjhsXZh.lod@flex--jackmanb.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=32mmBZwgKCPUgXZhjXkYdlldib.Zljifkru-jjhsXZh.lod@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-385d6ee042eso1539655f8f.0 for ; Fri, 10 Jan 2025 10:41:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534491; x=1737139291; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=z5b6fu2IPJ4vMWdeJyvKaXZwCsIFdjwPZGonrVH8VuY=; b=N8JAsaBtfho3CuWRPoONP84u9Q37ry7Glgf5YNM7O/bp103BPK9xmHBcPXe8i1Vgbi aZhN156JUdGf1IHj+BNKvMHcMD1PqcYrLwtgATwqnb9hPChm3pI6Kf66B99nyvpYhS5e 9T+Hse02SeDrylWClSooPenHlT+/WWrlz5IdCPv18jIYICDossIgzSXGAjTo0Ff4YyPi huNQ/XlwExuzeKmLHSrnSGr/K4LRUQs3uKyAD5SSlXhjPinRzAL8ErKAwDeXoAWQCAi1 DjLVW60n7MbNOh4n/q/VF9m0ObSxt98gNWvzG5X0/6hAntvAvlqQFFREe129zOvLJoHC k0Ug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534491; x=1737139291; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=z5b6fu2IPJ4vMWdeJyvKaXZwCsIFdjwPZGonrVH8VuY=; b=IW2NEVhoTWPFO0OFt93vIy49QVDE5C07H8wGlcKWmBU8RxoaAooHgXyKURFRs4wZqV 9n5Eibn8WNzO4ERPWyHKmMyrpLszePvlV+lf7htOIH4cLc/Nkq2rSL8hcLEdyVwfuHvx wQVBuHbdkmwFZlfycqwKqs+ecnxrggz65WhkArtCH7rZQ2IU9bzi82PS5H4zCBBNHzGc 8W6Gw6nLYr/NPoolkXP7T3diFgQFz8/QeGru7qj3tFGTzkfx9qy9ulEgfLJ4+z34jHAg 3JI6ZHT0p6oTf+jCHp1cy+yhYSIX5IPVlZUc4OCBEO17lfMIO1tW0rQ4ZzjtlNW/dbuy yx9A== X-Forwarded-Encrypted: i=1; AJvYcCWigI6cViYv6Fki74h9fOCCfkj536VQUI1oaneQYFB8r4jYuc2DnEWnhozyJVyt+1i+pJaKGPhA0w==@kvack.org X-Gm-Message-State: AOJu0YyM9PLNDX4HaC4gOj2Km4HFeyBZ8HB+obAjIO2fqK80JSzG79jw JYfzYtn2wCelupPMHrp4GJOQxFJgxUQSuZo8W5O4WNf1rGih9XIzIPfLw5md7fnipSJWfyCu+Lo ohupPcB2Dvg== X-Google-Smtp-Source: AGHT+IEE74PFwgraC6zRNOcME3rOW4swKtJd/bKE4AIl8MRowFtdpWNq19ZBTTKPz5sLGPxmEc1GHABVlwipiQ== X-Received: from wmjv9.prod.google.com ([2002:a7b:cb49:0:b0:434:f173:a51]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:1a85:b0:38a:4184:1520 with SMTP id ffacd0b85a97d-38a872eb1eamr9947778f8f.27.1736534490587; Fri, 10 Jan 2025 10:41:30 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:46 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-20-8419288bc805@google.com> Subject: [PATCH RFC v2 20/29] mm: asi: Make TLB flushing correct under ASI From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 4BDBE40008 X-Stat-Signature: oyceqzww8rs55bignt77abgqowtcnh83 X-Rspam-User: X-HE-Tag: 1736534492-262778 X-HE-Meta: U2FsdGVkX1/4kGGE3rsVJRQw26puXcnbfwkX1rT3aKbo+E95JQBGycf8pADA/xKf3jVeOevjTvBzJZ2jRukNytVawEHA9UeEuzyHHrSMnhS18kFQ9V7b2CFbovqxL373VlWpBMfEAW+9IgukHrCe4LbzLHJaL5fwEYHa3UN3D7t3I9H9aeWfVZPXeuNiBgiB64KVuE2klXv8XfVl8YCH/I1CWVUq3fL7nreY5HPhmcMEsENpMYkstlJNCHFGZeaaLoflmOLJLj8owa2HH2rk7/pvAfOX9WRmVZXC+buDXmOlLw5gFCE8si7K6VvSKNdJ70KdiLC5tQTH/66ruyjq32aD8z5/wCaLdZN5hysZvOlRqqzzyXH8YYqFOFxr6nFJjzUCh30Y5BgiuwLQHf1fW4SDHvGpuOYLMEhyW5qxlUwDbx6ywl+hBwI6UOLvwLSjXaFBv4CBBNMfhPyiAKsYbGG1htO3Y3ixyJvPGmFRx/877g/xCVI2PyoyV4BLIAFlzkYPLz0W5F4aw4ge+j4fIj8HYO9zxHV5GNd+H9H6FblC2WWeqQeakAUZC44lw185TfxRdDIqczeolqH4nzBWslArQ+zyZXV6OR+EsxN8ydV460FNf81raJzQURZQqGLx+Gf8KCLHlcs5Feb2JUdmiEbDzgazNwm43JUgAwkwpZpyEvfCpvRleTKk5D23y6QH/CgFXs58Rvuyu4eVTfWE4CvZbyFJNyEf2kKKPr7W6V0iNKOiCkMTxO7XnnjIgpZYDbsvFudVNraqd1ZPlUlngVJyeyRm5IdlGz/Njgv1DhN0jYZG9ygYOcVfmcAlPE74tqu9Sm2QMMZeck5pOs2u+vM4taBOHmfbmKXAbQATdkk4yBVydy9OMP1VR9qiCIF0GF0/vv0hAe6ffjnoxlIc/x2xpOkUUfuHqktPQ6FxSCSa2ctTJfGOKpdz5rgckqgV/SQYCCdTiQKDtwh0tkh wfZMm7hQ /7wv7Dr9QMPYqKyJGRY8CxM2yfSQ6Tt7eNla01u+PXRa4uoSCPRBsEI3u3vUvlGuSlY20vuCbR9KRmZW6ZWUIQJvWZIWTsjr0b5LSKGA4tvDbFCkUKc7nzbJzvg8X78bVIbJ5QdJjjZI9bP3OCwbKVA9Z2e0N7KondGDCJFIpZtff071qu00epmdOCNLg2JfhRaOFu0mLveEfqOrsCUxMS4RQuN5g5KQTpYagZHa+ECYxHVHebdsdNvfTqkFmy8N8zfwgVjDAiO+6v9RLbX2njFZlhDvbBWEQJkekuv+paTHMXuqHK3AcbES63Z+apV2VWVHumw7bueyUfHLLhZAmSdVCeQE13KY1ShS1aDfaYsh5OFpwCbH2uOv27cfGN2ZsBCT2XnQIe8JVs+L74mkx5OqyjZAMDbMzrsO/eMOCMSnxz/qeUkDxkA3X0FZt0Pf19wLPdlCN15dUOmL6PZ1X4b81PsLrBJA/03ZYGLGYm2EVq/nm0SqEyorMcSLx+FCuZBFDtIBk7elW5W6fHNjWNkNZulJht01mEI3N/LqQuMWJzZUsCCg6TM84DmTPn1llLUv+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This is the absolute minimum change for TLB flushing to be correct under ASI. There are two arguably orthogonal changes in here but they feel small enough for a single commit. .:: CR3 stabilization As noted in the comment ASI can destabilize CR3, but we can stabilize it again by calling asi_exit, this makes it safe to read CR3 and write it back. This is enough to be correct - we don't have to worry about invalidating the other ASI address space (i.e. we don't need to invalidate the restricted address space if we are currently unrestricted / vice versa) because we currently never set the noflush bit in CR3 for ASI transitions. Even without using CR3's noflush bit there are trivial optimizations still on the table here: on where invpcid_flush_single_context is available (i.e. with the INVPCID_SINGLE feature) we can use that in lieu of the CR3 read/write, and avoid the extremely costly asi_exit. .:: Invalidating kernel mappings Before ASI, with KPTI off we always either disable PCID or use global mappings for kernel memory. However ASI disables global kernel mappings regardless of factors. So we need to invalidate other address spaces to trigger a flush when we switch into them. Note that there is currently a pointless write of cpu_tlbstate.invalidate_other in the case of KPTI and !PCID. We've added another case of that (ASI, !KPTI and !PCID). I think that's preferable to expanding the conditional in flush_tlb_one_kernel. Signed-off-by: Brendan Jackman --- arch/x86/mm/tlb.c | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index ce5598f96ea7a84dc0e8623022ab5bfbba401b48..07b1657bee8e4cf17452ea57c838823e76f482c0 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -231,7 +231,7 @@ static void clear_asid_other(void) * This is only expected to be set if we have disabled * kernel _PAGE_GLOBAL pages. */ - if (!static_cpu_has(X86_FEATURE_PTI)) { + if (!static_cpu_has(X86_FEATURE_PTI) && !static_asi_enabled()) { WARN_ON_ONCE(1); return; } @@ -1040,7 +1040,6 @@ static void put_flush_tlb_info(void) noinstr u16 asi_pcid(struct asi *asi, u16 asid) { return kern_pcid(asid) | ((asi->class_id + 1) << X86_CR3_ASI_PCID_BITS_SHIFT); - // return kern_pcid(asid) | ((asi->index + 1) << X86_CR3_ASI_PCID_BITS_SHIFT); } void asi_flush_tlb_range(struct asi *asi, void *addr, size_t len) @@ -1192,15 +1191,19 @@ void flush_tlb_one_kernel(unsigned long addr) * use PCID if we also use global PTEs for the kernel mapping, and * INVLPG flushes global translations across all address spaces. * - * If PTI is on, then the kernel is mapped with non-global PTEs, and - * __flush_tlb_one_user() will flush the given address for the current - * kernel address space and for its usermode counterpart, but it does - * not flush it for other address spaces. + * If PTI or ASI is on, then the kernel is mapped with non-global PTEs, + * and __flush_tlb_one_user() will flush the given address for the + * current kernel address space and, if PTI is on, for its usermode + * counterpart, but it does not flush it for other address spaces. */ flush_tlb_one_user(addr); - if (!static_cpu_has(X86_FEATURE_PTI)) + /* Nothing more to do if PTI and ASI are completely off. */ + if (!static_cpu_has(X86_FEATURE_PTI) && !static_asi_enabled()) { + VM_WARN_ON_ONCE(static_cpu_has(X86_FEATURE_PCID) && + !(__default_kernel_pte_mask & _PAGE_GLOBAL)); return; + } /* * See above. We need to propagate the flush to all other address @@ -1289,6 +1292,16 @@ STATIC_NOPV void native_flush_tlb_local(void) invalidate_user_asid(this_cpu_read(cpu_tlbstate.loaded_mm_asid)); + /* + * Restricted ASI CR3 is unstable outside of critical section, so we + * couldn't flush via a CR3 read/write. asi_exit() stabilizes it. + * We don't expect any flushes in a critical section. + */ + if (WARN_ON(asi_in_critical_section())) + native_flush_tlb_global(); + else + asi_exit(); + /* If current->mm == NULL then the read_cr3() "borrows" an mm */ native_write_cr3(__native_read_cr3()); } From patchwork Fri Jan 10 18:40:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935250 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62434E7719D for ; Fri, 10 Jan 2025 18:41:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4BDE76B0085; Fri, 10 Jan 2025 13:41:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 41E9D6B0093; Fri, 10 Jan 2025 13:41:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 24A0F6B00D1; Fri, 10 Jan 2025 13:41:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id ECDB76B0085 for ; Fri, 10 Jan 2025 13:41:36 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id A70EA80E40 for ; Fri, 10 Jan 2025 18:41:36 +0000 (UTC) X-FDA: 82992410592.05.050575C Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf30.hostedemail.com (Postfix) with ESMTP id AA5978000A for ; Fri, 10 Jan 2025 18:41:34 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=u5rBoeVP; spf=pass (imf30.hostedemail.com: domain of 33GmBZwgKCPciZbjlZmafnnfkd.bnlkhmtw-lljuZbj.nqf@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=33GmBZwgKCPciZbjlZmafnnfkd.bnlkhmtw-lljuZbj.nqf@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534494; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0rBDGQyJzQ3zE+AnCN6b+fGLld6jYpPH9L4MwLLlvpM=; b=ujQiZNHOl1P1bR/o3S+/E9wThjOlPwgEmdboBakjVvqdSnKsUmWWjcQVDTuSZvCyRxkjSp bcF2l4sDH/pg2dELj750WV1Y4UxC2T3zs7vQHmQmbL3zQubQcreGVq2oQgOlsIiJAwzFUG O2pu9tuAurnQcrZmjYcrw0yNB9FV1lA= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=u5rBoeVP; spf=pass (imf30.hostedemail.com: domain of 33GmBZwgKCPciZbjlZmafnnfkd.bnlkhmtw-lljuZbj.nqf@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=33GmBZwgKCPciZbjlZmafnnfkd.bnlkhmtw-lljuZbj.nqf@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534494; a=rsa-sha256; cv=none; b=rJgrxH+u2zXm/2XHEGuaYza9k1vO7NEAynktulef/zE9ckbEOk+AIX3OiIFf0jMTWb1krw +Ip2jdUHQXik82dC+b2ULEnKRiyycC/nNunM0FA+D8qwek/8gonRkG9yMtM6Fp8FrShU4X n92m2V1yIbzWcAiWKjMNTrD12Cn/xUw= Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43625ceae52so12794495e9.0 for ; Fri, 10 Jan 2025 10:41:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534493; x=1737139293; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=0rBDGQyJzQ3zE+AnCN6b+fGLld6jYpPH9L4MwLLlvpM=; b=u5rBoeVP9gxSfiG6U+rju6PzKRm91FS+0q9jqq54Ds0iU0kgTlFRM6rWtpsuCYlXQw zD5nw8MgHOJk8JolT50FBqcWU7XYxgxl6B5DwdZJu2gv4Ycc4EspUeubDbgRD2QkBqId 4ciHc6FS94kb+v/mrNmKKytXfWQFwfe/EXjgEr/ve7GiQH4AA1CLB1yghnB+fTUsEFjE 3YFLu5jj35gYZyhPSyoLos5kVenF9xo8Z681vLP/gumgUxu8D4cQ3yJXs9Uod7XGkgLx ZZskWaw0eUaxkeAkSuD0XIKL66D/dllzBZMfps7rv4rTt9V0lrA6XdGvGZpnxGf0Uqu9 /MgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534493; x=1737139293; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=0rBDGQyJzQ3zE+AnCN6b+fGLld6jYpPH9L4MwLLlvpM=; b=EowhWj0t9JppKA9qHq+3bPeQoYX+cw/AAMYoZ1XOteIFG53gLjzaw6T0EQgusNgaS0 FL/3K4fwzI55Z2bdKE2TrvVSjVxHZQ35UqVlFXTwJx51WKNWrYNi0ys84drxoGE537kV IanKjuGuUv6iml+A3t38+TRsW/KMBi4lesIjJDOyNsjXgM21BlmVqPtXhCs9NPHAzUX4 Pe6h5R0n8MVlyNF1ghnMraUXVYodsVFrODrgsdzFZVdAWOve3KGQqvg1HIklvly8hxwV Rjv96pNM437O2aTDTmoTa7TeLIaYmqSBXl5GGLUU8npBNV+f/amdCRKRQCQRmZw/MMC4 AiDg== X-Forwarded-Encrypted: i=1; AJvYcCW+mQUoVwA7bD+VoXFWREbngF+L6h/ILSNnn9BXt2Oej2GzspNIfEDZX2xv4keSegbuv3jxbayMvA==@kvack.org X-Gm-Message-State: AOJu0Yx54Q7zGoCyZ5NbN1JqWR/9FP2z4Su/Tt/aUjWpGdivi/57oUOb 5QobEpvEkpH4qpXFC98XBcse1BPjFhLRhKrSnETT+Thny0fRH0/XC5O20EDNWr+DA5jxyyNw9ix Ylm+Qd6s5Bw== X-Google-Smtp-Source: AGHT+IGSdZ2gZJ+03Djx+zc677EB8kBOQn6jxeBFpWfFWpyM+wf7VvlfzHG8+toKsNSgpgcgyJMUn+Q1Gcn1Dw== X-Received: from wmbfc10.prod.google.com ([2002:a05:600c:524a:b0:434:e9fe:f913]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:3ba4:b0:431:3bf9:3ebb with SMTP id 5b1f17b1804b1-436e26f4805mr96359685e9.24.1736534492849; Fri, 10 Jan 2025 10:41:32 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:47 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-21-8419288bc805@google.com> Subject: [PATCH RFC v2 21/29] KVM: x86: asi: Restricted address space for VM execution From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman X-Rspamd-Queue-Id: AA5978000A X-Rspamd-Server: rspam12 X-Stat-Signature: eghn5p7w1pscjtzf8xi7fjtrmnwg1t9h X-Rspam-User: X-HE-Tag: 1736534494-34698 X-HE-Meta: U2FsdGVkX1+3yk8nJZflxSSK+9kX3nGyyqBm/QnhT/GNMNM8Zqk8X7TUGeHwXDhP+u0ZHLM9fV6FXvT7ILD86n6zcwOhC5Q+S+mE0wJ8NRWE0zHyyRqAse/eYTuY5x6iLWovD0Q+XnS+CyezO/Qwco6x+AhhPLU25b/MKgEkt/AIKAoC1M8DsuWSHGWJLdXwkNfTyYUQDVqtHGbaxgTnLRJ7LuZcxKRtmfXyvQNgHCwbWOEiYbA5uWCkNFTRzY4GS/dRMqCNIh6/yRaa1UEOPaik+mY8SBHofBTVOarZYTA26tZy0cBV6q26fsEnkhTNxoaJYKnj7Ktc9NBzlnJnVF3XSGYNbzwyYcPCHKIysGk5ermcRjvmiax/tUq4iqbbGS71MdAFHSH4AxH5LiTF0et35SDu20YgxLS8T0iGD/A2AsEzUo/pOCuMmsu2tzjurxamXFzWW0CcpeHX1uNXLQOmbX5ZbsAgn2XqadCyd0mg89XnwLyRTJ9o03b3ecHulUGqz9uEu96Q9jCdk+4jvB5jtrfIg+QFyKEaD8oDQhQ06ttVJwH/+41JO9yS03giIpKxsrg/DXfH952hIev9Sw+A+R4bGRo7lzy2XdGrA/G+B2Fi2lIkW0nTsKpebF6IDJ5udhisWYJFsDuMx5k/DHceJzC54ZALyZRZYkkPoIgdHIVEIzOIzuyqobx2w6JFNyvSH91DpAZHklhvYg6S9LKNXran/UY603xeCebmL3dIRePpHhbltCXQz7iGhLLmaQ27EhCuTe6dXXWIEmJWTlrYGX35x9OPA9UEphutf2yMA6V9x63zgM+N043ilmWAef6IDkT4WVkh8k88vdK/ylc3mHRsPnX1ak/FEmycUKqjE9of8wmFQPtI2gbtVk3STFu5FwhXh54AKKCCKKk/E2eRL7A8PT0sfpMlwSCKSgcJRcEsgTCv28HPY+aKcPKOj4vslHzUeTyoMfcFZ5c tR+vTdzM wh7UYn9DCtgOLwy+ZW5Tp4eY8X75x4kIQMdJjHXsqymbI0WMrWYpdsiPTbySgh1rQCnu1RHRsQLvzY4Ut8tMAQ0/cMajzYsypfianVFc310zNS4RJV9wc6lU62jnvIwlh8AI7BlUGjwL2vh+Wx8nQT9r1kbFWKBTgPIH2nUjIKkWVFlr5eO7+cFsN1Q+V6esz/0cEn/KiGkY40/46CaTLtLrp8wMmGqL+FlRfAM16DiyLkgSjtTc/Ulm7q4DI5/2vLYUg/kdhwb9/bu1Nkjo/kSND8thmlLlbRWY9k5p5ef9yixE2fRpsigqByAnyXaUOL5vvRMrIgZLqlzhJe9xtWRwDKScPp4xaTRJFk+SNsKy9LXqnZqZ46o0tMPXuNzi6AgBwYzGMKZoPnvFG0e24lxEcngA9/sBhxUfX9df7S0srXJI8OZEMwbaPgRfzyvsGEwK5ALAGxTDKx9rHjYIxQfZAFMS2TkWXGgxmYwqALgBt7KepFXvFtgIqA0xAEUc7qwjIdPZJc8KFw8nktJCfBOTa1QF7VUYB1ijIeMC2I2fJvnFmQfK+opz00bg2x5mGzINK X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: An ASI restricted address space is added for KVM. This protects the userspace from attack by the guest, and the guest from attack by other processes. It doesn't attempt to prevent the guest from attack by the current process. This change incorporates an extra asi_exit at the end of vcpu_run. We expect later iterations of ASI to drop that call as we gain the ability to context switch within the ASI domain. Signed-off-by: Brendan Jackman --- arch/x86/include/asm/kvm_host.h | 3 ++ arch/x86/kvm/svm/svm.c | 2 ++ arch/x86/kvm/vmx/vmx.c | 38 ++++++++++++-------- arch/x86/kvm/x86.c | 77 ++++++++++++++++++++++++++++++++++++++++- 4 files changed, 105 insertions(+), 15 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 6d9f763a7bb9d5db422ea5625b2c28420bd14f26..00cda452dd6ca6ec57ff85ca194ee4aeb6af3be7 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -37,6 +37,7 @@ #include #include #include +#include #define __KVM_HAVE_ARCH_VCPU_DEBUGFS @@ -1535,6 +1536,8 @@ struct kvm_arch { */ #define SPLIT_DESC_CACHE_MIN_NR_OBJECTS (SPTE_ENT_PER_PAGE + 1) struct kvm_mmu_memory_cache split_desc_cache; + + struct asi *asi; }; struct kvm_vm_stat { diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 9df3e1e5ae81a1346409632edd693cb7e0740f72..f2c3154292b4f6c960b490b0773f53bea43897bb 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4186,6 +4186,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in guest_state_enter_irqoff(); amd_clear_divider(); + asi_enter(vcpu->kvm->arch.asi); if (sev_es_guest(vcpu->kvm)) __svm_sev_es_vcpu_run(svm, spec_ctrl_intercepted, @@ -4193,6 +4194,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in else __svm_vcpu_run(svm, spec_ctrl_intercepted); + asi_relax(); guest_state_exit_irqoff(); } diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index d28618e9277ede83ad2edc1b1778ea44123aa797..181d230b1c057fed33f7b29b7b0e378dbdfeb174 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -49,6 +49,7 @@ #include #include #include +#include #include @@ -7282,14 +7283,34 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu, unsigned int flags) { struct vcpu_vmx *vmx = to_vmx(vcpu); + unsigned long cr3; guest_state_enter_irqoff(); + asi_enter(vcpu->kvm->arch.asi); + + /* + * Refresh vmcs.HOST_CR3 if necessary. This must be done immediately + * prior to VM-Enter, as the kernel may load a new ASID (PCID) any time + * it switches back to the current->mm, which can occur in KVM context + * when switching to a temporary mm to patch kernel code, e.g. if KVM + * toggles a static key while handling a VM-Exit. + * Also, this must be done after asi_enter(), as it changes CR3 + * when switching address spaces. + */ + cr3 = __get_current_cr3_fast(); + if (unlikely(cr3 != vmx->loaded_vmcs->host_state.cr3)) { + vmcs_writel(HOST_CR3, cr3); + vmx->loaded_vmcs->host_state.cr3 = cr3; + } /* * L1D Flush includes CPU buffer clear to mitigate MDS, but VERW * mitigation for MDS is done late in VMentry and is still * executed in spite of L1D Flush. This is because an extra VERW * should not matter much after the big hammer L1D Flush. + * + * This is only after asi_enter() for performance reasons. + * RFC: This also needs to be integrated with ASI's tainting model. */ if (static_branch_unlikely(&vmx_l1d_should_flush)) vmx_l1d_flush(vcpu); @@ -7310,6 +7331,8 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu, vmx->idt_vectoring_info = 0; + asi_relax(); + vmx_enable_fb_clear(vmx); if (unlikely(vmx->fail)) { @@ -7338,7 +7361,7 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu, fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) { struct vcpu_vmx *vmx = to_vmx(vcpu); - unsigned long cr3, cr4; + unsigned long cr4; /* Record the guest's net vcpu time for enforced NMI injections. */ if (unlikely(!enable_vnmi && @@ -7381,19 +7404,6 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) vmcs_writel(GUEST_RIP, vcpu->arch.regs[VCPU_REGS_RIP]); vcpu->arch.regs_dirty = 0; - /* - * Refresh vmcs.HOST_CR3 if necessary. This must be done immediately - * prior to VM-Enter, as the kernel may load a new ASID (PCID) any time - * it switches back to the current->mm, which can occur in KVM context - * when switching to a temporary mm to patch kernel code, e.g. if KVM - * toggles a static key while handling a VM-Exit. - */ - cr3 = __get_current_cr3_fast(); - if (unlikely(cr3 != vmx->loaded_vmcs->host_state.cr3)) { - vmcs_writel(HOST_CR3, cr3); - vmx->loaded_vmcs->host_state.cr3 = cr3; - } - cr4 = cr4_read_shadow(); if (unlikely(cr4 != vmx->loaded_vmcs->host_state.cr4)) { vmcs_writel(HOST_CR4, cr4); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 83fe0a78146fc198115aba0e76ba57ecfb1dd8d9..3e0811eb510650abc601e4adce1ce4189835a730 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -85,6 +85,7 @@ #include #include #include +#include #define CREATE_TRACE_POINTS #include "trace.h" @@ -9674,6 +9675,55 @@ static void kvm_x86_check_cpu_compat(void *ret) *(int *)ret = kvm_x86_check_processor_compatibility(); } +#ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION +static inline int kvm_x86_init_asi_class(void) +{ + static struct asi_taint_policy policy = { + /* + * Prevent going to the guest with sensitive data potentially + * left in sidechannels by code running in the unrestricted + * address space, or another MM. + */ + .protect_data = ASI_TAINT_KERNEL_DATA | ASI_TAINT_OTHER_MM_DATA, + /* + * Prevent going to the guest with branch predictor state + * influenced by other processes. Note this bit is about + * protecting the guest from other parts of the system, while + * data_taints is about protecting other parts of the system + * from the guest. + */ + .prevent_control = ASI_TAINT_OTHER_MM_CONTROL, + .set = ASI_TAINT_GUEST_DATA, + }; + + /* + * Inform ASI that the guest will gain control of the branch predictor, + * unless we're just unconditionally blasting it after VM Exit. + * + * RFC: This is a bit simplified - on some configurations we could avoid + * a duplicated RSB-fill if we had a separate taint specifically for the + * RSB. + */ + if (!cpu_feature_enabled(X86_FEATURE_IBPB_ON_VMEXIT) || + !IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) || + !cpu_feature_enabled(X86_FEATURE_RSB_VMEXIT)) + policy.set = ASI_TAINT_GUEST_CONTROL; + + /* + * And the same for data left behind by code in the userspace domain + * (i.e. the VMM itself, plus kernel code serving its syscalls etc). + * This should eventually be configurable: users whose VMMs contain + * no secrets can disable it to avoid paying a mitigation cost on + * transition between their guest and userspace. + */ + policy.protect_data |= ASI_TAINT_USER_DATA; + + return asi_init_class(ASI_CLASS_KVM, &policy); +} +#else /* CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */ +static inline int kvm_x86_init_asi_class(void) { return 0; } +#endif /* CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */ + int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) { u64 host_pat; @@ -9737,6 +9787,10 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) kvm_caps.supported_vm_types = BIT(KVM_X86_DEFAULT_VM); kvm_caps.supported_mce_cap = MCG_CTL_P | MCG_SER_P; + r = kvm_x86_init_asi_class(); + if (r < 0) + goto out_mmu_exit; + if (boot_cpu_has(X86_FEATURE_XSAVE)) { kvm_host.xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK); kvm_caps.supported_xcr0 = kvm_host.xcr0 & KVM_SUPPORTED_XCR0; @@ -9754,7 +9808,7 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) r = ops->hardware_setup(); if (r != 0) - goto out_mmu_exit; + goto out_asi_uninit; kvm_ops_update(ops); @@ -9810,6 +9864,8 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) out_unwind_ops: kvm_x86_ops.enable_virtualization_cpu = NULL; kvm_x86_call(hardware_unsetup)(); +out_asi_uninit: + asi_uninit_class(ASI_CLASS_KVM); out_mmu_exit: kvm_mmu_vendor_module_exit(); out_free_percpu: @@ -9841,6 +9897,7 @@ void kvm_x86_vendor_exit(void) cancel_work_sync(&pvclock_gtod_work); #endif kvm_x86_call(hardware_unsetup)(); + asi_uninit_class(ASI_CLASS_KVM); kvm_mmu_vendor_module_exit(); free_percpu(user_return_msrs); kmem_cache_destroy(x86_emulator_cache); @@ -11574,6 +11631,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) r = vcpu_run(vcpu); + /* + * At present ASI doesn't have the capability to transition directly + * from the restricted address space to the user address space. So we + * just return to the unrestricted address space in between. + */ + asi_exit(); + out: kvm_put_guest_fpu(vcpu); if (kvm_run->kvm_valid_regs) @@ -12705,6 +12769,14 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) if (ret) goto out_uninit_mmu; + ret = asi_init(kvm->mm, ASI_CLASS_KVM, &kvm->arch.asi); + if (ret) + goto out_uninit_mmu; + + ret = static_call(kvm_x86_vm_init)(kvm); + if (ret) + goto out_asi_destroy; + INIT_HLIST_HEAD(&kvm->arch.mask_notifier_list); atomic_set(&kvm->arch.noncoherent_dma_count, 0); @@ -12742,6 +12814,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) return 0; +out_asi_destroy: + asi_destroy(kvm->arch.asi); out_uninit_mmu: kvm_mmu_uninit_vm(kvm); kvm_page_track_cleanup(kvm); @@ -12883,6 +12957,7 @@ void kvm_arch_destroy_vm(struct kvm *kvm) kvm_destroy_vcpus(kvm); kvfree(rcu_dereference_check(kvm->arch.apic_map, 1)); kfree(srcu_dereference_check(kvm->arch.pmu_event_filter, &kvm->srcu, 1)); + asi_destroy(kvm->arch.asi); kvm_mmu_uninit_vm(kvm); kvm_page_track_cleanup(kvm); kvm_xen_destroy_vm(kvm); From patchwork Fri Jan 10 18:40:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935251 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01292E77188 for ; Fri, 10 Jan 2025 18:41:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3897B6B00D2; Fri, 10 Jan 2025 13:41:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 30F506B00D4; Fri, 10 Jan 2025 13:41:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0EAF56B00D3; Fri, 10 Jan 2025 13:41:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id DE8E76B00D1 for ; Fri, 10 Jan 2025 13:41:38 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id A1DE9AE2BF for ; Fri, 10 Jan 2025 18:41:38 +0000 (UTC) X-FDA: 82992410676.27.7E31E84 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf08.hostedemail.com (Postfix) with ESMTP id AB9AA160020 for ; Fri, 10 Jan 2025 18:41:36 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=gJUAv31l; spf=pass (imf08.hostedemail.com: domain of 332mBZwgKCPolcemocpdiqqing.eqonkpwz-oomxcem.qti@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=332mBZwgKCPolcemocpdiqqing.eqonkpwz-oomxcem.qti@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534496; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DPVLK2dGaKuBzLzuN4/iPTz64M+48MZ0VyD4n5dtrVo=; b=WUgPqOsYTemM/wQRZ+qv1N64gdZKINgH6Pxp3SGnq0eeomwQ3VzxsFlSUypLJE3LfbPt0f rZw5Uy4lm6aPZyDi8v450p961g2/IO9W9cWuE+cDKa67PWcb44LRqF4nwma89eUkx7u1HO 20QG8Qat55T1iTTz+nVyUKgn3v3iBfc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534496; a=rsa-sha256; cv=none; b=ZpLMy2AhAJfQ8h7qI7UAvk5g4hJDx9FazfIBxEZwiMUUj8BO693q0/ufG/KPW3LtvvcmSa EEDv5y2DdnvjDSrLoTg0+9fjVmlPdpXe3/iPtlvKpi2iDx2o8D167uZZDyWnpIj0wv4ZlJ 6P3lGhmE3JSbDcYtDlOIiTAsWssejKo= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=gJUAv31l; spf=pass (imf08.hostedemail.com: domain of 332mBZwgKCPolcemocpdiqqing.eqonkpwz-oomxcem.qti@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=332mBZwgKCPolcemocpdiqqing.eqonkpwz-oomxcem.qti@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43625ceae52so12794755e9.0 for ; Fri, 10 Jan 2025 10:41:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534495; x=1737139295; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=DPVLK2dGaKuBzLzuN4/iPTz64M+48MZ0VyD4n5dtrVo=; b=gJUAv31l+uzwIVMHuSEoq5fJi33TlbjJ/XkSu6akab3AmCoK0kKZhbztxuerwXk4pu Hb7tiTbwfrApcAr4S9bq8GGCBqLomJ7n/FUazDs4Z6g41j3i6UiewuvU9Bc8bgU67TCS VV7rvHB/aCvKRBBgN0yEcX6Uqh+VDuMnr33Mx+/8CGpjHY+QH61PoEbPXNLRQuqx32Ni skXlgCGGRM4YnkMUPCxxMufOXXtoyo8T6y5Ju64QPhy229cY5q16uyoJMj80qUbxhmV4 n9uvOJ9WYBKRHu2xa0uja5KPhlN5oyCQ1XPkCY0p77jqVR1UR2jHnGP+47d1+nUWRLjM QzWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534495; x=1737139295; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=DPVLK2dGaKuBzLzuN4/iPTz64M+48MZ0VyD4n5dtrVo=; b=U1oRBRlzct8QxIyquwNw13vNMUdhphjfRcny4x4abn16xOL1NeqGut6U0WGWOAvJPy cXNAXPOn6cROAK13rQgeWCtTCICvU85sU5JPEGpTtvK1ZdxAWEIPrCPJZFsa2w99FBN5 yYIUI0JOlSTgsrBKJICltf6d/K+naTbk0yUPvXqQWSkGZceAd5qvt7RQ8RrmgbO+YjM3 ZlP3U4aZnaGFkvG4e24tYYxghUwxrBkNsdugJX6+SKJ/1G7nYfd8nOWulqHLOrBRmnLq fIPUj5tPNmudeotlnYiqaWSj2NzthXnuXiSbpJweEydk8BH9EJ4FNEcT/6ChIhXFbf6c cb0g== X-Forwarded-Encrypted: i=1; AJvYcCUDP2y+OzUgODfVaO4abv6s5IxSJiF6SYXhTmo9kW8MdEpP2chlCXyYwuvGUeM/LWs3gkslYFqVzw==@kvack.org X-Gm-Message-State: AOJu0YyFOsP7q0bJARpzIZ8eWK+E11fCCJ066A9b0O11nMIWNQv3OxB1 YUYD+OqeLSsZ1dOlPtLxkF085BNACzkxrfRC9sMBFMyLC+umH6+vpEQLReyo6h7F6yplNktpMCT 4y7xAEfOdZQ== X-Google-Smtp-Source: AGHT+IGa4COP7j9J4GokCTwr7XdO0ZTV5zuQPMzgfz+JYaXuU8H1oEoG9KOP7c/HjvN6cC1FzOnvXoWDhc/niQ== X-Received: from wmbbd12.prod.google.com ([2002:a05:600c:1f0c:b0:434:fd41:173c]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:25a:b0:431:547e:81d0 with SMTP id 5b1f17b1804b1-436ee0a061emr59243545e9.11.1736534495189; Fri, 10 Jan 2025 10:41:35 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:48 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-22-8419288bc805@google.com> Subject: [PATCH RFC v2 22/29] mm: asi: exit ASI before accessing CR3 from C code where appropriate From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman , Yosry Ahmed X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: AB9AA160020 X-Stat-Signature: j19ea5fb1mes1af4be6thgxyzp7f4b5j X-Rspam-User: X-HE-Tag: 1736534496-122709 X-HE-Meta: U2FsdGVkX18/K1SG3jEy19/hVgVTnlHpjfmuwQ5BxC7ZxUFLp9Xdtr3j9FnaGRtwLo/OuAekpuIEWG8nVgGPSZ/13q0wMmTDXcdtuA+p+vrfi5UMfhyKngNnRu7kki6UVcJdJW5PGj2sEPXZ7JFdfgun9YeDEsbO0Y1fdHEaS3mAv6S6G/EvaiS61y0rGIhbui6Sa34sUdf879l1TAOAMDTjvBmXO8GUEIPBEbYv8evjo7VzzWEqvFNzOhVFcrHRQcK8nsvOHuRzQMZMZuK2J5oZx38OuN/cGQpV636+Xvib/TxI5uuIFn7WID5H/1eOAFmL6J66OrgWcCdjnlSRRKno696rVdvUbj47xhahnOMn5B2G7k7oomJ+tBJUh/LwCK5LRwhc+MShq8Tfxnipjr/M2jq8YHmtsDt+9XaWaWljHZ4Gabi10kXoSkSw4iWAiiTL1By1S6ZEpT9tVjvO3mB2QqpZBawQzji2OSxl6/0rWqCpwYbOkqOid6K5i0MQKj4iYOOrQcoT3A0ki3MnGCYkzciGlJTP85dF4bArKzXBl3eITBV0YSZzvs81fKnt0rhYcMk6Hz/vehuDYnONIUDbQtnRz1sMJ795gMy6ICqofhoUojPL0IpXJF0PvMXIUsTmM6ZndCOgr+wY19iNhPNBz+PNHf5l72xyZHwVMrNPtVWU0dfnOMdQXEnacxyC/IQ47grj5+3IBVtnaa3M6YyWFhJsJKhV0yLLrAKM+JECI6O3OaDUzBMu/HpWSFX+rK5/XmqvMCD/MrHbwkhOKQZyJuAMi+X70RPyXBWQwKMVkLRH65PsemlZr1+Pu2MbXLkp+gvCBlRIjDInZTOIdKsj7gzJYHROkiBfxFgjJmB0Tsu2T0yuzpwNJ+cdbFgRg28Y+GqaenmGX+HPMn3gHOLKHILTo0nEEM7qzCsqLLAKMzjlpyjfd0ZQqvLhd+bMrqHcp2IPm5ilgyIb1Hg 4tvxsqzL mn8lltWsDLigkke6aoD+w2y1bmXpx1J+zsyGt0Ypg2txc/ViZD5g4IicVv6HUoJsu9ZwpSIh/3+lr8/GlOEUa5XkwWfxrXqDbe9RLBFSVsHVGMJwh+RYEXyfLiLUAGS5ZCKB5p4OR7w22tuyQ6RvujlxKFYp6gAg5QmjYKJOCNdqUr1IOOrKyd1CSUAvzFqFdKG9X7oyAilYRUwDvkjVNWk6YnC7oF9RJN7Qo79PrjzFgql74gLgOScmqUmafOETae/JqQ9utKOASUuFnYiIap20t7IcwSxETMI/cqpzy0RlLyW+QkV0bfnbqgKZXWSiqP5ZpTSjIpiYwC2kBG52pImuKVZ9sz8KegGLoLOiIYGA68QcAgSmsS0s+bYzpqGKEzhQcON3AUXU2+AEzJFgC7ITsLhXjxlilbTC5Tyaa424tz9+twYyFZfcyX+x6hnqctu63F91a/DvQ8FjAgSnlIMR4jNNzLmGmdhBS+aq06KFZVVQv4Z3rwm/9FZ0TP4FJ4eFWrmj1j3yPw5kB3O51TaT6/vc6BFR9e+9picLe2Ex3L02lWh4yDrfykg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Because asi_exit()s can be triggered by NMIs, CR3 is unstable when in the ASI restricted address space. (Exception: code in the ASI critical section can treat it as stable, because if an asi_exit() occurs during an exception it will be undone before the critical section resumes). Code that accesses CR3 needs to become aware of this. Most importantly: if code reads CR3 and then writes a derived value back, if concurrent asi_exit() occurred in between then the address space switch would be undone, which would totally break ASI. So, make sure that an asi_exit() is performed before accessing CR3. Exceptions are made for cases that need to access the current CR3 value, restricted or not, without exiting ASI. (An alternative approach would be to enter an ASI critical section when a stable CR3 is needed. This would be worth exploring if the ASI exits introduced by this patch turned out to cause performance issues). Add calls to asi_exit() to __native_read_cr3() and native_write_cr3(), and introduce 'raw' variants that do not perform an ASI exit. Introduce similar variants for wrappers: __read_cr3(), read_cr3_pa(), and write_cr3(). A forward declaration of asi_exit(), because the one in asm-generic/asi.h is only declared when !CONFIG_ADDRESS_SPACE_ISOLATION, and arch/x86/asm/asi.h cannot be included either as it would cause a circular dependency. The 'raw' variants are used in the following cases: - In __show_regs() where the actual values of registers are dumped for debugging. - In dump_pagetable() and show_fault_oops() where the active page tables during a page fault are dumped for debugging. - In switch_mm_verify_cr3() and cr3_matches_current_mm() where the actual value of CR3 is needed for a debug check, and the code explicitly checks for ASI-restricted CR3. - In exc_page_fault() for ASI faults. The code is ASI-aware and explicitly performs an ASI exit before reading CR3. - In load_new_mm_cr3() where a new CR3 is loaded during context switching. At this point, it is guaranteed that ASI already exited. Calling asi_exit() at that point, where loaded_mm == LOADED_MM_SWITCHING, will cause VM_BUG_ON in asi_exit() to fire mistakenly even though loaded_mm is never accessed. - In __get_current_cr3_fast(), as it is called from an ASI critical section and the value is only used for debug checking. In nested_vmx_check_vmentry_hw(), do an explicit asi_exit() before calling __get_current_cr3_fast(), since in that case we are not in a critical section and do need a stable CR3 value. - In __asi_enter() and asi_exit() for obvious reasons. - In vmx_set_constant_host_state() when CR3 is initialized in the VMCS with the most likely value. Preemption is enabled, so once ASI supports context switching exiting ASI will not be reliable as rescheduling may cause re-entering ASI. It doesn't matter if the wrong value of CR3 is used in this context, before entering the guest, ASI is either explicitly entered or exited, and CR3 is updated again in the VMCS if needed. - In efi_5level_switch(), as it is called from startup_64_mixed_mode() during boot before ASI can be entered. startup_64_mixed_mode() is under arch/x86/boot/compressed/* and it cannot call asi_exit() anyway (see below). Finally, code in arch/x86/boot/compressed/ident_map_64.c and arch/x86/boot/compressed/pgtable_64.c extensively accesses CR3 during boot. This code under arch/x86/boot/compressed/* cannot call asi_exit() due to restriction on its compilation (it cannot use functions defined in .c files outside the directory). Instead of changing all CR3 accesses to use 'raw' variants, undefine CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION in these files. This will make the asi_exit() calls in CR3 helpers use the noop variant defined in include/asm-generic/asi.h. This is fine because the code is executed early in boot where asi_exit() would be noop anyway. With this change, the number of existing *_cr3() calls are 44, and the number of *_cr3_raw() are 22. The choice was made to make the existing functions exit ASI by default and adding new variants that do not exit ASI, because most call sites that use the new *_cr3_raw() variants are either ASI-aware code or debugging code. On the contrary, code that uses the existing variants is usually in important code paths (e.g. TLB flushes) and is ignorant of ASI. Hence, new code is most likely to be correct and less risky by using the variants that exit ASI by default. Signed-off-by: Yosry Ahmed Signed-off-by: Brendan Jackman --- arch/x86/Kconfig | 2 +- arch/x86/boot/compressed/ident_map_64.c | 10 ++++++++ arch/x86/boot/compressed/pgtable_64.c | 11 +++++++++ arch/x86/include/asm/processor.h | 5 ++++ arch/x86/include/asm/special_insns.h | 41 +++++++++++++++++++++++++++++++-- arch/x86/kernel/process_32.c | 2 +- arch/x86/kernel/process_64.c | 2 +- arch/x86/kvm/vmx/nested.c | 6 +++++ arch/x86/kvm/vmx/vmx.c | 8 ++++++- arch/x86/mm/asi.c | 4 ++-- arch/x86/mm/fault.c | 8 +++---- arch/x86/mm/tlb.c | 16 +++++++++---- arch/x86/virt/svm/sev.c | 2 +- drivers/firmware/efi/libstub/x86-5lvl.c | 2 +- include/asm-generic/asi.h | 1 + 15 files changed, 101 insertions(+), 19 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 1fcb52cb8cd5084ac3cef04af61b7d1653162bdb..ae31f36ce23d7c29d1e90b726c5a2e6ea5a63c8d 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2531,7 +2531,7 @@ config MITIGATION_ADDRESS_SPACE_ISOLATION The !PARAVIRT dependency is only because of lack of testing; in theory the code is written to work under paravirtualization. In practice there are likely to be unhandled cases, in particular concerning TLB - flushes. + flushes and CR3 manipulation. config ADDRESS_SPACE_ISOLATION_DEFAULT_ON diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c index dfb9c2deb77cfc4e9986976bf2fd1652666f8f15..957b6f818aec361191432b420b61ba6ae017cf6c 100644 --- a/arch/x86/boot/compressed/ident_map_64.c +++ b/arch/x86/boot/compressed/ident_map_64.c @@ -11,6 +11,16 @@ /* No MITIGATION_PAGE_TABLE_ISOLATION support needed either: */ #undef CONFIG_MITIGATION_PAGE_TABLE_ISOLATION +/* + * CR3 access helpers (e.g. write_cr3()) will call asi_exit() to exit the + * restricted address space first. We cannot call the version defined in + * arch/x86/mm/asi.c here, so make sure we always call the noop version in + * asm-generic/asi.h. It does not matter because early during boot asi_exit() + * would be a noop anyway. The alternative is spamming the code with *_raw() + * variants of the CR3 helpers. + */ +#undef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION + #include "error.h" #include "misc.h" diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c index c882e1f67af01c50a20bfe00a32138dc771ee88c..034ad7101126c19864cfacc7c363fd31fedecd2b 100644 --- a/arch/x86/boot/compressed/pgtable_64.c +++ b/arch/x86/boot/compressed/pgtable_64.c @@ -1,4 +1,15 @@ // SPDX-License-Identifier: GPL-2.0 + +/* + * CR3 access helpers (e.g. write_cr3()) will call asi_exit() to exit the + * restricted address space first. We cannot call the version defined in + * arch/x86/mm/asi.c here, so make sure we always call the noop version in + * asm-generic/asi.h. It does not matter because early during boot asi_exit() + * would be a noop anyway. The alternative is spamming the code with *_raw() + * variants of the CR3 helpers. + */ +#undef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION + #include "misc.h" #include #include diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index a32a53405f45e4c0473fe081e216029cf5bd0cdd..9375a7f877d60e8f556dedefbe74593c1a5a6e10 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -226,6 +226,11 @@ static __always_inline unsigned long read_cr3_pa(void) return __read_cr3() & CR3_ADDR_MASK; } +static __always_inline unsigned long read_cr3_pa_raw(void) +{ + return __read_cr3_raw() & CR3_ADDR_MASK; +} + static inline unsigned long native_read_cr3_pa(void) { return __native_read_cr3() & CR3_ADDR_MASK; diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index 6e103358966f6f1333aa07be97aec5f8af794120..1c886b3f04a56893b7408466a2c94d23f5d11857 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -5,6 +5,7 @@ #ifdef __KERNEL__ #include #include +#include #include #include @@ -42,18 +43,32 @@ static __always_inline void native_write_cr2(unsigned long val) asm volatile("mov %0,%%cr2": : "r" (val) : "memory"); } -static __always_inline unsigned long __native_read_cr3(void) +void asi_exit(void); + +static __always_inline unsigned long __native_read_cr3_raw(void) { unsigned long val; asm volatile("mov %%cr3,%0\n\t" : "=r" (val) : __FORCE_ORDER); return val; } -static __always_inline void native_write_cr3(unsigned long val) +static __always_inline unsigned long __native_read_cr3(void) +{ + asi_exit(); + return __native_read_cr3_raw(); +} + +static __always_inline void native_write_cr3_raw(unsigned long val) { asm volatile("mov %0,%%cr3": : "r" (val) : "memory"); } +static __always_inline void native_write_cr3(unsigned long val) +{ + asi_exit(); + native_write_cr3_raw(val); +} + static inline unsigned long native_read_cr4(void) { unsigned long val; @@ -152,17 +167,39 @@ static __always_inline void write_cr2(unsigned long x) /* * Careful! CR3 contains more than just an address. You probably want * read_cr3_pa() instead. + * + * The implementation interacts with ASI to ensure that the returned value is + * stable as long as preemption is disabled. */ static __always_inline unsigned long __read_cr3(void) { return __native_read_cr3(); } +/* + * The return value of this is unstable under ASI, even with preemption off. + * Call __read_cr3 instead unless you have a good reason not to. + */ +static __always_inline unsigned long __read_cr3_raw(void) +{ + return __native_read_cr3_raw(); +} + +/* This interacts with ASI like __read_cr3. */ static __always_inline void write_cr3(unsigned long x) { native_write_cr3(x); } +/* + * Like __read_cr3_raw, this doesn't interact with ASI. It's very unlikely that + * this should be called from outside ASI-specific code. + */ +static __always_inline void write_cr3_raw(unsigned long x) +{ + native_write_cr3_raw(x); +} + static inline void __write_cr4(unsigned long x) { native_write_cr4(x); diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c index 0917c7f25720be91372bacddb1a3032323b8996f..14828a867b713a50297953c5a0ccfd03d83debc0 100644 --- a/arch/x86/kernel/process_32.c +++ b/arch/x86/kernel/process_32.c @@ -79,7 +79,7 @@ void __show_regs(struct pt_regs *regs, enum show_regs_mode mode, cr0 = read_cr0(); cr2 = read_cr2(); - cr3 = __read_cr3(); + cr3 = __read_cr3_raw(); cr4 = __read_cr4(); printk("%sCR0: %08lx CR2: %08lx CR3: %08lx CR4: %08lx\n", log_lvl, cr0, cr2, cr3, cr4); diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 226472332a70dd02902f81c21207d770e698aeed..ca135731b54b7f5f1123c2b8b27afdca7b868bcc 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -113,7 +113,7 @@ void __show_regs(struct pt_regs *regs, enum show_regs_mode mode, cr0 = read_cr0(); cr2 = read_cr2(); - cr3 = __read_cr3(); + cr3 = __read_cr3_raw(); cr4 = __read_cr4(); printk("%sFS: %016lx(%04x) GS:%016lx(%04x) knlGS:%016lx\n", diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 931a7361c30f2da28073eb832efce0b378e3b325..7eb033dabe4a606947c4d7e5b96be2e42d8f2478 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -3214,6 +3214,12 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu) */ vmcs_writel(GUEST_RFLAGS, 0); + /* + * Stabilize CR3 to ensure the VM Exit returns to the correct address + * space. This is costly, we wouldn't do this in hot-path code. + */ + asi_exit(); + cr3 = __get_current_cr3_fast(); if (unlikely(cr3 != vmx->loaded_vmcs->host_state.cr3)) { vmcs_writel(HOST_CR3, cr3); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 181d230b1c057fed33f7b29b7b0e378dbdfeb174..0e90463f1f2183b8d716f85d5c8a8af8958fef0b 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -4323,8 +4323,14 @@ void vmx_set_constant_host_state(struct vcpu_vmx *vmx) /* * Save the most likely value for this task's CR3 in the VMCS. * We can't use __get_current_cr3_fast() because we're not atomic. + * + * Use __read_cr3_raw() to avoid exiting ASI if we are in the restrict + * address space. Preemption is enabled, so rescheduling could make us + * re-enter ASI anyway. It's okay to avoid exiting ASI here because + * vmx_vcpu_enter_exit() and nested_vmx_check_vmentry_hw() will + * explicitly enter or exit ASI and update CR3 in the VMCS if needed. */ - cr3 = __read_cr3(); + cr3 = __read_cr3_raw(); vmcs_writel(HOST_CR3, cr3); /* 22.2.3 FIXME: shadow tables */ vmx->loaded_vmcs->host_state.cr3 = cr3; diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index bc2cf0475a0e7344a66d81453f55034b2fc77eef..a9f9bfbf85eb47d16ef8d0bfbc7713f07052d3ed 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -488,7 +488,7 @@ noinstr void __asi_enter(void) pcid = asi_pcid(target, this_cpu_read(cpu_tlbstate.loaded_mm_asid)); asi_cr3 = build_cr3_pcid_noinstr(target->pgd, pcid, tlbstate_lam_cr3_mask(), false); - write_cr3(asi_cr3); + write_cr3_raw(asi_cr3); maybe_flush_data(target); /* @@ -559,7 +559,7 @@ noinstr void asi_exit(void) /* Tainting first makes reentrancy easier to reason about. */ this_cpu_or(asi_taints, ASI_TAINT_KERNEL_DATA); - write_cr3(unrestricted_cr3); + write_cr3_raw(unrestricted_cr3); /* * Must not update curr_asi until after CR3 write, otherwise a * re-entrant call might not enter this branch. (This means we diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index ee8f5417174e2956391d538f41e2475553ca4972..ca48e4f5a27be30ff93d1c3d194aad23d99ae43c 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -295,7 +295,7 @@ static bool low_pfn(unsigned long pfn) static void dump_pagetable(unsigned long address) { - pgd_t *base = __va(read_cr3_pa()); + pgd_t *base = __va(read_cr3_pa_raw()); pgd_t *pgd = &base[pgd_index(address)]; p4d_t *p4d; pud_t *pud; @@ -351,7 +351,7 @@ static int bad_address(void *p) static void dump_pagetable(unsigned long address) { - pgd_t *base = __va(read_cr3_pa()); + pgd_t *base = __va(read_cr3_pa_raw()); pgd_t *pgd = base + pgd_index(address); p4d_t *p4d; pud_t *pud; @@ -519,7 +519,7 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long ad pgd_t *pgd; pte_t *pte; - pgd = __va(read_cr3_pa()); + pgd = __va(read_cr3_pa_raw()); pgd += pgd_index(address); pte = lookup_address_in_pgd_attr(pgd, address, &level, &nx, &rw); @@ -1578,7 +1578,7 @@ DEFINE_IDTENTRY_RAW_ERRORCODE(exc_page_fault) * be losing some stats here. However for now this keeps ASI * page faults nice and fast. */ - pgd = (pgd_t *)__va(read_cr3_pa()) + pgd_index(address); + pgd = (pgd_t *)__va(read_cr3_pa_raw()) + pgd_index(address); if (!user_mode(regs) && kernel_access_ok(error_code, address, pgd)) { warn_if_bad_asi_pf(error_code, address); return; diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 07b1657bee8e4cf17452ea57c838823e76f482c0..0c9f477a44a4da971cb7744d01d9101900ead1a5 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -331,8 +331,14 @@ static void load_new_mm_cr3(pgd_t *pgdir, u16 new_asid, unsigned long lam, * Caution: many callers of this function expect * that load_cr3() is serializing and orders TLB * fills with respect to the mm_cpumask writes. + * + * The context switching code will explicitly exit ASI when needed, do + * not use write_cr3() as it has an implicit ASI exit. Calling + * asi_exit() here, where loaded_mm == LOADED_MM_SWITCHING, will cause + * the VM_BUG_ON() in asi_exit() to fire mistakenly even though + * loaded_mm is never accessed. */ - write_cr3(new_mm_cr3); + write_cr3_raw(new_mm_cr3); } void leave_mm(void) @@ -559,11 +565,11 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, * without going through leave_mm() / switch_mm_irqs_off() or that * does something like write_cr3(read_cr3_pa()). * - * Only do this check if CONFIG_DEBUG_VM=y because __read_cr3() + * Only do this check if CONFIG_DEBUG_VM=y because __read_cr3_raw() * isn't free. */ #ifdef CONFIG_DEBUG_VM - if (WARN_ON_ONCE(__read_cr3() != build_cr3(prev->pgd, prev_asid, + if (WARN_ON_ONCE(__read_cr3_raw() != build_cr3(prev->pgd, prev_asid, tlbstate_lam_cr3_mask()))) { /* * If we were to BUG here, we'd be very likely to kill @@ -1173,7 +1179,7 @@ noinstr unsigned long __get_current_cr3_fast(void) */ VM_WARN_ON_ONCE(asi && asi_in_critical_section()); - VM_BUG_ON(cr3 != __read_cr3()); + VM_BUG_ON(cr3 != __read_cr3_raw()); return cr3; } EXPORT_SYMBOL_GPL(__get_current_cr3_fast); @@ -1373,7 +1379,7 @@ static inline bool cr3_matches_current_mm(void) * find a current ASI domain. */ barrier(); - pgd_cr3 = __va(read_cr3_pa()); + pgd_cr3 = __va(read_cr3_pa_raw()); return pgd_cr3 == current->mm->pgd || pgd_cr3 == pgd_asi; } diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c index 9a6a943d8e410c0289200adb9deafe8e45d33a4b..63d391395a5c7f4ddec28116814ccd6c52bbb428 100644 --- a/arch/x86/virt/svm/sev.c +++ b/arch/x86/virt/svm/sev.c @@ -379,7 +379,7 @@ void snp_dump_hva_rmpentry(unsigned long hva) pgd_t *pgd; pte_t *pte; - pgd = __va(read_cr3_pa()); + pgd = __va(read_cr3_pa_raw()); pgd += pgd_index(hva); pte = lookup_address_in_pgd(pgd, hva, &level); diff --git a/drivers/firmware/efi/libstub/x86-5lvl.c b/drivers/firmware/efi/libstub/x86-5lvl.c index 77359e802181fd82b6a624cf74183e6a318cec9b..3b97a5aea983a109fbdc6d23a219e4a04024c512 100644 --- a/drivers/firmware/efi/libstub/x86-5lvl.c +++ b/drivers/firmware/efi/libstub/x86-5lvl.c @@ -66,7 +66,7 @@ void efi_5level_switch(void) bool have_la57 = native_read_cr4() & X86_CR4_LA57; bool need_toggle = want_la57 ^ have_la57; u64 *pgt = (void *)la57_toggle + PAGE_SIZE; - u64 *cr3 = (u64 *)__native_read_cr3(); + u64 *cr3 = (u64 *)__native_read_cr3_raw(); u64 *new_cr3; if (!la57_toggle || !need_toggle) diff --git a/include/asm-generic/asi.h b/include/asm-generic/asi.h index 7867b8c23449058a1dd06308ab5351e0d210a489..4f033d3ef5929707fd280f74fc800193e45143c1 100644 --- a/include/asm-generic/asi.h +++ b/include/asm-generic/asi.h @@ -71,6 +71,7 @@ static inline pgd_t *asi_pgd(struct asi *asi) { return NULL; } static inline void asi_handle_switch_mm(void) { } +struct thread_struct; static inline void asi_init_thread_state(struct thread_struct *thread) { } static inline void asi_intr_enter(void) { } From patchwork Fri Jan 10 18:40:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935252 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E6CBE7719C for ; Fri, 10 Jan 2025 18:41:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5584C6B00D6; Fri, 10 Jan 2025 13:41:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 509246B00D5; Fri, 10 Jan 2025 13:41:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 310426B00D6; Fri, 10 Jan 2025 13:41:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 0CE036B00D4 for ; Fri, 10 Jan 2025 13:41:41 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id C4EF2A0DF3 for ; Fri, 10 Jan 2025 18:41:40 +0000 (UTC) X-FDA: 82992410760.26.85C83E9 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf20.hostedemail.com (Postfix) with ESMTP id EF68E1C000D for ; Fri, 10 Jan 2025 18:41:38 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=daTiQU1+; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of 34WmBZwgKCPwnegoqerfksskpi.gsqpmry1-qqozego.svk@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=34WmBZwgKCPwnegoqerfksskpi.gsqpmry1-qqozego.svk@flex--jackmanb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534499; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aGIlyFkOxdGIUAlLSc7B7HeTqLb/6yPSKg6NF5x4oeg=; b=GVD/8iuj2kRKBCtjz57fQeAWl8kjH84fIU7SDur9XkPFZo22uZL0P0pzuHmbVjlHmDdo/W aM46oGvoboiOA26POBXbydIfuRNEh+NuvnZDMJbGYWVcHopei3DA+2lc+c6Zux/hlFk3e8 Lv+cYgHn0M7pmbuQ98hpvokpBzcOEiY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534499; a=rsa-sha256; cv=none; b=BKzsiTan3PVjb1WzSJXNjJEzeg+dkKryYiw3mNl/A/yeV/fLXDcbVo2kgih2NK4R6dOjLk /aAxo3LYiLeUoS0SHQQCXnl2RCiyqY8mYUshKXOzBCbS5sIEG4IW2Vr3zKPrSfoSqhieTv X/S/C8qgAmVgYbi7tgJNJySLYfsA5N0= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=daTiQU1+; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of 34WmBZwgKCPwnegoqerfksskpi.gsqpmry1-qqozego.svk@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=34WmBZwgKCPwnegoqerfksskpi.gsqpmry1-qqozego.svk@flex--jackmanb.bounces.google.com Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-4362552ce62so12186455e9.0 for ; Fri, 10 Jan 2025 10:41:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534497; x=1737139297; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=aGIlyFkOxdGIUAlLSc7B7HeTqLb/6yPSKg6NF5x4oeg=; b=daTiQU1+sbAuxvfXnajxYNBJnZpOPKhjSeu18bbcgGcUTF3tAT/rZ7qYG2pjwkaJhZ GBGjkHL+TvIvgB/j5gatwRqd4AiVNZq4VYIEGtFlSeEwj0PHVUV/6lVtJ8UsAtTKYCpI fmlaJz4KW/lOyNsZFLEMqzytcCy5Q72dOaPJXptbvJpstByIQdQK4x/T3zVlbwRUUJN9 NgPNoV2dxmom752SLoj8ijECZ/LouKIjG6AKp4K4ygR6HJWxYcOoRWeZJZj/ontQeYUW D7nhttUEKri/WNYBLt6mGCwb59cYtdiJCzpGPp1nLuEDEfkewcmnjjtUgMk3JIVCQVkd 5nFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534497; x=1737139297; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=aGIlyFkOxdGIUAlLSc7B7HeTqLb/6yPSKg6NF5x4oeg=; b=KQSIkGvBPhJPWcj4Qv3ooei6WcXtk1rpi8xKTEGnzoZ7BAuAbtZfaHkmZ6p4CZfVet ihG+yNpHDItXpZkIMseb/tfkr8jZW6Nwe1H9nmLIWWPAIURO0/ytHCcDilRROCUZKMDC 1WJeQwe0L+BBweZjWM+jodfUAPOEn14Ii0XPzjm4XMHFj0fBjdYI8riawHhsYdZ4Aexc 4O1mhDBTm54Lf7EKzGob83nj/Lv6VhaEfvZvRZO3Ah2/uyNRdM3AhwctmCnedxpx682W MYSsc1xjt0JievJxeFplORZUcTCV0at8j0QtZgWbZeUiXaJRbgWmGtLAb13lXU3Ugf3G ng8Q== X-Forwarded-Encrypted: i=1; AJvYcCWbHx6jrZCI4CffHpTcEj9kL8V2OPdQWuPAGc91HXH79zdpMRtJoIlug/mCCvnyNX7LNJWWvQoleA==@kvack.org X-Gm-Message-State: AOJu0YwTCRU7XNMZaklNiWNXDkURT25HKh++SpVNXLFvbtnM3VgNeTq6 x6eFSJNeHqD8Sy5d+W4Ft+ahA77MIXiaGY0XpkS4N7hLwlAAm3yOAHm/V2UgMmVqCKe30/u4w8Z fQTTUGwLISA== X-Google-Smtp-Source: AGHT+IFRy9/+C5CnBm/ma6a3ETfEwH3prKyhyhclEMfLJAYKbacCY4ZfacRZW3At4TIIMxUaqY38zEy/zVSAnQ== X-Received: from wmdn10.prod.google.com ([2002:a05:600c:294a:b0:436:d819:e4eb]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:5848:b0:436:f3f6:9582 with SMTP id 5b1f17b1804b1-436f3f695dfmr6272215e9.8.1736534497408; Fri, 10 Jan 2025 10:41:37 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:49 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-23-8419288bc805@google.com> Subject: [PATCH RFC v2 23/29] mm: asi: exit ASI before suspend-like operations From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman , Yosry Ahmed X-Stat-Signature: ykyrnnxd151835jrjfrjsnrf6rf7exgx X-Rspamd-Queue-Id: EF68E1C000D X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1736534498-563479 X-HE-Meta: U2FsdGVkX1/t/xPg4zq9zfqODuoV+M+fAPevYwtLPZqECZKVSbLs6E470dpaGQvMhCmB/2bIlJapA63u4gGbmDnsvGh1kGI8EEUSyj3lqQRQmSV6YR0JCJuUfPZu0a7QfzTJUnPbAckoeYKjt2RHmZZ8E7ViFEKA8WeuaZdtjvzNjq5eJ9NjPhZKF1EXJgytqiXmdtAaou251N8NAwaRB6tVC7Y64aBK/V8cO2dRnHDNpQJIuBKXB0+Gc6sL9qfFFEHZkLwoAhEe/2PzgyW6yV9OxQCRFcWUYIUtfaDmIbaoR0hRisDSzEvAGV4daFd44CulYHXFXEv34m4d3qlGQGCiExkYxD/6YNRF5XI1loi7/UP7pFdJ4DEReswK5MiaAzKrrfsQ29Q9YhJ6VTWo3HOZX6MjuoNYfBzZaGRen//refoLH+k+Ctj+JAweZCjxkBBVUpNGBEdWuPvE7TEcA1ib377TD8cmkD9qEonTyg+rFU9GAzcJnE82PzTDqRCeAD61i7mVvfow5hQQFNZoVjXO+657FAoC9L1DLk61bh2s9/J79l14pcW54o//z79CRy0TikeVTYkGoe9WpVZ98CzdA0Ue0S97g2iOUfb3wPVezE8ibXP8k+kfCzHva9PoV5h16j1YJvC7Bc1hUWd23PdLs1Em8sx2fT0N8OEcXwiSy4YtGOXf4qowqQhlbtAe+0roD70nS7uOAOAqX24lz3/ZwViUhcWV/D9w0fvQHftMkjgkG0EcI8ghLCpuLr6qZYunbv2i0OTtdrAJaqrYdUdjh3jlGtlehhQLmsaUEZ4lFrENuoY2HB9R/JSPnxMB5IunHW39JinE/gbfhb5S4jZQ5HeBCE2o8ZWpOZrm87mBmYDH7h+Rocy3q+HGVZKGkEQC2SUOM0L8/N+RqJtQjLG9aKzpCT2XEKlbMtnwIOkEZHOtQzsGYgCHHGQbs+tPBXo1aVImGd3AxvGvTBe dvFGToCB ++yHL8KaNViGWAN9kUHwm5IqyIW8zLWvwkc+5M8v7VRkPBTO5rNwi3+OUWAenHSHIN56ZrLfkh4KzFBDk+rcgVVVb5UlIhlUQmMrIW24OJ9H/mJdefp7OMVa3e1v5NivzKC+G3kraOT+brYi57jtGrQBD5xxuOUXT7V4pu2aj1bRfNOXLMdkad4mhMHH2uyRxL9xufmI734SeC327jLmJpACjNKfSjddsSdJmzJ/KI5VVav3qaeEBTa0RphwK9syFP/6fN0HszwOEMiFrQ4OW2TbLVRDzhGnAT4skKtlHXy/d+Xpqhbl9ugEdVpavI6yiHObBVnNqkHwmh8wz7K8+2DOyTReMMoQlsTiGFJw+JvS1R12UaXE1sgUFlxWDXOkAyg8k63vTNjlsiASc2gdTDhplhcZXAaExxBx5gSsHVvzPHEL3bc85JaizwhvlLRpBTaJV0dwPxU5lPDVpAkeZCXfUFeBsUhVlRRh2L7us59C3aXSgbNdoTeSoO8AqC9nOF9XLKkj8T41MQ8CwfjLISvNvArtt0arotqdsDhtgxHzaelPeiHJ8HnHo0NaN943YJVFg X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Yosry Ahmed During suspend-like operations (suspend, hibernate, kexec w/ preserve_context), the processor state (including CR3) is usually saved and restored later. In the kexec case, this only happens when KEXEC_PRESERVE_CONTEXT is used to jump back to the original kernel. In relocate_kernel(), some registers including CR3 are stored in VA_CONTROL_PAGE. If preserve_context is set (passed into relocate_kernel() in RCX), after running the new kernel the code under 'virtual_mapped' restores these registers. This is similar to what happens in suspend and hibernate. Note that even when KEXEC_PRESERVE_CONTEXT is not set, relocate_kernel() still accesses CR3. It mainly reads and writes it to flush the TLB. This could be problematic and cause improper ASI enters (see below), but it is assumed to be safe because the kernel will essentially reboot in this case anyway. Saving and restoring CR3 in this fashion can cause a problem if the suspend/hibernate/kexec is performed within an ASI domain. A restricted CR3 will be saved, and later restored after ASI had potentially already exited (e.g. from an NMI after CR3 is stored). This will cause an _improper_ ASI enter, where code starts executing in a restricted address space, yet ASI metadata (especially curr_asi) says otherwise. Exit ASI early in all these paths by registering a syscore_suspend() callback. syscore_suspend() is called in all the above paths (for kexec, only with KEXEC_PRESERVE_CONTEXT) after IRQs are finally disabled before the operation. This is not currently strictly required but is convenient because when ASI gains the ability to persist across context switching, there will be additional synchronization requirements simplified by this. Note: If the CR3 accesses in relocate_kernel() when KEXEC_PRESERVE_CONTEXT is not set are concerning, they could be handled by registering a syscore_shutdown() callback to exit ASI. syscore_shutdown() is called in the kexec path where KEXEC_PRESERVE_CONTEXT is not set starting commit 7bb943806ff6 ("kexec: do syscore_shutdown() in kernel_kexec"). Signed-off-by: Yosry Ahmed Signed-off-by: Brendan Jackman --- arch/x86/mm/asi.c | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index a9f9bfbf85eb47d16ef8d0bfbc7713f07052d3ed..c5073af1a82ded1c6fc467cd7a5d29a39d676bb4 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -6,6 +6,7 @@ #include #include +#include #include #include @@ -243,6 +244,32 @@ static int asi_map_percpu(struct asi *asi, void *percpu_addr, size_t len) return 0; } +#ifdef CONFIG_PM_SLEEP +static int asi_suspend(void) +{ + /* + * Must be called after IRQs are disabled and rescheduling is no longer + * possible (so that we cannot re-enter ASI before suspending. + */ + lockdep_assert_irqs_disabled(); + + /* + * Suspend operations sometimes save CR3 as part of the saved state, + * which is restored later (e.g. do_suspend_lowlevel() in the suspend + * path, swsusp_arch_suspend() in the hibernate path, relocate_kernel() + * in the kexec path). Saving a restricted CR3 and restoring it later + * could leave to improperly entering ASI. Exit ASI before such + * operations. + */ + asi_exit(); + return 0; +} + +static struct syscore_ops asi_syscore_ops = { + .suspend = asi_suspend, +}; +#endif /* CONFIG_PM_SLEEP */ + static int __init asi_global_init(void) { int err; @@ -306,6 +333,10 @@ static int __init asi_global_init(void) asi_clone_pgd(asi_global_nonsensitive_pgd, init_mm.pgd, VMEMMAP_START + (1UL << PGDIR_SHIFT)); +#ifdef CONFIG_PM_SLEEP + register_syscore_ops(&asi_syscore_ops); +#endif + return 0; } subsys_initcall(asi_global_init) From patchwork Fri Jan 10 18:40:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935253 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA8EDE7719C for ; Fri, 10 Jan 2025 18:41:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 028876B0099; Fri, 10 Jan 2025 13:41:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EF6286B00D4; Fri, 10 Jan 2025 13:41:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CAAB76B00D5; Fri, 10 Jan 2025 13:41:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 9AD8A6B0099 for ; Fri, 10 Jan 2025 13:41:43 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 567EE140DF1 for ; Fri, 10 Jan 2025 18:41:43 +0000 (UTC) X-FDA: 82992410886.03.F7F2096 Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) by imf15.hostedemail.com (Postfix) with ESMTP id 58D3BA000C for ; Fri, 10 Jan 2025 18:41:41 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Nk5J4jy5; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of 342mBZwgKCP4pgiqsgthmuumrk.iusrot03-ssq1giq.uxm@flex--jackmanb.bounces.google.com designates 209.85.221.74 as permitted sender) smtp.mailfrom=342mBZwgKCP4pgiqsgthmuumrk.iusrot03-ssq1giq.uxm@flex--jackmanb.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534501; a=rsa-sha256; cv=none; b=sEMZrwqq643FqCsTbgnk+KWGwmplvKircBCN1QAjleEt4xL+CynfjOtRHDmEOLPr6HJ/nW JX8x9QbQhO98qCDLlx1Zggu63keBLRG4jHQIfjOpcjNDzAQgmYoMIpV9TtMfzg8g3dWxm5 zbi9czAIMOeN59jtfn+psM1Mll9tibs= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Nk5J4jy5; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of 342mBZwgKCP4pgiqsgthmuumrk.iusrot03-ssq1giq.uxm@flex--jackmanb.bounces.google.com designates 209.85.221.74 as permitted sender) smtp.mailfrom=342mBZwgKCP4pgiqsgthmuumrk.iusrot03-ssq1giq.uxm@flex--jackmanb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534501; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=P63tWziIl5MLfOevD/STQnofaJPDmGipUyDaNTwON8A=; b=JxLRP5nmWJpgmjAQZX8otNFI1qxE8BefKT8EApbLwk8gazXtGDck3NgLq/bZnJE3sb/OJX m2tF4TVSUJfvc4QgeZz2fyqNKb3PNm/zXvHnIWQgsorMOTa1MP6I2dkSLy37TkJfhjgyx4 kfTTv+gKYN2YlIKQZWAPLXcNhEHjWx0= Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-385e03f54d0so1006223f8f.3 for ; Fri, 10 Jan 2025 10:41:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534500; x=1737139300; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=P63tWziIl5MLfOevD/STQnofaJPDmGipUyDaNTwON8A=; b=Nk5J4jy51BDovN1QpQyGTIDEbzvfsJncSYAmiPgML4yE7EzV6QW+Wk/3xxFQlwQzkl Q6BSwv7Bho489rh1FpUgr02T99uoLkI4M+IKFn44oCsnd9sUkma+DTdgKKYqH5u7uKSE amqHmAHuseGTlMozmhzJBt26Yg5SnHPYnEEgQjcxjd3OhEMgydVwBh0TVswlP0h0c/zr WfNYeGE5KDk9RkPGAbbpiMY17VjwgJBL5hD2xx+v7ZJeSVPghvdDdHsTLOBm70sMs0bf V93Q7p1AQSuiz35X9lOW7z+PN324OuelLZhOZYguc3rIFJcvZzyrPP3BAuaxuzRcUvFv lJ9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534500; x=1737139300; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=P63tWziIl5MLfOevD/STQnofaJPDmGipUyDaNTwON8A=; b=PcpWtW16P+Sk9goxCxrWMj7IFxEfljrjCKcp5lUAgdz2stYRp/qHBkFI5ilbRw0Fqw PEAiJSBw8kl2JNseDIuPtHY6cND50gKj/a+pn3gRZeY7AlePhKpo53VJoy2OZRBmi+d0 mbrgGhbt9eGXg5m5zDAW2DNLsd8BD2w/czjOALqSjaGrwCjaUZwaesMRlYTEOYplMVl+ W/Y+NE01Oz8wjL/Guwk4iwC4kF6C2qh1ZByxF31wYhiyuse0sAcD7B1wI+g8YoQKBHr/ +MzaG/r17XAJxBNrpCSrosQBWDChleWHfs4Dn7rMkHBtLV0HcT7Z5sZbRchLgy6wilNp 5cHw== X-Forwarded-Encrypted: i=1; AJvYcCXsdwHXQdLK317wPPpTrfajdpPdEtEVUyC8VgtsdaxsoVbfiuba92bP08tqx2CnR8Plt3A9z/fQMg==@kvack.org X-Gm-Message-State: AOJu0Yyx69rJ6NXYNw0oVCpIAZjJBZqrQo1XEU1E6GvpV60tMTiEcd9b 6jwWC5j7NVa0DAu4KNYxwGjE3ToPgjFd4wwQWHXOUL1rEEW/bMoPSpL8t/w6Q8ao1MdISJyscQD MJqTGE14EQg== X-Google-Smtp-Source: AGHT+IFpLRoJ7ZhWq6f6CfOMCmb66msRmVLu03/rxTgeCGcpzkKTsEslMdO3oeVqCUER7+YbrfuCetVYZstiXQ== X-Received: from wmbew14.prod.google.com ([2002:a05:600c:808e:b0:436:16c6:831]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:156c:b0:385:e3c5:61ae with SMTP id ffacd0b85a97d-38a873125c3mr11343217f8f.31.1736534499784; Fri, 10 Jan 2025 10:41:39 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:50 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-24-8419288bc805@google.com> Subject: [PATCH RFC v2 24/29] mm: asi: Add infrastructure for mapping userspace addresses From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman , Junaid Shahid , Reiji Watanabe X-Stat-Signature: w7p4bdxijn8tcn3xqq5mxa1m1bk8ek9z X-Rspam-User: X-Rspamd-Queue-Id: 58D3BA000C X-Rspamd-Server: rspam08 X-HE-Tag: 1736534501-310655 X-HE-Meta: U2FsdGVkX18EremXb6Aqf0WgIN2kVgf1iSVyseS3zvE0NkXaNGwpYLR+zJz3Y9qYMVTxkOADUsLWkaeLSOvy+1QW/cNu4XsIo+YSb617wWUIODV5J1dJcCYOnZ0x46TWGVgNVJBbPB+KoN+dN9EXnVr+Hqq12AxcgJZXCHWYkQl2/xb8F1VzjDEuVpjr4qegUHNQmAJR9f68kJ7c1TwntDRMfTnnzdsIQ9y8qcPSoC7Y7vU09oCEFsJ3XoeD/z4AK6BONCvI0MP9bRlDYW6p75WYa2mg+IUjVRXL97mEEiIvoHvdpr7E0PYZr3vPURG5lBHOpDASZKKGHNK/YxhLbtub2yzf5wH0gdeHO6MShnldCXa7p2paXUIhLmJmNSqtDMffIf3uWOGE/GUC2fJCGwJ7F1s1XHGkqFvmtR3gkFiN0mv1g0/kxNqzt9Hu4n4a67rHvQb4uX1NY1PY6ikTIwNx9ysbyZ5mD+QxjJytbeQhuHq10a3pF2oeyl+QVNxdsSqEuTB3CXwXSG1bDuwBKL5CtexQ5D5h5sTvlanLVd67XQ9/BbLbez6BFxPiq8uf1cnYGkRdi2nOZJltHzPvglq+gBvVV3DH3TLhJU5CjEumxEU1bFOV2XtmJGfGyVWUJNp5xE6JGHsu3276V0oTWI4VeL611bpSkyfnaH49MBl7H9mnxe48C1IC7gu5mKgATBZRl76t99/BSInXA3K7y4tbaE+yL2Niva1G99k7IwWIHYgWfHKwHnWpMXxLAW+LXUnCK9/B0h8eQyyUkV7jXhpODoVmenFeaWIbrFRxPKIrV05dTiLfMXrg3/M3FAW+2aYlwKjEoWcCtFTXyxU12zYaCFA4VFpeMAxzGoqdr76x2KDTWnIxTxsWqiGsTeUsJ0dYiwFkNzfHuNnbaqxDOgA9uUeky1QlfDJa5EhigDnqfwAGQ0PV0FAm0bwwDV/FFPu7bhqaMSCcq7HvypU Nq+nHWNL m2JrtP7IwXExIrCl2l9YIlMWhQYiEU2RMw/XCDzzqKzeZm0nij8jtJzM1x/ORtc90g3UdbnKiC4KmS5nDspx/wq5vm/JkoNavNOQGpJTQC1hY4wX4st3MJws28bgZE8fY2TfJwPay7NQJn6OQNkT2jpOThYlE0Dw6ragXge34Ob/zGWL7T8WHIQcmeOrfDAT5Gf1/eEFy9nqEGjUjxvUD69tpjH13QEVQIuQ+5PWg64avBqKRQsobtmkjC/XwMbXoU7C+xhuBZEk8XyIqdD6BzptQov2jBJSCQTjSIYiO2ZZ2QS5irAZs4C5HQ9GCnrOnjyHgmvv+atoSQGI/kNTBmMpc3AFNmL27nMvQdRDSNPfmWJkCgxqnW4gr0KCaQ/Cdb0u5jqFc3IKshol+o9wOogUkKics8Eqo6/Y+SElE04wh+7+jFuyoRDjP5s5A6lfU8yyMTdZiEgt/d41ked0jJzigoQCGNS8JWTPntkH0h7gwlk5S23B30OhBwK4vTHQVH6S3SudjnSIbFPjaWqyUdK9nJpZ6L3hKhyNlfvghfGMzPx6RCeCb6xDjiQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In preparation for sandboxing bare-metal processes, teach ASI to map userspace addresses into the restricted address space. Add a new policy helper to determine based on the class whether to do this. If the helper returns true, mirror userspace mappings into the ASI pagetables. Later, it will be possible for users who do not have a significant security boundary between KVM guests and their VMM process, to take advantage of this to reduce mitigation costs when switching between those two domains - to illustrate this idea, it's now reflected in the KVM taint policy, although the KVM class is still hard-coded not to map userspace addresses. Co-developed-by: Junaid Shahid Signed-off-by: Junaid Shahid Co-developed-by: Reiji Watanabe Signed-off-by: Reiji Watanabe Signed-off-by: Brendan Jackman --- arch/x86/include/asm/asi.h | 11 +++++ arch/x86/include/asm/pgalloc.h | 6 +++ arch/x86/include/asm/pgtable_64.h | 4 ++ arch/x86/kvm/x86.c | 12 +++-- arch/x86/mm/asi.c | 92 +++++++++++++++++++++++++++++++++++++++ include/asm-generic/asi.h | 4 ++ 6 files changed, 125 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index 555edb5f292e4d6baba782f51d014aa48dc850b6..e925d7d2cfc85bca8480c837548654e7a5a7009e 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -133,6 +133,7 @@ struct asi { struct mm_struct *mm; int64_t ref_count; enum asi_class_id class_id; + spinlock_t pgd_lock; }; DECLARE_PER_CPU_ALIGNED(struct asi *, curr_asi); @@ -147,6 +148,7 @@ const char *asi_class_name(enum asi_class_id class_id); int asi_init(struct mm_struct *mm, enum asi_class_id class_id, struct asi **out_asi); void asi_destroy(struct asi *asi); +void asi_clone_user_pgtbl(struct mm_struct *mm, pgd_t *pgdp); /* Enter an ASI domain (restricted address space) and begin the critical section. */ void asi_enter(struct asi *asi); @@ -286,6 +288,15 @@ static __always_inline bool asi_in_critical_section(void) void asi_handle_switch_mm(void); +/* + * This function returns true when we would like to map userspace addresses + * in the restricted address space. + */ +static inline bool asi_maps_user_addr(enum asi_class_id class_id) +{ + return false; +} + #endif /* CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */ #endif diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h index dcd836b59bebd329c3d265b98e48ef6eb4c9e6fc..edf9fe76c53369eefcd5bf14a09cbf802cf1ea21 100644 --- a/arch/x86/include/asm/pgalloc.h +++ b/arch/x86/include/asm/pgalloc.h @@ -114,12 +114,16 @@ static inline void p4d_populate(struct mm_struct *mm, p4d_t *p4d, pud_t *pud) { paravirt_alloc_pud(mm, __pa(pud) >> PAGE_SHIFT); set_p4d(p4d, __p4d(_PAGE_TABLE | __pa(pud))); + if (!pgtable_l5_enabled()) + asi_clone_user_pgtbl(mm, (pgd_t *)p4d); } static inline void p4d_populate_safe(struct mm_struct *mm, p4d_t *p4d, pud_t *pud) { paravirt_alloc_pud(mm, __pa(pud) >> PAGE_SHIFT); set_p4d_safe(p4d, __p4d(_PAGE_TABLE | __pa(pud))); + if (!pgtable_l5_enabled()) + asi_clone_user_pgtbl(mm, (pgd_t *)p4d); } extern void ___pud_free_tlb(struct mmu_gather *tlb, pud_t *pud); @@ -137,6 +141,7 @@ static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, p4d_t *p4d) return; paravirt_alloc_p4d(mm, __pa(p4d) >> PAGE_SHIFT); set_pgd(pgd, __pgd(_PAGE_TABLE | __pa(p4d))); + asi_clone_user_pgtbl(mm, pgd); } static inline void pgd_populate_safe(struct mm_struct *mm, pgd_t *pgd, p4d_t *p4d) @@ -145,6 +150,7 @@ static inline void pgd_populate_safe(struct mm_struct *mm, pgd_t *pgd, p4d_t *p4 return; paravirt_alloc_p4d(mm, __pa(p4d) >> PAGE_SHIFT); set_pgd_safe(pgd, __pgd(_PAGE_TABLE | __pa(p4d))); + asi_clone_user_pgtbl(mm, pgd); } static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned long addr) diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h index d1426b64c1b9715cd9e4d1d7451ae4feadd8b2f5..fe6d83ec632a6894527784f2ebdbd013161c6f09 100644 --- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -157,6 +157,8 @@ static inline void native_set_p4d(p4d_t *p4dp, p4d_t p4d) static inline void native_p4d_clear(p4d_t *p4d) { native_set_p4d(p4d, native_make_p4d(0)); + if (!pgtable_l5_enabled()) + asi_clone_user_pgtbl(NULL, (pgd_t *)p4d); } static inline void native_set_pgd(pgd_t *pgdp, pgd_t pgd) @@ -167,6 +169,8 @@ static inline void native_set_pgd(pgd_t *pgdp, pgd_t pgd) static inline void native_pgd_clear(pgd_t *pgd) { native_set_pgd(pgd, native_make_pgd(0)); + if (pgtable_l5_enabled()) + asi_clone_user_pgtbl(NULL, pgd); } /* diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 3e0811eb510650abc601e4adce1ce4189835a730..920475fe014f6503dd88c7bbdb6b2707c084a689 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9712,11 +9712,15 @@ static inline int kvm_x86_init_asi_class(void) /* * And the same for data left behind by code in the userspace domain * (i.e. the VMM itself, plus kernel code serving its syscalls etc). - * This should eventually be configurable: users whose VMMs contain - * no secrets can disable it to avoid paying a mitigation cost on - * transition between their guest and userspace. + * + * + * If we decided to map userspace into the guest's restricted address + * space then we don't bother with this since we assume either no bugs + * allow the guest to leak that data, or the user doesn't care about + * that security boundary. */ - policy.protect_data |= ASI_TAINT_USER_DATA; + if (!asi_maps_user_addr(ASI_CLASS_KVM)) + policy.protect_data |= ASI_TAINT_USER_DATA; return asi_init_class(ASI_CLASS_KVM, &policy); } diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index c5073af1a82ded1c6fc467cd7a5d29a39d676bb4..093103c1bc2677c81d68008aca064fab53b73a62 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -14,6 +14,7 @@ #include #include #include +#include #include "mm_internal.h" #include "../../../mm/internal.h" @@ -351,6 +352,33 @@ static void __asi_destroy(struct asi *asi) memset(asi, 0, sizeof(struct asi)); } +static void __asi_init_user_pgds(struct mm_struct *mm, struct asi *asi) +{ + int i; + + if (!asi_maps_user_addr(asi->class_id)) + return; + + /* + * The code below must be executed only after the given asi is + * available in mm->asi[index] to ensure at least either this + * function or __asi_clone_user_pgd() will copy entries in the + * unrestricted pgd to the restricted pgd. + */ + if (WARN_ON_ONCE(&mm->asi[asi->class_id] != asi)) + return; + + /* + * See the comment for __asi_clone_user_pgd() why we hold the lock here. + */ + spin_lock(&asi->pgd_lock); + + for (i = 0; i < KERNEL_PGD_BOUNDARY; i++) + set_pgd(asi->pgd + i, READ_ONCE(*(mm->pgd + i))); + + spin_unlock(&asi->pgd_lock); +} + int asi_init(struct mm_struct *mm, enum asi_class_id class_id, struct asi **out_asi) { struct asi *asi; @@ -388,6 +416,7 @@ int asi_init(struct mm_struct *mm, enum asi_class_id class_id, struct asi **out_ asi->mm = mm; asi->class_id = class_id; + spin_lock_init(&asi->pgd_lock); for (i = KERNEL_PGD_BOUNDARY; i < PTRS_PER_PGD; i++) set_pgd(asi->pgd + i, asi_global_nonsensitive_pgd[i]); @@ -398,6 +427,7 @@ int asi_init(struct mm_struct *mm, enum asi_class_id class_id, struct asi **out_ else *out_asi = asi; + __asi_init_user_pgds(mm, asi); mutex_unlock(&mm->asi_init_lock); return err; @@ -891,3 +921,65 @@ void asi_unmap(struct asi *asi, void *addr, size_t len) asi_flush_tlb_range(asi, addr, len); } + +/* + * This function is to copy the given unrestricted pgd entry for + * userspace addresses to the corresponding restricted pgd entries. + * It means that the unrestricted pgd entry must be updated before + * this function is called. + * We map entire userspace addresses to the restricted address spaces + * by copying unrestricted pgd entries to the restricted page tables + * so that we don't need to maintain consistency of lower level PTEs + * between the unrestricted page table and the restricted page tables. + */ +void asi_clone_user_pgtbl(struct mm_struct *mm, pgd_t *pgdp) +{ + unsigned long pgd_idx; + struct asi *asi; + int i; + + if (!static_asi_enabled()) + return; + + /* We shouldn't need to take care non-userspace mapping. */ + if (!pgdp_maps_userspace(pgdp)) + return; + + /* + * The mm will be NULL for p{4,g}d_clear(). We need to get + * the owner mm for this pgd in this case. The pgd page has + * a valid pt_mm only when SHARED_KERNEL_PMD == 0. + */ + BUILD_BUG_ON(SHARED_KERNEL_PMD); + if (!mm) { + mm = pgd_page_get_mm(virt_to_page(pgdp)); + if (WARN_ON_ONCE(!mm)) + return; + } + + /* + * Compute a PGD index of the given pgd entry. This will be the + * index of the ASI PGD entry to be updated. + */ + pgd_idx = pgdp - PTR_ALIGN_DOWN(pgdp, PAGE_SIZE); + + for (i = 0; i < ARRAY_SIZE(mm->asi); i++) { + asi = mm->asi + i; + + if (!asi_pgd(asi) || !asi_maps_user_addr(asi->class_id)) + continue; + + /* + * We need to synchronize concurrent callers of + * __asi_clone_user_pgd() among themselves, as well as + * __asi_init_user_pgds(). The lock makes sure that reading + * the unrestricted pgd and updating the corresponding + * ASI pgd are not interleaved by concurrent calls. + * We cannot rely on mm->page_table_lock here because it + * is not always held when pgd/p4d_clear_bad() is called. + */ + spin_lock(&asi->pgd_lock); + set_pgd(asi_pgd(asi) + pgd_idx, READ_ONCE(*pgdp)); + spin_unlock(&asi->pgd_lock); + } +} diff --git a/include/asm-generic/asi.h b/include/asm-generic/asi.h index 4f033d3ef5929707fd280f74fc800193e45143c1..d103343292fad567dcd73e45e986fb3974e59898 100644 --- a/include/asm-generic/asi.h +++ b/include/asm-generic/asi.h @@ -95,6 +95,10 @@ void asi_flush_tlb_range(struct asi *asi, void *addr, size_t len) { } static inline void asi_check_boottime_disable(void) { } +static inline void asi_clone_user_pgtbl(struct mm_struct *mm, pgd_t *pgdp) { }; + +static inline bool asi_maps_user_addr(enum asi_class_id class_id) { return false; } + #endif /* !CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */ #endif /* !_ASSEMBLY_ */ From patchwork Fri Jan 10 18:40:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935254 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78FFEE7719C for ; Fri, 10 Jan 2025 18:42:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5A6B96B00D5; Fri, 10 Jan 2025 13:41:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 557216B00D7; Fri, 10 Jan 2025 13:41:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3D1D26B00D8; Fri, 10 Jan 2025 13:41:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 138B46B00D5 for ; Fri, 10 Jan 2025 13:41:46 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id BDDC5C0E1A for ; Fri, 10 Jan 2025 18:41:45 +0000 (UTC) X-FDA: 82992410970.01.445C48C Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) by imf04.hostedemail.com (Postfix) with ESMTP id DB71E40003 for ; Fri, 10 Jan 2025 18:41:43 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=h6riJRrm; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of 35mmBZwgKCAMmdfnpdqejrrjoh.frpolqx0-ppnydfn.ruj@flex--jackmanb.bounces.google.com designates 209.85.221.74 as permitted sender) smtp.mailfrom=35mmBZwgKCAMmdfnpdqejrrjoh.frpolqx0-ppnydfn.ruj@flex--jackmanb.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534504; a=rsa-sha256; cv=none; b=41JQHEsbyuZr6KX2e/daK5xfT+W/ISUYVUEhX04BFATtnriZSO8xnqnrHwqxU7fSSEAWrJ GDF/eEva+2+9XRvTHErXZ+VsWDKFkOeZj4Ze/gbSmHDbbk3duPGLpE9A7mYY2RmjLdqL6N pbpvg3APIoBkly0PZFBaVdln6T1XqH0= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=h6riJRrm; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of 35mmBZwgKCAMmdfnpdqejrrjoh.frpolqx0-ppnydfn.ruj@flex--jackmanb.bounces.google.com designates 209.85.221.74 as permitted sender) smtp.mailfrom=35mmBZwgKCAMmdfnpdqejrrjoh.frpolqx0-ppnydfn.ruj@flex--jackmanb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534504; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lGjVOQFY6aPZLutIkt8TLO/unqgXoc0/gnPdxCEE5wQ=; b=MXxQULQl9TV2CVJsGWI4di4hZ6B7aF+4EgRcbEXygZjWkkQimy3GtPT5s/jbQ6zWxn61ZO IHVW7MM88tx3j7Rvn8hsON04XBTa+HlG4SBMt9EmlkIEcz1Rpg2oxamcMiqMI20Q42niEO S/ackPW+60QaVIYHuKA1sxOkodIMP34= Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-38629a685fdso880514f8f.2 for ; Fri, 10 Jan 2025 10:41:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534502; x=1737139302; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=lGjVOQFY6aPZLutIkt8TLO/unqgXoc0/gnPdxCEE5wQ=; b=h6riJRrm509oyy//4WTFtu3hJ0J9inlfGIf3tXlOELhWz4BIfgbFr6UwTjuw2qlAUa cxagnjauP/aq5NLwjvGIHT2HPWl0zYVDBxLUH5tAizAvxG8+qteBjGi1A3MElgxldKtq lF8FRbgvSFImSD4P/WLuI5u3rKXqf6B25fbw3lASZK09I4H4jj4H5rnwLfincXZc45H1 Wvaya7S2MLUoz/S0rjhp0yiezROi/Imkg919DKhdihbuqd/QSJQqyYVbdyz11OmpZWGD 5h4k/X1tTLUyIaNM5z9/DkPIc6Vfo44LU2k9B8DICsc42suJUG/T7A/pmOa7yG6VOZer nd4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534502; x=1737139302; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lGjVOQFY6aPZLutIkt8TLO/unqgXoc0/gnPdxCEE5wQ=; b=ft/N56B6DNyCDFVB1knHU8vfRoJCFhvdU8YePIt1im3wqrH80HK7wMf+NJuF0KFSux Z1T1fsh9XvQ328+gKLl1IaVWtLIHJfhqtAVB+289SZyCUZA5YmS5LZ/+FDRa6//Og2L4 suWCF4mutyo7uIfDDAan2G2sPyVDhnwv+NWEIKIyrad6HTVbqzfHeGe+01khei4p+n3O IPJVECpBEy/48rLJVJf/sg2UNq2QSR9Wa04pCJwp3AwQ0Q1dNJZORa2pTqu6lJ3Xe90w eih/W6NqYxZ+jMCKVC7uC6qpSwlh0Wr1nJKNWC/QoQoHjsk0x3pBiZ1Zg8rgdIiTLpdo jIkA== X-Forwarded-Encrypted: i=1; AJvYcCWcBBEYh9XDu9WS8ja9+hqwUaCqC2gmV1mSfJIrw5qW+hSHUEbq8ZPVNoc+1X0N8rQLCU9YYrbIYA==@kvack.org X-Gm-Message-State: AOJu0Yy2TDhYDgwf1qwlT2ZGnTh/Ua65S5stvVD3ry9hCkXe6Gza7iz6 P3lGYV5jH/DAlkThQqGTZ3JMX5nOksmv7TJXI0NzaSfzrl3hih0j2YJFZDsuu8y6Un1qJwA7lgZ 6t9UMUckQ+A== X-Google-Smtp-Source: AGHT+IEWN36a0RhRoQNfORq+2yO51X6txHS1Rw2POezO/cSMdMR6AN2SOjOPPnxrT0b5W7xr5QSy1CVP6p9cPA== X-Received: from wmqa1.prod.google.com ([2002:a05:600c:3481:b0:436:1995:1888]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a5d:588b:0:b0:385:df84:8496 with SMTP id ffacd0b85a97d-38a872c9432mr10863969f8f.3.1736534502226; Fri, 10 Jan 2025 10:41:42 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:51 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-25-8419288bc805@google.com> Subject: [PATCH RFC v2 25/29] mm: asi: Restricted execution fore bare-metal processes From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: DB71E40003 X-Stat-Signature: h8r7ezpnpg7rka4o4iwi9tkoksafxqza X-Rspam-User: X-HE-Tag: 1736534503-569994 X-HE-Meta: U2FsdGVkX1/eIcO13nSSi7ZAiH5Hh5BeidpXm5mzr6gB5g3k53DqvCai5G85Z7YtlukFodx8E/EBOy+rj1hot5rUEWtP+LyBEkWh/Jqf/QiuGU9arxZ/4vdPW6D3+TVeoO4YvouAKpbNVlQfhZfFvxJQo8nVDpLpdE49/e676f9wPOI5zt4FzD7u9pO3Wncpnzb3SbpQG6tC1s16npALir8s4ccW0VmVnWRie/OUe75KfVDGEreqJGcvbFfi1dJdBlCskwXWSUq+YVqwCZFC+NKzuCV3LkqGtzuLvZt+Hk/k3w20hObOvBQKxqRGttaP641K+PSu3/kse49lVklCI4o8IiZczA4pyroNxbN6ZlSrxZeSi+4oQAta5qpTBXa13+gWoclVpa3qioeyWxRlaUcHJ+3OL8v3V7Gwq63cFUv2FTJhrS0xERMv6Y4bwdFE2pZrp9r9l6tg8R/kS+VBr2EQKE13XpGG7ZiV1XThCXKvyHd7RMf9KlAznii/+kBOFdPmuj+cSXCrt0hXuP33EE4PjFUH6Q6HWM++bLQRIp72Mhz4uOTxmBcQCRimMeVToq5164mXkDSnjcsOJtL47r02mu6oWBBq/bAWcjOxIVLihZlmjC1CWML2EgE9x07Sw79PXvFHiMqlAkjsWoyFs49ScmbgWYH182n71s1O0GcRHHFvJUX7hdZWavzhE9y7fW6ltJbc8O2JSrjiPssrCH6iix/sM2LB0QRTkRiLeQ/lhNSg9TmAsjVgieF41qreCxmgC9FRC/gjn8xS15zLUkhZg0Xbso7Wk/Vane8wcxY1Jo+X9TwKKDHs2uPeiNjsz29ZvE43NkRPxLg6MGlwqy1daBIOm98zacVFPfVx88yeJwsurWFp1W+pVeJU4kuS/Rj3bLh1ZmHo9rFae6wiMcak42AA5AHB22Nu6fL75cbFV+CdZppe9uBeQv9zP8OnI24M9EKEcOa78Zf23Dl Jhg00Aac oV/T6wZlxkPq7BW7UsruDEFnADZsZ3nJMNJEckTBqbq2u3lSbCDEkCq3LDUBlPJpfUu53OF0X56PX9UOxGYk0E5143Mz3lN4Yd3lj2bhVybIphsdCw5ayNYl/i3PeuDXpKQ+t/995RflqGsTnfJR5nFsbO+dU4q6vxawKqqxxDCXR+7IDUb7qjVxuBmLbfNXR+m1xHSDL15t7auIorvqbCSL0b6zl1Nbpd+3vbV5445URz+XLoeGDIkZ9WQlry1XnFsNh+pIAZWzlok/gnJxTrKbj6VDFtxTIQx9V7o3ru1wzHfQzvdfOQ9Bj/icrHgZfhN7t2ILU2JwbH078m76nusS1tF2Qum/lvK+NAY3d+TYiyRHL0FJBgiIwo8E3nKgxRdjgFTMcJyZgup+TqsrcY/uY1pqaOZGyOdCa9JtsRMDmITo/wkxebqIZLkVmPuXRSXgwe1wMwOTYX3/Vqe6IsHJ8xqWeUiS2Dlg/R9QqBEWBPrOzQMDS7heS4SdBC0edln4YF+M+3qotchSN+NLY9e2IL6DV6AFOkSU9Mbupqk4rjR/SKxn5C397yCsr8aQtg4IC X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now userspace gets a restricted address space too. The critical section begins on exit to userspace and ends when it makes a system call. Other entries from userspace just interrupt the critical section via asi_intr_enter(). The reason why system calls have to actually asi_relax() (i.e. fully terminate the critical section instead of just interrupting it) is that system calls are the type of kernel entry that can lead to transition into a _different_ ASI domain, namely the KVM one: it is not supported to transition into a different domain while a critical section exists (i.e. while asi_state.target is not NULL), even if it has been paused by asi_intr_enter() (i.e. even if asi_state.intr_nest_depth is nonzero) - there must be an asi_relax() between any two asi_enter()s. The restricted address space for bare-metal tasks naturally contains the entire userspace address region, although the task's own memory is still missing from the direct map. This implementation creates new userspace-specific APIs for asi_init(), asi_destroy() and asi_enter(), which seems a little ugly, maybe this suggest a general rework of these APIs given that the "generic" version only has one caller. For RFC code this seems good enough though. Signed-off-by: Brendan Jackman --- arch/x86/include/asm/asi.h | 8 ++++++-- arch/x86/mm/asi.c | 49 ++++++++++++++++++++++++++++++++++++++++---- include/asm-generic/asi.h | 9 +++++++- include/linux/entry-common.h | 11 ++++++++++ init/main.c | 2 ++ kernel/entry/common.c | 1 + kernel/fork.c | 4 +++- 7 files changed, 76 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index e925d7d2cfc85bca8480c837548654e7a5a7009e..c3c1a57f0147ae9bd11d89c8bf7c8a4477728f51 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -140,19 +140,23 @@ DECLARE_PER_CPU_ALIGNED(struct asi *, curr_asi); void asi_check_boottime_disable(void); -void asi_init_mm_state(struct mm_struct *mm); +int asi_init_mm_state(struct mm_struct *mm); int asi_init_class(enum asi_class_id class_id, struct asi_taint_policy *taint_policy); +void asi_init_userspace_class(void); void asi_uninit_class(enum asi_class_id class_id); const char *asi_class_name(enum asi_class_id class_id); int asi_init(struct mm_struct *mm, enum asi_class_id class_id, struct asi **out_asi); void asi_destroy(struct asi *asi); +void asi_destroy_userspace(struct mm_struct *mm); void asi_clone_user_pgtbl(struct mm_struct *mm, pgd_t *pgdp); /* Enter an ASI domain (restricted address space) and begin the critical section. */ void asi_enter(struct asi *asi); +void asi_enter_userspace(void); + /* * Leave the "tense" state if we are in it, i.e. end the critical section. We * will stay relaxed until the next asi_enter. @@ -294,7 +298,7 @@ void asi_handle_switch_mm(void); */ static inline bool asi_maps_user_addr(enum asi_class_id class_id) { - return false; + return class_id == ASI_CLASS_USERSPACE; } #endif /* CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */ diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index 093103c1bc2677c81d68008aca064fab53b73a62..1e9dc568e79e8686a4dbf47f765f2c2535d025ec 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -25,6 +25,7 @@ const char *asi_class_names[] = { #if IS_ENABLED(CONFIG_KVM) [ASI_CLASS_KVM] = "KVM", #endif + [ASI_CLASS_USERSPACE] = "userspace", }; DEFINE_PER_CPU_ALIGNED(struct asi *, curr_asi); @@ -67,6 +68,32 @@ int asi_init_class(enum asi_class_id class_id, struct asi_taint_policy *taint_po } EXPORT_SYMBOL_GPL(asi_init_class); +void __init asi_init_userspace_class(void) +{ + static struct asi_taint_policy policy = { + /* + * Prevent going to userspace with sensitive data potentially + * left in sidechannels by code running in the unrestricted + * address space, or another MM. Note we don't check for guest + * data here. This reflects the assumption that the guest trusts + * its VMM (absent fancy HW features, which are orthogonal). + */ + .protect_data = ASI_TAINT_KERNEL_DATA | ASI_TAINT_OTHER_MM_DATA, + /* + * Don't go into userspace with control flow state controlled by + * other processes, or any KVM guest the process is running. + * Note this bit is about protecting userspace from other parts + * of the system, while data_taints is about protecting other + * parts of the system from the guest. + */ + .prevent_control = ASI_TAINT_GUEST_CONTROL | ASI_TAINT_OTHER_MM_CONTROL, + .set = ASI_TAINT_USER_CONTROL | ASI_TAINT_USER_DATA, + }; + int err = asi_init_class(ASI_CLASS_USERSPACE, &policy); + + WARN_ON(err); +} + void asi_uninit_class(enum asi_class_id class_id) { if (!boot_cpu_has(X86_FEATURE_ASI)) @@ -385,7 +412,8 @@ int asi_init(struct mm_struct *mm, enum asi_class_id class_id, struct asi **out_ int err = 0; uint i; - *out_asi = NULL; + if (out_asi) + *out_asi = NULL; if (!boot_cpu_has(X86_FEATURE_ASI)) return 0; @@ -424,7 +452,7 @@ int asi_init(struct mm_struct *mm, enum asi_class_id class_id, struct asi **out_ exit_unlock: if (err) __asi_destroy(asi); - else + else if (out_asi) *out_asi = asi; __asi_init_user_pgds(mm, asi); @@ -515,6 +543,12 @@ static __always_inline void maybe_flush_data(struct asi *next_asi) this_cpu_and(asi_taints, ~ASI_TAINTS_DATA_MASK); } +void asi_destroy_userspace(struct mm_struct *mm) +{ + VM_BUG_ON(!asi_class_initialized(ASI_CLASS_USERSPACE)); + asi_destroy(&mm->asi[ASI_CLASS_USERSPACE]); +} + noinstr void __asi_enter(void) { u64 asi_cr3; @@ -584,6 +618,11 @@ noinstr void asi_enter(struct asi *asi) } EXPORT_SYMBOL_GPL(asi_enter); +noinstr void asi_enter_userspace(void) +{ + asi_enter(¤t->mm->asi[ASI_CLASS_USERSPACE]); +} + noinstr void asi_relax(void) { if (static_asi_enabled()) { @@ -633,13 +672,15 @@ noinstr void asi_exit(void) } EXPORT_SYMBOL_GPL(asi_exit); -void asi_init_mm_state(struct mm_struct *mm) +int asi_init_mm_state(struct mm_struct *mm) { if (!boot_cpu_has(X86_FEATURE_ASI)) - return; + return 0; memset(mm->asi, 0, sizeof(mm->asi)); mutex_init(&mm->asi_init_lock); + + return asi_init(mm, ASI_CLASS_USERSPACE, NULL); } void asi_handle_switch_mm(void) diff --git a/include/asm-generic/asi.h b/include/asm-generic/asi.h index d103343292fad567dcd73e45e986fb3974e59898..c93f9e779ce1fa61e3df7835f5ab744cce7d667b 100644 --- a/include/asm-generic/asi.h +++ b/include/asm-generic/asi.h @@ -15,6 +15,7 @@ enum asi_class_id { #if IS_ENABLED(CONFIG_KVM) ASI_CLASS_KVM, #endif + ASI_CLASS_USERSPACE, ASI_MAX_NUM_CLASSES, }; static_assert(order_base_2(X86_CR3_ASI_PCID_BITS) <= ASI_MAX_NUM_CLASSES); @@ -37,8 +38,10 @@ int asi_init_class(enum asi_class_id class_id, static inline void asi_uninit_class(enum asi_class_id class_id) { } +static inline void asi_init_userspace_class(void) { } + struct mm_struct; -static inline void asi_init_mm_state(struct mm_struct *mm) { } +static inline int asi_init_mm_state(struct mm_struct *mm) { return 0; } static inline int asi_init(struct mm_struct *mm, enum asi_class_id class_id, struct asi **out_asi) @@ -48,8 +51,12 @@ static inline int asi_init(struct mm_struct *mm, enum asi_class_id class_id, static inline void asi_destroy(struct asi *asi) { } +static inline void asi_destroy_userspace(struct mm_struct *mm) { } + static inline void asi_enter(struct asi *asi) { } +static inline void asi_enter_userspace(void) { } + static inline void asi_relax(void) { } static inline bool asi_is_relaxed(void) { return true; } diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h index 1e50cdb83ae501467ecc30ee52f1379d409f962e..f04c4c038556f84ddf3bc09b6c1dd22a9dbd2f6b 100644 --- a/include/linux/entry-common.h +++ b/include/linux/entry-common.h @@ -191,6 +191,16 @@ static __always_inline long syscall_enter_from_user_mode(struct pt_regs *regs, l { long ret; + /* + * End the ASI critical section for userspace. Syscalls are the only + * place this happens - all other entry from userspace is handled via + * ASI's interrupt-tracking. The reason syscalls are special is that's + * where it's possible to switch to another ASI domain within the same + * task (i.e. KVM_RUN), an asi_relax() is required here in case of an + * upcoming asi_enter(). + */ + asi_relax(); + enter_from_user_mode(regs); instrumentation_begin(); @@ -355,6 +365,7 @@ static __always_inline void exit_to_user_mode_prepare(struct pt_regs *regs) */ static __always_inline void exit_to_user_mode(void) { + instrumentation_begin(); trace_hardirqs_on_prepare(); lockdep_hardirqs_on_prepare(); diff --git a/init/main.c b/init/main.c index c4778edae7972f512d5eefe8400075ac35a70d1c..d19e149d385e8321d2f3e7c28aa75802af62d09c 100644 --- a/init/main.c +++ b/init/main.c @@ -953,6 +953,8 @@ void start_kernel(void) /* Architectural and non-timekeeping rng init, before allocator init */ random_init_early(command_line); + asi_init_userspace_class(); + /* * These use large bootmem allocations and must precede * initalization of page allocator diff --git a/kernel/entry/common.c b/kernel/entry/common.c index 5b6934e23c21d36a3238dc03e391eb9e3beb4cfb..874254ed5958d62eaeaef4fe3e8c02e56deaf5ed 100644 --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -218,6 +218,7 @@ __visible noinstr void syscall_exit_to_user_mode(struct pt_regs *regs) __syscall_exit_to_user_mode_work(regs); instrumentation_end(); exit_to_user_mode(); + asi_enter_userspace(); } noinstr void irqentry_enter_from_user_mode(struct pt_regs *regs) diff --git a/kernel/fork.c b/kernel/fork.c index bb73758790d08112265d398b16902ff9a4c2b8fe..54068d2415939b92409ca8a45111176783c6acbd 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -917,6 +917,7 @@ void __mmdrop(struct mm_struct *mm) /* Ensure no CPUs are using this as their lazy tlb mm */ cleanup_lazy_tlbs(mm); + asi_destroy_userspace(mm); WARN_ON_ONCE(mm == current->active_mm); mm_free_pgd(mm); destroy_context(mm); @@ -1297,7 +1298,8 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, if (mm_alloc_pgd(mm)) goto fail_nopgd; - asi_init_mm_state(mm); + if (asi_init_mm_state(mm)) + goto fail_nocontext; if (init_new_context(p, mm)) goto fail_nocontext; From patchwork Fri Jan 10 18:40:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935255 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A469E7719C for ; Fri, 10 Jan 2025 18:42:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 88DE46B00D7; Fri, 10 Jan 2025 13:41:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 821206B00D8; Fri, 10 Jan 2025 13:41:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 643726B00D9; Fri, 10 Jan 2025 13:41:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 405B36B00D7 for ; Fri, 10 Jan 2025 13:41:48 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id F096EAEA4F for ; Fri, 10 Jan 2025 18:41:47 +0000 (UTC) X-FDA: 82992411054.10.B4C6DBD Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf22.hostedemail.com (Postfix) with ESMTP id 15A84C0002 for ; Fri, 10 Jan 2025 18:41:45 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=dVQM8EW6; spf=pass (imf22.hostedemail.com: domain of 36GmBZwgKCAUofhprfsglttlqj.htrqnsz2-rrp0fhp.twl@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=36GmBZwgKCAUofhprfsglttlqj.htrqnsz2-rrp0fhp.twl@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534506; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CAllTjvu+aeRJsuZfnIbaARP70yHbpTXyh5BDwNTLpw=; b=3Kd25hRRqpVHj28qrJkFNL3rIa/Y0LoZrX9g3a86FzPgdRVvDBCtE6Wp8fyF6QUeiI4kO5 qBL+keJEvTYQwkDULEeSyRZ9HTfPR502XyiEQFivvx2JqBRAlrtDmkma4XhXsvNNx9WsGm 3IM0JCHl8VpY6qzXGU5bOSRmTTxDraI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534506; a=rsa-sha256; cv=none; b=FS5OvkVIUQ0tnQ5VvjKCJH0DRuGwuFPyAcQrnZVmQ6XBZ4zhl1yP6ya5rRJSPA9UoRCQ7M t4wN0qaNvxWO1v8SEKyiqIBIEzmm6IQC0hbfWTwNPet4kD9n0rDJ9AxbEp5XcOZV3ieZk+ 5s16HfiyXn6sVcoo8zYoPY9BOHFLK8I= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=dVQM8EW6; spf=pass (imf22.hostedemail.com: domain of 36GmBZwgKCAUofhprfsglttlqj.htrqnsz2-rrp0fhp.twl@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=36GmBZwgKCAUofhprfsglttlqj.htrqnsz2-rrp0fhp.twl@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-43621907030so19343065e9.1 for ; Fri, 10 Jan 2025 10:41:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534505; x=1737139305; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=CAllTjvu+aeRJsuZfnIbaARP70yHbpTXyh5BDwNTLpw=; b=dVQM8EW6fyH+4Sg9DINJN24bKorjyogwln0AHFl5Ur9GcvmY/EVFxI8FTWJIjkTKGK D5oQ++CnLxRrYy75xZETOPG8ZlqBxvxM4h1GWpRj1hBOYFrbRBhDwlc0e1GztcOAZes+ ya4sgg1FclWfY6fc7LhrTIwPTj82NBht/mJnX6/KHWc58gHL0HZKydODFZHsFI5YS1Up Vz6a45pDRYtWGWXa02lGJXCCtPKo0fJGWhXvzyMH5hBZaj5BZ+Iw/dcg5uOU6WAlUBih Z2xlmkOSQTQ28KeU1KhqW2Qf/NVioQLqX9ZlVKZwAApxi4gOqaNLNgS7CtofTBjNyxbp /YuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534505; x=1737139305; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=CAllTjvu+aeRJsuZfnIbaARP70yHbpTXyh5BDwNTLpw=; b=vn6G5X8mlobEo+yLEqKYcmLNw1clHNSp2Vm12rrl4KxJaaJflkGJn5Tb0zc/FDtXdA aKssplnOxwBUc/HFOiOGre4jd1RhgCtGMed+rQDuxvXxQ9UX83Giq8Om4xHlvcCq+TOO 0FJbRoi5avgk8HSqSZlJKFoEm9G0HK1sS66uBqyLp2uJZSu0ZdeE9j1OB3ALV4Ums7DQ 47GF6HMdBFQuNaDgTdqvo55xX9ul+ARv1l81ovKtOooebDleEDl6rdSkdsFd9zXMImCL tRVu4E0d1UYjM59CCMpMIbOBH4cueZS6cg80OBNxk8FMAWUWLsapo5BeMuboWSo0N7ff xJiQ== X-Forwarded-Encrypted: i=1; AJvYcCXGmUaTG6Aa+xUtR5AXtHwWMS0norvWJut1JVWcabttc9uGoQD0yzFmrK1Ma8GyBhqI4bHIJcVS3w==@kvack.org X-Gm-Message-State: AOJu0YyhabME9IJ8waWTUPaoNv/a+vfXrg28FlADzzS2Vlu6SvM5Wlyp C/MXEO3xQkxHy0aVaNL9Y0ZqhkEZFLCL0AVeam8C/9L9sX0OaMnfVzKcsXdW/pGmGdk0PsAC+sx GahtJ8LNWAg== X-Google-Smtp-Source: AGHT+IFbu2VfquZETrYawBwFSqjLwW/Qzimn4MDKfCetEVKQSMACYt32ELZsfcDoj+dSkmxIf50qY6QNFb/+Xg== X-Received: from wmqe1.prod.google.com ([2002:a05:600c:4e41:b0:434:a050:ddcf]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:3c85:b0:436:18d0:aa6e with SMTP id 5b1f17b1804b1-436e2679a7cmr125841955e9.5.1736534504638; Fri, 10 Jan 2025 10:41:44 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:52 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-26-8419288bc805@google.com> Subject: [PATCH RFC v2 26/29] x86: Create library for flushing L1D for L1TF From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 15A84C0002 X-Stat-Signature: p6q1ak47qumhjktyycmmqm5gc674nf99 X-Rspam-User: X-HE-Tag: 1736534505-499792 X-HE-Meta: U2FsdGVkX18qi/5O830bwZSCaHMU+g/AUMIjs7Cqk7/I3KLgTaxAKrwK+UOJ4voXdp7pVlHLlIBOvMpXkQT5cAhqjzdr2+PIS42keNTd6Pp//XtsQx3rzroFHIwfRcs+CSjd4Lo4zpevV3/4MjemY3Uz2dUfR72/3zfrohytL3hCvOuD8ONRXG5uD5IkXOP40ROJePdXYkuyEwZ6BQbTo897Gx/mHR4d2pWA2o3uRSSDyQ+U2xsW8BROK55iNrib/zpnk6o62c441XYLn8MHw4fhYhe05ZJEOCIE3JiOTJVuggUtX5tn0IhtmqYpRzms9VDuIB3bpyuCh3eQFcnwAZQXDhC8beVouIBBXM3PBEEjDKnlidPqJniDNNMU9wUhi9D//Q1LsV5k+NnIjnshgT5LWHvwMYn12+EcEWEEu7EtEmU74TTEErfoSP1AJaQtTWGU2H+POw12X5J6yxB9N7AH//Ub+mpVZKoNRow5XZ6ESPTyCtd96pcVbdsHrcLPwj3ooycpqfv1UypvmnZ/skEMNRe8eib1eI3BEJacaHCv1JYpxTD5mq5icr+TRigEtKXoSw8kx8tb4sBtGzJQb36XtylRzAKXWADdjSjgf2OPdsqDf3jy2lGqYPAvMMSvO7Il5LxSbJdzu7lRIGvQCLCGR6jDGJ8iFMczwG7QvFO22VCEigrffz7WN/+pk7vNlpBf4bFbi2wrwuEOkK7ova71Uny66o0SA67v6ijQC1cXLuaBguMGnaqg/w5/FEl7xW+6BhQaJdd3pGmsJuRushY3DmbopibPQXZ96S4b5jaWZSKiuh6a/jB735Y/Dy5Ntu6ok7aWcbmBQG1oiKCBrZAaJnCtVKqTlvHqIm7LgHokFVr3G4v0rW6/xg6MyNvmhti7vvjuxk6hTHkNfILXV7neKuS1EN6GPjoT5Fgcr3EajRCBxV3mApkEQzwJ1XWjbENqr4qYjJfEfkdeCC+ 9qBRLje2 aUULbGHv7mzDttipfAztLB8/uGC5m8VuiDxJ4RYLpJKcbGhXZJ++XAZCcfBPdHnIOuxLWIOHLZbGLiqQkLuozh+SUFtFY8dq8sgBjtwAmWlBGy5ox0z4u7U0ea2NwOik/yBSsOI1PWC586VJMLaXLwi1gSJnhaF8nLMvBJbUaGPllrkzQ0+85tOqL9dtAijCFF5uoKzWuvkPjMXInCyojaDwcjjXw9cfsJC70uW+8KeMPfTI+xKLELb9/y0jil2MCclHOwAiJRJLGv+nDt6D6OaNHdG38aQGkunMEjYO3oWye8MLTXbB8r35cJZY6A7ij1dWm/wcY/vaHzYGB1GUpw7ogfJsh9VtszXBlWYG3XaXh0lJeT+5oG2CwFFHoZZlW/5IoiksiBrQeYK4THGg9bt9cNl8ZWxuooVO7yW2YQoyqH6eKvThouqq1oVPm2rqu2pAYfWLkKQeVU/Thw3ryoQWNcbJWUaEamg/DiyNGinrp1CW51PUibIBKvGZ/7t4jRpUcm0htlB+k9dDnOr4uVhCdgqOV2jt2YKerlCLOz8akcmWPjTT1oxdKtzpxEblLRvP7mWxWBxeiEyGi9wZ+avQ473YGzNUT1yebiTVYm0POHT4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: ASI will need to use this L1D flushing logic so put it in a library where it can be used independently of KVM. Since we're creating this library, it starts to look messy if we don't also use it in the double-opt-in (both kernel cmdline and prctl) mm-switching flush logic which is there for mitigating Snoop-Assisted L1 Data Sampling ("SAL1DS"). However, that logic doesn't use any software-based fallback for flushing on CPUs without the L1D_FLUSH command. In that case the prctl opt-in will fail. One option would be to just start using the software fallback sequence currently done by VMX code, but Linus didn't seem happy with a similar sequence being used here [1]. CPUs affected by SAL1DS are a subset of those affected by L1TF, so it wouldn't be completely insane to assume that the same sequence works for both cases, but I'll err on the side of caution and avoid risk of giving users a false impression that the kernel has really flushed L1D for them. [1] https://lore.kernel.org/linux-kernel/CAHk-=whC4PUhErcoDhCbTOdmPPy-Pj8j9ytsdcyz9TorOb4KUw@mail.gmail.com/ Instead, create this awkward library that is scoped specifically to L1TF, which will be used only by VMX and ASI, and has an annoying "only sometimes works" doc-comment. Users of the library can then infer from that comment whether they have flushed L1D. No functional change intended. Checkpatch-args: --ignore=COMMIT_LOG_LONG_LINE Signed-off-by: Brendan Jackman --- arch/x86/Kconfig | 4 ++ arch/x86/include/asm/l1tf.h | 11 ++++++ arch/x86/kvm/Kconfig | 1 + arch/x86/kvm/vmx/vmx.c | 66 +++---------------------------- arch/x86/lib/Makefile | 1 + arch/x86/lib/l1tf.c | 94 +++++++++++++++++++++++++++++++++++++++++++++ 6 files changed, 117 insertions(+), 60 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index ae31f36ce23d7c29d1e90b726c5a2e6ea5a63c8d..ca984dc7ee2f2b68c3ce1bcb5055047ca4f2a65d 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2523,6 +2523,7 @@ config MITIGATION_ADDRESS_SPACE_ISOLATION bool "Allow code to run with a reduced kernel address space" default n depends on X86_64 && !PARAVIRT && !UML + select X86_L1TF_FLUSH_LIB help This feature provides the ability to run some kernel code with a reduced kernel address space. This can be used to @@ -3201,6 +3202,9 @@ config HAVE_ATOMIC_IOMAP def_bool y depends on X86_32 +config X86_L1TF_FLUSH_LIB + def_bool n + source "arch/x86/kvm/Kconfig" source "arch/x86/Kconfig.assembler" diff --git a/arch/x86/include/asm/l1tf.h b/arch/x86/include/asm/l1tf.h new file mode 100644 index 0000000000000000000000000000000000000000..e0be19c588bb5ec5c76a1861492e48b88615b4b8 --- /dev/null +++ b/arch/x86/include/asm/l1tf.h @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_L1TF_FLUSH_H +#define _ASM_L1TF_FLUSH_H + +#ifdef CONFIG_X86_L1TF_FLUSH_LIB +int l1tf_flush_setup(void); +void l1tf_flush(void); +#endif /* CONFIG_X86_L1TF_FLUSH_LIB */ + +#endif + diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index f09f13c01c6bbd28fa37fdf50547abf4403658c9..81c71510e33e52447882ab7b22682199c57b492e 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -92,6 +92,7 @@ config KVM_SW_PROTECTED_VM config KVM_INTEL tristate "KVM for Intel (and compatible) processors support" depends on KVM && IA32_FEAT_CTL + select X86_L1TF_FLUSH_LIB help Provides support for KVM on processors equipped with Intel's VT extensions, a.k.a. Virtual Machine Extensions (VMX). diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 0e90463f1f2183b8d716f85d5c8a8af8958fef0b..b1a02f27b3abce0ef6ac448b66bef2c653a52eef 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -42,6 +42,7 @@ #include #include #include +#include #include #include #include @@ -250,9 +251,6 @@ static void *vmx_l1d_flush_pages; static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf) { - struct page *page; - unsigned int i; - if (!boot_cpu_has_bug(X86_BUG_L1TF)) { l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_NOT_REQUIRED; return 0; @@ -288,26 +286,11 @@ static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf) l1tf = VMENTER_L1D_FLUSH_ALWAYS; } - if (l1tf != VMENTER_L1D_FLUSH_NEVER && !vmx_l1d_flush_pages && - !boot_cpu_has(X86_FEATURE_FLUSH_L1D)) { - /* - * This allocation for vmx_l1d_flush_pages is not tied to a VM - * lifetime and so should not be charged to a memcg. - */ - page = alloc_pages(GFP_KERNEL, L1D_CACHE_ORDER); - if (!page) - return -ENOMEM; - vmx_l1d_flush_pages = page_address(page); + if (l1tf != VMENTER_L1D_FLUSH_NEVER) { + int err = l1tf_flush_setup(); - /* - * Initialize each page with a different pattern in - * order to protect against KSM in the nested - * virtualization case. - */ - for (i = 0; i < 1u << L1D_CACHE_ORDER; ++i) { - memset(vmx_l1d_flush_pages + i * PAGE_SIZE, i + 1, - PAGE_SIZE); - } + if (err) + return err; } l1tf_vmx_mitigation = l1tf; @@ -6652,20 +6635,8 @@ int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) return ret; } -/* - * Software based L1D cache flush which is used when microcode providing - * the cache control MSR is not loaded. - * - * The L1D cache is 32 KiB on Nehalem and later microarchitectures, but to - * flush it is required to read in 64 KiB because the replacement algorithm - * is not exactly LRU. This could be sized at runtime via topology - * information but as all relevant affected CPUs have 32KiB L1D cache size - * there is no point in doing so. - */ static noinstr void vmx_l1d_flush(struct kvm_vcpu *vcpu) { - int size = PAGE_SIZE << L1D_CACHE_ORDER; - /* * This code is only executed when the flush mode is 'cond' or * 'always' @@ -6695,32 +6666,7 @@ static noinstr void vmx_l1d_flush(struct kvm_vcpu *vcpu) vcpu->stat.l1d_flush++; - if (static_cpu_has(X86_FEATURE_FLUSH_L1D)) { - native_wrmsrl(MSR_IA32_FLUSH_CMD, L1D_FLUSH); - return; - } - - asm volatile( - /* First ensure the pages are in the TLB */ - "xorl %%eax, %%eax\n" - ".Lpopulate_tlb:\n\t" - "movzbl (%[flush_pages], %%" _ASM_AX "), %%ecx\n\t" - "addl $4096, %%eax\n\t" - "cmpl %%eax, %[size]\n\t" - "jne .Lpopulate_tlb\n\t" - "xorl %%eax, %%eax\n\t" - "cpuid\n\t" - /* Now fill the cache */ - "xorl %%eax, %%eax\n" - ".Lfill_cache:\n" - "movzbl (%[flush_pages], %%" _ASM_AX "), %%ecx\n\t" - "addl $64, %%eax\n\t" - "cmpl %%eax, %[size]\n\t" - "jne .Lfill_cache\n\t" - "lfence\n" - :: [flush_pages] "r" (vmx_l1d_flush_pages), - [size] "r" (size) - : "eax", "ebx", "ecx", "edx"); + l1tf_flush(); } void vmx_update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr) diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile index 98583a9dbab337e09a2e58905e5200499a496a07..b0a45bd70b40743a3fccb352b9641caacac83275 100644 --- a/arch/x86/lib/Makefile +++ b/arch/x86/lib/Makefile @@ -37,6 +37,7 @@ lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o insn-eval.o lib-$(CONFIG_RANDOMIZE_BASE) += kaslr.o lib-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o lib-$(CONFIG_MITIGATION_RETPOLINE) += retpoline.o +lib-$(CONFIG_X86_L1TF_FLUSH_LIB) += l1tf.o obj-y += msr.o msr-reg.o msr-reg-export.o hweight.o obj-y += iomem.o diff --git a/arch/x86/lib/l1tf.c b/arch/x86/lib/l1tf.c new file mode 100644 index 0000000000000000000000000000000000000000..c474f18ae331c8dfa7a029c457dd3cf75bebf808 --- /dev/null +++ b/arch/x86/lib/l1tf.c @@ -0,0 +1,94 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include + +#include +#include +#include + +#define L1D_CACHE_ORDER 4 +static void *l1tf_flush_pages; + +int l1tf_flush_setup(void) +{ + struct page *page; + unsigned int i; + + if (l1tf_flush_pages || boot_cpu_has(X86_FEATURE_FLUSH_L1D)) + return 0; + + page = alloc_pages(GFP_KERNEL, L1D_CACHE_ORDER); + if (!page) + return -ENOMEM; + l1tf_flush_pages = page_address(page); + + /* + * Initialize each page with a different pattern in + * order to protect against KSM in the nested + * virtualization case. + */ + for (i = 0; i < 1u << L1D_CACHE_ORDER; ++i) { + memset(l1tf_flush_pages + i * PAGE_SIZE, i + 1, + PAGE_SIZE); + } + + return 0; +} +EXPORT_SYMBOL(l1tf_flush_setup); + +/* + * Flush L1D in a way that: + * + * - definitely works on CPUs X86_FEATURE_FLUSH_L1D (because the SDM says so). + * - almost definitely works on other CPUs with L1TF (because someone on LKML + * said someone from Intel said so). + * - may or may not work on other CPUs. + * + * Don't call unless l1tf_flush_setup() has returned successfully. + */ +noinstr void l1tf_flush(void) +{ + int size = PAGE_SIZE << L1D_CACHE_ORDER; + + if (static_cpu_has(X86_FEATURE_FLUSH_L1D)) { + native_wrmsrl(MSR_IA32_FLUSH_CMD, L1D_FLUSH); + return; + } + + if (WARN_ON(!l1tf_flush_pages)) + return; + + /* + * This sequence was provided by Intel for the purpose of mitigating + * L1TF on VMX. + * + * The L1D cache is 32 KiB on Nehalem and some later microarchitectures, + * but to flush it is required to read in 64 KiB because the replacement + * algorithm is not exactly LRU. This could be sized at runtime via + * topology information but as all relevant affected CPUs have 32KiB L1D + * cache size there is no point in doing so. + */ + asm volatile( + /* First ensure the pages are in the TLB */ + "xorl %%eax, %%eax\n" + ".Lpopulate_tlb:\n\t" + "movzbl (%[flush_pages], %%" _ASM_AX "), %%ecx\n\t" + "addl $4096, %%eax\n\t" + "cmpl %%eax, %[size]\n\t" + "jne .Lpopulate_tlb\n\t" + "xorl %%eax, %%eax\n\t" + "cpuid\n\t" + /* Now fill the cache */ + "xorl %%eax, %%eax\n" + ".Lfill_cache:\n" + "movzbl (%[flush_pages], %%" _ASM_AX "), %%ecx\n\t" + "addl $64, %%eax\n\t" + "cmpl %%eax, %[size]\n\t" + "jne .Lfill_cache\n\t" + "lfence\n" + :: [flush_pages] "r" (l1tf_flush_pages), + [size] "r" (size) + : "eax", "ebx", "ecx", "edx"); +} +EXPORT_SYMBOL(l1tf_flush); From patchwork Fri Jan 10 18:40:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935256 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80943E77188 for ; Fri, 10 Jan 2025 18:42:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B4F7C6B009B; Fri, 10 Jan 2025 13:41:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B01606B00D8; Fri, 10 Jan 2025 13:41:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 94EE76B009E; Fri, 10 Jan 2025 13:41:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 75B206B008C for ; Fri, 10 Jan 2025 13:41:50 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 3920FAEA6C for ; Fri, 10 Jan 2025 18:41:50 +0000 (UTC) X-FDA: 82992411180.01.DEDB223 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf09.hostedemail.com (Postfix) with ESMTP id 581CB14000F for ; Fri, 10 Jan 2025 18:41:48 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="W1yNJ/hw"; spf=pass (imf09.hostedemail.com: domain of 36mmBZwgKCAcqhjrthuinvvnsl.jvtspu14-ttr2hjr.vyn@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=36mmBZwgKCAcqhjrthuinvvnsl.jvtspu14-ttr2hjr.vyn@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534508; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YGrdWVEeU4Ptbj/O5LlMCmxq+nbCGI8+4U1HKBQ0dew=; b=ytv3rhwCL92bt4G0d1ao96sZMVFiWZ2pmC11Uzvqy9MF3b10GU7dmIWnmwMWtVYHoPYCuB nBiYcZvq2ukkliCbEWlFaOQHqNbJMQBDLN1UHgmP59r/nZYVrqgPpointcvdHhZCpP9Fr1 NAzUa324vnTasEiqrBac+XQ9v190HW0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534508; a=rsa-sha256; cv=none; b=CnT6Di4Vsq804NVBSBS4k3WDCf365XlrNiQIteN7a74PrYa2SzvZnm+wKJdcSceulFsPGD cefuKJEu2p3qQlEhwGsFgbQiX7nKITOiiP1aS4d+b2V+gvysg7mrvDYxLlJC9IVuJd77ag KSdKBUY9/SOQ26l0PAqwXxfghuIB6YY= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="W1yNJ/hw"; spf=pass (imf09.hostedemail.com: domain of 36mmBZwgKCAcqhjrthuinvvnsl.jvtspu14-ttr2hjr.vyn@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=36mmBZwgKCAcqhjrthuinvvnsl.jvtspu14-ttr2hjr.vyn@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-4361c040ba8so13109705e9.1 for ; Fri, 10 Jan 2025 10:41:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534507; x=1737139307; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YGrdWVEeU4Ptbj/O5LlMCmxq+nbCGI8+4U1HKBQ0dew=; b=W1yNJ/hwgWDfKZqPTu2JXj9ddc7AIa+qUL6A/m4WVMHgrJUHzZzUkcbgNapzkUXd8A mKxEkHGSd4FrzvPbLzGXXBb2e9058d1DU/M+YLVXWc6g4XLDIB5SXkgK/hZGMO6K8iBE /rhzHNcoMxNdotki3ayUMJzHf0JvANiuXkqG4Vd5VJ0mLE4DxzSE4W17bzG/RXqvTc/g NzDbGjnVFbjFUlu8ifTJyg2lJF8mA9AowkYT1A+kBKRo5INRPeJCcgoU4BocnIArFieZ JV04q7L3jRDNr4wdKj/A3DCxkTJx96B+UeJjzNU6TeTI/YSpJOshqX/SFY65UF5Q+rxP FSfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534507; x=1737139307; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YGrdWVEeU4Ptbj/O5LlMCmxq+nbCGI8+4U1HKBQ0dew=; b=H3ewWENKJzQonDbkseJsTLH14p+6VGz1Cw4jb+13wZ8J1LkW7QIrDEI7m8q3qolHSL keTZHBHlMs+KlFyzHgFzx2XZb8Em3Zknwt/FEKg5GA06GQIEHq8A6DFZ8tcVISF3+75t WOtrLUm8W3mXdvEhA5ULADip3N0MobyXWkhii9CLDdWeolApFEigfllaEb/7qF7pjzYD nhz3LHji+ggaOQGRaLBlEkb/LoyDBt2ojorT4qm4UxwKeGbO5ZP29OYUMp379oJ60zBh vnMAbYg6DHQLWkhUW9zzcgzIeqbYuhzedcziHdjyMiwuo0b5p9HkHGEKEf2F0eDoKurG IlGg== X-Forwarded-Encrypted: i=1; AJvYcCWkuptmCkhRnFLbKF805Peftw2CtmFgV9snuXciPjU+gtsKF9+VoMbyaFwxY+SgFfZhPKEg6hI64w==@kvack.org X-Gm-Message-State: AOJu0YyAnYQPW0i/oHFsjLUOSKMC5wecjC8pEWdJp/pm2bpCl2GhPG55 SLSVRXNUnAwQ7VVjHicZTdRw2IW3oS8ct3UuCDZMcTiKkBZJb4IStblimE75ydsOAkKp297lOjB UqAfOn4xSfA== X-Google-Smtp-Source: AGHT+IEgsXbDoolH8oLFMalWBy0djT9uJM60zMy+Lo0ErgHawFZLEGOSZUiE5P0d6bxzOZZfpK4Vh92Jd6K16A== X-Received: from wmrn43.prod.google.com ([2002:a05:600c:502b:b0:434:a9bd:e68c]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:4f81:b0:434:f1d5:1453 with SMTP id 5b1f17b1804b1-436e2531ec8mr120237105e9.0.1736534506703; Fri, 10 Jan 2025 10:41:46 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:53 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-27-8419288bc805@google.com> Subject: [PATCH RFC v2 27/29] mm: asi: Add some mitigations on address space transitions From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 581CB14000F X-Stat-Signature: 5yakgaeyfpi7a313ho7cn557gr3ri658 X-Rspam-User: X-HE-Tag: 1736534508-147149 X-HE-Meta: U2FsdGVkX1/PrwNHwfSXTCNzlpb6w+UTsHwsJ6TRRyksnQAWsfsulQM/cpuNSh2Z49pV24nAzllYcZcBxK1XpNw+yM7JWse66j2pdhi5rS/dgmt5ED6My9FvJd9Lj34GXW7LcolP53c/hx6gj7VkKNTIN35MnomE3DzqoE8f7qdN0fsDtIGFQ8CeBtE3sQPJQmPxOno/IW4eirCiHXF8F5ntq5I2erVyYI5M2oDuJikjQDO1YuyYLrFajGXUSo4ruMZFnkXRKK3RKYpAWpADSlLi7X8QdJXBQ5ZBr32vlYis3AvE+7UCDpkJDsomB7s9k6l1d8TXlCd446B4zk8GYfBGoiZgDKOqff16j84vQlotBPVf9jK8Z90QercEdjfczkYIljV4H3w5VWDy4/fpcP32I2NPdnSifRQ6VgIbtP/V+Ncg8K/xatF3GkuFzVmSbfFicYJRYG5ulb7zqoTxF3rqDNjfySqIaQKAmrkwJYeYFsdtHLyu8XSnyHfyH3ieDLcptuBz1vx6XNNXzcc4RQqzEbya64U7vyZQJoaagtB1oX13gKSKlibcc35N8qDXBfUsKIp3PB9R4Ez+njET9PuCOYsVstmyi/wisOw7SEpg/cusnioab77hJLxOynth0CShyhp5p4F4cQNreN4c74/5xSMXfEJYe+R9Y8nMKLqZKSNa2ZwUmk3zSGf59+L92RwF/WoIxSJSIGTECjNhxJu3qJxTglwHIv1QtRuiqPrRRCZHWtyWQTMKlYMlsqYl7eD/iOf1QlIloIFfL2bw324wZUKxO6GtKl9fPzJsf1LYM2d/p+i0PPClJbHm/88RANYtRExjnS/02W0uKWN00ji/ykTf8ZSTBhAcwtUy9wbpkROpKFIXyfixoTWdetLyu6h45Ivyw5OENzRj4w+cHLeNNZ6SJWVTFVn3AuXdIBPxKLAyECJb1ZIkgFqsQ1v9CeqgbWHEZutg1i8mZ8m kXT7aach 3KeXq7xo/SNrDdo8fDAoTa01qA743uV47+Hfekl0qQIf/6aRDAkPPjn2yyHJdPnVHY7rXbzMx8jNAGA2jJHHhIwz/woirciD9PM/rfe0hCQfomFSscyKW6UlbBCsjn536b6yq0VUoG1iaY+ZHCn7uNR7aBLjbrNFuaSKVcuZKKBtYcUikpRXMKiyNx3OX2VqBTF79PNtrbfBk3mv2uAspTiWODzhpVA5pML/AssZykvdpPKjsHLKMc3G7iIvjtn3qbXtdugkVrm2WQoFIgrhC786IqE7crSIr5Tbf4QilFxbEq4rYQxIK6E4Vuk/qUHAvFSEDvD5ncMHbw6++G6UiQNLKiEOv4q47LIPozsQj84HETcOVF8XNdRRl2uTB2HmmdLdJ1ghoPp0sAFjQBSKkuG+bOalj3Umj+d4hwA7xlYNoo6U9mR3apyEytkSeJqNwUNZoag8GWr24t2uoOpyLm3VHe5zcIxrwJMSOYvzfcKactucTA7KnGukB1jAKyp7lE09JFcVcyW+6WbIxzTPqrNhI0vMGMDTPA+3bksNTMxEWRjTYuoaK3AOPRjv+BGHw7ODY X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Here we ASI actually starts becoming a real exploit mitigation, On CPUs with L1TF, flush L1D when the ASI data taints say so. On all CPUs, do some general branch predictor clearing whenever the control taints say so. This policy is very much just a starting point for discussion. Primarily it's a vague gesture at the fact that there is leeway in how ASI is used: it can be used to target CPU-specific issues (as is the case for L1TF here), or it can be used as a fairly broad mitigation (asi_maybe_flush_control() mitigates several known Spectre-style attacks and very likely also some unknown ones). Signed-off-by: Brendan Jackman --- arch/x86/include/asm/nospec-branch.h | 2 ++ arch/x86/kvm/vmx/vmx.c | 1 + arch/x86/lib/l1tf.c | 2 ++ arch/x86/lib/retpoline.S | 10 ++++++++++ arch/x86/mm/asi.c | 29 +++++++++++++++++++++-------- 5 files changed, 36 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 96b410b1d4e841eb02f53a4691ee794ceee4ad2c..4582fb1fb42f6fd226534012d969ed13085e943a 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -614,6 +614,8 @@ static __always_inline void mds_idle_clear_cpu_buffers(void) mds_clear_cpu_buffers(); } +extern void fill_return_buffer(void); + #endif /* __ASSEMBLY__ */ #endif /* _ASM_X86_NOSPEC_BRANCH_H_ */ diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index b1a02f27b3abce0ef6ac448b66bef2c653a52eef..a532783caaea97291cd92a2e2cac617f74f76c7e 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6635,6 +6635,7 @@ int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) return ret; } +/* Must be reentrant, for use by vmx_post_asi_enter. */ static noinstr void vmx_l1d_flush(struct kvm_vcpu *vcpu) { /* diff --git a/arch/x86/lib/l1tf.c b/arch/x86/lib/l1tf.c index c474f18ae331c8dfa7a029c457dd3cf75bebf808..ffe1c3d0ef43ff8f1781f2e446aed041f4ce3179 100644 --- a/arch/x86/lib/l1tf.c +++ b/arch/x86/lib/l1tf.c @@ -46,6 +46,8 @@ EXPORT_SYMBOL(l1tf_flush_setup); * - may or may not work on other CPUs. * * Don't call unless l1tf_flush_setup() has returned successfully. + * + * Must be reentrant, for use by ASI. */ noinstr void l1tf_flush(void) { diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S index 391059b2c6fbc4a571f0582c7c4654147a930cef..6d126fff6bf839889086fe21464d8af07316d7e5 100644 --- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -396,3 +396,13 @@ SYM_CODE_END(__x86_return_thunk) EXPORT_SYMBOL(__x86_return_thunk) #endif /* CONFIG_MITIGATION_RETHUNK */ + +.pushsection .noinstr.text, "ax" +SYM_CODE_START(fill_return_buffer) + UNWIND_HINT_FUNC + ENDBR + __FILL_RETURN_BUFFER(%_ASM_AX,RSB_CLEAR_LOOPS) + RET +SYM_CODE_END(fill_return_buffer) +__EXPORT_THUNK(fill_return_buffer) +.popsection diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index 1e9dc568e79e8686a4dbf47f765f2c2535d025ec..f10f6614b26148e5ba423d8a44f640674573ee40 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -10,6 +10,7 @@ #include #include +#include #include #include #include @@ -38,6 +39,8 @@ struct asi __asi_global_nonsensitive = { .mm = &init_mm, }; +static bool do_l1tf_flush __ro_after_init; + static inline bool asi_class_id_valid(enum asi_class_id class_id) { return class_id >= 0 && class_id < ASI_MAX_NUM_CLASSES; @@ -361,6 +364,15 @@ static int __init asi_global_init(void) asi_clone_pgd(asi_global_nonsensitive_pgd, init_mm.pgd, VMEMMAP_START + (1UL << PGDIR_SHIFT)); + if (boot_cpu_has_bug(X86_BUG_L1TF)) { + int err = l1tf_flush_setup(); + + if (err) + pr_warn("Failed to setup L1TF flushing for ASI (%pe)", ERR_PTR(err)); + else + do_l1tf_flush = true; + } + #ifdef CONFIG_PM_SLEEP register_syscore_ops(&asi_syscore_ops); #endif @@ -512,10 +524,12 @@ static __always_inline void maybe_flush_control(struct asi *next_asi) if (!taints) return; - /* - * This is where we'll do the actual dirty work of clearing uarch state. - * For now we just pretend, clear the taints. - */ + /* Clear normal indirect branch predictions, if we haven't */ + if (cpu_feature_enabled(X86_FEATURE_IBPB)) + __wrmsr(MSR_IA32_PRED_CMD, PRED_CMD_IBPB, 0); + + fill_return_buffer(); + this_cpu_and(asi_taints, ~ASI_TAINTS_CONTROL_MASK); } @@ -536,10 +550,9 @@ static __always_inline void maybe_flush_data(struct asi *next_asi) if (!taints) return; - /* - * This is where we'll do the actual dirty work of clearing uarch state. - * For now we just pretend, clear the taints. - */ + if (do_l1tf_flush) + l1tf_flush(); + this_cpu_and(asi_taints, ~ASI_TAINTS_DATA_MASK); } From patchwork Fri Jan 10 18:40:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935257 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0415FE7719C for ; Fri, 10 Jan 2025 18:42:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 09F366B00D9; Fri, 10 Jan 2025 13:41:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F19F16B00DA; Fri, 10 Jan 2025 13:41:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D47496B00DB; Fri, 10 Jan 2025 13:41:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B5A216B00D9 for ; Fri, 10 Jan 2025 13:41:52 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 80C53120DE6 for ; Fri, 10 Jan 2025 18:41:52 +0000 (UTC) X-FDA: 82992411264.24.5AC5D98 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf03.hostedemail.com (Postfix) with ESMTP id A943720009 for ; Fri, 10 Jan 2025 18:41:50 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=fOCak21M; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of 37GmBZwgKCAksjltvjwkpxxpun.lxvurw36-vvt4jlt.x0p@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=37GmBZwgKCAksjltvjwkpxxpun.lxvurw36-vvt4jlt.x0p@flex--jackmanb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534510; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=49F9KmkByYRspQkaSYzsGBFq+GnQYA+RnfOGnAD1TNs=; b=SfXfYi4vxygV/R54wRjzwno4cEdMadz0ObWVp30FFe2klkyfv8Sswk6AOQEfY6Sx3mKw5Q SrZ1SFW57bpib7XxaoWdYtcb5RrT5c1gSQ7TxtJMg16+ticn+56OyMHn8YuiUBW5GScJP9 KuxF+UKHP2+Q7NZO6dH4tZanyNRFGEk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534510; a=rsa-sha256; cv=none; b=Q3+TGpZVVRgbX7WOk9+DyLCmJysA4rQlu1mGMM6DkwCrsZlcDCvCGd/BoqDo1OUlyhHQQx fH3Gl5fDrnlCSV1p4YZ2AwM3/gB5sjc30Yd+eQxArUE33tDKFu6+L26tLt8md4zH7owdbn TZbe+e9gE9Ffnh6BEp1D2rsc8/Oo22U= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=fOCak21M; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of 37GmBZwgKCAksjltvjwkpxxpun.lxvurw36-vvt4jlt.x0p@flex--jackmanb.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=37GmBZwgKCAksjltvjwkpxxpun.lxvurw36-vvt4jlt.x0p@flex--jackmanb.bounces.google.com Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-436328fcfeeso20438575e9.1 for ; Fri, 10 Jan 2025 10:41:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534509; x=1737139309; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=49F9KmkByYRspQkaSYzsGBFq+GnQYA+RnfOGnAD1TNs=; b=fOCak21M395U8uEr4y9CPo3CrbgboD7pMAnCQAS+bQOI9xEToXRyz5RGiga4c4WaD0 7v67SHQ4Y+WAzQVdOpVj32D8Gb52DyyvjhWPxBtP6/vPmR79KSvtRTl+jiA8hsFNLUzW Vr41zZLd7HUXr1bF46osFwWlQnFtjRv1Nfh5FuZusItrNQ3+u9a0/4NFN4jypmS+T1yV ypWKwMfqUON6L2qhRLcNXhoHrdmDydOJsLMLsg8p6KvS2QILDuDteIEuxwsVMubE/Szm l58fZPhPap7Zwph/osg4rSrhQ1RNsh8vEkjBaWqCwwrr1MrKwPXoBJ7z6+8QpNzGg9hB WTyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534509; x=1737139309; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=49F9KmkByYRspQkaSYzsGBFq+GnQYA+RnfOGnAD1TNs=; b=hgjttXK+sFk0H67R/St5nryYn3bNX6lAP4IZ5313AN/2l2JDKXQEG+f0RZCI3kqugf Ko49YnFEBvtdjAlTtg8ElU9vDJ8lcKEmpsBzzA2h0x74aPkeBJp341B73hnNtJFLZViO HYeNeSgt6QzhFMqxCjDt5v9punVLw4ymPki8oxUiRIMeGF235qzB+xxFDokPF3tgj9w0 P++QF8TgRB21RHqMnFElE8Hw6aLSq3Iml3WYrc/rXxHQwzlC7J96OYB8IJeq+PZAQQAN UIPAdFlBXtvlriRmyW29/hilKyGRI7+e/fK5DD1wazs3oquudaTwKIHR/n/LYQL/0NMM 15GA== X-Forwarded-Encrypted: i=1; AJvYcCUJ3yqXmCYcjeFUjRWbWPrb2sS11xgWqaDEr3PPQfTAZpnNjCebObWGLpFAG4RihDobDnET9mh8gQ==@kvack.org X-Gm-Message-State: AOJu0Yy3whgOvYRk5FVV0pgG6VlUDVkMd5oLP2UXw/SzgtsPYKMLZo7q tl4f1bbC61OzdsvdmwGO3s7YQwaXmLqeJFTqrjN7m/YiAfFYP1tDccnoUm1z9bM1fl3uX6Ahqu9 po7enKpX66A== X-Google-Smtp-Source: AGHT+IFKWz0raFavEVPdqLi0m3c/SgK/uJNRKjAz6M2EuA2Ei6coxzOA0we1XFhGSIskVPMsumh+r4On/rf0kg== X-Received: from wmrn35.prod.google.com ([2002:a05:600c:5023:b0:434:f2eb:aa72]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1d07:b0:434:fa73:a907 with SMTP id 5b1f17b1804b1-436e269a5f5mr112362055e9.13.1736534508901; Fri, 10 Jan 2025 10:41:48 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:54 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-28-8419288bc805@google.com> Subject: [PATCH RFC v2 28/29] x86/pti: Disable PTI when ASI is on From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman X-Stat-Signature: 6y3ix3fxpkk7p3ag6dzztat9et4hoj8k X-Rspamd-Queue-Id: A943720009 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1736534510-202132 X-HE-Meta: U2FsdGVkX18yzyyxLbk+ZdoxJPIU7M1gSt386WcxS9B0O1YumFZWRAlCXrEEtUynoeCHm7wkfFBDoOXp/RrW6sZWL4pRUhmU+HGpO4NOec2LOpl5ay5gqAd9boOZDTzsb4woz/YdncTLscIzOnYvWEqaKu3Cb7o3nea0PaakFIazQzOO9egCuGjaXbbfo+lg1TnNmG3fiRCrW5tjUQjtJy0Bic0z2nDfqsBD0M1g5DMEP7xC+nJ8TLIXBwfGBH8qSN63ttU+5Bq3YZIRnDmkxsxTjRhzbxMOcVGa1x296Y+h+tnymNn5IppVm2Pz5CLcGAKIfkCn+4AP+bMFyqVpJiKadJII9Iz8rESNn+Rpwi3KH1hhvDVuqrVV6e0GPZL6Z4zHumEm1RfKUqwCg/pdf3DyvU1pnUNpPvHc/mJAFBi/PdPLwM18UwynW9DvUurvdjjUQ4dydq9kGcRbaXrLmtdYN1A3uILW7ApcwoXbvgyRJ1+xfkOtlEcgPKOsMg1fTYDQVJUM3Jqy+PgXJ31gmvgTuBR8jLBQrKhTrKQpEIGOjZMQAqwKfPPCKEe/v9rVQsStTFLSRQ5GNsMJK6rmxbt9ftiBIWn+1zaZyzK+qpphudzO2gt5iZK/ipeZ6EJCyIJ1wQcRVxznFnscjW/0hXikP+4bif0SU6bKLX+atdqdFYcfj1c5+rVNaB7qafQoFEH1aCerkxQvurtXHwisFOeJiC1Qu6ZOiK7eXySe8Uf/cj8nDuQJPlSkw9MY9Ney7LO+rMNvRjl8aJ+vuE+Omtr0507qz0D8HVe5EYvamxsy481471FY6QcAqp0h3us3/eZpY2u55RTQylQT7ABon4JiaTZ10+w1SL+PbJIzjF3vu1Jl5+l5qguhV+OAxYwui0GuQ5u8m70jCixBs6XVEYelFQjQ8q2eweW4QoXRNtUcVGsgaY7sG0fxFUNLslYe9RaFnCqhV1JF0L4Oz9j rvPJ/FGZ 2aj1OaFJXdRRnrgO9ELr4p+Dy7XEpQWgmMYqJbLkh0+bmB/7k2KjkuKVqlJ4G4l3QSfAeOM5p2fGBJGZRvRPPzu6Zmq/Znk7Bs5Ao0mhu/MYrER9dkrEicmJ+UIrgAvVK6LoSYlkunfqbCvyQB/PJMuXB7M3q5wS/t2Nb60W52oKjjDk/KejlHeholATfWfrgxWa5y6OwYXQqJpmoucw10bTMy+oGobD6uE7O/SANSPfFJ1V7O4nIk/735i5Rg63wLJI0i5BrAOartb+CogmgrhfWVxeAMnNtneU+b44ftTKucxBYmj7LU47gg/nTPlByJzWolB7m0m+5xZe5TRStyk12X2TVxwAMNHTkAjAhB5i1paO4XQuVm3txmNmYy0xp8ePhrJ1lOx6l3m9Xw5QmLahSx3+wKJd516NBDU4AbsCgdFXoK4NK854gbMGQAcf3b35D/7iSCU/wik7zYMK4MZIP6NiwNIhQMjJmWx5mADBWDXp3WUEfX8UYkccAcdE9xi3nQASCMH78HZPeJc/vpnAzzQdZwlplxKWAD1ydryhJRVobqjvs+IPpCiD/S3ecmN3gGqt8RhrB3PPRD4n/ay7/K4uC7c292oqnrhRBRcP0DV9f8u3LzsgeU0doRXnilgXVWXWEG01120buitNnPjMFZqxaZlL76jUX1YD4w64piyn0K5BYLdhpL6hf3lGt4Mky X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now that ASI has support for sandboxing userspace, although userspace now has much more mapped than it would under KPTI, in theory none of that data is important to protect. Note that one particular impact of this is it makes locally defeating KASLR easier. I don't think this is a great loss given [1] etc. Why do we pass in an argument instead of just having pti_check_boottime_disable() check boot_cpu_has(X86_FEATURE_ASI)? Just for clarity: I wanted it to be at least _sort of_ visible that it would break if you reordered asi_check_boottime_disable() afterwards. [1]: https://gruss.cc/files/prefetch.pdf and https://dl.acm.org/doi/pdf/10.1145/3623652.3623669 Signed-off-by: Brendan Jackman --- arch/x86/include/asm/pti.h | 6 ++++-- arch/x86/mm/init.c | 2 +- arch/x86/mm/pti.c | 14 +++++++++++++- 3 files changed, 18 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/pti.h b/arch/x86/include/asm/pti.h index ab167c96b9ab474b33d778453db0bb550f42b0ac..79b9ba927db9b76ac3cc72cdda6f8b5fc413d352 100644 --- a/arch/x86/include/asm/pti.h +++ b/arch/x86/include/asm/pti.h @@ -3,12 +3,14 @@ #define _ASM_X86_PTI_H #ifndef __ASSEMBLY__ +#include + #ifdef CONFIG_MITIGATION_PAGE_TABLE_ISOLATION extern void pti_init(void); -extern void pti_check_boottime_disable(void); +extern void pti_check_boottime_disable(bool asi_enabled); extern void pti_finalize(void); #else -static inline void pti_check_boottime_disable(void) { } +static inline void pti_check_boottime_disable(bool asi_enabled) { } #endif #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index ded3a47f2a9c1f554824d4ad19f3b48bce271274..4ccf6d60705652805342abefc5e71cd00c563207 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -754,8 +754,8 @@ void __init init_mem_mapping(void) { unsigned long end; - pti_check_boottime_disable(); asi_check_boottime_disable(); + pti_check_boottime_disable(boot_cpu_has(X86_FEATURE_ASI)); probe_page_size_mask(); setup_pcid(); diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c index 851ec8f1363a8b389ea4579cc68bf3300a4df27c..b7132080d3c9b6962a0252383190335e171bafa6 100644 --- a/arch/x86/mm/pti.c +++ b/arch/x86/mm/pti.c @@ -76,7 +76,7 @@ static enum pti_mode { PTI_FORCE_ON } pti_mode; -void __init pti_check_boottime_disable(void) +void __init pti_check_boottime_disable(bool asi_enabled) { if (hypervisor_is_type(X86_HYPER_XEN_PV)) { pti_mode = PTI_FORCE_OFF; @@ -91,6 +91,18 @@ void __init pti_check_boottime_disable(void) return; } + if (asi_enabled) { + /* + * Having both ASI and PTI enabled is not a totally ridiculous + * thing to do; if you want ASI but you are not confident in the + * sensitivity annotations then it provides useful + * defence-in-depth. But, the implementation doesn't support it. + */ + if (pti_mode != PTI_FORCE_OFF) + pti_print_if_insecure("disabled by ASI"); + return; + } + if (pti_mode == PTI_FORCE_ON) pti_print_if_secure("force enabled on command line."); From patchwork Fri Jan 10 18:40:55 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935258 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35636E77188 for ; Fri, 10 Jan 2025 18:42:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1AF696B00DB; Fri, 10 Jan 2025 13:41:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 15F926B00DD; Fri, 10 Jan 2025 13:41:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EF5C76B00DC; Fri, 10 Jan 2025 13:41:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id CCD106B00DA for ; Fri, 10 Jan 2025 13:41:54 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 8745780EAC for ; Fri, 10 Jan 2025 18:41:54 +0000 (UTC) X-FDA: 82992411348.11.A11CEB4 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) by imf14.hostedemail.com (Postfix) with ESMTP id B484410000C for ; Fri, 10 Jan 2025 18:41:52 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=eUH+cpn5; spf=pass (imf14.hostedemail.com: domain of 372mBZwgKCAwvmowymzns00sxq.o0yxuz69-yyw7mow.03s@flex--jackmanb.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=372mBZwgKCAwvmowymzns00sxq.o0yxuz69-yyw7mow.03s@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534512; a=rsa-sha256; cv=none; b=gZRVgQOHCJ3bwijCk7RcHfh2SNfd8XowVrdoh/6RL5Xyhq/o4AeoJ6UXmzqZg7Hl4k2Sw0 fL5tprLjTNWU6OCJk337bwU5g08W9zrRgz1kdcSEW1HYMI0jQSIKgTqbzA0dTuGlzMkwxq vpTFQjX6rV5Y38bSoNZkBjLa9r8W3eY= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=eUH+cpn5; spf=pass (imf14.hostedemail.com: domain of 372mBZwgKCAwvmowymzns00sxq.o0yxuz69-yyw7mow.03s@flex--jackmanb.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=372mBZwgKCAwvmowymzns00sxq.o0yxuz69-yyw7mow.03s@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534512; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=t56obMWOlMyKyO7q6Si1cnEbfn2qyXwrpnzxR+4ONcI=; b=qd1pjw5p0I0Ij5nQhwVC9f+DJfGGE5p/P46I9KmdYrpTgwiGc6qDOgQgcQBrP+a60BcbE3 vHB8cGeb/ftMR3geSNYMNu/Ha24DlGnuPn5dtnCFH1ZS2zVB/bMjUCdglWkH14MvTj1UvG g5MAj7q1yJHnprFIhF3G64fcfXe6RtQ= Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-38a35a65575so1747293f8f.1 for ; Fri, 10 Jan 2025 10:41:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534511; x=1737139311; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=t56obMWOlMyKyO7q6Si1cnEbfn2qyXwrpnzxR+4ONcI=; b=eUH+cpn5W7pgqzdvRKwrETYlyKvzsRe1Hc/rHhl9kVNggIWKHWNdkPp7CDdwSl7RqW ZQxOWrxSF0Yjynlkms0tQ34DR8lwHmoOE6U1gSNWkgWZZEvzAnNT6KnLaDERmMAqyD7o WBJIp4j/h6PJB2mXAKe46WRWufDEICvk8zL6Fs5GUXmF6ZKK8NqtTJWSx2AcUMe+/+yl G3zD1MCzeM2Ul2m3wDDeu43NqcQ7vv/UN5im5dz3021f6uoccE3/owK6rzPg1+NufXL4 gyoVAU5g3bbj1FCt3a6pGYkRgE3LihL2psPgitA3DmcHe0DcoDJvd//qEKBcJYYWykdH zMbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534511; x=1737139311; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=t56obMWOlMyKyO7q6Si1cnEbfn2qyXwrpnzxR+4ONcI=; b=a7Vcf+lMLM7ujSJ0TeCQz9r6UJY31K0DD6Ub+xW3L4hsimAGQwiky3o12kmdrBX7be 9nQfOyKgF97/XckYwlPyp18jiD15jHfmTi40Y0FVj2iIgoNcIQVBntyv6UzdgeRtlYKU KQWuEIz49eq7Qa+YnrIJYwMjIWR8de6NyLHKi6ZCHOi7coRaWODe6DIqebenMcOCxkwk +fwOnjpF3OcO1+jWkZcKdVhANeua3ghwH6NEkI2WQJojH4T5PRss2gbAUviatQ5aZOJ5 /C3t1G4OuoZg8cnF94MrjMeqg2EgrH0iDXbBGEbVm1dHBZI8b+U3rP6LvDAKBwFIs0e9 2bwA== X-Forwarded-Encrypted: i=1; AJvYcCV9G9/nOmK2MT1hZC9eM+M/Pj8Brakgwqq0f71ZuFL4CpgdAjiLpVp79hL9dK4X88rUqp5fbtwxtQ==@kvack.org X-Gm-Message-State: AOJu0YzDer5fm40VfpcvKcMaqpfVrgYlxdcKTnoYPlV0Xs0cdSf9/szA ADvRvs5cfc6oKapOvqyq1w6okFNwH8eh1St00VEV3DwduGTLuVJjLs2DyKoyX2TEdVucis75bIH NDMAeKLpIsg== X-Google-Smtp-Source: AGHT+IHNze04KHr3mNrPu4oWzk1uJr6kUgfUG8uUPvBwU8SZTj2uxyjbDaOwELGdGilpgEGrHE3bExV1JNVQyA== X-Received: from wmba16.prod.google.com ([2002:a05:600c:6dd0:b0:434:f350:9fc]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:70a:b0:38a:4184:1519 with SMTP id ffacd0b85a97d-38a873051e1mr10550801f8f.23.1736534511095; Fri, 10 Jan 2025 10:41:51 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:55 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-29-8419288bc805@google.com> Subject: [PATCH RFC v2 29/29] mm: asi: Stop ignoring asi=on cmdline flag From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman X-Rspamd-Queue-Id: B484410000C X-Stat-Signature: otsw3zjqisjkwepk3gd3f8ttmaymrua3 X-Rspam-User: X-Rspamd-Server: rspam09 X-HE-Tag: 1736534512-475652 X-HE-Meta: U2FsdGVkX1+YOkt69drRslmddxZnqqtUoeyv5QMJFzRT/aKsm+ZpaH9Lh/5ZVT4HzVDQTcJSzDEg8QBKjKb6RfeAsdTVMojHM34aWWbWqfe08eegBkJ2P/02xA3D7fkQV3JYWCUTXGRLXnKEqOyCayN9Jmkgg77DTSkq0lIp8nulM9Ulv7mUxZ4m5Sy0kgGFefhB8qjlfwDB2mPbym3XMr+MgIT9BFg7QfgIWeISsLOvjHQ0gtlxO71AMS8NX/FqA0bVyNpOLKJBMsS/TYCuJlg/wbC2n5lUN4o3QBPmwSdtF75qFyG/3FzFUVRCO8VPA+MotBYFH/VvMqujlWsnPdGgjBjfUqYfJvByY46iR6clCmELQYCmUTbZ2iVa8R0c81Rs/giINIMEK20Yn19kVsJE6HX4htFDb1vxt8jqOOtMQyfzQR2kJEg2Bcpvf6QgaHtD7PKw8JvZIBZxt2d2oEkWLJrsqu+umK0Yw3dfHzazbaGEnyuTmhfBVRfUGfdN88KNrjPSbxMo8aJ6lIR1R4WdMiG/66spGkav37+TBkPMF4EQ6WvBI73eFhtyStf6C8ChAKr19yDWDo2aI8hDa7IyzvJEuSfogSHojrKJuZn8dXhfNuPyByHyDy1DH0C2IcUGBw3k7onH8bE+6zTevT33ukN+JICLU9GyxRjHwQZ7dxljS6jBRCWaFX4hD8seTMPd040ipnINs11oBnIXzVLQXOrLWVFo0bmclSlREXSEReg188RFaBdr5RXhkJ6mTmow1cvB2sZVLRChlX6kCARvhCVZFc4/6B2gqousF70qGjCdCw0O2plsnjdfoNNGhSgDajtpTisREQC/8iZAfM0tm9x3VLREgSsRZ8AqiK+hJ+fLIZ4AuTwpqcgmxdfAVTJOI2AyfZ6BaxiCFIWrfbfwJmcyETw75SIhc3Omqz1KJi/zR/xUf97iV7oSdiIptkd/Fe57eXEY2yPidWF r6nO6WJS rOgSVJJLjvkFzYFPjvIaeMe1N2UzIgUAf5FVCSLoViO9CdfZnmeNwmMasw8XVC6+/Gk60dhUMYzxhivd7K1WaJydJoEn4O7Si4mktfBqf57sEq7bxHFnfoOkCZH7V/8401etPoYyeBfGiMDabRfm4G3oyM31o7xKJ36oq7C3NYZRwxmn0xe1YgmHqTC5C1+Kc6hnBARu1I4JjONRE2KeYeFhLORcRWGifEssdcPoNxRfc9KWOzBig1OixIXsLJ3TLnA2jKnn7R9ZFmLQ9R4xabjdE7Fg2heE9QgYlQYASKzW0QXODuwaHfsrMMget5xMxmzsZrwzFl6oXjBsvBjP7wW9isHJ8GODq7QbTPX9lzk+I5vbqXQLOo1H+1+4UQ8Iytt6IqHn1TOutr8iMxOqJkAvPOJFQo+VueaOQNZ1v+PEfe66UP18UAEEDzwhjJZwSiiNYkX5ITsuUJ0DBr94zUPzwMjuAtpBqzQl0hx5aSEzviR5T+lzwznIglsrtedmIgigOWqdpwLwG+pP7nqSgjsOdUuD5kyKVzH7V/pjm/d1JWVNLMfkAgx4Ppg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: At this point the minimum requirements are in place for the kernel to operate correctly with ASI enabled. Signed-off-by: Brendan Jackman --- arch/x86/mm/asi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index f10f6614b26148e5ba423d8a44f640674573ee40..3e3956326936ea8550308ad004dbbb3738546f9f 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -207,14 +207,14 @@ void __init asi_check_boottime_disable(void) pr_info("ASI disabled through kernel command line.\n"); } else if (ret == 2 && !strncmp(arg, "on", 2)) { enabled = true; - pr_info("Ignoring asi=on param while ASI implementation is incomplete.\n"); + pr_info("ASI enabled through kernel command line.\n"); } else { pr_info("ASI %s by default.\n", enabled ? "enabled" : "disabled"); } if (enabled) - pr_info("ASI enablement ignored due to incomplete implementation.\n"); + setup_force_cpu_cap(X86_FEATURE_ASI); } /*