From patchwork Fri Sep 27 20:33:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mathieu Desnoyers X-Patchwork-Id: 13814541 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4905DCDD1B9 for ; Fri, 27 Sep 2024 20:35:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C21FA6B00CB; Fri, 27 Sep 2024 16:35:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BA9EE6B00D9; Fri, 27 Sep 2024 16:35:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A71F56B00D6; Fri, 27 Sep 2024 16:35:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 87D2C6B013E for ; Fri, 27 Sep 2024 16:35:50 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 2EDF4160F06 for ; Fri, 27 Sep 2024 20:35:50 +0000 (UTC) X-FDA: 82611674460.03.64E4B8C Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) by imf23.hostedemail.com (Postfix) with ESMTP id A4D33140004 for ; Fri, 27 Sep 2024 20:35:47 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=efficios.com header.s=smtpout1 header.b=grO9EOYR; spf=pass (imf23.hostedemail.com: domain of mathieu.desnoyers@efficios.com designates 167.114.26.122 as permitted sender) smtp.mailfrom=mathieu.desnoyers@efficios.com; dmarc=pass (policy=none) header.from=efficios.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727469224; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=6xJ7uYcv005Xsrba67lV+gPe751H7dJDL9FPslXI7+U=; b=0FZec4OOeT3EtWTTop+1bDr+0Fm1V04QFvaFx238wP8AUOya+K/rZaZ+OQQgNoPgKPk9Hr xApVTnkXvyNZXynyo/zBhCoJr3U6ffx0Yul9//sxcb0pT2/6RuHdvsqWbrYjiej7ds1bWR QSlzm9xr5TIMbXyUgnYjuA/G9O+tXfE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727469224; a=rsa-sha256; cv=none; b=FgPvTOisj491sMB+iSkGhh+sAY4HQviMN42eDXgArJXUlQesq59VNJ5sRYGaHSG8s5LLsm KPddN17T2DS16z9abVy44EMwKu+n7d04axeCV1le7v8ynY4Y+fmziuuM7LPm3mqpH+AW4q 597gXERCzeWHGXE+ZWlmj3wzqJAi+os= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=efficios.com header.s=smtpout1 header.b=grO9EOYR; spf=pass (imf23.hostedemail.com: domain of mathieu.desnoyers@efficios.com designates 167.114.26.122 as permitted sender) smtp.mailfrom=mathieu.desnoyers@efficios.com; dmarc=pass (policy=none) header.from=efficios.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1727469346; bh=XF5056z02azYzf5pHxQg2Q3bGDdaMsKYT1iCdrYtegI=; h=From:To:Cc:Subject:Date:From; b=grO9EOYRvZUdijajmWSbT98E/Ps5M4ViHfH/mvH8YiX31JYPgIKSMgWssllsRqNfa 2vLao3hEaT98OCAH5NQCJ1X5XvmUFBsd7otCVENm5FWDrB7QZShmK5l+mmrHTvhqxm d2UhwUZ/OmUEy05kt/Z2WsW+Z30tZnANBspG7m5Vf7V9JEwonIYF42Xfk4Dc5fI+x/ Z2BF6l8osnzL5Ec6PwUyZ8T+26yFvuCMVymuzhiBWUYxJB3quunRtZ6lwBZbBp5sy/ 9eliKO58HQ6dKXPGDrcS2kO9tFTRpmdGayZpv01NDB9/P4tQCJ37GrBTQkfThLrfzc JRiQWOcNQ4DEg== Received: from thinkos.internal.efficios.com (unknown [IPv6:2606:6d00:100:4000:cacb:9855:de1f:ded2]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4XFj0L24bPz1RhW; Fri, 27 Sep 2024 16:35:46 -0400 (EDT) From: Mathieu Desnoyers To: Linus Torvalds Cc: linux-kernel@vger.kernel.org, Mathieu Desnoyers , Boqun Feng , Greg Kroah-Hartman , Sebastian Andrzej Siewior , "Paul E. McKenney" , Will Deacon , Peter Zijlstra , Alan Stern , John Stultz , Neeraj Upadhyay , Frederic Weisbecker , Joel Fernandes , Josh Triplett , Uladzislau Rezki , Steven Rostedt , Lai Jiangshan , Zqiang , Ingo Molnar , Waiman Long , Mark Rutland , Thomas Gleixner , Vlastimil Babka , maged.michael@gmail.com, Mateusz Guzik , rcu@vger.kernel.org, linux-mm@kvack.org, lkmm@lists.linux.dev Subject: [RFC PATCH] compiler.h: Introduce ptr_eq() to preserve address dependency Date: Fri, 27 Sep 2024 16:33:34 -0400 Message-Id: <20240927203334.976821-1-mathieu.desnoyers@efficios.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 X-Stat-Signature: ezhnypr3b4x1s9wc6th33qatj4oght5j X-Rspamd-Queue-Id: A4D33140004 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1727469347-511873 X-HE-Meta: U2FsdGVkX18ISpbk2eawFhgdGbydZicPV5NvQSfWnCHpBjRMmwzTt9FxW8cXfFtcFahXmPUz9McI953rUNiIMIQ37nUF8AvE16IRta+OQM/Z49WDIeZl7b+xCMjbAncUypVDBWNN9RsT/KoF0C9rTBTi/7I8O0V1XWclu7xt7OWQHza61gOw8qIUuCxoC5hb9DP+6amx/s6m9hR2+2a9t69v9IDx4M6OsLl5jLKE4DTxGKfANwFk177QMyjBxbdfr5jHb6cYJGy9BFLHB28VYQh3cNORQ01Sz94mJG3N3mGf0GJQ8fzpFKKelI9RSRhFL+Az1AfbuQ940fpL6WnrkxV0MCVH+3M3ahvVZlqYFyg3x1h+B26OGFz6Kp+6hnOb0zbWPt9kxrKi0U3v98d1WGcLL3T7FAv3++av6bTx2SSqv+xVt0rBjB4sxA0I+voxeg8Zy5wei+3NiS/PU+VJTZQzx+PQq0Zt4dRT0eYwbNLfPUgRZH82TlBYwv2nFx8ziAuDAUtZG+t1fPEeFEpel6K+v/tL524dZmjvtSrpQJjCp4Fqy+tckAQrit02Sil1cWMmUB1jaJA/210KmwsPMeUD8bWukOVCHlXmjX0BH7xSczE84LKLOW+9hiXHvvGvXTrOsLlg7lYHAacdii8Iwgxrl/tuFst51qhDiQSRbUKU4qJQoYMDTOqUygAUzeKhNRFnJLo2ShHJY0SKhbCPnKxd01ugySdt5/VhPdVnZlVf5ONnO9k6ULU2gsSN0/GxD7BHduUHDBg50t+UrhUYG4g//ASEstlL2YaENoio/ZaNHeHJEAGkSw/0RA31IF9Kqvn923Xp9k8/oDp0YPO5xdgEq4zRQRKgwXSk3wO6AsP/c18ZayJ7Rfl3kd57TAV66TiemvMxbam2h1pzam2YaPeP9uIlxMxNPqAqjexq/Y5JTQ53V3ei5SsOXEkqXnPmy+xpBxQA3cgE5I1oOJu AqmauBo2 RH5J7Zl+0eU7LQ+oVFkx4NLIdUPeqVdJmCJeCxv4qu//+vby2Jh8AUYOCAyGlE/brrEkzJs36AvNq2b7rxyMURep/LdgNSJ8uhuAK5kBvDZNcWtLap5QG2TCNCm1E0aGS5CXCGnTuFF3jQCGbi7ForzrJnJpOMKNRIr5hHNvSwgDsG0hi2qwbn5Zim2HeLtNZExxU+rJpfMrD9V43CvfYvKcoZAa5kTRSacViNY5z1vT36QSJqoaKflYKqmu2au87wbtmcotFNvwPDd7hnoars9woNZF+yA2sIlcv7g1ENjTi0awYhKJpw5+FvtzgY3V/hEcfJou55zsNd/VWW+F2y6JiH9UBbanXxTOTM/NDnN9vElX5ue1YzTENpFnkInv5vg748slffHwtLeC/7DaR/CrvwfHlVhV3SAbe0R47noLjWKwKOlIr6F6Pw7930rVu/juQeFsrEKzBGy6OfUbJyMJMT0VhSXQNwJ6yEUMh5ePqnF5fy3dQLeSKsGcSQW37yYNUlU2DGBV3twh4n2TdcGadqgzak9HnP5+Cs9org0i+ZYJV+cH78FKQLf3gjhzTFicMGNeR9Emj4+gQ4sFts/E4Wlr13ckJ4wqxA4S8/0i8DvlQVYN3ykf8ehP6KolYd0MIakvqCfUIdrLvYGtSjPPD5Hj93bXgRS+9kU1tHBNRyBQNu1Tr/Sgm6bHXnMz/egrZs/2Vr239TvUsZfWXub2TnDXUKmgO+B+ie7HNGrIDmIkNpCp/Vb+G4ue6Dy9JZwoZlxHrbnIs4l0AMgN+flpb/8bGWN18L0ZJI/Dc7kpt72pjGgIJ2SRM7nrB4Ofw+fCTSn20cXz7DGE4CeERGUYI9UayYhbtbWS3pxmXtMw0lo/+qGcxYE3XpQuFzMirWh/i14p43s8NDKsgH+iniLU5U0noEVaBHwJQCbnrPQF4fOvg5+NV+4Rsz3/eJXeju+mRegpLuO6rwP8UDJZOL4WOc/L7 rmSqckSz wr40lp5AuqEewYn3sZMQXVAYYYG+5Yh0xScB/0Nma/97ElzZbHptmYzTB8VCOMjX X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Compiler CSE and SSA GVN optimizations can cause the address dependency of addresses returned by rcu_dereference to be lost when comparing those pointers with either constants or previously loaded pointers. Introduce ptr_eq() to compare two addresses while preserving the address dependencies for later use of the address. It should be used when comparing an address returned by rcu_dereference(). This is needed to prevent the compiler CSE and SSA GVN optimizations from replacing the registers holding @a or @b based on their equality, which does not preserve address dependencies and allows the following misordering speculations: - If @b is a constant, the compiler can issue the loads which depend on @a before loading @a. - If @b is a register populated by a prior load, weakly-ordered CPUs can speculate loads which depend on @a before loading @a. The same logic applies with @a and @b swapped. The compiler barrier() is ineffective at fixing this issue. It does not prevent the compiler CSE from losing the address dependency: int fct_2_volatile_barriers(void) { int *a, *b; do { a = READ_ONCE(p); asm volatile ("" : : : "memory"); b = READ_ONCE(p); } while (a != b); asm volatile ("" : : : "memory"); <----- barrier() return *b; } With gcc 14.2 (arm64): fct_2_volatile_barriers: adrp x0, .LANCHOR0 add x0, x0, :lo12:.LANCHOR0 .L2: ldr x1, [x0] <------ x1 populated by first load. ldr x2, [x0] cmp x1, x2 bne .L2 ldr w0, [x1] <------ x1 is used for access which should depend on b. ret On weakly-ordered architectures, this lets CPU speculation use the result from the first load to speculate "ldr w0, [x1]" before "ldr x2, [x0]". Based on the RCU documentation, the control dependency does not prevent the CPU from speculating loads. Suggested-by: Linus Torvalds Suggested-by: Boqun Feng Signed-off-by: Mathieu Desnoyers Cc: Greg Kroah-Hartman Cc: Sebastian Andrzej Siewior Cc: "Paul E. McKenney" Cc: Will Deacon Cc: Peter Zijlstra Cc: Boqun Feng Cc: Alan Stern Cc: John Stultz Cc: Neeraj Upadhyay Cc: Linus Torvalds Cc: Boqun Feng Cc: Frederic Weisbecker Cc: Joel Fernandes Cc: Josh Triplett Cc: Uladzislau Rezki Cc: Steven Rostedt Cc: Lai Jiangshan Cc: Zqiang Cc: Ingo Molnar Cc: Waiman Long Cc: Mark Rutland Cc: Thomas Gleixner Cc: Vlastimil Babka Cc: maged.michael@gmail.com Cc: Mateusz Guzik Cc: rcu@vger.kernel.org Cc: linux-mm@kvack.org Cc: lkmm@lists.linux.dev Reviewed-by: Boqun Feng Acked-by: Paul E. McKenney --- include/linux/compiler.h | 62 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 62 insertions(+) diff --git a/include/linux/compiler.h b/include/linux/compiler.h index 2df665fa2964..f26705c267e8 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -186,6 +186,68 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, __asm__ ("" : "=r" (var) : "0" (var)) #endif +/* + * Compare two addresses while preserving the address dependencies for + * later use of the address. It should be used when comparing an address + * returned by rcu_dereference(). + * + * This is needed to prevent the compiler CSE and SSA GVN optimizations + * from replacing the registers holding @a or @b based on their + * equality, which does not preserve address dependencies and allows the + * following misordering speculations: + * + * - If @b is a constant, the compiler can issue the loads which depend + * on @a before loading @a. + * - If @b is a register populated by a prior load, weakly-ordered + * CPUs can speculate loads which depend on @a before loading @a. + * + * The same logic applies with @a and @b swapped. + * + * Return value: true if pointers are equal, false otherwise. + * + * The compiler barrier() is ineffective at fixing this issue. It does + * not prevent the compiler CSE from losing the address dependency: + * + * int fct_2_volatile_barriers(void) + * { + * int *a, *b; + * + * do { + * a = READ_ONCE(p); + * asm volatile ("" : : : "memory"); + * b = READ_ONCE(p); + * } while (a != b); + * asm volatile ("" : : : "memory"); <-- barrier() + * return *b; + * } + * + * With gcc 14.2 (arm64): + * + * fct_2_volatile_barriers: + * adrp x0, .LANCHOR0 + * add x0, x0, :lo12:.LANCHOR0 + * .L2: + * ldr x1, [x0] <-- x1 populated by first load. + * ldr x2, [x0] + * cmp x1, x2 + * bne .L2 + * ldr w0, [x1] <-- x1 is used for access which should depend on b. + * ret + * + * On weakly-ordered architectures, this lets CPU speculation use the + * result from the first load to speculate "ldr w0, [x1]" before + * "ldr x2, [x0]". + * Based on the RCU documentation, the control dependency does not + * prevent the CPU from speculating loads. + */ +static __always_inline +int ptr_eq(const volatile void *a, const volatile void *b) +{ + OPTIMIZER_HIDE_VAR(a); + OPTIMIZER_HIDE_VAR(b); + return a == b; +} + #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) /**