From patchwork Fri Nov 24 13:26:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13467671 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C1A0C624B4 for ; Fri, 24 Nov 2023 13:27:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BB61D8D0086; Fri, 24 Nov 2023 08:27:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B138F8D0084; Fri, 24 Nov 2023 08:27:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 940358D0086; Fri, 24 Nov 2023 08:27:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 79B648D0084 for ; Fri, 24 Nov 2023 08:27:43 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 4C951B65DA for ; Fri, 24 Nov 2023 13:27:43 +0000 (UTC) X-FDA: 81492925206.30.82B02C2 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf14.hostedemail.com (Postfix) with ESMTP id 8234A10000C for ; Fri, 24 Nov 2023 13:27:41 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=FuIzEwOH; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf14.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700832461; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uVx5jgkX7VDXu2BOJ112MWVTaXFaq3iLz1p8EjK2pug=; b=mkT+pd4wn54viHz/LwuoCrDQeZ8gu/E5xssrqPk84CBBbqmmTCyilU9U41imRAPYfWWXfa X5JbI7tbJzUODrUEcMCKh6jUZdj41pbSxT+bU6U4WoZ9n+I9zrUZl7CF8Ma+5Xdli13ZfJ 3NTZmpSSlKCZBjh/9JHkB3fWvyy70Ys= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=FuIzEwOH; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf14.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700832461; a=rsa-sha256; cv=none; b=K4pOFHbwKwNc5zypi6T5YrpwleP6r38SQXBl1UJc1LcbsvtXLVMGmJqxoN+erz4Sj8/ZwS 4s8Zhos6wqr54V34jl5V1LZdoYPa8nWJuIVP2RTq14uJw1/mbUYTznWdPLaz6S/810RJCa WMIoH3lKX8GEIxelco+G5nG5TKNs+OQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1700832460; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uVx5jgkX7VDXu2BOJ112MWVTaXFaq3iLz1p8EjK2pug=; b=FuIzEwOHwTZrspx/x3A2KR/uG1WseChB2hi9BSZHUwTB3gLM9kvhN2rcmCmsCc2pqByx5f pwS7csN6BTqbUr7SwNZx6te60nD9XPwDkGkhKA8waJ4C9smlFzac1P7xOv0MccYupgTUmK xP9AMepFQHjcDEg2h4m8/ts0Malra6w= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-77-x3WGX4woMCePcALEhUGVJg-1; Fri, 24 Nov 2023 08:27:37 -0500 X-MC-Unique: x3WGX4woMCePcALEhUGVJg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E919E185A781; Fri, 24 Nov 2023 13:27:36 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.194.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id 887C32166B2A; Fri, 24 Nov 2023 13:27:33 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Linus Torvalds , Ryan Roberts , Matthew Wilcox , Hugh Dickins , Yin Fengwei , Yang Shi , Ying Huang , Zi Yan , Peter Zijlstra , Ingo Molnar , Will Deacon , Waiman Long , "Paul E. McKenney" Subject: [PATCH WIP v1 18/20] atomic_seqcount: use atomic add-return instead of atomic cmpxchg on 64bit Date: Fri, 24 Nov 2023 14:26:23 +0100 Message-ID: <20231124132626.235350-19-david@redhat.com> In-Reply-To: <20231124132626.235350-1-david@redhat.com> References: <20231124132626.235350-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 X-Rspam-User: X-Stat-Signature: hb1747cjk7pnai36hsi3aqbma4iu8ium X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 8234A10000C X-HE-Tag: 1700832461-101287 X-HE-Meta: U2FsdGVkX19mg5yjCLCLyu5GjMV0u/FsT/ntAmvpefy67Skr9BMV3Cg8/JCHISTev7weGlL72HyLg1NhSw8qaEu8RkcoXxE7OE1WvvzV2qLHJoqi9mq6PYL60unC/uV/KgrJf0NetAvzizK3I85sItOWc3cOk6Ax4hgRb62fpDrQa5PWSxQfaRjtOysJyQzJynUiHh3Cr/4Vg6ohQjmvHhKJkR5ywBtCrH1VHT/vc0VkX7SHypO9JxCj4Eur9tUufCHkmWxzn1wThAmfUcGqLFRKzJAlf0lmRsrk7KITc4Wf1MG4wTTsnVOzMHXTMFnicq9/GvjBNHm1oS81nFmAexInlBk+cz5JcaRPK1b9I2BU2lF7FwaLj6j+wv4B658oqdNW2r/e05xnHFLMd7zmeT35tD6k5hKW3QnT9QKDM2vFL3vm2wQZ3Xl8URTHZgi/1xp66TEPt8PLUScX2Jj8mkVF8j4tDCONGTeUggv7/3V/fNxnx64QNzGVGQsldgg5+cM9j61wPYfoEQIQf82gJBZUjbC93Rk8+lhhpoljU/2oBWeYW8Rss+Xioy25MG5gWt0Uf9BqBV62vdsGSiauM2n+Flr48CNwrFJCgo2ce2bR30QNL9XznxXuD9LFwQWZk3C23zc3ly8Q1fLDvAB+Y6WmvISxhboH2orWZjwU7B3dga/MEFgalb0nBt43/z7IobHqMnTLPpNM3EkYTl4LjPZffJRGzC5lLiL1T74XbMQYUk4+nYjsks59ow6/qIAZI9JDjp8e6u2mBMYFckXz2Xe3HWbxYSRZ7P8+vdExN2++++6GJmuk4BxcSowX9S9Xpqus0w2wR2DJy4X+Yow0Szi9QLUpXWuZ247vqwzQ5zz/tO9RjyiXFxfr20PRuGp/TnTTJw80KfeTPmI9erl5/u7msHVWBENt0b7OdJ/EOd/EpAMAj+Q84vTPJUsUCowBkF+CReoBvEkqlN+3c8Z TpeNqyZV eq1tp8IlmvbkMRj0tMjJPYGxNW7D410Yva4IQnHVL3RbpoA3zmD7jr+RHtyAqt/wAdXff5B42JVjkj3d/IjCjOM5rHbcLYf622y4G/XwF6+0vHEIi88+zW7rqRuEWEhA9L3xAvlsWpoYvp2j1aC+86TmxF0eYRq5T3rsOhpFg8MHCYwbjnSpnZU40EQLdOPEZvYW6RokljuRXrokvzNflRqRp2Zua++BiFIQ7 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Turns out that it can be beneficial on some HW to use an add-return instead of and atomic cmpxchg. However, we have to deal with more possible races now: in the worst case, each and every CPU might try becoming the exclusive writer at the same time, so we need the same number of bits as for the shared writer case. In case we detect that we didn't end up being the exclusive writer, simply back off and convert to a shared writer. Only implement this optimization on 64bit, where we can steal more bits from the actual sequence without sorrow. Signed-off-by: David Hildenbrand --- include/linux/atomic_seqcount.h | 43 +++++++++++++++++++++++++++------ 1 file changed, 36 insertions(+), 7 deletions(-) diff --git a/include/linux/atomic_seqcount.h b/include/linux/atomic_seqcount.h index 00286a9da221..9cd40903863d 100644 --- a/include/linux/atomic_seqcount.h +++ b/include/linux/atomic_seqcount.h @@ -42,9 +42,10 @@ typedef struct raw_atomic_seqcount { #define ATOMIC_SEQCOUNT_SHARED_WRITERS_MAX 0x0000000000008000ul #define ATOMIC_SEQCOUNT_SHARED_WRITERS_MASK 0x000000000000fffful #define ATOMIC_SEQCOUNT_EXCLUSIVE_WRITER 0x0000000000010000ul -#define ATOMIC_SEQCOUNT_WRITERS_MASK 0x000000000001fffful -/* We have 48bit for the actual sequence. */ -#define ATOMIC_SEQCOUNT_SEQUENCE_STEP 0x0000000000020000ul +#define ATOMIC_SEQCOUNT_EXCLUSIVE_WRITERS_MASK 0x00000000ffff0000ul +#define ATOMIC_SEQCOUNT_WRITERS_MASK 0x00000000fffffffful +/* We have 32bit for the actual sequence. */ +#define ATOMIC_SEQCOUNT_SEQUENCE_STEP 0x0000000100000000ul #else /* CONFIG_64BIT */ @@ -53,6 +54,7 @@ typedef struct raw_atomic_seqcount { #define ATOMIC_SEQCOUNT_SHARED_WRITERS_MAX 0x00000040ul #define ATOMIC_SEQCOUNT_SHARED_WRITERS_MASK 0x0000007ful #define ATOMIC_SEQCOUNT_EXCLUSIVE_WRITER 0x00000080ul +#define ATOMIC_SEQCOUNT_EXCLUSIVE_WRITERS_MASK 0x00000080ul #define ATOMIC_SEQCOUNT_WRITERS_MASK 0x000000fful /* We have 24bit for the actual sequence. */ #define ATOMIC_SEQCOUNT_SEQUENCE_STEP 0x00000100ul @@ -144,7 +146,7 @@ static inline bool raw_read_atomic_seqcount_retry(raw_atomic_seqcount_t *s, static inline bool raw_write_atomic_seqcount_begin(raw_atomic_seqcount_t *s, bool try_exclusive) { - unsigned long seqcount, seqcount_new; + unsigned long __maybe_unused seqcount, seqcount_new; BUILD_BUG_ON(IS_ENABLED(CONFIG_PREEMPT_RT)); #ifdef CONFIG_DEBUG_ATOMIC_SEQCOUNT @@ -160,6 +162,32 @@ static inline bool raw_write_atomic_seqcount_begin(raw_atomic_seqcount_t *s, if (unlikely(seqcount & ATOMIC_SEQCOUNT_WRITERS_MASK)) goto shared; +#ifdef CONFIG_64BIT + BUILD_BUG_ON(__builtin_popcount(ATOMIC_SEQCOUNT_EXCLUSIVE_WRITERS_MASK) != + __builtin_popcount(ATOMIC_SEQCOUNT_SHARED_WRITERS_MASK)); + + /* See comment for atomic_long_try_cmpxchg() below. */ + seqcount = atomic_long_add_return(ATOMIC_SEQCOUNT_EXCLUSIVE_WRITER, + &s->sequence); + if (likely((seqcount & ATOMIC_SEQCOUNT_WRITERS_MASK) == + ATOMIC_SEQCOUNT_EXCLUSIVE_WRITER)) + return true; + + /* + * Whoops, we raced with another writer. Back off, converting ourselves + * to a shared writer and wait for any exclusive writers. + */ + atomic_long_add(ATOMIC_SEQCOUNT_SHARED_WRITER - ATOMIC_SEQCOUNT_EXCLUSIVE_WRITER, + &s->sequence); + /* + * No need for __smp_mb__after_atomic(): the reader side already + * realizes that it has to retry and the memory barrier from + * atomic_long_add_return() is sufficient for that. + */ + while (atomic_long_read(&s->sequence) & ATOMIC_SEQCOUNT_EXCLUSIVE_WRITERS_MASK) + cpu_relax(); + return false; +#else seqcount_new = seqcount | ATOMIC_SEQCOUNT_EXCLUSIVE_WRITER; /* * Store the sequence before any store in the critical section. Further, @@ -168,6 +196,7 @@ static inline bool raw_write_atomic_seqcount_begin(raw_atomic_seqcount_t *s, */ if (atomic_long_try_cmpxchg(&s->sequence, &seqcount, seqcount_new)) return true; +#endif shared: /* * Indicate that there is a shared writer, and spin until the exclusive @@ -185,10 +214,10 @@ static inline bool raw_write_atomic_seqcount_begin(raw_atomic_seqcount_t *s, DEBUG_LOCKS_WARN_ON((seqcount & ATOMIC_SEQCOUNT_SHARED_WRITERS_MASK) > ATOMIC_SEQCOUNT_SHARED_WRITERS_MAX); #endif /* CONFIG_DEBUG_ATOMIC_SEQCOUNT */ - if (likely(!(seqcount & ATOMIC_SEQCOUNT_EXCLUSIVE_WRITER))) + if (likely(!(seqcount & ATOMIC_SEQCOUNT_EXCLUSIVE_WRITERS_MASK))) return false; - while (atomic_long_read(&s->sequence) & ATOMIC_SEQCOUNT_EXCLUSIVE_WRITER) + while (atomic_long_read(&s->sequence) & ATOMIC_SEQCOUNT_EXCLUSIVE_WRITERS_MASK) cpu_relax(); return false; } @@ -209,7 +238,7 @@ static inline void raw_write_atomic_seqcount_end(raw_atomic_seqcount_t *s, if (likely(exclusive)) { #ifdef CONFIG_DEBUG_ATOMIC_SEQCOUNT DEBUG_LOCKS_WARN_ON(!(atomic_long_read(&s->sequence) & - ATOMIC_SEQCOUNT_EXCLUSIVE_WRITER)); + ATOMIC_SEQCOUNT_EXCLUSIVE_WRITERS_MASK)); #endif /* CONFIG_DEBUG_ATOMIC_SEQCOUNT */ val -= ATOMIC_SEQCOUNT_EXCLUSIVE_WRITER; } else {