From patchwork Thu Apr 26 10:34:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 10365497 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 28604601D3 for ; Thu, 26 Apr 2018 10:50:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1829A28497 for ; Thu, 26 Apr 2018 10:50:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0B34D283D1; Thu, 26 Apr 2018 10:50:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 74E0628619 for ; Thu, 26 Apr 2018 10:50:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References: In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=6WsDqZBg0lmcszsik4nXX5GPbl+zvbvEZaeGvwcb7Gw=; b=re6nyEfy32qAIG6eAm2YQaqSlS sNvdOkJPy1Veoeg/O/1odM7uDZqU4R9D0AH7po7VTWyKwdSZCNdPFjwLtSwzY/QFiczBqg04MCsVL YthlUnFjAuSX/8/I1L9eyPQHobVStqKo4AYNO/i0f8YyYy0MexAwrIVKQTMqHVRa6v0t9XcpHEp8h PLiRSNQb0NhLNvmL/iYUE2d0plr+O5AvQB3AWSMHvyt/JkJ8UB4kPh1o/VAfayDvVhrtr/UaA7sgr Kg3sGldEEPY3RCQAlCqaavRCBGkjX+tTtxQJSLU52BS57S6M49/4Iq5R9FdNZMMp6kKMnEFxMvALk mR2sbX0w==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1fBeTe-0004d0-IU; Thu, 26 Apr 2018 10:50:14 +0000 Received: from foss.arm.com ([217.140.101.70]) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1fBeET-00022u-0W for linux-arm-kernel@lists.infradead.org; Thu, 26 Apr 2018 10:34:41 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7FE1819CC; Thu, 26 Apr 2018 03:34:10 -0700 (PDT) Received: from edgewater-inn.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 51F8B3F5AF; Thu, 26 Apr 2018 03:34:10 -0700 (PDT) Received: by edgewater-inn.cambridge.arm.com (Postfix, from userid 1000) id A92981AE519F; Thu, 26 Apr 2018 11:34:29 +0100 (BST) From: Will Deacon To: linux-kernel@vger.kernel.org Subject: [PATCH v3 11/14] locking/qspinlock: Elide back-to-back RELEASE operations with smp_wmb() Date: Thu, 26 Apr 2018 11:34:25 +0100 Message-Id: <1524738868-31318-12-git-send-email-will.deacon@arm.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1524738868-31318-1-git-send-email-will.deacon@arm.com> References: <1524738868-31318-1-git-send-email-will.deacon@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20180426_033433_183263_00AA2368 X-CRM114-Status: GOOD ( 17.42 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peterz@infradead.org, boqun.feng@gmail.com, will.deacon@arm.com, longman@redhat.com, paulmck@linux.vnet.ibm.com, mingo@kernel.org, linux-arm-kernel@lists.infradead.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP The qspinlock slowpath must ensure that the MCS node is fully initialised before it can be reached by another other CPU. This is currently enforced by using a RELEASE operation when updating the tail and also when linking the node into the waitqueue (since the control dependency off xchg_tail is insufficient to enforce sufficient ordering -- see 95bcade33a8a ("locking/qspinlock: Ensure node is initialised before updating prev->next")). Back-to-back RELEASE operations may be expensive on some architectures, particularly those that implement them using fences under the hood. We can replace the two RELEASE operations with a single smp_wmb() fence and use RELAXED operations for the subsequent publishing of the node. Cc: Peter Zijlstra Cc: Ingo Molnar Signed-off-by: Will Deacon --- kernel/locking/qspinlock.c | 33 +++++++++++++++++---------------- 1 file changed, 17 insertions(+), 16 deletions(-) diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index 7b8c81ebb15e..fa5d2ab369f9 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -164,10 +164,10 @@ static __always_inline void clear_pending_set_locked(struct qspinlock *lock) static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail) { /* - * Use release semantics to make sure that the MCS node is properly - * initialized before changing the tail code. + * We can use relaxed semantics since the caller ensures that the + * MCS node is properly initialized before updating the tail. */ - return (u32)xchg_release(&lock->tail, + return (u32)xchg_relaxed(&lock->tail, tail >> _Q_TAIL_OFFSET) << _Q_TAIL_OFFSET; } @@ -212,10 +212,11 @@ static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail) for (;;) { new = (val & _Q_LOCKED_PENDING_MASK) | tail; /* - * Use release semantics to make sure that the MCS node is - * properly initialized before changing the tail code. + * We can use relaxed semantics since the caller ensures that + * the MCS node is properly initialized before updating the + * tail. */ - old = atomic_cmpxchg_release(&lock->val, val, new); + old = atomic_cmpxchg_relaxed(&lock->val, val, new); if (old == val) break; @@ -388,12 +389,18 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) goto release; /* + * Ensure that the initialisation of @node is complete before we + * publish the updated tail via xchg_tail() and potentially link + * @node into the waitqueue via WRITE_ONCE(prev->next, node) below. + */ + smp_wmb(); + + /* + * Publish the updated tail. * We have already touched the queueing cacheline; don't bother with * pending stuff. * * p,*,* -> n,*,* - * - * RELEASE, such that the stores to @node must be complete. */ old = xchg_tail(lock, tail); next = NULL; @@ -405,14 +412,8 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) if (old & _Q_TAIL_MASK) { prev = decode_tail(old); - /* - * We must ensure that the stores to @node are observed before - * the write to prev->next. The address dependency from - * xchg_tail is not sufficient to ensure this because the read - * component of xchg_tail is unordered with respect to the - * initialisation of @node. - */ - smp_store_release(&prev->next, node); + /* Link @node into the waitqueue. */ + WRITE_ONCE(prev->next, node); pv_wait_node(node, prev); arch_mcs_spin_lock_contended(&node->locked);