From patchwork Thu Apr 5 16:59:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 10325063 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 6851C6032A for ; Thu, 5 Apr 2018 17:06:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 55071292F7 for ; Thu, 5 Apr 2018 17:06:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 497E329305; Thu, 5 Apr 2018 17:06:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C82BE292F7 for ; Thu, 5 Apr 2018 17:06:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References: In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=gnqUraA5rdbGcIZ4xpKYfTTm0xz9VNi3ufpJDTDcsdw=; b=L0kPINr1OPXy2f+jamTIm7zNLu e6Z4fyC8Ol5V2zro+gMRwgRxZofxEv4iJFSJDOEHHBsfzGJmj6At8E1Vr06aYaGDh8bsy2I8pzaOJ 8nduiTp4ELI82hGcDapqTtnBpZySRbZmbFFitksw2YJXLWvpdYp13/QTHgdaaqa6/e9Q/Wj6Ug+YN O4By38gNiM+rxI8Kss8X1pJ1ELCWAkBCl6IN5hlbGUJDDc9y+RMBgKwZYomjgy3qDRen6GgheGezm j5QFcNJ1NKgIQWO+6WMifuTEd9P2W2iZPSPAAaYNNxHxg2HeQXiRshzKgjFPWrl9tnjL/g8NxG3+G Nns9xq+A==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1f48LV-0001nn-9u; Thu, 05 Apr 2018 17:06:45 +0000 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70] helo=foss.arm.com) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1f48EI-0003ZP-SJ for linux-arm-kernel@lists.infradead.org; Thu, 05 Apr 2018 16:59:23 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2F6DB16BA; Thu, 5 Apr 2018 09:58:56 -0700 (PDT) Received: from edgewater-inn.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 01D353F7F3; Thu, 5 Apr 2018 09:58:56 -0700 (PDT) Received: by edgewater-inn.cambridge.arm.com (Postfix, from userid 1000) id 306311AE55AC; Thu, 5 Apr 2018 17:59:09 +0100 (BST) From: Will Deacon To: linux-kernel@vger.kernel.org Subject: [PATCH 10/10] locking/qspinlock: Elide back-to-back RELEASE operations with smp_wmb() Date: Thu, 5 Apr 2018 17:59:07 +0100 Message-Id: <1522947547-24081-11-git-send-email-will.deacon@arm.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1522947547-24081-1-git-send-email-will.deacon@arm.com> References: <1522947547-24081-1-git-send-email-will.deacon@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20180405_095918_984708_CEF98FC7 X-CRM114-Status: GOOD ( 18.56 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peterz@infradead.org, catalin.marinas@arm.com, boqun.feng@gmail.com, Will Deacon , paulmck@linux.vnet.ibm.com, mingo@kernel.org, linux-arm-kernel@lists.infradead.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP The qspinlock slowpath must ensure that the MCS node is fully initialised before it can be reached by another other CPU. This is currently enforced by using a RELEASE operation when updating the tail and also when linking the node into the waitqueue (since the control dependency off xchg_tail is insufficient to enforce sufficient ordering -- see 95bcade33a8a ("locking/qspinlock: Ensure node is initialised before updating prev->next")). Back-to-back RELEASE operations may be expensive on some architectures, particularly those that implement them using fences under the hood. We can replace the two RELEASE operations with a single smp_wmb() fence and use RELAXED operations for the subsequent publishing of the node. Cc: Peter Zijlstra Cc: Ingo Molnar Signed-off-by: Will Deacon --- kernel/locking/qspinlock.c | 32 +++++++++++++++----------------- 1 file changed, 15 insertions(+), 17 deletions(-) diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index 3ad8786a47e2..42c61f7b37c5 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -141,10 +141,10 @@ static __always_inline void clear_pending_set_locked(struct qspinlock *lock) static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail) { /* - * Use release semantics to make sure that the MCS node is properly - * initialized before changing the tail code. + * We can use relaxed semantics since the caller ensures that the + * MCS node is properly initialized before updating the tail. */ - return (u32)xchg_release(&lock->tail, + return (u32)xchg_relaxed(&lock->tail, tail >> _Q_TAIL_OFFSET) << _Q_TAIL_OFFSET; } @@ -178,10 +178,11 @@ static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail) for (;;) { new = (val & _Q_LOCKED_PENDING_MASK) | tail; /* - * Use release semantics to make sure that the MCS node is - * properly initialized before changing the tail code. + * We can use relaxed semantics since the caller ensures that + * the MCS node is properly initialized before updating the + * tail. */ - old = atomic_cmpxchg_release(&lock->val, val, new); + old = atomic_cmpxchg_relaxed(&lock->val, val, new); if (old == val) break; @@ -340,12 +341,17 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) goto release; /* + * Ensure that the initialisation of @node is complete before we + * publish the updated tail and potentially link @node into the + * waitqueue. + */ + smp_wmb(); + + /* * We have already touched the queueing cacheline; don't bother with * pending stuff. * * p,*,* -> n,*,* - * - * RELEASE, such that the stores to @node must be complete. */ old = xchg_tail(lock, tail); next = NULL; @@ -356,15 +362,7 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) */ if (old & _Q_TAIL_MASK) { prev = decode_tail(old); - - /* - * We must ensure that the stores to @node are observed before - * the write to prev->next. The address dependency from - * xchg_tail is not sufficient to ensure this because the read - * component of xchg_tail is unordered with respect to the - * initialisation of @node. - */ - smp_store_release(&prev->next, node); + WRITE_ONCE(prev->next, node); pv_wait_node(node, prev); arch_mcs_spin_lock_contended(&node->locked);