From patchwork Sun Jul 10 04:21:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 12912392 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85A34C43334 for ; Sun, 10 Jul 2022 04:09:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229560AbiGJEJh (ORCPT ); Sun, 10 Jul 2022 00:09:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229450AbiGJEJg (ORCPT ); Sun, 10 Jul 2022 00:09:36 -0400 Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 06FE412D3B; Sat, 9 Jul 2022 21:09:35 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.143]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4LgYR83wbqz6PCRP; Sun, 10 Jul 2022 12:08:36 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.127.227]) by APP3 (Coremail) with SMTP id _Ch0CgB32mn4UMpiguJUAg--.16364S5; Sun, 10 Jul 2022 12:09:33 +0800 (CST) From: Yu Kuai To: axboe@kernel.dk, asml.silence@gmail.com, osandov@fb.com, jack@suse.cz Cc: kbusch@kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com Subject: [PATCH RFC v3 1/3] sbitmap: fix that same waitqueue can be woken up continuously Date: Sun, 10 Jul 2022 12:21:58 +0800 Message-Id: <20220710042200.20936-2-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220710042200.20936-1-yukuai1@huaweicloud.com> References: <20220710042200.20936-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: _Ch0CgB32mn4UMpiguJUAg--.16364S5 X-Coremail-Antispam: 1UD129KBjvJXoW7Zw47GrW5ur4fWFWxXFyftFb_yoW8Cry7pa 17XFyvvr4jqry2k3ykXF4UAw15GwnF9r9rGr4rG3WjvFsrKr45J3WvvFs8ur18uFsrCayr JF47tFW3Gr4UXFUanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBK14x267AKxVW5JVWrJwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWUuVWrJwAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r1I6r4UM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UM2 8EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2IY04 v7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_ Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x 0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8 JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIx AIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7VUbec_DUUUUU= = X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Yu Kuai __sbq_wake_up __sbq_wake_up sbq_wake_ptr -> assume 0 sbq_wake_ptr -> 0 atomic_dec_return atomic_dec_return atomic_cmpxchg -> succeed atomic_cmpxchg -> failed return true __sbq_wake_up sbq_wake_ptr atomic_read(&sbq->wake_index) -> still 0 sbq_index_atomic_inc -> inc to 1 if (waitqueue_active(&ws->wait)) if (wake_index != atomic_read(&sbq->wake_index)) atomic_set -> reset from 1 to 0 wake_up_nr -> wake up first waitqueue // continue to wake up in first waitqueue Fix the problem by using atomic_cmpxchg() instead of atomic_set() to update 'wake_index'. Fixes: 417232880c8a ("sbitmap: Replace cmpxchg with xchg") Signed-off-by: Yu Kuai --- lib/sbitmap.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/lib/sbitmap.c b/lib/sbitmap.c index 29eb0484215a..b46fce1beb3a 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -579,19 +579,24 @@ EXPORT_SYMBOL_GPL(sbitmap_queue_min_shallow_depth); static struct sbq_wait_state *sbq_wake_ptr(struct sbitmap_queue *sbq) { - int i, wake_index; + int i, wake_index, old_wake_index; +again: if (!atomic_read(&sbq->ws_active)) return NULL; - wake_index = atomic_read(&sbq->wake_index); + old_wake_index = wake_index = atomic_read(&sbq->wake_index); for (i = 0; i < SBQ_WAIT_QUEUES; i++) { struct sbq_wait_state *ws = &sbq->ws[wake_index]; if (waitqueue_active(&ws->wait)) { - if (wake_index != atomic_read(&sbq->wake_index)) - atomic_set(&sbq->wake_index, wake_index); - return ws; + if (wake_index == old_wake_index) + return ws; + + if (atomic_cmpxchg(&sbq->wake_index, old_wake_index, + wake_index) == old_wake_index) + return ws; + goto again; } wake_index = sbq_index_inc(wake_index); From patchwork Sun Jul 10 04:21:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 12912395 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E39DCCA480 for ; Sun, 10 Jul 2022 04:09:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229587AbiGJEJj (ORCPT ); Sun, 10 Jul 2022 00:09:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40430 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229562AbiGJEJh (ORCPT ); Sun, 10 Jul 2022 00:09:37 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6FBFE12D3C; Sat, 9 Jul 2022 21:09:35 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.143]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4LgYRK1vLQzl0rR; Sun, 10 Jul 2022 12:08:45 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.127.227]) by APP3 (Coremail) with SMTP id _Ch0CgB32mn4UMpiguJUAg--.16364S6; Sun, 10 Jul 2022 12:09:33 +0800 (CST) From: Yu Kuai To: axboe@kernel.dk, asml.silence@gmail.com, osandov@fb.com, jack@suse.cz Cc: kbusch@kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com Subject: [PATCH RFC v3 2/3] sbitmap: fix invalid wakeup on the wrong waitqueue Date: Sun, 10 Jul 2022 12:21:59 +0800 Message-Id: <20220710042200.20936-3-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220710042200.20936-1-yukuai1@huaweicloud.com> References: <20220710042200.20936-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: _Ch0CgB32mn4UMpiguJUAg--.16364S6 X-Coremail-Antispam: 1UD129KBjvJXoWxWr17AF4xtFWrXr4xtF1DKFg_yoW5Gr17pr 43GFn2qanYvrWIywsrJr4jy3WYk3yvk3yxGrW5G348Cr17trsIkr10grn3Zry8CrZ5XFy3 Jr43JrZ3CFyUXaDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBK14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWUuVWrJwAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26r4j6ryUM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UM2 8EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2IY04 v7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_ Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x 0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8 JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIx AIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7VUjYiiDUUUUU= = X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Yu Kuai For example, 2 * wake_batch tags are put, while only wake_batch threads are woken: __sbq_wake_up atomic_cmpxchg -> reset wait_cnt __sbq_wake_up -> decrease wait_cnt ... __sbq_wake_up -> wait_cnt is decreased to 0 again atomic_cmpxchg sbq_index_atomic_inc -> increase wake_index wake_up_nr -> wake up and waitqueue might be empty sbq_index_atomic_inc -> increase again, one waitqueue is skipped wake_up_nr -> invalid wake up because old wakequeue might be empty To fix the problem, increasing 'wake_index' before resetting 'wait_cnt'. Fixes: 88459642cba4 ("blk-mq: abstract tag allocation out into sbitmap library") Signed-off-by: Yu Kuai --- lib/sbitmap.c | 45 +++++++++++++++++++++++---------------------- 1 file changed, 23 insertions(+), 22 deletions(-) diff --git a/lib/sbitmap.c b/lib/sbitmap.c index b46fce1beb3a..57095dd88a33 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -616,32 +616,33 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) return false; wait_cnt = atomic_dec_return(&ws->wait_cnt); - if (wait_cnt <= 0) { - int ret; + if (wait_cnt > 0) + return false; - wake_batch = READ_ONCE(sbq->wake_batch); + /* + * For concurrent callers of this, callers should call this function + * again to wakeup a new batch on a different 'ws'. + */ + if (wait_cnt < 0) + return true; - /* - * Pairs with the memory barrier in sbitmap_queue_resize() to - * ensure that we see the batch size update before the wait - * count is reset. - */ - smp_mb__before_atomic(); + wake_batch = READ_ONCE(sbq->wake_batch); - /* - * For concurrent callers of this, the one that failed the - * atomic_cmpxhcg() race should call this function again - * to wakeup a new batch on a different 'ws'. - */ - ret = atomic_cmpxchg(&ws->wait_cnt, wait_cnt, wake_batch); - if (ret == wait_cnt) { - sbq_index_atomic_inc(&sbq->wake_index); - wake_up_nr(&ws->wait, wake_batch); - return false; - } + /* + * Pairs with the memory barrier in sbitmap_queue_resize() to + * ensure that we see the batch size update before the wait + * count is reset. + */ + smp_mb__before_atomic(); - return true; - } + /* + * Increase wake_index before updating wait_cnt, otherwise concurrent + * callers can see valid wait_cnt in old waitqueue, which can cause + * invalid wakeup on the old waitqueue. + */ + sbq_index_atomic_inc(&sbq->wake_index); + atomic_set(&ws->wait_cnt, wake_batch); + wake_up_nr(&ws->wait, wake_batch); return false; } From patchwork Sun Jul 10 04:22:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 12912394 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01496CCA479 for ; Sun, 10 Jul 2022 04:09:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229572AbiGJEJi (ORCPT ); Sun, 10 Jul 2022 00:09:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229558AbiGJEJh (ORCPT ); Sun, 10 Jul 2022 00:09:37 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C63EF13CC8; Sat, 9 Jul 2022 21:09:35 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.143]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4LgYRK4s5fzkX0H; Sun, 10 Jul 2022 12:08:45 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.127.227]) by APP3 (Coremail) with SMTP id _Ch0CgB32mn4UMpiguJUAg--.16364S7; Sun, 10 Jul 2022 12:09:33 +0800 (CST) From: Yu Kuai To: axboe@kernel.dk, asml.silence@gmail.com, osandov@fb.com, jack@suse.cz Cc: kbusch@kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com Subject: [PATCH RFC v3 3/3] sbitmap: fix that 'wait_cnt' can be decreased while waitqueue is empty Date: Sun, 10 Jul 2022 12:22:00 +0800 Message-Id: <20220710042200.20936-4-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220710042200.20936-1-yukuai1@huaweicloud.com> References: <20220710042200.20936-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: _Ch0CgB32mn4UMpiguJUAg--.16364S7 X-Coremail-Antispam: 1UD129KBjvJXoW7ZF17ZryrtFyrXryxWFy8Krg_yoW8Kw1fpr 42kF1vqanYvrW7t39rJr4jq3WYkws5ArZ7KrWfK3W5Cw12qrsavr10kF4Y9r18Zrs8XFW5 Jr4fJ393CFyUXaDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBK14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWUuVWrJwAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JrWl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UM2 8EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2IY04 v7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_ Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x 0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8 JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIx AIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7VUbJ73DUUUUU= = X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Yu Kuai As pointed out by Jan Kara, following race is still possible: CPU1 CPU2 __sbq_wake_up __sbq_wake_up sbq_wake_ptr() sbq_wake_ptr() -> the same wait_cnt = atomic_dec_return() /* decreased to 0 */ sbq_index_atomic_inc() /* move to next waitqueue */ atomic_set() /* reset wait_cnt */ wake_up_nr() /* wake up on the old waitqueue */ wait_cnt = atomic_dec_return() /* * decrease wait_cnt in the old * waitqueue, while it can be * empty. */ Fix the problem by waking up before updating 'wake_index' and 'wait_cnt'. With this patch, noted that 'wait_cnt' is still decreased in the old empty waitqueue, however, the wakeup is redirected to a active waitqueue, and the extra decrement on the old empty waitqueue is not handled. Fixes: 88459642cba4 ("blk-mq: abstract tag allocation out into sbitmap library") Signed-off-by: Yu Kuai --- lib/sbitmap.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/lib/sbitmap.c b/lib/sbitmap.c index 57095dd88a33..55826ebbe7db 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -616,22 +616,33 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) return false; wait_cnt = atomic_dec_return(&ws->wait_cnt); - if (wait_cnt > 0) - return false; - /* * For concurrent callers of this, callers should call this function * again to wakeup a new batch on a different 'ws'. */ - if (wait_cnt < 0) + if (wait_cnt < 0 || !waitqueue_active(&ws->wait)) return true; + if (wait_cnt > 0) + return false; + wake_batch = READ_ONCE(sbq->wake_batch); + /* + * Wake up first in case that concurrent callers decrease wait_cnt + * while waitqueue is empty. + */ + wake_up_nr(&ws->wait, wake_batch); + /* * Pairs with the memory barrier in sbitmap_queue_resize() to * ensure that we see the batch size update before the wait * count is reset. + * + * Also pairs with the implicit barrier between becrementing wait_cnt + * and checking for waitqueue_active() to make sure waitqueue_active() + * sees result of the wakeup if atomic_dec_return() has seen the result + * of atomic_set(). */ smp_mb__before_atomic(); @@ -642,7 +653,6 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) */ sbq_index_atomic_inc(&sbq->wake_index); atomic_set(&ws->wait_cnt, wake_batch); - wake_up_nr(&ws->wait, wake_batch); return false; }