From patchwork Wed Apr 26 08:20:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13224227 X-Patchwork-Delegate: song@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63B6AC77B60 for ; Wed, 26 Apr 2023 08:22:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240114AbjDZIW1 (ORCPT ); Wed, 26 Apr 2023 04:22:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44900 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230484AbjDZIWZ (ORCPT ); Wed, 26 Apr 2023 04:22:25 -0400 Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EC35D3591; Wed, 26 Apr 2023 01:22:23 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Q5sL34HWpz4f4Vy1; Wed, 26 Apr 2023 16:22:19 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP2 (Coremail) with SMTP id Syh0CgBnW+k430hkUexVIA--.50201S5; Wed, 26 Apr 2023 16:22:21 +0800 (CST) From: Yu Kuai To: song@kernel.org, akpm@osdl.org, neilb@suse.de Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v2 1/7] md/raid10: prevent soft lockup while flush writes Date: Wed, 26 Apr 2023 16:20:25 +0800 Message-Id: <20230426082031.1299149-2-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230426082031.1299149-1-yukuai1@huaweicloud.com> References: <20230426082031.1299149-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: Syh0CgBnW+k430hkUexVIA--.50201S5 X-Coremail-Antispam: 1UD129KBjvJXoW7ZFyxGw43ZryUArW3Jw1rXrb_yoW8ZrWrp3 90gFWYyw1UCw13AwsIyF4xKFyrZa90q3y7CFWkZw13XF13XFyUGayDXryjgrWDuryfGrW3 CF4vkrZ7Xw15tFJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9m14x267AKxVW5JVWrJwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r1I6r4UM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCF04k20xvY0x0EwIxGrwCFx2 IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v2 6r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67 AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IY s7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr 0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUqAp5UUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-raid@vger.kernel.org From: Yu Kuai Currently, there is no limit for raid1/raid10 plugged bio. While flushing writes, raid1 has cond_resched() while raid10 doesn't, and too many writes can cause soft lockup. Follow up soft lockup can be triggered easily with writeback test for raid10 with ramdisks: watchdog: BUG: soft lockup - CPU#10 stuck for 27s! [md0_raid10:1293] Call Trace: call_rcu+0x16/0x20 put_object+0x41/0x80 __delete_object+0x50/0x90 delete_object_full+0x2b/0x40 kmemleak_free+0x46/0xa0 slab_free_freelist_hook.constprop.0+0xed/0x1a0 kmem_cache_free+0xfd/0x300 mempool_free_slab+0x1f/0x30 mempool_free+0x3a/0x100 bio_free+0x59/0x80 bio_put+0xcf/0x2c0 free_r10bio+0xbf/0xf0 raid_end_bio_io+0x78/0xb0 one_write_done+0x8a/0xa0 raid10_end_write_request+0x1b4/0x430 bio_endio+0x175/0x320 brd_submit_bio+0x3b9/0x9b7 [brd] __submit_bio+0x69/0xe0 submit_bio_noacct_nocheck+0x1e6/0x5a0 submit_bio_noacct+0x38c/0x7e0 flush_pending_writes+0xf0/0x240 raid10d+0xac/0x1ed0 Fix the problem by adding cond_resched() to raid10 like what raid1 did. Note that unlimited plugged bio still need to be optimized, for example, in the case of lots of dirty pages writeback, this will take lots of memory and io latency is quite bad. Signed-off-by: Yu Kuai --- drivers/md/raid10.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 32fb4ff0acdb..6b31f848a6d9 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -921,6 +921,7 @@ static void flush_pending_writes(struct r10conf *conf) else submit_bio_noacct(bio); bio = next; + cond_resched(); } blk_finish_plug(&plug); } else @@ -1145,6 +1146,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule) else submit_bio_noacct(bio); bio = next; + cond_resched(); } kfree(plug); } From patchwork Wed Apr 26 08:20:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13224228 X-Patchwork-Delegate: song@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C8CDC77B7C for ; Wed, 26 Apr 2023 08:22:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240128AbjDZIW3 (ORCPT ); Wed, 26 Apr 2023 04:22:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44924 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240084AbjDZIW0 (ORCPT ); Wed, 26 Apr 2023 04:22:26 -0400 Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0953E3A93; Wed, 26 Apr 2023 01:22:25 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Q5sL407Ghz4f4WXs; Wed, 26 Apr 2023 16:22:20 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP2 (Coremail) with SMTP id Syh0CgBnW+k430hkUexVIA--.50201S6; Wed, 26 Apr 2023 16:22:21 +0800 (CST) From: Yu Kuai To: song@kernel.org, akpm@osdl.org, neilb@suse.de Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v2 2/7] md/raid1-10: factor out a helper to add bio to plug Date: Wed, 26 Apr 2023 16:20:26 +0800 Message-Id: <20230426082031.1299149-3-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230426082031.1299149-1-yukuai1@huaweicloud.com> References: <20230426082031.1299149-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: Syh0CgBnW+k430hkUexVIA--.50201S6 X-Coremail-Antispam: 1UD129KBjvJXoWxXrWUXFWkCr1xtw47ur15Jwb_yoW5uw1fpa 15Ka4avrWDXrW5Xw4kJFWDuFW5K3ZIgFZFkrZ3C3s3Jr17XFWUWa15JFWrCr98uFZxury7 Jrn0krZrCa13KFUanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9m14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26r4j6ryUM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCF04k20xvY0x0EwIxGrwCFx2 IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v2 6r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67 AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IY s7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr 0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUc6pPUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-raid@vger.kernel.org From: Yu Kuai The code in raid1 and raid10 is identical, prepare to limit the number of pluged bios. Signed-off-by: Yu Kuai --- drivers/md/raid1-10.c | 16 ++++++++++++++++ drivers/md/raid1.c | 12 +----------- drivers/md/raid10.c | 11 +---------- 3 files changed, 18 insertions(+), 21 deletions(-) diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c index e61f6cad4e08..5ba0c158ccd7 100644 --- a/drivers/md/raid1-10.c +++ b/drivers/md/raid1-10.c @@ -109,3 +109,19 @@ static void md_bio_reset_resync_pages(struct bio *bio, struct resync_pages *rp, size -= len; } while (idx++ < RESYNC_PAGES && size > 0); } + +static inline bool md_add_bio_to_plug(struct mddev *mddev, struct bio *bio, + blk_plug_cb_fn unplug) +{ + struct raid1_plug_cb *plug = NULL; + struct blk_plug_cb *cb = blk_check_plugged(unplug, mddev, + sizeof(*plug)); + + if (!cb) + return false; + + plug = container_of(cb, struct raid1_plug_cb, cb); + bio_list_add(&plug->pending, bio); + + return true; +} diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 2f1011ffdf09..92083f5329f9 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -1343,8 +1343,6 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio, struct bitmap *bitmap = mddev->bitmap; unsigned long flags; struct md_rdev *blocked_rdev; - struct blk_plug_cb *cb; - struct raid1_plug_cb *plug = NULL; int first_clone; int max_sectors; bool write_behind = false; @@ -1573,15 +1571,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio, r1_bio->sector); /* flush_pending_writes() needs access to the rdev so...*/ mbio->bi_bdev = (void *)rdev; - - cb = blk_check_plugged(raid1_unplug, mddev, sizeof(*plug)); - if (cb) - plug = container_of(cb, struct raid1_plug_cb, cb); - else - plug = NULL; - if (plug) { - bio_list_add(&plug->pending, mbio); - } else { + if (!md_add_bio_to_plug(mddev, mbio, raid1_unplug)) { spin_lock_irqsave(&conf->device_lock, flags); bio_list_add(&conf->pending_bio_list, mbio); spin_unlock_irqrestore(&conf->device_lock, flags); diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 6b31f848a6d9..2a2e138fa804 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -1287,8 +1287,6 @@ static void raid10_write_one_disk(struct mddev *mddev, struct r10bio *r10_bio, const blk_opf_t do_sync = bio->bi_opf & REQ_SYNC; const blk_opf_t do_fua = bio->bi_opf & REQ_FUA; unsigned long flags; - struct blk_plug_cb *cb; - struct raid1_plug_cb *plug = NULL; struct r10conf *conf = mddev->private; struct md_rdev *rdev; int devnum = r10_bio->devs[n_copy].devnum; @@ -1328,14 +1326,7 @@ static void raid10_write_one_disk(struct mddev *mddev, struct r10bio *r10_bio, atomic_inc(&r10_bio->remaining); - cb = blk_check_plugged(raid10_unplug, mddev, sizeof(*plug)); - if (cb) - plug = container_of(cb, struct raid1_plug_cb, cb); - else - plug = NULL; - if (plug) { - bio_list_add(&plug->pending, mbio); - } else { + if (!md_add_bio_to_plug(mddev, mbio, raid10_unplug)) { spin_lock_irqsave(&conf->device_lock, flags); bio_list_add(&conf->pending_bio_list, mbio); spin_unlock_irqrestore(&conf->device_lock, flags); From patchwork Wed Apr 26 08:20:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13224231 X-Patchwork-Delegate: song@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1AF4AC77B7C for ; Wed, 26 Apr 2023 08:22:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240163AbjDZIWe (ORCPT ); Wed, 26 Apr 2023 04:22:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44952 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239964AbjDZIW2 (ORCPT ); Wed, 26 Apr 2023 04:22:28 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4E6B53C0F; Wed, 26 Apr 2023 01:22:25 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Q5sL45f2Vz4f4d6C; Wed, 26 Apr 2023 16:22:20 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP2 (Coremail) with SMTP id Syh0CgBnW+k430hkUexVIA--.50201S7; Wed, 26 Apr 2023 16:22:22 +0800 (CST) From: Yu Kuai To: song@kernel.org, akpm@osdl.org, neilb@suse.de Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v2 3/7] md/raid1-10: factor out a helper to submit normal write Date: Wed, 26 Apr 2023 16:20:27 +0800 Message-Id: <20230426082031.1299149-4-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230426082031.1299149-1-yukuai1@huaweicloud.com> References: <20230426082031.1299149-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: Syh0CgBnW+k430hkUexVIA--.50201S7 X-Coremail-Antispam: 1UD129KBjvJXoWxAF1rXFW5ZF4kXFyDWr4Durg_yoW5uw1Dp3 9Iqa4fZ3y7XF47Wa1Duay8J3WFga1UtrWUCFW7CayfAFy3ZryDta18JryIgryDJFyrCry7 ZF18K3y7Ww43JFDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9m14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JrWl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCF04k20xvY0x0EwIxGrwCFx2 IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v2 6r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67 AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IY s7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr 0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUd8n5UUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-raid@vger.kernel.org From: Yu Kuai There are multiple places to do the same thing, factor out a helper to prevent redundant code, and the helper will be used in following patch as well. Signed-off-by: Yu Kuai --- drivers/md/raid1-10.c | 16 ++++++++++++++++ drivers/md/raid1.c | 13 ++----------- drivers/md/raid10.c | 26 ++++---------------------- 3 files changed, 22 insertions(+), 33 deletions(-) diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c index 5ba0c158ccd7..25d55036a6fb 100644 --- a/drivers/md/raid1-10.c +++ b/drivers/md/raid1-10.c @@ -110,6 +110,22 @@ static void md_bio_reset_resync_pages(struct bio *bio, struct resync_pages *rp, } while (idx++ < RESYNC_PAGES && size > 0); } +static inline void md_submit_write(struct bio *bio) +{ + struct md_rdev *rdev = (struct md_rdev *)bio->bi_bdev; + + bio->bi_next = NULL; + bio_set_dev(bio, rdev->bdev); + if (test_bit(Faulty, &rdev->flags)) + bio_io_error(bio); + else if (unlikely(bio_op(bio) == REQ_OP_DISCARD && + !bdev_max_discard_sectors(bio->bi_bdev))) + /* Just ignore it */ + bio_endio(bio); + else + submit_bio_noacct(bio); +} + static inline bool md_add_bio_to_plug(struct mddev *mddev, struct bio *bio, blk_plug_cb_fn unplug) { diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 92083f5329f9..d79525d6ec8f 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -799,17 +799,8 @@ static void flush_bio_list(struct r1conf *conf, struct bio *bio) while (bio) { /* submit pending writes */ struct bio *next = bio->bi_next; - struct md_rdev *rdev = (void *)bio->bi_bdev; - bio->bi_next = NULL; - bio_set_dev(bio, rdev->bdev); - if (test_bit(Faulty, &rdev->flags)) { - bio_io_error(bio); - } else if (unlikely((bio_op(bio) == REQ_OP_DISCARD) && - !bdev_max_discard_sectors(bio->bi_bdev))) - /* Just ignore it */ - bio_endio(bio); - else - submit_bio_noacct(bio); + + md_submit_write(bio); bio = next; cond_resched(); } diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 2a2e138fa804..626cb8ed2099 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -909,17 +909,8 @@ static void flush_pending_writes(struct r10conf *conf) while (bio) { /* submit pending writes */ struct bio *next = bio->bi_next; - struct md_rdev *rdev = (void*)bio->bi_bdev; - bio->bi_next = NULL; - bio_set_dev(bio, rdev->bdev); - if (test_bit(Faulty, &rdev->flags)) { - bio_io_error(bio); - } else if (unlikely((bio_op(bio) == REQ_OP_DISCARD) && - !bdev_max_discard_sectors(bio->bi_bdev))) - /* Just ignore it */ - bio_endio(bio); - else - submit_bio_noacct(bio); + + md_submit_write(bio); bio = next; cond_resched(); } @@ -1134,17 +1125,8 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule) while (bio) { /* submit pending writes */ struct bio *next = bio->bi_next; - struct md_rdev *rdev = (void*)bio->bi_bdev; - bio->bi_next = NULL; - bio_set_dev(bio, rdev->bdev); - if (test_bit(Faulty, &rdev->flags)) { - bio_io_error(bio); - } else if (unlikely((bio_op(bio) == REQ_OP_DISCARD) && - !bdev_max_discard_sectors(bio->bi_bdev))) - /* Just ignore it */ - bio_endio(bio); - else - submit_bio_noacct(bio); + + md_submit_write(bio); bio = next; cond_resched(); } From patchwork Wed Apr 26 08:20:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13224229 X-Patchwork-Delegate: song@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24722C7618E for ; Wed, 26 Apr 2023 08:22:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240143AbjDZIWb (ORCPT ); Wed, 26 Apr 2023 04:22:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44922 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240066AbjDZIW0 (ORCPT ); Wed, 26 Apr 2023 04:22:26 -0400 Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02D3F35AC; Wed, 26 Apr 2023 01:22:25 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Q5sL45q37z4f4WHg; Wed, 26 Apr 2023 16:22:20 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP2 (Coremail) with SMTP id Syh0CgBnW+k430hkUexVIA--.50201S8; Wed, 26 Apr 2023 16:22:22 +0800 (CST) From: Yu Kuai To: song@kernel.org, akpm@osdl.org, neilb@suse.de Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v2 4/7] md/raid1-10: submit write io directly if bitmap is not enabled Date: Wed, 26 Apr 2023 16:20:28 +0800 Message-Id: <20230426082031.1299149-5-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230426082031.1299149-1-yukuai1@huaweicloud.com> References: <20230426082031.1299149-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: Syh0CgBnW+k430hkUexVIA--.50201S8 X-Coremail-Antispam: 1UD129KBjvJXoWxuF15JFyrGw47Ww45Ww18AFb_yoW5AFy3pa yDGa4Ykr15JFW3X3ZxAa4DAFyFywn7tr9rKryfC395uFy3XFsxGFWrGay5twn7CrnxCFsx Xr1YkryUCr1UXrJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9C14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCF04k20xvY0x0EwIxGrw CFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE 14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2 IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxK x2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI 0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUQSdkUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-raid@vger.kernel.org From: Yu Kuai Commit 6cce3b23f6f8 ("[PATCH] md: write intent bitmap support for raid10") add bitmap support, and it changed that write io is submitted through daemon thread because bitmap need to be updated before write io. And later, plug is used to fix performance regression because all the write io will go to demon thread, which means io can't be issued concurrently. However, if bitmap is not enabled, the write io should not go to demon thread in the first place, and plug is not needed as well. Fixes: 6cce3b23f6f8 ("[PATCH] md: write intent bitmap support for raid10") Signed-off-by: Yu Kuai --- drivers/md/md-bitmap.c | 4 +--- drivers/md/md-bitmap.h | 7 +++++++ drivers/md/raid1-10.c | 13 +++++++++++-- 3 files changed, 19 insertions(+), 5 deletions(-) diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c index 2a21633c5917..17e988f88303 100644 --- a/drivers/md/md-bitmap.c +++ b/drivers/md/md-bitmap.c @@ -1023,7 +1023,6 @@ static int md_bitmap_file_test_bit(struct bitmap *bitmap, sector_t block) return set; } - /* this gets called when the md device is ready to unplug its underlying * (slave) device queues -- before we let any writes go down, we need to * sync the dirty pages of the bitmap file to disk */ @@ -1033,8 +1032,7 @@ void md_bitmap_unplug(struct bitmap *bitmap) int dirty, need_write; int writing = 0; - if (!bitmap || !bitmap->storage.filemap || - test_bit(BITMAP_STALE, &bitmap->flags)) + if (!md_bitmap_enabled(bitmap)) return; /* look at each page to see if there are any set bits that need to be diff --git a/drivers/md/md-bitmap.h b/drivers/md/md-bitmap.h index cfd7395de8fd..3a4750952b3a 100644 --- a/drivers/md/md-bitmap.h +++ b/drivers/md/md-bitmap.h @@ -273,6 +273,13 @@ int md_bitmap_copy_from_slot(struct mddev *mddev, int slot, sector_t *lo, sector_t *hi, bool clear_bits); void md_bitmap_free(struct bitmap *bitmap); void md_bitmap_wait_behind_writes(struct mddev *mddev); + +static inline bool md_bitmap_enabled(struct bitmap *bitmap) +{ + return bitmap && bitmap->storage.filemap && + !test_bit(BITMAP_STALE, &bitmap->flags); +} + #endif #endif diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c index 25d55036a6fb..4167bda139b1 100644 --- a/drivers/md/raid1-10.c +++ b/drivers/md/raid1-10.c @@ -130,9 +130,18 @@ static inline bool md_add_bio_to_plug(struct mddev *mddev, struct bio *bio, blk_plug_cb_fn unplug) { struct raid1_plug_cb *plug = NULL; - struct blk_plug_cb *cb = blk_check_plugged(unplug, mddev, - sizeof(*plug)); + struct blk_plug_cb *cb; + + /* + * If bitmap is not enabled, it's safe to submit the io directly, and + * this can get optimal performance. + */ + if (!md_bitmap_enabled(mddev->bitmap)) { + md_submit_write(bio); + return true; + } + cb = blk_check_plugged(unplug, mddev, sizeof(*plug)); if (!cb) return false; From patchwork Wed Apr 26 08:20:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13224232 X-Patchwork-Delegate: song@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D19F6C7618E for ; Wed, 26 Apr 2023 08:22:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240148AbjDZIWg (ORCPT ); Wed, 26 Apr 2023 04:22:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44960 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240126AbjDZIW2 (ORCPT ); Wed, 26 Apr 2023 04:22:28 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8B4C198C; Wed, 26 Apr 2023 01:22:25 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Q5sL55Nq2z4f553M; Wed, 26 Apr 2023 16:22:21 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP2 (Coremail) with SMTP id Syh0CgBnW+k430hkUexVIA--.50201S9; Wed, 26 Apr 2023 16:22:22 +0800 (CST) From: Yu Kuai To: song@kernel.org, akpm@osdl.org, neilb@suse.de Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v2 5/7] md/md-bitmap: add a new helper to unplug bitmap asynchrously Date: Wed, 26 Apr 2023 16:20:29 +0800 Message-Id: <20230426082031.1299149-6-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230426082031.1299149-1-yukuai1@huaweicloud.com> References: <20230426082031.1299149-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: Syh0CgBnW+k430hkUexVIA--.50201S9 X-Coremail-Antispam: 1UD129KBjvJXoWxZFWfZF1UZw4rXr1xAFyxZrb_yoWrAw4DpF 98twn8Kr45JFWaq343Ary2kF1YyF1vqrnrJryfG3s8CFy3XFnxJF48GFyjywn8GrsxGFs8 Zw1rJF98Cr1fWF7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9K14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCF04k20xvY0x0EwIxGrw CFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE 14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2 IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAv wI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14 v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjfUOBTYUUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-raid@vger.kernel.org From: Yu Kuai If bitmap is enabled, bitmap must update before submitting write io, this is why unplug callback must move these io to 'conf->pending_io_list' if 'current->bio_list' is not empty, which will suffer performance degradation. This patch add a new helper md_bitmap_unplug_async() to submit bitmap io in a kworker, so that submit bitmap io in raid10_unplug() doesn't require that 'current->bio_list' is empty. This patch prepare to limit the number of plugged bio. Signed-off-by: Yu Kuai --- drivers/md/md-bitmap.c | 51 +++++++++++++++++++++++++++++++++++++++--- drivers/md/md-bitmap.h | 3 +++ 2 files changed, 51 insertions(+), 3 deletions(-) diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c index 17e988f88303..1afee157b96a 100644 --- a/drivers/md/md-bitmap.c +++ b/drivers/md/md-bitmap.c @@ -1023,10 +1023,12 @@ static int md_bitmap_file_test_bit(struct bitmap *bitmap, sector_t block) return set; } -/* this gets called when the md device is ready to unplug its underlying +/* + * This gets called when the md device is ready to unplug its underlying * (slave) device queues -- before we let any writes go down, we need to - * sync the dirty pages of the bitmap file to disk */ -void md_bitmap_unplug(struct bitmap *bitmap) + * sync the dirty pages of the bitmap file to disk. + */ +static void md_do_bitmap_unplug(struct bitmap *bitmap) { unsigned long i; int dirty, need_write; @@ -1059,8 +1061,42 @@ void md_bitmap_unplug(struct bitmap *bitmap) if (test_bit(BITMAP_WRITE_ERROR, &bitmap->flags)) md_bitmap_file_kick(bitmap); } + +void md_bitmap_unplug(struct bitmap *bitmap) +{ + return md_do_bitmap_unplug(bitmap); +} EXPORT_SYMBOL(md_bitmap_unplug); +struct bitmap_unplug_work { + struct work_struct work; + struct bitmap *bitmap; + struct completion *done; +}; + +static void md_bitmap_unplug_fn(struct work_struct *work) +{ + struct bitmap_unplug_work *unplug_work = + container_of(work, struct bitmap_unplug_work, work); + + md_do_bitmap_unplug(unplug_work->bitmap); + complete(unplug_work->done); +} + +void md_bitmap_unplug_async(struct bitmap *bitmap) +{ + DECLARE_COMPLETION_ONSTACK(done); + struct bitmap_unplug_work unplug_work; + + INIT_WORK(&unplug_work.work, md_bitmap_unplug_fn); + unplug_work.bitmap = bitmap; + unplug_work.done = &done; + + queue_work(bitmap->unplug_wq, &unplug_work.work); + wait_for_completion(&done); +} +EXPORT_SYMBOL(md_bitmap_unplug_async); + static void md_bitmap_set_memory_bits(struct bitmap *bitmap, sector_t offset, int needed); /* * bitmap_init_from_disk -- called at bitmap_create time to initialize * the in-memory bitmap from the on-disk bitmap -- also, sets up the @@ -1776,6 +1812,9 @@ void md_bitmap_free(struct bitmap *bitmap) if (!bitmap) /* there was no bitmap */ return; + if (bitmap->unplug_wq) + destroy_workqueue(bitmap->unplug_wq); + if (bitmap->sysfs_can_clear) sysfs_put(bitmap->sysfs_can_clear); @@ -1866,6 +1905,12 @@ struct bitmap *md_bitmap_create(struct mddev *mddev, int slot) if (!bitmap) return ERR_PTR(-ENOMEM); + bitmap->unplug_wq = create_workqueue("md_bitmap"); + if (!bitmap->unplug_wq) { + err = -ENOMEM; + goto error; + } + spin_lock_init(&bitmap->counts.lock); atomic_set(&bitmap->pending_writes, 0); init_waitqueue_head(&bitmap->write_wait); diff --git a/drivers/md/md-bitmap.h b/drivers/md/md-bitmap.h index 3a4750952b3a..55531669db24 100644 --- a/drivers/md/md-bitmap.h +++ b/drivers/md/md-bitmap.h @@ -231,6 +231,8 @@ struct bitmap { struct kernfs_node *sysfs_can_clear; int cluster_slot; /* Slot offset for clustered env */ + + struct workqueue_struct *unplug_wq; }; /* the bitmap API */ @@ -264,6 +266,7 @@ void md_bitmap_sync_with_cluster(struct mddev *mddev, sector_t new_lo, sector_t new_hi); void md_bitmap_unplug(struct bitmap *bitmap); +void md_bitmap_unplug_async(struct bitmap *bitmap); void md_bitmap_daemon_work(struct mddev *mddev); int md_bitmap_resize(struct bitmap *bitmap, sector_t blocks, From patchwork Wed Apr 26 08:20:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13224233 X-Patchwork-Delegate: song@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2131EC7618E for ; Wed, 26 Apr 2023 08:22:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240176AbjDZIWi (ORCPT ); Wed, 26 Apr 2023 04:22:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44930 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240118AbjDZIW1 (ORCPT ); Wed, 26 Apr 2023 04:22:27 -0400 Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D8E6A3591; Wed, 26 Apr 2023 01:22:25 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Q5sL54Zfhz4f4WY6; Wed, 26 Apr 2023 16:22:21 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP2 (Coremail) with SMTP id Syh0CgBnW+k430hkUexVIA--.50201S10; Wed, 26 Apr 2023 16:22:23 +0800 (CST) From: Yu Kuai To: song@kernel.org, akpm@osdl.org, neilb@suse.de Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v2 6/7] md/raid1-10: don't handle pluged bio by daemon thread Date: Wed, 26 Apr 2023 16:20:30 +0800 Message-Id: <20230426082031.1299149-7-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230426082031.1299149-1-yukuai1@huaweicloud.com> References: <20230426082031.1299149-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: Syh0CgBnW+k430hkUexVIA--.50201S10 X-Coremail-Antispam: 1UD129KBjvJXoWxZw4DAw1ruF48GFyfWrW8tFb_yoWrAF1xp3 yYqa1YqrWUJFW5Xw4DZFW8uFyFq3WvgFZrAFZ5uws5uFy3XF9xWFWrGFW0q34DZrsxGFy7 Ary5trWDGa1YvFJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9K14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCF04k20xvY0x0EwIxGrw CFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE 14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2 IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAv wI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14 v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjfUOBTYUUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-raid@vger.kernel.org From: Yu Kuai current->bio_list will be set under submit_bio() context, in this case bitmap io will be added to the list and wait for current io submission to finish, while current io submission must wait for bitmap io to be done. commit 874807a83139 ("md/raid1{,0}: fix deadlock in bitmap_unplug.") fix the deadlock by handling plugged bio by daemon thread. On the one hand, the deadlock won't exist after commit a214b949d8e3 ("blk-mq: only flush requests from the plug in blk_mq_submit_bio"). On the other hand, current solution makes it impossible to flush plugged bio in raid1/10_make_request(), because this will cause that all the writes will goto daemon thread. In order to limit the number of plugged bio, commit 874807a83139 ("md/raid1{,0}: fix deadlock in bitmap_unplug.") is reverted, and the deadlock is fixed by handling bitmap io asynchronously. Signed-off-by: Yu Kuai --- drivers/md/raid1-10.c | 14 ++++++++++++++ drivers/md/raid1.c | 4 ++-- drivers/md/raid10.c | 8 +++----- 3 files changed, 19 insertions(+), 7 deletions(-) diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c index 4167bda139b1..98d678b7df3f 100644 --- a/drivers/md/raid1-10.c +++ b/drivers/md/raid1-10.c @@ -150,3 +150,17 @@ static inline bool md_add_bio_to_plug(struct mddev *mddev, struct bio *bio, return true; } + +/* + * current->bio_list will be set under submit_bio() context, in this case bitmap + * io will be added to the list and wait for current io submission to finish, + * while current io submission must wait for bitmap io to be done. In order to + * avoid such deadlock, submit bitmap io asynchronously. + */ +static inline void md_prepare_flush_writes(struct bitmap *bitmap) +{ + if (current->bio_list) + md_bitmap_unplug_async(bitmap); + else + md_bitmap_unplug(bitmap); +} diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index d79525d6ec8f..639e09cecf01 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -794,7 +794,7 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect static void flush_bio_list(struct r1conf *conf, struct bio *bio) { /* flush any pending bitmap writes to disk before proceeding w/ I/O */ - md_bitmap_unplug(conf->mddev->bitmap); + md_prepare_flush_writes(conf->mddev->bitmap); wake_up(&conf->wait_barrier); while (bio) { /* submit pending writes */ @@ -1166,7 +1166,7 @@ static void raid1_unplug(struct blk_plug_cb *cb, bool from_schedule) struct r1conf *conf = mddev->private; struct bio *bio; - if (from_schedule || current->bio_list) { + if (from_schedule) { spin_lock_irq(&conf->device_lock); bio_list_merge(&conf->pending_bio_list, &plug->pending); spin_unlock_irq(&conf->device_lock); diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 626cb8ed2099..bd9e655ca408 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -902,9 +902,7 @@ static void flush_pending_writes(struct r10conf *conf) __set_current_state(TASK_RUNNING); blk_start_plug(&plug); - /* flush any pending bitmap writes to disk - * before proceeding w/ I/O */ - md_bitmap_unplug(conf->mddev->bitmap); + md_prepare_flush_writes(conf->mddev->bitmap); wake_up(&conf->wait_barrier); while (bio) { /* submit pending writes */ @@ -1108,7 +1106,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule) struct r10conf *conf = mddev->private; struct bio *bio; - if (from_schedule || current->bio_list) { + if (from_schedule) { spin_lock_irq(&conf->device_lock); bio_list_merge(&conf->pending_bio_list, &plug->pending); spin_unlock_irq(&conf->device_lock); @@ -1120,7 +1118,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule) /* we aren't scheduling, so we can do the write-out directly. */ bio = bio_list_get(&plug->pending); - md_bitmap_unplug(mddev->bitmap); + md_prepare_flush_writes(mddev->bitmap); wake_up(&conf->wait_barrier); while (bio) { /* submit pending writes */ From patchwork Wed Apr 26 08:20:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13224230 X-Patchwork-Delegate: song@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B522BC77B60 for ; Wed, 26 Apr 2023 08:22:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240154AbjDZIWd (ORCPT ); Wed, 26 Apr 2023 04:22:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240125AbjDZIW2 (ORCPT ); Wed, 26 Apr 2023 04:22:28 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 70A333C1B; Wed, 26 Apr 2023 01:22:26 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Q5sL649TFz4f55b3; Wed, 26 Apr 2023 16:22:22 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP2 (Coremail) with SMTP id Syh0CgBnW+k430hkUexVIA--.50201S11; Wed, 26 Apr 2023 16:22:23 +0800 (CST) From: Yu Kuai To: song@kernel.org, akpm@osdl.org, neilb@suse.de Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v2 7/7] md/raid1-10: limit the number of plugged bio Date: Wed, 26 Apr 2023 16:20:31 +0800 Message-Id: <20230426082031.1299149-8-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230426082031.1299149-1-yukuai1@huaweicloud.com> References: <20230426082031.1299149-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: Syh0CgBnW+k430hkUexVIA--.50201S11 X-Coremail-Antispam: 1UD129KBjvJXoWxZr4rKF4fXry8CFyfZry3XFb_yoW5try8pa 1Dta4YvrWUZFW7X3yDJayUCFyFga1DWFZFkrZ5C395ZF17XFWjga15JFWrWr1DZFZxGFy3 J3Z8Kr4xGF15tF7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9E14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCF04k20xvY0x0EwIxGrw CFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE 14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2 IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAv wI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14 v26r4UJVWxJrUvcSsGvfC2KfnxnUUI43ZEXa7VUbmZX7UUUUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-raid@vger.kernel.org From: Yu Kuai bio can be added to plug infinitely, and following writeback test can trigger huge amount of plugged bio: Test script: modprobe brd rd_nr=4 rd_size=10485760 mdadm -CR /dev/md0 -l10 -n4 /dev/ram[0123] --assume-clean echo 0 > /proc/sys/vm/dirty_background_ratio echo 60 > /proc/sys/vm/dirty_ratio fio -filename=/dev/md0 -ioengine=libaio -rw=write -bs=4k -numjobs=1 -iodepth=128 -name=test Test result: Monitor /sys/block/md0/inflight will found that inflight keep increasing until fio finish writing, after running for about 2 minutes: [root@fedora ~]# cat /sys/block/md0/inflight 0 4474191 Fix the problem by limiting the number of plugged bio based on the number of copies for original bio. Signed-off-by: Yu Kuai --- drivers/md/raid1-10.c | 9 ++++++++- drivers/md/raid1.c | 2 +- drivers/md/raid10.c | 2 +- 3 files changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c index 98d678b7df3f..35fb80aa37aa 100644 --- a/drivers/md/raid1-10.c +++ b/drivers/md/raid1-10.c @@ -21,6 +21,7 @@ #define IO_MADE_GOOD ((struct bio *)2) #define BIO_SPECIAL(bio) ((unsigned long)bio <= 2) +#define MAX_PLUG_BIO 32 /* for managing resync I/O pages */ struct resync_pages { @@ -31,6 +32,7 @@ struct resync_pages { struct raid1_plug_cb { struct blk_plug_cb cb; struct bio_list pending; + unsigned int count; }; static void rbio_pool_free(void *rbio, void *data) @@ -127,7 +129,7 @@ static inline void md_submit_write(struct bio *bio) } static inline bool md_add_bio_to_plug(struct mddev *mddev, struct bio *bio, - blk_plug_cb_fn unplug) + blk_plug_cb_fn unplug, int copies) { struct raid1_plug_cb *plug = NULL; struct blk_plug_cb *cb; @@ -147,6 +149,11 @@ static inline bool md_add_bio_to_plug(struct mddev *mddev, struct bio *bio, plug = container_of(cb, struct raid1_plug_cb, cb); bio_list_add(&plug->pending, bio); + if (++plug->count / MAX_PLUG_BIO >= copies) { + list_del(&cb->list); + cb->callback(cb, false); + } + return true; } diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 639e09cecf01..c6066408a913 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -1562,7 +1562,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio, r1_bio->sector); /* flush_pending_writes() needs access to the rdev so...*/ mbio->bi_bdev = (void *)rdev; - if (!md_add_bio_to_plug(mddev, mbio, raid1_unplug)) { + if (!md_add_bio_to_plug(mddev, mbio, raid1_unplug, disks)) { spin_lock_irqsave(&conf->device_lock, flags); bio_list_add(&conf->pending_bio_list, mbio); spin_unlock_irqrestore(&conf->device_lock, flags); diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index bd9e655ca408..7135cfaf75db 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -1306,7 +1306,7 @@ static void raid10_write_one_disk(struct mddev *mddev, struct r10bio *r10_bio, atomic_inc(&r10_bio->remaining); - if (!md_add_bio_to_plug(mddev, mbio, raid10_unplug)) { + if (!md_add_bio_to_plug(mddev, mbio, raid10_unplug, conf->copies)) { spin_lock_irqsave(&conf->device_lock, flags); bio_list_add(&conf->pending_bio_list, mbio); spin_unlock_irqrestore(&conf->device_lock, flags);