From patchwork Thu Apr 13 00:06:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 13209676 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4023CC7619A for ; Thu, 13 Apr 2023 00:07:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229685AbjDMAG7 (ORCPT ); Wed, 12 Apr 2023 20:06:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56990 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229508AbjDMAG6 (ORCPT ); Wed, 12 Apr 2023 20:06:58 -0400 Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02847448F; Wed, 12 Apr 2023 17:06:58 -0700 (PDT) Received: by mail-pl1-x633.google.com with SMTP id kh6so11636557plb.0; Wed, 12 Apr 2023 17:06:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1681344417; x=1683936417; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=+c1yUpkzswmzq/oMrJxXHf4ZggUo4IZgO2KUwSt5A/4=; b=VLtSIZymPOxj5fb0NSKxrUzsmKlOO0xUCBks6vYGJpb25czz5XiWa812SUs7RjyOZa BKFa1UXONpNf3P76SXq4ZK6owNzs3QosAXdpYc/m3XKy2GVCbCIezHzndHeU1lG2HEsp ebkviD/eIEndrW/8/N863hw5FYz7jBm4tn9lneqLIftYtMgyLk/t0Ge39rD/JGw0CvQa Bcs9/6F1btD+Kxpp0NuBlZiCjGa2PZF4l1vcPgAWA4f6ssqEH/ISC+i+RisrA6F50To4 K+uK6SsOVSOOajB23lhnWkwDmDSvCVMK5n2lyEQuuDuv6iHCMKEatWXRpQlZYSWm5Nvk z+FA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681344417; x=1683936417; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=+c1yUpkzswmzq/oMrJxXHf4ZggUo4IZgO2KUwSt5A/4=; b=Zw6zy/tjN2JjknKEWIfrGIZboVpRAcFVO/qsVY7BsDpyue9HM6uaM2wDUX20VWtUw2 zab/JWg5dEPU6YdmzW472DsR8DTqBCQgaaL5H6b+wa/6YQIu13my7Cw69UxXKDwGo4Ux y++7aGrsG6Ik7JKp7MKZoZqaNePqby5vxF2C37OIAGHmv6JOO8syxasS5JMhsFZrTyjY s6YofVz+B7HVNMzQrIiEiGGKIY9M7gJZE1U/iRAaZNWSbUX6wWwluwrQ9CL2VyvLgbMS 051y5Ii3IJ/G8W/kC59Jv1mnerks5upPLuwkVVUO9Mg+j9pQwvFTUsNXQOJYxQwJsa0d BbeA== X-Gm-Message-State: AAQBX9dnWN01tJrZeOYjn3dRGNYp44IPac8eba++DPWZ5vpoKcPG9dRZ Zo9UZVVoheZop/oZ4I01Zfc= X-Google-Smtp-Source: AKy350YDLlg0qRTKX3Voc4vDlbECcwCD//NpNhmfQJwJ7widmWUrX0ZepcpGPCbp56HaPMmeXLqTFw== X-Received: by 2002:a17:90a:e545:b0:240:5c46:e9b0 with SMTP id ei5-20020a17090ae54500b002405c46e9b0mr4963211pjb.2.1681344417156; Wed, 12 Apr 2023 17:06:57 -0700 (PDT) Received: from localhost (2603-800c-1a02-1bae-a7fa-157f-969a-4cde.res6.spectrum.com. [2603:800c:1a02:1bae:a7fa:157f:969a:4cde]) by smtp.gmail.com with ESMTPSA id t13-20020a17090a2f8d00b002349fcf17f8sm1979486pjd.15.2023.04.12.17.06.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Apr 2023 17:06:56 -0700 (PDT) Sender: Tejun Heo From: Tejun Heo To: axboe@kernel.dk, josef@toxicpanda.com, hch@lst.de Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai1@huaweicloud.com, Tejun Heo Subject: [PATCH 1/4] blkcg: Drop unnecessary RCU read [un]locks from blkg_conf_prep/finish() Date: Wed, 12 Apr 2023 14:06:46 -1000 Message-Id: <20230413000649.115785-2-tj@kernel.org> X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230413000649.115785-1-tj@kernel.org> References: <20230413000649.115785-1-tj@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Now that all RCU flavors have been combined either holding a spin lock, disabling irq or disabling preemption implies RCU read lock, so there's no need to use rcu_read_[un]lock() explicitly while holding queue_lock. This shouldn't cause any behavior changes. v2: Description updated. Leave __acquires/release on queue_lock alone. Signed-off-by: Tejun Heo Reviewed-by: Christoph Hellwig --- block/blk-cgroup.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 18331cb92914..0a2c19d74d95 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -699,12 +699,12 @@ struct block_device *blkcg_conf_open_bdev(char **inputp) * * Parse per-blkg config update from @input and initialize @ctx with the * result. @ctx->blkg points to the blkg to be updated and @ctx->body the - * part of @input following MAJ:MIN. This function returns with RCU read - * lock and queue lock held and must be paired with blkg_conf_finish(). + * part of @input following MAJ:MIN. This function returns with queue lock + * held and must be paired with blkg_conf_finish(). */ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol, char *input, struct blkg_conf_ctx *ctx) - __acquires(rcu) __acquires(&bdev->bd_queue->queue_lock) + __acquires(&bdev->bd_queue->queue_lock) { struct block_device *bdev; struct gendisk *disk; @@ -726,7 +726,6 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol, if (ret) goto fail; - rcu_read_lock(); spin_lock_irq(&q->queue_lock); if (!blkcg_policy_enabled(q, pol)) { @@ -755,7 +754,6 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol, /* Drop locks to do new blkg allocation with GFP_KERNEL. */ spin_unlock_irq(&q->queue_lock); - rcu_read_unlock(); new_blkg = blkg_alloc(pos, disk, GFP_KERNEL); if (unlikely(!new_blkg)) { @@ -769,7 +767,6 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol, goto fail_exit_queue; } - rcu_read_lock(); spin_lock_irq(&q->queue_lock); if (!blkcg_policy_enabled(q, pol)) { @@ -805,7 +802,6 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol, radix_tree_preload_end(); fail_unlock: spin_unlock_irq(&q->queue_lock); - rcu_read_unlock(); fail_exit_queue: blk_queue_exit(q); fail: @@ -832,10 +828,9 @@ EXPORT_SYMBOL_GPL(blkg_conf_prep); * with blkg_conf_prep(). */ void blkg_conf_finish(struct blkg_conf_ctx *ctx) - __releases(&ctx->bdev->bd_queue->queue_lock) __releases(rcu) + __releases(&ctx->bdev->bd_queue->queue_lock) { spin_unlock_irq(&bdev_get_queue(ctx->bdev)->queue_lock); - rcu_read_unlock(); blkdev_put_no_open(ctx->bdev); } EXPORT_SYMBOL_GPL(blkg_conf_finish); From patchwork Thu Apr 13 00:06:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 13209678 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B50EDC7619A for ; Thu, 13 Apr 2023 00:07:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229755AbjDMAHF (ORCPT ); Wed, 12 Apr 2023 20:07:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57090 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229599AbjDMAHC (ORCPT ); Wed, 12 Apr 2023 20:07:02 -0400 Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3A03365B3; Wed, 12 Apr 2023 17:07:00 -0700 (PDT) Received: by mail-pj1-x102f.google.com with SMTP id 98e67ed59e1d1-247115ef7e4so52706a91.0; Wed, 12 Apr 2023 17:07:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1681344419; x=1683936419; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=GXO207zMmq17nDMUqv0On/sHq+Zra8zadxF+UTQzIOs=; b=lvdt9PJ1UFYAB17sB6aNNkP9FedAP+G1tgl1tbiZsaPbgq+6Vwlb04oWC8pxaRWLCC Gfyw3F2PKOavtMMqRSTd2Na0JSbQy/ynru3lm0kqL5/tkoi7+QHaQS8HxxSnXwB9epL5 vZkfFZLQuBc3mD15NPaGczNHgYf3kI2BMEjf7iPHDLpytATVHKQ2dp5zXDptXs5oCthx bV51dOVhK+d4DwZ32gzDLH8CKsPY0crvvSvxHJSET8GwSDlYjvCht6+k5Xi49CtxPQ3v 78B0sxuO0/ozKt8Rp0siIN//T68lXvm7uScQqgkvrT30mXcmgxXMhovcFvx9Q/THKtU2 xlzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681344419; x=1683936419; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=GXO207zMmq17nDMUqv0On/sHq+Zra8zadxF+UTQzIOs=; b=Vozwm9JjNdd/PxWLmghGzt6+NA7LshpXRschSonpgLOeeGPX8hUIfYV0VOTkc6IQEw HcDDgP4qB0lAlT7+pAXOOIZzSkOxB0sfJTdpgY4wXEpyKpbOcwk9Q0tqxLF/39bqngCO 8hPgRmjFWf1yDmWAQRXf99X275mVuCi198OudpVhVPdaswVO93QjH5vPEUXAHolwmovo z9MekSHhybSrmj0J60aaZknJC3FZgcdG26Q0lbxdX8Uvf7etCmXK3j9u/09E2hLupJ2R PmVjUaNsP1yQjbwwjXlrKmV2/pX0fwiEAvYALlmuvO23L2ouHEL2qiAew3nTHEm48qbC WFFw== X-Gm-Message-State: AAQBX9dm42Y1r+9dT9MFikBhQxgoJZ0IMvTNe6Hby549ooYI0igvxBzj Vwj+QFYa4dJUcDZ13lc8hqCn1vdOZ1U= X-Google-Smtp-Source: AKy350a8QMSXtl5hUU3eq2uLxkTwNLZE9iRfiqvpHUrIdpXatCAuasy676zcaooy3BgoGDIgQrMrfg== X-Received: by 2002:aa7:888f:0:b0:628:1274:4d60 with SMTP id z15-20020aa7888f000000b0062812744d60mr981939pfe.21.1681344419146; Wed, 12 Apr 2023 17:06:59 -0700 (PDT) Received: from localhost (2603-800c-1a02-1bae-a7fa-157f-969a-4cde.res6.spectrum.com. [2603:800c:1a02:1bae:a7fa:157f:969a:4cde]) by smtp.gmail.com with ESMTPSA id z22-20020aa785d6000000b0062dd9a8c1b8sm28308pfn.100.2023.04.12.17.06.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Apr 2023 17:06:58 -0700 (PDT) Sender: Tejun Heo From: Tejun Heo To: axboe@kernel.dk, josef@toxicpanda.com, hch@lst.de Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai1@huaweicloud.com, Tejun Heo Subject: [PATCH 2/4] blkcg: Restructure blkg_conf_prep() and friends Date: Wed, 12 Apr 2023 14:06:47 -1000 Message-Id: <20230413000649.115785-3-tj@kernel.org> X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230413000649.115785-1-tj@kernel.org> References: <20230413000649.115785-1-tj@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org We want to support lazy init of rq-qos policies so that iolatency is enabled lazily on configuration instead of gendisk initialization. The way blkg config helpers are structured now is a bit awkward for that. Let's restructure: * blkcg_conf_open_bdev() is renamed to blkg_conf_open_bdev(). The blkcg_ prefix was used because the bdev opening step is blkg-independent. However, the distinction is too subtle and confuses more than helps. Let's switch to blkg prefix so that it's consistent with the type and other helper names. * struct blkg_conf_ctx now remembers the original input string and is always initialized by the new blkg_conf_init(). * blkg_conf_open_bdev() is updated to take a pointer to blkg_conf_ctx like blkg_conf_prep() and can be called multiple times safely. Instead of modifying the double pointer to input string directly, blkg_conf_open_bdev() now sets blkg_conf_ctx->body. * blkg_conf_finish() is renamed to blkg_conf_exit() for symmetry and now must be called on all blkg_conf_ctx's which were initialized with blkg_conf_init(). Combined, this allows the users to either open the bdev first or do it altogether with blkg_conf_prep() which will help implementing lazy init of rq-qos policies. blkg_conf_init/exit() will also be used implement synchronization against device removal. This is necessary because iolat / iocost are configured through cgroupfs instead of one of the files under /sys/block/DEVICE. As cgroupfs operations aren't synchronized with block layer, the lazy init and other configuration operations may race against device removal. This patch makes blkg_conf_init/exit() used consistently for all cgroup-orginating configurations making them a good place to implement explicit synchronization. Users are updated accordingly. No behavior change is intended by this patch. v2: bfq wasn't updated in v1 causing a build error. Fixed. v3: Update the description to include future use of blkg_conf_init/exit() as synchronization points. Signed-off-by: Tejun Heo Cc: Josef Bacik Cc: Christoph Hellwig Cc: Yu Kuai Reviewed-by: Christoph Hellwig --- block/bfq-cgroup.c | 8 ++-- block/blk-cgroup.c | 105 +++++++++++++++++++++++++++--------------- block/blk-cgroup.h | 10 ++-- block/blk-iocost.c | 58 +++++++++++++---------- block/blk-iolatency.c | 8 ++-- block/blk-throttle.c | 16 ++++--- 6 files changed, 127 insertions(+), 78 deletions(-) diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index 74f7d051665b..2c90e5de0acd 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -1105,9 +1105,11 @@ static ssize_t bfq_io_set_device_weight(struct kernfs_open_file *of, struct bfq_group *bfqg; u64 v; - ret = blkg_conf_prep(blkcg, &blkcg_policy_bfq, buf, &ctx); + blkg_conf_init(&ctx, buf); + + ret = blkg_conf_prep(blkcg, &blkcg_policy_bfq, &ctx); if (ret) - return ret; + goto out; if (sscanf(ctx.body, "%llu", &v) == 1) { /* require "default" on dfl */ @@ -1129,7 +1131,7 @@ static ssize_t bfq_io_set_device_weight(struct kernfs_open_file *of, ret = 0; } out: - blkg_conf_finish(&ctx); + blkg_conf_exit(&ctx); return ret ?: nbytes; } diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 0a2c19d74d95..c154b08a7e92 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -653,69 +653,93 @@ u64 __blkg_prfill_u64(struct seq_file *sf, struct blkg_policy_data *pd, u64 v) EXPORT_SYMBOL_GPL(__blkg_prfill_u64); /** - * blkcg_conf_open_bdev - parse and open bdev for per-blkg config update - * @inputp: input string pointer + * blkg_conf_init - initialize a blkg_conf_ctx + * @ctx: blkg_conf_ctx to initialize + * @input: input string * - * Parse the device node prefix part, MAJ:MIN, of per-blkg config update - * from @input and get and return the matching bdev. *@inputp is - * updated to point past the device node prefix. Returns an ERR_PTR() - * value on error. + * Initialize @ctx which can be used to parse blkg config input string @input. + * Once initialized, @ctx can be used with blkg_conf_open_bdev() and + * blkg_conf_prep(), and must be cleaned up with blkg_conf_exit(). + */ +void blkg_conf_init(struct blkg_conf_ctx *ctx, char *input) +{ + *ctx = (struct blkg_conf_ctx){ .input = input }; +} +EXPORT_SYMBOL_GPL(blkg_conf_init); + +/** + * blkg_conf_open_bdev - parse and open bdev for per-blkg config update + * @ctx: blkg_conf_ctx initialized with blkg_conf_init() * - * Use this function iff blkg_conf_prep() can't be used for some reason. + * Parse the device node prefix part, MAJ:MIN, of per-blkg config update from + * @ctx->input and get and store the matching bdev in @ctx->bdev. @ctx->body is + * set to point past the device node prefix. + * + * This function may be called multiple times on @ctx and the extra calls become + * NOOPs. blkg_conf_prep() implicitly calls this function. Use this function + * explicitly if bdev access is needed without resolving the blkcg / policy part + * of @ctx->input. Returns -errno on error. */ -struct block_device *blkcg_conf_open_bdev(char **inputp) +int blkg_conf_open_bdev(struct blkg_conf_ctx *ctx) { - char *input = *inputp; + char *input = ctx->input; unsigned int major, minor; struct block_device *bdev; int key_len; + if (ctx->bdev) + return 0; + if (sscanf(input, "%u:%u%n", &major, &minor, &key_len) != 2) - return ERR_PTR(-EINVAL); + return -EINVAL; input += key_len; if (!isspace(*input)) - return ERR_PTR(-EINVAL); + return -EINVAL; input = skip_spaces(input); bdev = blkdev_get_no_open(MKDEV(major, minor)); if (!bdev) - return ERR_PTR(-ENODEV); + return -ENODEV; if (bdev_is_partition(bdev)) { blkdev_put_no_open(bdev); - return ERR_PTR(-ENODEV); + return -ENODEV; } - *inputp = input; - return bdev; + ctx->body = input; + ctx->bdev = bdev; + return 0; } /** * blkg_conf_prep - parse and prepare for per-blkg config update * @blkcg: target block cgroup * @pol: target policy - * @input: input string - * @ctx: blkg_conf_ctx to be filled + * @ctx: blkg_conf_ctx initialized with blkg_conf_init() + * + * Parse per-blkg config update from @ctx->input and initialize @ctx + * accordingly. On success, @ctx->body points to the part of @ctx->input + * following MAJ:MIN, @ctx->bdev points to the target block device and + * @ctx->blkg to the blkg being configured. * - * Parse per-blkg config update from @input and initialize @ctx with the - * result. @ctx->blkg points to the blkg to be updated and @ctx->body the - * part of @input following MAJ:MIN. This function returns with queue lock - * held and must be paired with blkg_conf_finish(). + * blkg_conf_open_bdev() may be called on @ctx beforehand. On success, this + * function returns with queue lock held and must be followed by + * blkg_conf_exit(). */ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol, - char *input, struct blkg_conf_ctx *ctx) + struct blkg_conf_ctx *ctx) __acquires(&bdev->bd_queue->queue_lock) { - struct block_device *bdev; struct gendisk *disk; struct request_queue *q; struct blkcg_gq *blkg; int ret; - bdev = blkcg_conf_open_bdev(&input); - if (IS_ERR(bdev)) - return PTR_ERR(bdev); - disk = bdev->bd_disk; + ret = blkg_conf_open_bdev(ctx); + if (ret) + return ret; + + disk = ctx->bdev->bd_disk; q = disk->queue; /* @@ -793,9 +817,7 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol, } success: blk_queue_exit(q); - ctx->bdev = bdev; ctx->blkg = blkg; - ctx->body = input; return 0; fail_preloaded: @@ -805,7 +827,6 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol, fail_exit_queue: blk_queue_exit(q); fail: - blkdev_put_no_open(bdev); /* * If queue was bypassing, we should retry. Do so after a * short msleep(). It isn't strictly necessary but queue @@ -821,19 +842,27 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol, EXPORT_SYMBOL_GPL(blkg_conf_prep); /** - * blkg_conf_finish - finish up per-blkg config update - * @ctx: blkg_conf_ctx initialized by blkg_conf_prep() + * blkg_conf_exit - clean up per-blkg config update + * @ctx: blkg_conf_ctx initialized with blkg_conf_init() * - * Finish up after per-blkg config update. This function must be paired - * with blkg_conf_prep(). + * Clean up after per-blkg config update. This function must be called on all + * blkg_conf_ctx's initialized with blkg_conf_init(). */ -void blkg_conf_finish(struct blkg_conf_ctx *ctx) +void blkg_conf_exit(struct blkg_conf_ctx *ctx) __releases(&ctx->bdev->bd_queue->queue_lock) { - spin_unlock_irq(&bdev_get_queue(ctx->bdev)->queue_lock); - blkdev_put_no_open(ctx->bdev); + if (ctx->blkg) { + spin_unlock_irq(&bdev_get_queue(ctx->bdev)->queue_lock); + ctx->blkg = NULL; + } + + if (ctx->bdev) { + blkdev_put_no_open(ctx->bdev); + ctx->body = NULL; + ctx->bdev = NULL; + } } -EXPORT_SYMBOL_GPL(blkg_conf_finish); +EXPORT_SYMBOL_GPL(blkg_conf_exit); static void blkg_iostat_set(struct blkg_iostat *dst, struct blkg_iostat *src) { diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h index 2c6788658544..d6ad3abc6eca 100644 --- a/block/blk-cgroup.h +++ b/block/blk-cgroup.h @@ -206,15 +206,17 @@ void blkcg_print_blkgs(struct seq_file *sf, struct blkcg *blkcg, u64 __blkg_prfill_u64(struct seq_file *sf, struct blkg_policy_data *pd, u64 v); struct blkg_conf_ctx { + char *input; + char *body; struct block_device *bdev; struct blkcg_gq *blkg; - char *body; }; -struct block_device *blkcg_conf_open_bdev(char **inputp); +void blkg_conf_init(struct blkg_conf_ctx *ctx, char *input); +int blkg_conf_open_bdev(struct blkg_conf_ctx *ctx); int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol, - char *input, struct blkg_conf_ctx *ctx); -void blkg_conf_finish(struct blkg_conf_ctx *ctx); + struct blkg_conf_ctx *ctx); +void blkg_conf_exit(struct blkg_conf_ctx *ctx); /** * bio_issue_as_root_blkg - see if this bio needs to be issued as root blkg diff --git a/block/blk-iocost.c b/block/blk-iocost.c index 4442c7a85112..285ced3467ab 100644 --- a/block/blk-iocost.c +++ b/block/blk-iocost.c @@ -3106,9 +3106,11 @@ static ssize_t ioc_weight_write(struct kernfs_open_file *of, char *buf, return nbytes; } - ret = blkg_conf_prep(blkcg, &blkcg_policy_iocost, buf, &ctx); + blkg_conf_init(&ctx, buf); + + ret = blkg_conf_prep(blkcg, &blkcg_policy_iocost, &ctx); if (ret) - return ret; + goto err; iocg = blkg_to_iocg(ctx.blkg); @@ -3127,12 +3129,14 @@ static ssize_t ioc_weight_write(struct kernfs_open_file *of, char *buf, weight_updated(iocg, &now); spin_unlock(&iocg->ioc->lock); - blkg_conf_finish(&ctx); + blkg_conf_exit(&ctx); return nbytes; einval: - blkg_conf_finish(&ctx); - return -EINVAL; + ret = -EINVAL; +err: + blkg_conf_exit(&ctx); + return ret; } static u64 ioc_qos_prfill(struct seq_file *sf, struct blkg_policy_data *pd, @@ -3189,19 +3193,22 @@ static const match_table_t qos_tokens = { static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input, size_t nbytes, loff_t off) { - struct block_device *bdev; + struct blkg_conf_ctx ctx; struct gendisk *disk; struct ioc *ioc; u32 qos[NR_QOS_PARAMS]; bool enable, user; - char *p; + char *body, *p; int ret; - bdev = blkcg_conf_open_bdev(&input); - if (IS_ERR(bdev)) - return PTR_ERR(bdev); + blkg_conf_init(&ctx, input); - disk = bdev->bd_disk; + ret = blkg_conf_open_bdev(&ctx); + if (ret) + goto err; + + body = ctx.body; + disk = ctx.bdev->bd_disk; if (!queue_is_mq(disk->queue)) { ret = -EOPNOTSUPP; goto err; @@ -3223,7 +3230,7 @@ static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input, enable = ioc->enabled; user = ioc->user_qos_params; - while ((p = strsep(&input, " \t\n"))) { + while ((p = strsep(&body, " \t\n"))) { substring_t args[MAX_OPT_ARGS]; char buf[32]; int tok; @@ -3313,7 +3320,7 @@ static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input, blk_mq_unquiesce_queue(disk->queue); blk_mq_unfreeze_queue(disk->queue); - blkdev_put_no_open(bdev); + blkg_conf_exit(&ctx); return nbytes; einval: spin_unlock_irq(&ioc->lock); @@ -3323,7 +3330,7 @@ static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input, ret = -EINVAL; err: - blkdev_put_no_open(bdev); + blkg_conf_exit(&ctx); return ret; } @@ -3376,19 +3383,22 @@ static const match_table_t i_lcoef_tokens = { static ssize_t ioc_cost_model_write(struct kernfs_open_file *of, char *input, size_t nbytes, loff_t off) { - struct block_device *bdev; + struct blkg_conf_ctx ctx; struct request_queue *q; struct ioc *ioc; u64 u[NR_I_LCOEFS]; bool user; - char *p; + char *body, *p; int ret; - bdev = blkcg_conf_open_bdev(&input); - if (IS_ERR(bdev)) - return PTR_ERR(bdev); + blkg_conf_init(&ctx, input); + + ret = blkg_conf_open_bdev(&ctx); + if (ret) + goto err; - q = bdev_get_queue(bdev); + body = ctx.body; + q = bdev_get_queue(ctx.bdev); if (!queue_is_mq(q)) { ret = -EOPNOTSUPP; goto err; @@ -3396,7 +3406,7 @@ static ssize_t ioc_cost_model_write(struct kernfs_open_file *of, char *input, ioc = q_to_ioc(q); if (!ioc) { - ret = blk_iocost_init(bdev->bd_disk); + ret = blk_iocost_init(ctx.bdev->bd_disk); if (ret) goto err; ioc = q_to_ioc(q); @@ -3409,7 +3419,7 @@ static ssize_t ioc_cost_model_write(struct kernfs_open_file *of, char *input, memcpy(u, ioc->params.i_lcoefs, sizeof(u)); user = ioc->user_cost_model; - while ((p = strsep(&input, " \t\n"))) { + while ((p = strsep(&body, " \t\n"))) { substring_t args[MAX_OPT_ARGS]; char buf[32]; int tok; @@ -3456,7 +3466,7 @@ static ssize_t ioc_cost_model_write(struct kernfs_open_file *of, char *input, blk_mq_unquiesce_queue(q); blk_mq_unfreeze_queue(q); - blkdev_put_no_open(bdev); + blkg_conf_exit(&ctx); return nbytes; einval: @@ -3467,7 +3477,7 @@ static ssize_t ioc_cost_model_write(struct kernfs_open_file *of, char *input, ret = -EINVAL; err: - blkdev_put_no_open(bdev); + blkg_conf_exit(&ctx); return ret; } diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c index 0dc910568b31..6707164c37f1 100644 --- a/block/blk-iolatency.c +++ b/block/blk-iolatency.c @@ -836,9 +836,11 @@ static ssize_t iolatency_set_limit(struct kernfs_open_file *of, char *buf, u64 oldval; int ret; - ret = blkg_conf_prep(blkcg, &blkcg_policy_iolatency, buf, &ctx); + blkg_conf_init(&ctx, buf); + + ret = blkg_conf_prep(blkcg, &blkcg_policy_iolatency, &ctx); if (ret) - return ret; + goto out; iolat = blkg_to_lat(ctx.blkg); p = ctx.body; @@ -874,7 +876,7 @@ static ssize_t iolatency_set_limit(struct kernfs_open_file *of, char *buf, iolatency_clear_scaling(blkg); ret = 0; out: - blkg_conf_finish(&ctx); + blkg_conf_exit(&ctx); return ret ?: nbytes; } diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 47e9d8be68f3..9bac95343ba0 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -1368,9 +1368,11 @@ static ssize_t tg_set_conf(struct kernfs_open_file *of, int ret; u64 v; - ret = blkg_conf_prep(blkcg, &blkcg_policy_throtl, buf, &ctx); + blkg_conf_init(&ctx, buf); + + ret = blkg_conf_prep(blkcg, &blkcg_policy_throtl, &ctx); if (ret) - return ret; + goto out_finish; ret = -EINVAL; if (sscanf(ctx.body, "%llu", &v) != 1) @@ -1389,7 +1391,7 @@ static ssize_t tg_set_conf(struct kernfs_open_file *of, tg_conf_updated(tg, false); ret = 0; out_finish: - blkg_conf_finish(&ctx); + blkg_conf_exit(&ctx); return ret ?: nbytes; } @@ -1561,9 +1563,11 @@ static ssize_t tg_set_limit(struct kernfs_open_file *of, int ret; int index = of_cft(of)->private; - ret = blkg_conf_prep(blkcg, &blkcg_policy_throtl, buf, &ctx); + blkg_conf_init(&ctx, buf); + + ret = blkg_conf_prep(blkcg, &blkcg_policy_throtl, &ctx); if (ret) - return ret; + goto out_finish; tg = blkg_to_tg(ctx.blkg); tg_update_carryover(tg); @@ -1662,7 +1666,7 @@ static ssize_t tg_set_limit(struct kernfs_open_file *of, tg->td->limit_valid[LIMIT_LOW]); ret = 0; out_finish: - blkg_conf_finish(&ctx); + blkg_conf_exit(&ctx); return ret ?: nbytes; } From patchwork Thu Apr 13 00:06:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 13209677 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D63D3C77B6C for ; Thu, 13 Apr 2023 00:07:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229760AbjDMAHG (ORCPT ); Wed, 12 Apr 2023 20:07:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229732AbjDMAHE (ORCPT ); Wed, 12 Apr 2023 20:07:04 -0400 Received: from mail-pj1-x1035.google.com (mail-pj1-x1035.google.com [IPv6:2607:f8b0:4864:20::1035]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E546B76BB; Wed, 12 Apr 2023 17:07:01 -0700 (PDT) Received: by mail-pj1-x1035.google.com with SMTP id l9-20020a17090a3f0900b0023d32684e7fso4994198pjc.1; Wed, 12 Apr 2023 17:07:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1681344421; x=1683936421; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=M4IbWZnTXBOzIxRU6MUG4lSyI23IeZVC+QVDJ/RpU5E=; b=P0pDbjFzPRdinRnTj4Ok5PfWTprkWOlQNhTcHfH4iCyJTQETRXAdDh9h7uI8XvcIdN /QuJd9YxROGZU2xH+d1hueAaUAuoHHtWvWSIif9i0fYxHMo6pF+LZTl4kGd69KCzuiD1 k1zfNZ3qJidbW+m/ZniOa1xmrvND9MOsEI5UrTq6ulqBYzK1EWosHwuy/MtP8w++u1Po Hlv0/AiL2/8kqN4A2PnIhg8fDdlheWLG6NdE/NoO/+4FKA0GMbv8fx0V4KUBZ0IXTlIz YNdbBP9pPsVzloBfKHNdJtBQimbBXz0IIrnTCHwqn7se+I1bU6hYH5L8IoW0pwBDn0Qr R5UA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681344421; x=1683936421; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=M4IbWZnTXBOzIxRU6MUG4lSyI23IeZVC+QVDJ/RpU5E=; b=HwU6e+pzUHrlocXgOFxz8Kqpj/bPL6EtQq9uYjbu0X5sXHBy4aaBrenki06I4Y/T6b h1kU6iTW1EjvZv0WxIARGdnbelZWaGyQjg8gIG9lgn/GyXRneGNwjzJuVuklGWDpujaK wQzcMJCaPRGEQV+VkystokkD2WtRYtnBMNo2ATTntavCkvrioBZQCtpYftHXbWcxsGhw BG8D6W4g8fgAi6iMAMNpMrmQD4l9/C+sdUnDwMzp7YHp75cfYRHXTU8+MXWr35mRuJEl XvqznPKwsFYFCElspyBGrVhVOI4gWJjjpDwvfUO4NbJW7NTMwSuz+gGPJ+u41JbdyX0g zrJw== X-Gm-Message-State: AAQBX9fr4epoEEq9wKz0qdXRUoqJmI1sdbWdgP+fi7CDQ4jJPT1B+OWh UJqeZM7D0+rnT7HX31oxi24= X-Google-Smtp-Source: AKy350bA/wlhMB8kKD1xq6pWlj3VZ8nNDPo12O7Qsjx0TisZprIXr5POdtVTyjuAtCMyizXNrqt+fw== X-Received: by 2002:a17:902:e54f:b0:19c:a9b8:4349 with SMTP id n15-20020a170902e54f00b0019ca9b84349mr152990plf.32.1681344421005; Wed, 12 Apr 2023 17:07:01 -0700 (PDT) Received: from localhost (2603-800c-1a02-1bae-a7fa-157f-969a-4cde.res6.spectrum.com. [2603:800c:1a02:1bae:a7fa:157f:969a:4cde]) by smtp.gmail.com with ESMTPSA id bi12-20020a170902bf0c00b001a63c61f06esm140111plb.195.2023.04.12.17.07.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Apr 2023 17:07:00 -0700 (PDT) Sender: Tejun Heo From: Tejun Heo To: axboe@kernel.dk, josef@toxicpanda.com, hch@lst.de Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai1@huaweicloud.com, Tejun Heo Subject: [PATCH 3/4] blk-iolatency: s/blkcg_rq_qos/iolat_rq_qos/ Date: Wed, 12 Apr 2023 14:06:48 -1000 Message-Id: <20230413000649.115785-4-tj@kernel.org> X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230413000649.115785-1-tj@kernel.org> References: <20230413000649.115785-1-tj@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org The name was too generic given that there are multiple blkcg rq-qos policies. Signed-off-by: Tejun Heo Reviewed-by: Christoph Hellwig Cc: Josef Bacik --- block/blk-iolatency.c | 2 +- block/blk-rq-qos.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c index 6707164c37f1..2560708b9109 100644 --- a/block/blk-iolatency.c +++ b/block/blk-iolatency.c @@ -969,7 +969,7 @@ static void iolatency_pd_init(struct blkg_policy_data *pd) { struct iolatency_grp *iolat = pd_to_lat(pd); struct blkcg_gq *blkg = lat_to_blkg(iolat); - struct rq_qos *rqos = blkcg_rq_qos(blkg->q); + struct rq_qos *rqos = iolat_rq_qos(blkg->q); struct blk_iolatency *blkiolat = BLKIOLATENCY(rqos); u64 now = ktime_to_ns(ktime_get()); int cpu; diff --git a/block/blk-rq-qos.h b/block/blk-rq-qos.h index b02a1a3d33a8..f48ee150d667 100644 --- a/block/blk-rq-qos.h +++ b/block/blk-rq-qos.h @@ -74,7 +74,7 @@ static inline struct rq_qos *wbt_rq_qos(struct request_queue *q) return rq_qos_id(q, RQ_QOS_WBT); } -static inline struct rq_qos *blkcg_rq_qos(struct request_queue *q) +static inline struct rq_qos *iolat_rq_qos(struct request_queue *q) { return rq_qos_id(q, RQ_QOS_LATENCY); } From patchwork Thu Apr 13 00:06:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 13209679 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17199C7619A for ; Thu, 13 Apr 2023 00:07:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229498AbjDMAHJ (ORCPT ); Wed, 12 Apr 2023 20:07:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57296 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229787AbjDMAHF (ORCPT ); Wed, 12 Apr 2023 20:07:05 -0400 Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC5D07290; Wed, 12 Apr 2023 17:07:03 -0700 (PDT) Received: by mail-pl1-x631.google.com with SMTP id w11so13219495plp.13; Wed, 12 Apr 2023 17:07:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1681344423; x=1683936423; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=6a22uOlRjDtJ8Fbi9zx7mQDhRHqsq8ulm+wYpcSlrL4=; b=nf7U8/vw1JXB5hQIi0zWYWw5ZPhjDCJtGW3DpoxfqeVVj8t09FSkjnxwrOA+mMLVkk BTzi3yVizrqTLQv4IlPg4/pL+zlCPeyVgA+uq9NN2Wb/UfUFEHpoCh2GSaI5Rldwx7zV Pkpznn6YxU+c7712ZZRfj3w3x1rQVz46EtIatX6/2nMrd/aNb7y16CyZGQE0kpXxmIas JPCoLErPvoJeHHuJW6tb1Ep54Lk3u3XIoxQw4HBwojb33fZMnyTTrtFZHFgZ1YDj0IUv BkdJxlSsh6fc3In0b7dCzZ9BdCo2seKA4FIsz04hg79dTTotlHAybz1oAtvYG+FU6+pT F83w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681344423; x=1683936423; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=6a22uOlRjDtJ8Fbi9zx7mQDhRHqsq8ulm+wYpcSlrL4=; b=VkRm/ZjL5cl6RadmP4GmabA+P8igHmgGshWN2ME8se1fwgMsyI7fqN5KQI5yU4AQ/j oDmhhofCFB4QD9PmXespm/9WrRtAg+WI14Mr2e31WusJdLD5mKw1fTWr93tfxpw7dKHI IMc2jNsV26Awg2hK0HIm5iAaadhkm/U6XTDEiwOtNBTNv1J62lYYHdM76lZO892LPEUK /ddoxwEuh8RUlylnF386lZ00VgM7rN9dhtLu9aYY8oECcx14fVglG0otu1POGrqKJAvh 2ZaXmLohYpUAdsbt9T2dhx09WNHKLbRm4h9jNuNuSdLFv0RzBQ2pUQT1rc2tBs8FWBpm vo3A== X-Gm-Message-State: AAQBX9fqRE9eStglKy/bBOXcTulx+EX8IeQxWtAhs9qEnpnJ9omZhe8N K2J/JIgAmxDsWBeth8avI/I= X-Google-Smtp-Source: AKy350brOmX8aAGegEFStx3AvYebyWfnUrUl3m5QZ6vV8rtiDpufL8rfyiB+6H/LHV3E0fNGAp0ZSg== X-Received: by 2002:a17:903:230a:b0:1a5:2a50:e177 with SMTP id d10-20020a170903230a00b001a52a50e177mr40928plh.55.1681344422711; Wed, 12 Apr 2023 17:07:02 -0700 (PDT) Received: from localhost (2603-800c-1a02-1bae-a7fa-157f-969a-4cde.res6.spectrum.com. [2603:800c:1a02:1bae:a7fa:157f:969a:4cde]) by smtp.gmail.com with ESMTPSA id h9-20020a170902f7c900b00192aa53a7d5sm161665plw.8.2023.04.12.17.07.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Apr 2023 17:07:02 -0700 (PDT) Sender: Tejun Heo From: Tejun Heo To: axboe@kernel.dk, josef@toxicpanda.com, hch@lst.de Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai1@huaweicloud.com, Tejun Heo Subject: [PATCH 4/4] blk-iolatency: Make initialization lazy Date: Wed, 12 Apr 2023 14:06:49 -1000 Message-Id: <20230413000649.115785-5-tj@kernel.org> X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230413000649.115785-1-tj@kernel.org> References: <20230413000649.115785-1-tj@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Other rq_qos policies such as wbt and iocost are lazy-initialized when they are configured for the first time for the device but iolatency is initialized unconditionally from blkcg_init_disk() during gendisk init. Lazy init is beneficial because rq_qos policies add runtime overhead when initialized as every IO has to walk all registered rq_qos callbacks. This patch switches iolatency to lazy initialization too so that it only registered its rq_qos policy when it is first configured. Note that there is a known race condition between blkcg config file writes and del_gendisk() and this patch makes iolatency susceptible to it by exposing the init path to race against the deletion path. However, that problem already exists in iocost and is being worked on. Signed-off-by: Tejun Heo Reviewed-by: Christoph Hellwig Cc: Josef Bacik Reviewed-by: Christoph Hellwig --- block/blk-cgroup.c | 8 -------- block/blk-iolatency.c | 29 ++++++++++++++++++++++++++++- block/blk.h | 6 ------ 3 files changed, 28 insertions(+), 15 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index c154b08a7e92..1c1ebeb51003 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -33,7 +33,6 @@ #include "blk-cgroup.h" #include "blk-ioprio.h" #include "blk-throttle.h" -#include "blk-rq-qos.h" /* * blkcg_pol_mutex protects blkcg_policy[] and policy [de]activation. @@ -1350,14 +1349,8 @@ int blkcg_init_disk(struct gendisk *disk) if (ret) goto err_ioprio_exit; - ret = blk_iolatency_init(disk); - if (ret) - goto err_throtl_exit; - return 0; -err_throtl_exit: - blk_throtl_exit(disk); err_ioprio_exit: blk_ioprio_exit(disk); err_destroy_all: @@ -1373,7 +1366,6 @@ int blkcg_init_disk(struct gendisk *disk) void blkcg_exit_disk(struct gendisk *disk) { blkg_destroy_all(disk); - rq_qos_exit(disk->queue); blk_throtl_exit(disk); } diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c index 2560708b9109..fd5fec989e39 100644 --- a/block/blk-iolatency.c +++ b/block/blk-iolatency.c @@ -755,7 +755,7 @@ static void blkiolatency_enable_work_fn(struct work_struct *work) } } -int blk_iolatency_init(struct gendisk *disk) +static int blk_iolatency_init(struct gendisk *disk) { struct blk_iolatency *blkiolat; int ret; @@ -824,6 +824,29 @@ static void iolatency_clear_scaling(struct blkcg_gq *blkg) } } +static int blk_iolatency_try_init(struct blkg_conf_ctx *ctx) +{ + static DEFINE_MUTEX(init_mutex); + int ret; + + ret = blkg_conf_open_bdev(ctx); + if (ret) + return ret; + + /* + * blk_iolatency_init() may fail after rq_qos_add() succeeds which can + * confuse iolat_rq_qos() test. Make the test and init atomic. + */ + mutex_lock(&init_mutex); + + if (!iolat_rq_qos(ctx->bdev->bd_queue)) + ret = blk_iolatency_init(ctx->bdev->bd_disk); + + mutex_unlock(&init_mutex); + + return ret; +} + static ssize_t iolatency_set_limit(struct kernfs_open_file *of, char *buf, size_t nbytes, loff_t off) { @@ -838,6 +861,10 @@ static ssize_t iolatency_set_limit(struct kernfs_open_file *of, char *buf, blkg_conf_init(&ctx, buf); + ret = blk_iolatency_try_init(&ctx); + if (ret) + goto out; + ret = blkg_conf_prep(blkcg, &blkcg_policy_iolatency, &ctx); if (ret) goto out; diff --git a/block/blk.h b/block/blk.h index d65d96994a94..62fca868bc61 100644 --- a/block/blk.h +++ b/block/blk.h @@ -399,12 +399,6 @@ static inline struct bio *blk_queue_bounce(struct bio *bio, return bio; } -#ifdef CONFIG_BLK_CGROUP_IOLATENCY -int blk_iolatency_init(struct gendisk *disk); -#else -static inline int blk_iolatency_init(struct gendisk *disk) { return 0; }; -#endif - #ifdef CONFIG_BLK_DEV_ZONED void disk_free_zone_bitmaps(struct gendisk *disk); void disk_clear_zone_settings(struct gendisk *disk);