From patchwork Sat Dec 10 18:33:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 13070392 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D454EC4332F for ; Sat, 10 Dec 2022 18:33:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229529AbiLJSdQ (ORCPT ); Sat, 10 Dec 2022 13:33:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49526 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229475AbiLJSdP (ORCPT ); Sat, 10 Dec 2022 13:33:15 -0500 Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 19A8310FEA; Sat, 10 Dec 2022 10:33:13 -0800 (PST) Received: by mail-pl1-x62e.google.com with SMTP id s7so8112789plk.5; Sat, 10 Dec 2022 10:33:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-disposition:mime-version:message-id:subject:cc:to:from:date :sender:from:to:cc:subject:date:message-id:reply-to; bh=erjJZ3BPLuF4fh5E96B8USxD9G40KMRBirufYCK/yho=; b=m87yCXDKsTffULppu3iMrVZ6qDGYyGYS8EBcgLw4eBJkmuxFpBVOL+7JEw4XErKQGC vLtJw+xsdY5BZlrCtOvEQGtNbgh7czgcay3iDqkEGsbY4qDA6JAhhC2RiQiGf63eUABC p9+USAbJEUvcTLi71QGdGxt+iPBTZorJbWqNoxIguBi8PAHdHsS1LV+Ib0F4XuB/96Fu 8yUp988BD3BKPfsWqy4viBsrVQe6N010fxyG8HGst1vYlbmeW/qSYikPO0cUhHobDrDO hFAvxy0vvncTEfbkphhPsS0ZkUBk0i0VcT20KQsFijHFyxfG8rtA7mEhMXuDiPy24wFs N1/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-disposition:mime-version:message-id:subject:cc:to:from:date :sender:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=erjJZ3BPLuF4fh5E96B8USxD9G40KMRBirufYCK/yho=; b=RnlBDRFGbyRfzHT5+gJTMpqPSsfPVxwoJWc08TbJneCGFogZCEJ3B6W8CyouVTT8XN 0nXvoHvHbNBgQ+SHr3pghDAtaGEAsf/jM5HRQrhrB+EzqLNo8PnvMgt7cfLpfRZerH0k TVAD/39U0uZV4T3xLwgitAvmpLzySrOD7PV+yvlM+up7BN6Hc5qxvSqDBiBiAQ3AKTUE fZsJMB9CUclJ/LyCAadgKlUtJlQbO/SUPK6tdRXDlCceV9v17+EsAKvyzVBs+AFqt3qz e+81Wxg/tIJcjluZFrPxmocDBZvQJYmqei9NIEp1onJAZpQb7i4aFW62TUl29i4zAzEf Tebw== X-Gm-Message-State: ANoB5pnBqCIU/uB0HAswcYdXpbVYIpo9Hq+KdZWVv1dQeKs8TZZASl15 LAbooYmKKjT+xQyhY+v/Xuvn2Zemc3gPbw== X-Google-Smtp-Source: AA0mqf7IM/h/sKpCbZ5v9LMPJBptNljSVRd+DB/beyA2pSYIvoWdAuZWxxlsqtPH9n99+jqe2zXwHQ== X-Received: by 2002:a17:903:138a:b0:189:5f5c:da1f with SMTP id jx10-20020a170903138a00b001895f5cda1fmr10668384plb.5.1670697192213; Sat, 10 Dec 2022 10:33:12 -0800 (PST) Received: from localhost (2603-800c-1a02-1bae-a7fa-157f-969a-4cde.res6.spectrum.com. [2603:800c:1a02:1bae:a7fa:157f:969a:4cde]) by smtp.gmail.com with ESMTPSA id jj12-20020a170903048c00b0018996404dd5sm3284296plb.109.2022.12.10.10.33.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 10 Dec 2022 10:33:11 -0800 (PST) Sender: Tejun Heo Date: Sat, 10 Dec 2022 08:33:10 -1000 From: Tejun Heo To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, darklight2357@icloud.com, Josef Bacik , Linus Torvalds , cgroups@vger.kernel.org Subject: [PATCH 1/2 block/for-6.2] blk-iolatency: Fix memory leak on add_disk() failures Message-ID: MIME-Version: 1.0 Content-Disposition: inline Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org When a gendisk is successfully initialized but add_disk() fails such as when a loop device has invalid number of minor device numbers specified, blkcg_init_disk() is called during init and then blkcg_exit_disk() during error handling. Unfortunately, iolatency gets initialized in the former but doesn't get cleaned up in the latter. This is because, in non-error cases, the cleanup is performed by del_gendisk() calling rq_qos_exit(), the assumption being that rq_qos policies, iolatency being one of them, can only be activated once the disk is fully registered and visible. That assumption is true for wbt and iocost, but not so for iolatency as it gets initialized before add_disk() is called. It is desirable to lazy-init rq_qos policies because they are optional features and add to hot path overhead once initialized - each IO has to walk all the registered rq_qos policies. So, we want to switch iolatency to lazy init too. However, that's a bigger change. As a fix for the immediate problem, let's just add an extra call to rq_qos_exit() in blkcg_exit_disk(). This is safe because duplicate calls to rq_qos_exit() become noop's. Signed-off-by: Tejun Heo Reported-by: darklight2357@icloud.com Cc: Josef Bacik Cc: Linus Torvalds Fixes: d70675121546 ("block: introduce blk-iolatency io controller") Cc: stable@vger.kernel.org # v4.19+ Reviewed-by: Christoph Hellwig --- Hello, I'm posting two patches for the iolatency memory leak issue after add_disk() failure. This one is the immediate fix and should be really safe. However, any change has risks and given that the bug being address is not critical at all, I still think it'd make sense to route it through 6.2-rc1 rather than applying directly to master for 6.1 release. So, it's tagged for the 6.2 merge window. Thanks. block/blk-cgroup.c | 2 ++ 1 file changed, 2 insertions(+) --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -33,6 +33,7 @@ #include "blk-cgroup.h" #include "blk-ioprio.h" #include "blk-throttle.h" +#include "blk-rq-qos.h" /* * blkcg_pol_mutex protects blkcg_policy[] and policy [de]activation. @@ -1322,6 +1323,7 @@ err_unlock: void blkcg_exit_disk(struct gendisk *disk) { blkg_destroy_all(disk); + rq_qos_exit(disk->queue); blk_throtl_exit(disk); } From patchwork Sat Dec 10 18:34:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 13070393 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17E22C10F1B for ; Sat, 10 Dec 2022 18:35:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229637AbiLJSfD (ORCPT ); Sat, 10 Dec 2022 13:35:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50076 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229475AbiLJSfC (ORCPT ); Sat, 10 Dec 2022 13:35:02 -0500 Received: from mail-pj1-x1031.google.com (mail-pj1-x1031.google.com [IPv6:2607:f8b0:4864:20::1031]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A395413DE1; Sat, 10 Dec 2022 10:35:01 -0800 (PST) Received: by mail-pj1-x1031.google.com with SMTP id u15-20020a17090a3fcf00b002191825cf02so8206578pjm.2; Sat, 10 Dec 2022 10:35:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:sender:from:to:cc:subject:date:message-id :reply-to; bh=uY04nYXLVzuNbDq8d1mF0P2iWu8Vn/aqFvQdGInkT+U=; b=Q4Z+xbyvQLwy6YCSc3VKV/vv66l6CO/FJX6LLn1CZqXpM7GmPTER00jPUfsDLbpxQR NXxBhcjscA5Ksyai91lI8ZTKrgvdD4uGPeLTBSHhLQ1We78ZgiDjQgYuq/K1oxPXp269 olUX7zXOKyl3iiosYHagArqjSc+/xUEUK+gom6eL4PLR4gKFa5127JmBOMa1iMdflG03 0Tvbfcs6cDJS29KnFrDlSP9Zn1qD/q95LQOOrZo3keK1uT5eJzGTigCGJc8qbsaUVan/ mp8hZLY/k20sWRXmcgTtBD4lnyJaKpqTJYwyimmtI8nBHV8WjiFA2HJrcS71uasFC8Yl vZSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:sender:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uY04nYXLVzuNbDq8d1mF0P2iWu8Vn/aqFvQdGInkT+U=; b=4Hx9pFy7zsGELBg5rX62wbed3FEodupm/3g8TRiQdmtDbegSS6GisZjeJMJbRWZRrd wM2Vxnav3sPCKwSoNQD+vR9nJp9p5KhLASrNjh+e+bG9dJkptBQC1xrm2xduLvpgyli4 QVwX8BnY0APBAD5MAwuK2UPLXukry3R7Rn/Q9Tvfx/GqxCZfWAUPQUjKL6Mxbdzt3sIu lhsIm5VoGqv3nHn+tT/5QVfQqA66UnWnigWPAxNXEtLPazI3EeLCPoOM7wORTQqpaYyw Pln7ZYiE7/PjTPxg5MqaQiU0LEsMOXWI6IeXJ8lYtLiI4rW9LQ0OAt11DMk5FPfAOe8z 2qpg== X-Gm-Message-State: ANoB5plo0YIUzvJuRGwW8OS6Lz5Z9huanyqbzjYPeYrS+oAakQeXmh3y 7Uh8IEBRgMkKc8fciMYNFa7PcZvTOKeF/Q== X-Google-Smtp-Source: AA0mqf76RYscAg1y9pG9/V2uIzR4qfwCHDjtAQeDrqt0Ng4Drg0kpNnJL8CGXsSELP0WzP1fPI/sCA== X-Received: by 2002:a05:6a20:4294:b0:a6:6d23:9fff with SMTP id o20-20020a056a20429400b000a66d239fffmr14533303pzj.14.1670697300942; Sat, 10 Dec 2022 10:35:00 -0800 (PST) Received: from localhost (2603-800c-1a02-1bae-a7fa-157f-969a-4cde.res6.spectrum.com. [2603:800c:1a02:1bae:a7fa:157f:969a:4cde]) by smtp.gmail.com with ESMTPSA id u3-20020a170902e5c300b00189947bd9f7sm3305098plf.50.2022.12.10.10.35.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 10 Dec 2022 10:35:00 -0800 (PST) Sender: Tejun Heo Date: Sat, 10 Dec 2022 08:34:59 -1000 From: Tejun Heo To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, darklight2357@icloud.com, Josef Bacik , Linus Torvalds , cgroups@vger.kernel.org Subject: [PATCH 2/2 block/for-6.2] blk-iolatency: Make initialization lazy Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Other rq_qos policies such as wbt and iocost are lazy-initialized when they are configured for the first time for the device but iolatency is initialized unconditionally from blkcg_init_disk() during gendisk init. Lazy init is beneficial because rq_qos policies add runtime overhead when initialized as every IO has to walk all registered rq_qos callbacks. This patch switches iolatency to lazy initialization too so that it only registered its rq_qos policy when it is first configured. Note that there is a known race condition between blkcg config file writes and del_gendisk() and this patch makes iolatency susceptible to it by exposing the init path to race against the deletion path. However, that problem already exists in iocost and is being worked on. Signed-off-by: Tejun Heo Cc: Josef Bacik --- Hello, Tagged for-6.2 but this can easily go for the next merge window too. Thanks. block/blk-cgroup.c | 8 -------- block/blk-iolatency.c | 36 ++++++++++++++++++++++++++++++++++-- block/blk-rq-qos.h | 2 +- block/blk.h | 6 ------ 4 files changed, 35 insertions(+), 17 deletions(-) --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -33,7 +33,6 @@ #include "blk-cgroup.h" #include "blk-ioprio.h" #include "blk-throttle.h" -#include "blk-rq-qos.h" /* * blkcg_pol_mutex protects blkcg_policy[] and policy [de]activation. @@ -1300,14 +1299,8 @@ int blkcg_init_disk(struct gendisk *disk if (ret) goto err_ioprio_exit; - ret = blk_iolatency_init(disk); - if (ret) - goto err_throtl_exit; - return 0; -err_throtl_exit: - blk_throtl_exit(disk); err_ioprio_exit: blk_ioprio_exit(disk); err_destroy_all: @@ -1323,7 +1316,6 @@ err_unlock: void blkcg_exit_disk(struct gendisk *disk) { blkg_destroy_all(disk); - rq_qos_exit(disk->queue); blk_throtl_exit(disk); } --- a/block/blk-iolatency.c +++ b/block/blk-iolatency.c @@ -755,7 +755,7 @@ static void blkiolatency_enable_work_fn( } } -int blk_iolatency_init(struct gendisk *disk) +static int blk_iolatency_init(struct gendisk *disk) { struct request_queue *q = disk->queue; struct blk_iolatency *blkiolat; @@ -830,6 +830,31 @@ static void iolatency_clear_scaling(stru } } +static int blk_iolatency_try_init(char *input) +{ + static DEFINE_MUTEX(init_mutex); + struct block_device *bdev; + int ret = 0; + + bdev = blkcg_conf_open_bdev(&input); + if (IS_ERR(bdev)) + return PTR_ERR(bdev); + + /* + * blk_iolatency_init() may fail after rq_qos_add() succeeds which can + * confuse iolat_rq_qos() test. Make the test and init atomic. + */ + mutex_lock(&init_mutex); + + if (!iolat_rq_qos(bdev->bd_queue)) + ret = blk_iolatency_init(bdev->bd_disk); + + mutex_unlock(&init_mutex); + blkdev_put_no_open(bdev); + + return ret; +} + static ssize_t iolatency_set_limit(struct kernfs_open_file *of, char *buf, size_t nbytes, loff_t off) { @@ -842,7 +867,14 @@ static ssize_t iolatency_set_limit(struc u64 oldval; int ret; +retry: ret = blkg_conf_prep(blkcg, &blkcg_policy_iolatency, buf, &ctx); + if (ret == -EOPNOTSUPP) { + ret = blk_iolatency_try_init(buf); + if (ret) + return ret; + goto retry; + } if (ret) return ret; @@ -974,7 +1006,7 @@ static void iolatency_pd_init(struct blk { struct iolatency_grp *iolat = pd_to_lat(pd); struct blkcg_gq *blkg = lat_to_blkg(iolat); - struct rq_qos *rqos = blkcg_rq_qos(blkg->q); + struct rq_qos *rqos = iolat_rq_qos(blkg->q); struct blk_iolatency *blkiolat = BLKIOLATENCY(rqos); u64 now = ktime_to_ns(ktime_get()); int cpu; --- a/block/blk-rq-qos.h +++ b/block/blk-rq-qos.h @@ -74,7 +74,7 @@ static inline struct rq_qos *wbt_rq_qos( return rq_qos_id(q, RQ_QOS_WBT); } -static inline struct rq_qos *blkcg_rq_qos(struct request_queue *q) +static inline struct rq_qos *iolat_rq_qos(struct request_queue *q) { return rq_qos_id(q, RQ_QOS_LATENCY); } --- a/block/blk.h +++ b/block/blk.h @@ -392,12 +392,6 @@ static inline struct bio *blk_queue_boun return bio; } -#ifdef CONFIG_BLK_CGROUP_IOLATENCY -int blk_iolatency_init(struct gendisk *disk); -#else -static inline int blk_iolatency_init(struct gendisk *disk) { return 0; }; -#endif - #ifdef CONFIG_BLK_DEV_ZONED void disk_free_zone_bitmaps(struct gendisk *disk); void disk_clear_zone_settings(struct gendisk *disk);