From patchwork Thu Jan 24 00:16:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liu Bo X-Patchwork-Id: 10778185 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 663631575 for ; Thu, 24 Jan 2019 00:17:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 57FA92856D for ; Thu, 24 Jan 2019 00:17:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4B0E42DB16; Thu, 24 Jan 2019 00:17:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EBB472856D for ; Thu, 24 Jan 2019 00:17:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726078AbfAXARA (ORCPT ); Wed, 23 Jan 2019 19:17:00 -0500 Received: from out30-131.freemail.mail.aliyun.com ([115.124.30.131]:46952 "EHLO out30-131.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726168AbfAXARA (ORCPT ); Wed, 23 Jan 2019 19:17:00 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R101e4;CH=green;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e07486;MF=bo.liu@linux.alibaba.com;NM=1;PH=DS;RN=2;SR=0;TI=SMTPD_---0TIsCJom_1548289017; Received: from localhost(mailfrom:bo.liu@linux.alibaba.com fp:SMTPD_---0TIsCJom_1548289017) by smtp.aliyun-inc.com(127.0.0.1); Thu, 24 Jan 2019 08:16:58 +0800 From: Liu Bo To: linux-block@vger.kernel.org Cc: Josef Bacik Subject: [PATCH 1/3] Blk-iolatency: warn on negative inflight IO counter Date: Thu, 24 Jan 2019 08:16:53 +0800 Message-Id: <20190124001655.92508-2-bo.liu@linux.alibaba.com> X-Mailer: git-send-email 2.20.1.2.gb21ebb6 In-Reply-To: <20190124001655.92508-1-bo.liu@linux.alibaba.com> References: <20190124001655.92508-1-bo.liu@linux.alibaba.com> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This is to catch any unexpected negative value of inflight IO counter. Signed-off-by: Liu Bo --- block/blk-iolatency.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c index fc714ef402a6..7200f3189a84 100644 --- a/block/blk-iolatency.c +++ b/block/blk-iolatency.c @@ -591,6 +591,7 @@ static void blkcg_iolatency_done_bio(struct rq_qos *rqos, struct bio *bio) u64 now = ktime_to_ns(ktime_get()); bool issue_as_root = bio_issue_as_root_blkg(bio); bool enabled = false; + int inflight = 0; blkg = bio->bi_blkg; if (!blkg || !bio_flagged(bio, BIO_TRACKED)) @@ -609,7 +610,8 @@ static void blkcg_iolatency_done_bio(struct rq_qos *rqos, struct bio *bio) } rqw = &iolat->rq_wait; - atomic_dec(&rqw->inflight); + inflight = atomic_dec_return(&rqw->inflight); + WARN_ON_ONCE(inflight < 0); if (!enabled || iolat->min_lat_nsec == 0) goto next; iolatency_record_time(iolat, &bio->bi_issue, now, From patchwork Thu Jan 24 00:16:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liu Bo X-Patchwork-Id: 10778187 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9F39613BF for ; Thu, 24 Jan 2019 00:17:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 905832856D for ; Thu, 24 Jan 2019 00:17:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 83B8A2DB16; Thu, 24 Jan 2019 00:17:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0B84A2856D for ; Thu, 24 Jan 2019 00:17:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726310AbfAXARB (ORCPT ); Wed, 23 Jan 2019 19:17:01 -0500 Received: from out30-131.freemail.mail.aliyun.com ([115.124.30.131]:45229 "EHLO out30-131.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726252AbfAXARA (ORCPT ); Wed, 23 Jan 2019 19:17:00 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R131e4;CH=green;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04420;MF=bo.liu@linux.alibaba.com;NM=1;PH=DS;RN=2;SR=0;TI=SMTPD_---0TIsDr.4_1548289018; Received: from localhost(mailfrom:bo.liu@linux.alibaba.com fp:SMTPD_---0TIsDr.4_1548289018) by smtp.aliyun-inc.com(127.0.0.1); Thu, 24 Jan 2019 08:16:58 +0800 From: Liu Bo To: linux-block@vger.kernel.org Cc: Josef Bacik Subject: [PATCH 2/3] blk-iolatency: fix IO hang due to negative inflight counter Date: Thu, 24 Jan 2019 08:16:54 +0800 Message-Id: <20190124001655.92508-3-bo.liu@linux.alibaba.com> X-Mailer: git-send-email 2.20.1.2.gb21ebb6 In-Reply-To: <20190124001655.92508-1-bo.liu@linux.alibaba.com> References: <20190124001655.92508-1-bo.liu@linux.alibaba.com> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Our test reported the following stack, and vmcore showed that ->inflight counter is -1. [ffffc9003fcc38d0] __schedule at ffffffff8173d95d [ffffc9003fcc3958] schedule at ffffffff8173de26 [ffffc9003fcc3970] io_schedule at ffffffff810bb6b6 [ffffc9003fcc3988] blkcg_iolatency_throttle at ffffffff813911cb [ffffc9003fcc3a20] rq_qos_throttle at ffffffff813847f3 [ffffc9003fcc3a48] blk_mq_make_request at ffffffff8137468a [ffffc9003fcc3b08] generic_make_request at ffffffff81368b49 [ffffc9003fcc3b68] submit_bio at ffffffff81368d7d [ffffc9003fcc3bb8] ext4_io_submit at ffffffffa031be00 [ext4] [ffffc9003fcc3c00] ext4_writepages at ffffffffa03163de [ext4] [ffffc9003fcc3d68] do_writepages at ffffffff811c49ae [ffffc9003fcc3d78] __filemap_fdatawrite_range at ffffffff811b6188 [ffffc9003fcc3e30] filemap_write_and_wait_range at ffffffff811b6301 [ffffc9003fcc3e60] ext4_sync_file at ffffffffa030cee8 [ext4] [ffffc9003fcc3ea8] vfs_fsync_range at ffffffff8128594b [ffffc9003fcc3ee8] do_fsync at ffffffff81285abd [ffffc9003fcc3f18] sys_fsync at ffffffff81285d50 [ffffc9003fcc3f28] do_syscall_64 at ffffffff81003c04 [ffffc9003fcc3f50] entry_SYSCALL_64_after_swapgs at ffffffff81742b8e The ->inflight counter may be negative (-1) if 1) blk-iolatency was disabled when the IO was issued, 2) blk-iolatency was enabled before this IO reached its endio, 3) the ->inflight counter is decreased from 0 to -1 in endio() In fact the hang can be easily reproduced by the below script, H=/sys/fs/cgroup/unified/ P=/sys/fs/cgroup/unified/test echo "+io" > $H/cgroup.subtree_control mkdir -p $P echo $$ > $P/cgroup.procs xfs_io -f -d -c "pwrite 0 4k" /dev/sdg echo "`cat /sys/block/sdg/dev` target=1000000" > $P/io.latency xfs_io -f -d -c "pwrite 0 4k" /dev/sdg This fixes the problem by freezing the queue do that while enabling/disabling iolatency, there is no inflight rq running. Note that quiesce_queue is not needed as this only updating iolatency configuration about which dispatching request_queue doesn't care. Signed-off-by: Liu Bo --- block/blk-iolatency.c | 52 +++++++++++++++++++++++++++++++++++++------ 1 file changed, 45 insertions(+), 7 deletions(-) diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c index 7200f3189a84..2620baa1f699 100644 --- a/block/blk-iolatency.c +++ b/block/blk-iolatency.c @@ -72,6 +72,7 @@ #include #include #include +#include #include "blk-rq-qos.h" #include "blk-stat.h" @@ -602,6 +603,9 @@ static void blkcg_iolatency_done_bio(struct rq_qos *rqos, struct bio *bio) return; enabled = blk_iolatency_enabled(iolat->blkiolat); + if (!enabled) + return; + while (blkg && blkg->parent) { iolat = blkg_to_lat(blkg); if (!iolat) { @@ -612,7 +616,7 @@ static void blkcg_iolatency_done_bio(struct rq_qos *rqos, struct bio *bio) inflight = atomic_dec_return(&rqw->inflight); WARN_ON_ONCE(inflight < 0); - if (!enabled || iolat->min_lat_nsec == 0) + if (iolat->min_lat_nsec == 0) goto next; iolatency_record_time(iolat, &bio->bi_issue, now, issue_as_root); @@ -756,10 +760,13 @@ int blk_iolatency_init(struct request_queue *q) return 0; } -static void iolatency_set_min_lat_nsec(struct blkcg_gq *blkg, u64 val) +/* + * return 1 for enabling iolatency, return -1 for disabling iolatency, otherwise + * return 0. + */ +static int iolatency_set_min_lat_nsec(struct blkcg_gq *blkg, u64 val) { struct iolatency_grp *iolat = blkg_to_lat(blkg); - struct blk_iolatency *blkiolat = iolat->blkiolat; u64 oldval = iolat->min_lat_nsec; iolat->min_lat_nsec = val; @@ -768,9 +775,10 @@ static void iolatency_set_min_lat_nsec(struct blkcg_gq *blkg, u64 val) BLKIOLATENCY_MAX_WIN_SIZE); if (!oldval && val) - atomic_inc(&blkiolat->enabled); + return 1; if (oldval && !val) - atomic_dec(&blkiolat->enabled); + return -1; + return 0; } static void iolatency_clear_scaling(struct blkcg_gq *blkg) @@ -802,6 +810,7 @@ static ssize_t iolatency_set_limit(struct kernfs_open_file *of, char *buf, u64 lat_val = 0; u64 oldval; int ret; + int enable = 0; ret = blkg_conf_prep(blkcg, &blkcg_policy_iolatency, buf, &ctx); if (ret) @@ -836,7 +845,12 @@ static ssize_t iolatency_set_limit(struct kernfs_open_file *of, char *buf, blkg = ctx.blkg; oldval = iolat->min_lat_nsec; - iolatency_set_min_lat_nsec(blkg, lat_val); + enable = iolatency_set_min_lat_nsec(blkg, lat_val); + if (enable) { + WARN_ON_ONCE(!blk_get_queue(blkg->q)); + blkg_get(blkg); + } + if (oldval != iolat->min_lat_nsec) { iolatency_clear_scaling(blkg); } @@ -844,6 +858,24 @@ static ssize_t iolatency_set_limit(struct kernfs_open_file *of, char *buf, ret = 0; out: blkg_conf_finish(&ctx); + if (ret == 0 && enable) { + struct iolatency_grp *tmp = blkg_to_lat(blkg); + struct blk_iolatency *blkiolat = tmp->blkiolat; + + blk_mq_freeze_queue(blkg->q); + + if (enable == 1) + atomic_inc(&blkiolat->enabled); + else if (enable == -1) + atomic_dec(&blkiolat->enabled); + else + WARN_ON_ONCE(1); + + blk_mq_unfreeze_queue(blkg->q); + + blkg_put(blkg); + blk_put_queue(blkg->q); + } return ret ?: nbytes; } @@ -979,8 +1011,14 @@ static void iolatency_pd_offline(struct blkg_policy_data *pd) { struct iolatency_grp *iolat = pd_to_lat(pd); struct blkcg_gq *blkg = lat_to_blkg(iolat); + struct blk_iolatency *blkiolat = iolat->blkiolat; + int ret; - iolatency_set_min_lat_nsec(blkg, 0); + ret = iolatency_set_min_lat_nsec(blkg, 0); + if (ret == 1) + atomic_inc(&blkiolat->enabled); + if (ret == -1) + atomic_dec(&blkiolat->enabled); iolatency_clear_scaling(blkg); } From patchwork Thu Jan 24 00:16:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liu Bo X-Patchwork-Id: 10778189 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9FA0113BF for ; Thu, 24 Jan 2019 00:17:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 928DC2856D for ; Thu, 24 Jan 2019 00:17:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 86CB22DB16; Thu, 24 Jan 2019 00:17:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 79BD02856D for ; Thu, 24 Jan 2019 00:17:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726252AbfAXARC (ORCPT ); Wed, 23 Jan 2019 19:17:02 -0500 Received: from out30-130.freemail.mail.aliyun.com ([115.124.30.130]:42822 "EHLO out30-130.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726168AbfAXARC (ORCPT ); Wed, 23 Jan 2019 19:17:02 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R231e4;CH=green;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e07487;MF=bo.liu@linux.alibaba.com;NM=1;PH=DS;RN=2;SR=0;TI=SMTPD_---0TIsrxct_1548289019; Received: from localhost(mailfrom:bo.liu@linux.alibaba.com fp:SMTPD_---0TIsrxct_1548289019) by smtp.aliyun-inc.com(127.0.0.1); Thu, 24 Jan 2019 08:16:59 +0800 From: Liu Bo To: linux-block@vger.kernel.org Cc: Josef Bacik Subject: [PATCH 3/3] blk-mq: remove duplicated definition of blk_mq_freeze_queue Date: Thu, 24 Jan 2019 08:16:55 +0800 Message-Id: <20190124001655.92508-4-bo.liu@linux.alibaba.com> X-Mailer: git-send-email 2.20.1.2.gb21ebb6 In-Reply-To: <20190124001655.92508-1-bo.liu@linux.alibaba.com> References: <20190124001655.92508-1-bo.liu@linux.alibaba.com> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP As the prototype has been defined in "include/linux/blk-mq.h", the one in "block/blk-mq.h" can be removed then. Signed-off-by: Liu Bo --- block/blk-mq.h | 1 - 1 file changed, 1 deletion(-) diff --git a/block/blk-mq.h b/block/blk-mq.h index d943d46b0785..d0b3dd54ef8d 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -36,7 +36,6 @@ struct blk_mq_ctx { struct kobject kobj; } ____cacheline_aligned_in_smp; -void blk_mq_freeze_queue(struct request_queue *q); void blk_mq_free_queue(struct request_queue *q); int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr); void blk_mq_wake_waiters(struct request_queue *q);