From patchwork Fri Nov 30 22:22:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 10707355 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 11A0F14BD for ; Fri, 30 Nov 2018 22:22:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 04AE12FF89 for ; Fri, 30 Nov 2018 22:22:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EB7613008C; Fri, 30 Nov 2018 22:22:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A0B252FF89 for ; Fri, 30 Nov 2018 22:22:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726933AbeLAJdT (ORCPT ); Sat, 1 Dec 2018 04:33:19 -0500 Received: from mx1.redhat.com ([209.132.183.28]:33120 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726771AbeLAJdT (ORCPT ); Sat, 1 Dec 2018 04:33:19 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CB9B43138BA5; Fri, 30 Nov 2018 22:22:32 +0000 (UTC) Received: from localhost (unknown [10.16.197.51]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 41A9918C5A; Fri, 30 Nov 2018 22:22:30 +0000 (UTC) From: Mike Snitzer To: Jens Axboe Cc: Mikulas Patocka , dm-devel@redhat.com, linux-block@vger.kernel.org Subject: [PATCH v2 1/6] dm: dont rewrite dm_disk(md)->part0.in_flight Date: Fri, 30 Nov 2018 17:22:21 -0500 Message-Id: <20181130222226.77216-2-snitzer@redhat.com> In-Reply-To: <20181130222226.77216-1-snitzer@redhat.com> References: <20181130222226.77216-1-snitzer@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Fri, 30 Nov 2018 22:22:32 +0000 (UTC) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Mikulas Patocka generic_start_io_acct and generic_end_io_acct already update the variable in_flight using atomic operations, so we don't have to overwrite them again. Signed-off-by: Mikulas Patocka Signed-off-by: Mike Snitzer --- drivers/md/dm.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index a733e4c920af..a8ae7931bce7 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -663,8 +663,7 @@ static void start_io_acct(struct dm_io *io) generic_start_io_acct(md->queue, bio_op(bio), bio_sectors(bio), &dm_disk(md)->part0); - atomic_set(&dm_disk(md)->part0.in_flight[rw], - atomic_inc_return(&md->pending[rw])); + atomic_inc(&md->pending[rw]); if (unlikely(dm_stats_used(&md->stats))) dm_stats_account_io(&md->stats, bio_data_dir(bio), @@ -693,7 +692,6 @@ static void end_io_acct(struct dm_io *io) * a flush. */ pending = atomic_dec_return(&md->pending[rw]); - atomic_set(&dm_disk(md)->part0.in_flight[rw], pending); pending += atomic_read(&md->pending[rw^0x1]); /* nudge anyone waiting on suspend queue */ From patchwork Fri Nov 30 22:22:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 10707359 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 019E214E2 for ; Fri, 30 Nov 2018 22:22:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E8F322FF89 for ; Fri, 30 Nov 2018 22:22:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DD50C3008C; Fri, 30 Nov 2018 22:22:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 86EBD2FF89 for ; Fri, 30 Nov 2018 22:22:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726974AbeLAJdX (ORCPT ); Sat, 1 Dec 2018 04:33:23 -0500 Received: from mx1.redhat.com ([209.132.183.28]:50932 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726771AbeLAJdW (ORCPT ); Sat, 1 Dec 2018 04:33:22 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4143281F0E; Fri, 30 Nov 2018 22:22:36 +0000 (UTC) Received: from localhost (unknown [10.16.197.51]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 9E04360C67; Fri, 30 Nov 2018 22:22:33 +0000 (UTC) From: Mike Snitzer To: Jens Axboe Cc: Mikulas Patocka , dm-devel@redhat.com, linux-block@vger.kernel.org Subject: [PATCH v2 2/6] dm rq: leverage blk_mq_queue_busy() to check for outstanding IO Date: Fri, 30 Nov 2018 17:22:22 -0500 Message-Id: <20181130222226.77216-3-snitzer@redhat.com> In-Reply-To: <20181130222226.77216-1-snitzer@redhat.com> References: <20181130222226.77216-1-snitzer@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Fri, 30 Nov 2018 22:22:36 +0000 (UTC) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Now that request-based dm-multipath only supports blk-mq, make use of the newly introduced blk_mq_queue_busy() to check for outstanding IO -- rather than (ab)using the block core's in_flight counters. Signed-off-by: Mike Snitzer --- drivers/md/dm-rq.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index 1f1fe9a618ea..d2397d8fcbd1 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -130,11 +130,11 @@ static void rq_end_stats(struct mapped_device *md, struct request *orig) */ static void rq_completed(struct mapped_device *md, int rw, bool run_queue) { - atomic_dec(&md->pending[rw]); - /* nudge anyone waiting on suspend queue */ - if (!md_in_flight(md)) - wake_up(&md->wait); + if (unlikely(waitqueue_active(&md->wait))) { + if (!blk_mq_queue_busy(md->queue)) + wake_up(&md->wait); + } /* * dm_put() must be at the end of this function. See the comment above @@ -436,7 +436,6 @@ ssize_t dm_attr_rq_based_seq_io_merge_deadline_store(struct mapped_device *md, static void dm_start_request(struct mapped_device *md, struct request *orig) { blk_mq_start_request(orig); - atomic_inc(&md->pending[rq_data_dir(orig)]); if (unlikely(dm_stats_used(&md->stats))) { struct dm_rq_target_io *tio = tio_from_request(orig); From patchwork Fri Nov 30 22:22:23 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 10707361 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5C59914BD for ; Fri, 30 Nov 2018 22:22:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4DEA23008C for ; Fri, 30 Nov 2018 22:22:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 427CF3010F; Fri, 30 Nov 2018 22:22:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 96A372FF89 for ; Fri, 30 Nov 2018 22:22:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726980AbeLAJdY (ORCPT ); Sat, 1 Dec 2018 04:33:24 -0500 Received: from mx1.redhat.com ([209.132.183.28]:50964 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726771AbeLAJdY (ORCPT ); Sat, 1 Dec 2018 04:33:24 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5C985309BFE2; Fri, 30 Nov 2018 22:22:37 +0000 (UTC) Received: from localhost (unknown [10.16.197.51]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 0F8EA7C809; Fri, 30 Nov 2018 22:22:36 +0000 (UTC) From: Mike Snitzer To: Jens Axboe Cc: Mikulas Patocka , dm-devel@redhat.com, linux-block@vger.kernel.org Subject: [PATCH v2 3/6] block: delete part_round_stats and switch to less precise counting Date: Fri, 30 Nov 2018 17:22:23 -0500 Message-Id: <20181130222226.77216-4-snitzer@redhat.com> In-Reply-To: <20181130222226.77216-1-snitzer@redhat.com> References: <20181130222226.77216-1-snitzer@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Fri, 30 Nov 2018 22:22:37 +0000 (UTC) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Mikulas Patocka We want to convert to per-cpu in_flight counters. The function part_round_stats needs the in_flight counter every jiffy, it would be too costly to sum all the percpu variables every jiffy, so it must be deleted. part_round_stats is used to calculate two counters - time_in_queue and io_ticks. time_in_queue can be calculated without part_round_stats, by adding the duration of the I/O when the I/O ends (the value is almost as exact as the previously calculated value, except that time for in-progress I/Os is not counted). io_ticks can be approximated by increasing the value when I/O is started or ended and the jiffies value has changed. If the I/Os take less than a jiffy, the value is as exact as the previously calculated value. If the I/Os take more than a jiffy, io_ticks can drift behind the previously calculated value. Signed-off-by: Mikulas Patocka Signed-off-by: Mike Snitzer --- block/bio.c | 24 +++++++++++++++--- block/blk-core.c | 63 +++-------------------------------------------- block/blk-merge.c | 1 - block/genhd.c | 4 --- block/partition-generic.c | 4 --- include/linux/genhd.h | 3 +-- 6 files changed, 26 insertions(+), 73 deletions(-) diff --git a/block/bio.c b/block/bio.c index 03895cc0d74a..d5ef043a97aa 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1663,13 +1663,29 @@ void bio_check_pages_dirty(struct bio *bio) } EXPORT_SYMBOL_GPL(bio_check_pages_dirty); +void update_io_ticks(int cpu, struct hd_struct *part, unsigned long now) +{ + unsigned long stamp; +again: + stamp = READ_ONCE(part->stamp); + if (unlikely(stamp != now)) { + if (likely(cmpxchg(&part->stamp, stamp, now) == stamp)) { + __part_stat_add(cpu, part, io_ticks, 1); + } + } + if (part->partno) { + part = &part_to_disk(part)->part0; + goto again; + } +} + void generic_start_io_acct(struct request_queue *q, int op, unsigned long sectors, struct hd_struct *part) { const int sgrp = op_stat_group(op); int cpu = part_stat_lock(); - part_round_stats(q, cpu, part); + update_io_ticks(cpu, part, jiffies); part_stat_inc(cpu, part, ios[sgrp]); part_stat_add(cpu, part, sectors[sgrp], sectors); part_inc_in_flight(q, part, op_is_write(op)); @@ -1681,12 +1697,14 @@ EXPORT_SYMBOL(generic_start_io_acct); void generic_end_io_acct(struct request_queue *q, int req_op, struct hd_struct *part, unsigned long start_time) { - unsigned long duration = jiffies - start_time; + unsigned long now = jiffies; + unsigned long duration = now - start_time; const int sgrp = op_stat_group(req_op); int cpu = part_stat_lock(); + update_io_ticks(cpu, part, now); part_stat_add(cpu, part, nsecs[sgrp], jiffies_to_nsecs(duration)); - part_round_stats(q, cpu, part); + part_stat_add(cpu, part, time_in_queue, duration); part_dec_in_flight(q, part, op_is_write(req_op)); part_stat_unlock(); diff --git a/block/blk-core.c b/block/blk-core.c index 3f6f5e6c2fe4..6bd4669f05fd 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -583,63 +583,6 @@ struct request *blk_get_request(struct request_queue *q, unsigned int op, } EXPORT_SYMBOL(blk_get_request); -static void part_round_stats_single(struct request_queue *q, int cpu, - struct hd_struct *part, unsigned long now, - unsigned int inflight) -{ - if (inflight) { - __part_stat_add(cpu, part, time_in_queue, - inflight * (now - part->stamp)); - __part_stat_add(cpu, part, io_ticks, (now - part->stamp)); - } - part->stamp = now; -} - -/** - * part_round_stats() - Round off the performance stats on a struct disk_stats. - * @q: target block queue - * @cpu: cpu number for stats access - * @part: target partition - * - * The average IO queue length and utilisation statistics are maintained - * by observing the current state of the queue length and the amount of - * time it has been in this state for. - * - * Normally, that accounting is done on IO completion, but that can result - * in more than a second's worth of IO being accounted for within any one - * second, leading to >100% utilisation. To deal with that, we call this - * function to do a round-off before returning the results when reading - * /proc/diskstats. This accounts immediately for all queue usage up to - * the current jiffies and restarts the counters again. - */ -void part_round_stats(struct request_queue *q, int cpu, struct hd_struct *part) -{ - struct hd_struct *part2 = NULL; - unsigned long now = jiffies; - unsigned int inflight[2]; - int stats = 0; - - if (part->stamp != now) - stats |= 1; - - if (part->partno) { - part2 = &part_to_disk(part)->part0; - if (part2->stamp != now) - stats |= 2; - } - - if (!stats) - return; - - part_in_flight(q, part, inflight); - - if (stats & 2) - part_round_stats_single(q, cpu, part2, now, inflight[1]); - if (stats & 1) - part_round_stats_single(q, cpu, part, now, inflight[0]); -} -EXPORT_SYMBOL_GPL(part_round_stats); - void blk_put_request(struct request *req) { blk_mq_free_request(req); @@ -1408,9 +1351,10 @@ void blk_account_io_done(struct request *req, u64 now) cpu = part_stat_lock(); part = req->part; + update_io_ticks(cpu, part, jiffies); part_stat_inc(cpu, part, ios[sgrp]); part_stat_add(cpu, part, nsecs[sgrp], now - req->start_time_ns); - part_round_stats(req->q, cpu, part); + part_stat_add(cpu, part, time_in_queue, nsecs_to_jiffies64(now - req->start_time_ns)); part_dec_in_flight(req->q, part, rq_data_dir(req)); hd_struct_put(part); @@ -1446,11 +1390,12 @@ void blk_account_io_start(struct request *rq, bool new_io) part = &rq->rq_disk->part0; hd_struct_get(part); } - part_round_stats(rq->q, cpu, part); part_inc_in_flight(rq->q, part, rw); rq->part = part; } + update_io_ticks(cpu, part, jiffies); + part_stat_unlock(); } diff --git a/block/blk-merge.c b/block/blk-merge.c index 6be04ef8da5b..c278b6d18a24 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -690,7 +690,6 @@ static void blk_account_io_merge(struct request *req) cpu = part_stat_lock(); part = req->part; - part_round_stats(req->q, cpu, part); part_dec_in_flight(req->q, part, rq_data_dir(req)); hd_struct_put(part); diff --git a/block/genhd.c b/block/genhd.c index 0145bcb0cc76..cdf174d7d329 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -1326,7 +1326,6 @@ static int diskstats_show(struct seq_file *seqf, void *v) struct hd_struct *hd; char buf[BDEVNAME_SIZE]; unsigned int inflight[2]; - int cpu; /* if (&disk_to_dev(gp)->kobj.entry == block_class.devices.next) @@ -1338,9 +1337,6 @@ static int diskstats_show(struct seq_file *seqf, void *v) disk_part_iter_init(&piter, gp, DISK_PITER_INCL_EMPTY_PART0); while ((hd = disk_part_iter_next(&piter))) { - cpu = part_stat_lock(); - part_round_stats(gp->queue, cpu, hd); - part_stat_unlock(); part_in_flight(gp->queue, hd, inflight); seq_printf(seqf, "%4d %7d %s " "%lu %lu %lu %u " diff --git a/block/partition-generic.c b/block/partition-generic.c index 5f8db5c5140f..42d6138ac876 100644 --- a/block/partition-generic.c +++ b/block/partition-generic.c @@ -121,11 +121,7 @@ ssize_t part_stat_show(struct device *dev, struct hd_struct *p = dev_to_part(dev); struct request_queue *q = part_to_disk(p)->queue; unsigned int inflight[2]; - int cpu; - cpu = part_stat_lock(); - part_round_stats(q, cpu, p); - part_stat_unlock(); part_in_flight(q, p, inflight); return sprintf(buf, "%8lu %8lu %8llu %8u " diff --git a/include/linux/genhd.h b/include/linux/genhd.h index 0c5ee17b4d88..f2a0a52c874f 100644 --- a/include/linux/genhd.h +++ b/include/linux/genhd.h @@ -398,8 +398,7 @@ static inline void free_part_info(struct hd_struct *part) kfree(part->info); } -/* block/blk-core.c */ -extern void part_round_stats(struct request_queue *q, int cpu, struct hd_struct *part); +void update_io_ticks(int cpu, struct hd_struct *part, unsigned long now); /* block/genhd.c */ extern void device_add_disk(struct device *parent, struct gendisk *disk, From patchwork Fri Nov 30 22:22:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 10707363 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A23A414BD for ; Fri, 30 Nov 2018 22:22:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 94B872FF89 for ; Fri, 30 Nov 2018 22:22:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 893FA3008C; Fri, 30 Nov 2018 22:22:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 013EB2FF89 for ; Fri, 30 Nov 2018 22:22:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726992AbeLAJdZ (ORCPT ); Sat, 1 Dec 2018 04:33:25 -0500 Received: from mx1.redhat.com ([209.132.183.28]:37490 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726771AbeLAJdZ (ORCPT ); Sat, 1 Dec 2018 04:33:25 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8168C30833A4; Fri, 30 Nov 2018 22:22:38 +0000 (UTC) Received: from localhost (unknown [10.16.197.51]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 2FDB060C67; Fri, 30 Nov 2018 22:22:38 +0000 (UTC) From: Mike Snitzer To: Jens Axboe Cc: Mikulas Patocka , dm-devel@redhat.com, linux-block@vger.kernel.org Subject: [PATCH v2 4/6] block: switch to per-cpu in-flight counters Date: Fri, 30 Nov 2018 17:22:24 -0500 Message-Id: <20181130222226.77216-5-snitzer@redhat.com> In-Reply-To: <20181130222226.77216-1-snitzer@redhat.com> References: <20181130222226.77216-1-snitzer@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.44]); Fri, 30 Nov 2018 22:22:38 +0000 (UTC) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Mikulas Patocka Now when part_round_stats is gone, we can switch to per-cpu in-flight counters. We use the local-atomic type local_t, so that if part_inc_in_flight or part_dec_in_flight is reentrantly called from an interrupt, the value will be correct. The other counters could be corrupted due to reentrant interrupt, but the corruption only results in slight counter skew - the in_flight counter must be exact, so it needs local_t. Signed-off-by: Mikulas Patocka Signed-off-by: Mike Snitzer --- block/bio.c | 4 ++-- block/blk-core.c | 4 ++-- block/blk-merge.c | 2 +- block/genhd.c | 47 +++++++++++++++++++++++++++++++++++------------ include/linux/genhd.h | 7 ++++--- 5 files changed, 44 insertions(+), 20 deletions(-) diff --git a/block/bio.c b/block/bio.c index d5ef043a97aa..b25b4fef9900 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1688,7 +1688,7 @@ void generic_start_io_acct(struct request_queue *q, int op, update_io_ticks(cpu, part, jiffies); part_stat_inc(cpu, part, ios[sgrp]); part_stat_add(cpu, part, sectors[sgrp], sectors); - part_inc_in_flight(q, part, op_is_write(op)); + part_inc_in_flight(q, cpu, part, op_is_write(op)); part_stat_unlock(); } @@ -1705,7 +1705,7 @@ void generic_end_io_acct(struct request_queue *q, int req_op, update_io_ticks(cpu, part, now); part_stat_add(cpu, part, nsecs[sgrp], jiffies_to_nsecs(duration)); part_stat_add(cpu, part, time_in_queue, duration); - part_dec_in_flight(q, part, op_is_write(req_op)); + part_dec_in_flight(q, cpu, part, op_is_write(req_op)); part_stat_unlock(); } diff --git a/block/blk-core.c b/block/blk-core.c index 6bd4669f05fd..87f06672d9a7 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1355,7 +1355,7 @@ void blk_account_io_done(struct request *req, u64 now) part_stat_inc(cpu, part, ios[sgrp]); part_stat_add(cpu, part, nsecs[sgrp], now - req->start_time_ns); part_stat_add(cpu, part, time_in_queue, nsecs_to_jiffies64(now - req->start_time_ns)); - part_dec_in_flight(req->q, part, rq_data_dir(req)); + part_dec_in_flight(req->q, cpu, part, rq_data_dir(req)); hd_struct_put(part); part_stat_unlock(); @@ -1390,7 +1390,7 @@ void blk_account_io_start(struct request *rq, bool new_io) part = &rq->rq_disk->part0; hd_struct_get(part); } - part_inc_in_flight(rq->q, part, rw); + part_inc_in_flight(rq->q, cpu, part, rw); rq->part = part; } diff --git a/block/blk-merge.c b/block/blk-merge.c index c278b6d18a24..c02386cdf0ca 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -690,7 +690,7 @@ static void blk_account_io_merge(struct request *req) cpu = part_stat_lock(); part = req->part; - part_dec_in_flight(req->q, part, rq_data_dir(req)); + part_dec_in_flight(req->q, cpu, part, rq_data_dir(req)); hd_struct_put(part); part_stat_unlock(); diff --git a/block/genhd.c b/block/genhd.c index cdf174d7d329..d4c9dd65def6 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -45,53 +45,76 @@ static void disk_add_events(struct gendisk *disk); static void disk_del_events(struct gendisk *disk); static void disk_release_events(struct gendisk *disk); -void part_inc_in_flight(struct request_queue *q, struct hd_struct *part, int rw) +void part_inc_in_flight(struct request_queue *q, int cpu, struct hd_struct *part, int rw) { if (queue_is_mq(q)) return; - atomic_inc(&part->in_flight[rw]); + local_inc(&per_cpu_ptr(part->dkstats, cpu)->in_flight[rw]); if (part->partno) - atomic_inc(&part_to_disk(part)->part0.in_flight[rw]); + local_inc(&per_cpu_ptr(part_to_disk(part)->part0.dkstats, cpu)->in_flight[rw]); } -void part_dec_in_flight(struct request_queue *q, struct hd_struct *part, int rw) +void part_dec_in_flight(struct request_queue *q, int cpu, struct hd_struct *part, int rw) { if (queue_is_mq(q)) return; - atomic_dec(&part->in_flight[rw]); + local_dec(&per_cpu_ptr(part->dkstats, cpu)->in_flight[rw]); if (part->partno) - atomic_dec(&part_to_disk(part)->part0.in_flight[rw]); + local_dec(&per_cpu_ptr(part_to_disk(part)->part0.dkstats, cpu)->in_flight[rw]); } void part_in_flight(struct request_queue *q, struct hd_struct *part, unsigned int inflight[2]) { + int cpu; + if (queue_is_mq(q)) { blk_mq_in_flight(q, part, inflight); return; } - inflight[0] = atomic_read(&part->in_flight[0]) + - atomic_read(&part->in_flight[1]); + inflight[0] = 0; + for_each_possible_cpu(cpu) { + inflight[0] += local_read(&per_cpu_ptr(part->dkstats, cpu)->in_flight[0]) + + local_read(&per_cpu_ptr(part->dkstats, cpu)->in_flight[1]); + } + if ((int)inflight[0] < 0) + inflight[0] = 0; + if (part->partno) { part = &part_to_disk(part)->part0; - inflight[1] = atomic_read(&part->in_flight[0]) + - atomic_read(&part->in_flight[1]); + inflight[1] = 0; + for_each_possible_cpu(cpu) { + inflight[1] += local_read(&per_cpu_ptr(part->dkstats, cpu)->in_flight[0]) + + local_read(&per_cpu_ptr(part->dkstats, cpu)->in_flight[1]); + } + if ((int)inflight[1] < 0) + inflight[1] = 0; } } void part_in_flight_rw(struct request_queue *q, struct hd_struct *part, unsigned int inflight[2]) { + int cpu; + if (queue_is_mq(q)) { blk_mq_in_flight_rw(q, part, inflight); return; } - inflight[0] = atomic_read(&part->in_flight[0]); - inflight[1] = atomic_read(&part->in_flight[1]); + inflight[0] = 0; + inflight[1] = 0; + for_each_possible_cpu(cpu) { + inflight[0] += local_read(&per_cpu_ptr(part->dkstats, cpu)->in_flight[0]); + inflight[1] += local_read(&per_cpu_ptr(part->dkstats, cpu)->in_flight[1]); + } + if ((int)inflight[0] < 0) + inflight[0] = 0; + if ((int)inflight[1] < 0) + inflight[1] = 0; } struct hd_struct *__disk_get_part(struct gendisk *disk, int partno) diff --git a/include/linux/genhd.h b/include/linux/genhd.h index f2a0a52c874f..a03aa6502a83 100644 --- a/include/linux/genhd.h +++ b/include/linux/genhd.h @@ -17,6 +17,7 @@ #include #include #include +#include #ifdef CONFIG_BLOCK @@ -89,6 +90,7 @@ struct disk_stats { unsigned long merges[NR_STAT_GROUPS]; unsigned long io_ticks; unsigned long time_in_queue; + local_t in_flight[2]; }; #define PARTITION_META_INFO_VOLNAMELTH 64 @@ -122,7 +124,6 @@ struct hd_struct { int make_it_fail; #endif unsigned long stamp; - atomic_t in_flight[2]; #ifdef CONFIG_SMP struct disk_stats __percpu *dkstats; #else @@ -380,9 +381,9 @@ void part_in_flight(struct request_queue *q, struct hd_struct *part, unsigned int inflight[2]); void part_in_flight_rw(struct request_queue *q, struct hd_struct *part, unsigned int inflight[2]); -void part_dec_in_flight(struct request_queue *q, struct hd_struct *part, +void part_dec_in_flight(struct request_queue *q, int cpu, struct hd_struct *part, int rw); -void part_inc_in_flight(struct request_queue *q, struct hd_struct *part, +void part_inc_in_flight(struct request_queue *q, int cpu, struct hd_struct *part, int rw); static inline struct partition_meta_info *alloc_part_info(struct gendisk *disk) From patchwork Fri Nov 30 22:22:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 10707371 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BB64914BD for ; Fri, 30 Nov 2018 22:22:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AC6B12FF89 for ; Fri, 30 Nov 2018 22:22:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A04113008C; Fri, 30 Nov 2018 22:22:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1A6C32FF89 for ; Fri, 30 Nov 2018 22:22:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726994AbeLAJd3 (ORCPT ); Sat, 1 Dec 2018 04:33:29 -0500 Received: from mx1.redhat.com ([209.132.183.28]:38038 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726771AbeLAJd2 (ORCPT ); Sat, 1 Dec 2018 04:33:28 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id BBF83307CDDB; Fri, 30 Nov 2018 22:22:41 +0000 (UTC) Received: from localhost (unknown [10.16.197.51]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 4F07017CC9; Fri, 30 Nov 2018 22:22:39 +0000 (UTC) From: Mike Snitzer To: Jens Axboe Cc: Mikulas Patocka , dm-devel@redhat.com, linux-block@vger.kernel.org Subject: [PATCH v2 5/6] block: return just one value from part_in_flight Date: Fri, 30 Nov 2018 17:22:25 -0500 Message-Id: <20181130222226.77216-6-snitzer@redhat.com> In-Reply-To: <20181130222226.77216-1-snitzer@redhat.com> References: <20181130222226.77216-1-snitzer@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Fri, 30 Nov 2018 22:22:41 +0000 (UTC) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Mikulas Patocka The previous patches deleted all the code that needed the second value returned from part_in_flight - now the kernel only uses the first value. Consequently, part_in_flight (and blk_mq_in_flight) may be changed so that it only returns one value. This patch just refactors the code, there's no functional change. Signed-off-by: Mikulas Patocka Signed-off-by: Mike Snitzer --- block/blk-mq.c | 12 +++++------- block/blk-mq.h | 3 +-- block/genhd.c | 32 +++++++++++--------------------- block/partition-generic.c | 6 +++--- include/linux/genhd.h | 3 +-- 5 files changed, 21 insertions(+), 35 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 7dcef565dc0f..88ed969cb9df 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -101,25 +101,23 @@ static bool blk_mq_check_inflight(struct blk_mq_hw_ctx *hctx, struct mq_inflight *mi = priv; /* - * index[0] counts the specific partition that was asked for. index[1] - * counts the ones that are active on the whole device, so increment - * that if mi->part is indeed a partition, and not a whole device. + * index[0] counts the specific partition that was asked for. */ if (rq->part == mi->part) mi->inflight[0]++; - if (mi->part->partno) - mi->inflight[1]++; return true; } -void blk_mq_in_flight(struct request_queue *q, struct hd_struct *part, - unsigned int inflight[2]) +unsigned int blk_mq_in_flight(struct request_queue *q, struct hd_struct *part) { + unsigned inflight[2]; struct mq_inflight mi = { .part = part, .inflight = inflight, }; inflight[0] = inflight[1] = 0; blk_mq_queue_tag_busy_iter(q, blk_mq_check_inflight, &mi); + + return inflight[0]; } static bool blk_mq_check_inflight_rw(struct blk_mq_hw_ctx *hctx, diff --git a/block/blk-mq.h b/block/blk-mq.h index 7291e5379358..4022943cb191 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -184,8 +184,7 @@ static inline bool blk_mq_hw_queue_mapped(struct blk_mq_hw_ctx *hctx) return hctx->nr_ctx && hctx->tags; } -void blk_mq_in_flight(struct request_queue *q, struct hd_struct *part, - unsigned int inflight[2]); +unsigned int blk_mq_in_flight(struct request_queue *q, struct hd_struct *part); void blk_mq_in_flight_rw(struct request_queue *q, struct hd_struct *part, unsigned int inflight[2]); diff --git a/block/genhd.c b/block/genhd.c index d4c9dd65def6..3397288a2926 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -65,34 +65,24 @@ void part_dec_in_flight(struct request_queue *q, int cpu, struct hd_struct *part local_dec(&per_cpu_ptr(part_to_disk(part)->part0.dkstats, cpu)->in_flight[rw]); } -void part_in_flight(struct request_queue *q, struct hd_struct *part, - unsigned int inflight[2]) +unsigned int part_in_flight(struct request_queue *q, struct hd_struct *part) { int cpu; + int inflight; if (queue_is_mq(q)) { - blk_mq_in_flight(q, part, inflight); - return; + return blk_mq_in_flight(q, part); } - inflight[0] = 0; + inflight = 0; for_each_possible_cpu(cpu) { - inflight[0] += local_read(&per_cpu_ptr(part->dkstats, cpu)->in_flight[0]) + + inflight += local_read(&per_cpu_ptr(part->dkstats, cpu)->in_flight[0]) + local_read(&per_cpu_ptr(part->dkstats, cpu)->in_flight[1]); } - if ((int)inflight[0] < 0) - inflight[0] = 0; + if (inflight < 0) + inflight = 0; - if (part->partno) { - part = &part_to_disk(part)->part0; - inflight[1] = 0; - for_each_possible_cpu(cpu) { - inflight[1] += local_read(&per_cpu_ptr(part->dkstats, cpu)->in_flight[0]) + - local_read(&per_cpu_ptr(part->dkstats, cpu)->in_flight[1]); - } - if ((int)inflight[1] < 0) - inflight[1] = 0; - } + return (unsigned int)inflight; } void part_in_flight_rw(struct request_queue *q, struct hd_struct *part, @@ -1348,7 +1338,7 @@ static int diskstats_show(struct seq_file *seqf, void *v) struct disk_part_iter piter; struct hd_struct *hd; char buf[BDEVNAME_SIZE]; - unsigned int inflight[2]; + unsigned int inflight; /* if (&disk_to_dev(gp)->kobj.entry == block_class.devices.next) @@ -1360,7 +1350,7 @@ static int diskstats_show(struct seq_file *seqf, void *v) disk_part_iter_init(&piter, gp, DISK_PITER_INCL_EMPTY_PART0); while ((hd = disk_part_iter_next(&piter))) { - part_in_flight(gp->queue, hd, inflight); + inflight = part_in_flight(gp->queue, hd); seq_printf(seqf, "%4d %7d %s " "%lu %lu %lu %u " "%lu %lu %lu %u " @@ -1376,7 +1366,7 @@ static int diskstats_show(struct seq_file *seqf, void *v) part_stat_read(hd, merges[STAT_WRITE]), part_stat_read(hd, sectors[STAT_WRITE]), (unsigned int)part_stat_read_msecs(hd, STAT_WRITE), - inflight[0], + inflight, jiffies_to_msecs(part_stat_read(hd, io_ticks)), jiffies_to_msecs(part_stat_read(hd, time_in_queue)), part_stat_read(hd, ios[STAT_DISCARD]), diff --git a/block/partition-generic.c b/block/partition-generic.c index 42d6138ac876..8e596a8dff32 100644 --- a/block/partition-generic.c +++ b/block/partition-generic.c @@ -120,9 +120,9 @@ ssize_t part_stat_show(struct device *dev, { struct hd_struct *p = dev_to_part(dev); struct request_queue *q = part_to_disk(p)->queue; - unsigned int inflight[2]; + unsigned int inflight; - part_in_flight(q, p, inflight); + inflight = part_in_flight(q, p); return sprintf(buf, "%8lu %8lu %8llu %8u " "%8lu %8lu %8llu %8u " @@ -137,7 +137,7 @@ ssize_t part_stat_show(struct device *dev, part_stat_read(p, merges[STAT_WRITE]), (unsigned long long)part_stat_read(p, sectors[STAT_WRITE]), (unsigned int)part_stat_read_msecs(p, STAT_WRITE), - inflight[0], + inflight, jiffies_to_msecs(part_stat_read(p, io_ticks)), jiffies_to_msecs(part_stat_read(p, time_in_queue)), part_stat_read(p, ios[STAT_DISCARD]), diff --git a/include/linux/genhd.h b/include/linux/genhd.h index a03aa6502a83..13b7ce01727a 100644 --- a/include/linux/genhd.h +++ b/include/linux/genhd.h @@ -377,8 +377,7 @@ static inline void free_part_stats(struct hd_struct *part) #define part_stat_sub(cpu, gendiskp, field, subnd) \ part_stat_add(cpu, gendiskp, field, -subnd) -void part_in_flight(struct request_queue *q, struct hd_struct *part, - unsigned int inflight[2]); +unsigned int part_in_flight(struct request_queue *q, struct hd_struct *part); void part_in_flight_rw(struct request_queue *q, struct hd_struct *part, unsigned int inflight[2]); void part_dec_in_flight(struct request_queue *q, int cpu, struct hd_struct *part, From patchwork Fri Nov 30 22:22:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 10707375 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 666CF14E2 for ; Fri, 30 Nov 2018 22:22:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 58F132FF89 for ; Fri, 30 Nov 2018 22:22:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4D7893008C; Fri, 30 Nov 2018 22:22:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E0E4D2FF89 for ; Fri, 30 Nov 2018 22:22:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726996AbeLAJdd (ORCPT ); Sat, 1 Dec 2018 04:33:33 -0500 Received: from mx1.redhat.com ([209.132.183.28]:46292 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726987AbeLAJdd (ORCPT ); Sat, 1 Dec 2018 04:33:33 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D7E123001860; Fri, 30 Nov 2018 22:22:46 +0000 (UTC) Received: from localhost (unknown [10.16.197.51]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 8DAFE106222C; Fri, 30 Nov 2018 22:22:42 +0000 (UTC) From: Mike Snitzer To: Jens Axboe Cc: Mikulas Patocka , dm-devel@redhat.com, linux-block@vger.kernel.org Subject: [PATCH v2 6/6] dm: remove the pending IO accounting Date: Fri, 30 Nov 2018 17:22:26 -0500 Message-Id: <20181130222226.77216-7-snitzer@redhat.com> In-Reply-To: <20181130222226.77216-1-snitzer@redhat.com> References: <20181130222226.77216-1-snitzer@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Fri, 30 Nov 2018 22:22:46 +0000 (UTC) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Mikulas Patocka Remove the "pending" atomic counters, that duplicate block-core's in_flight counters, and update md_in_flight() to look at percpu in_flight counters. Signed-off-by: Mikulas Patocka Signed-off-by: Mike Snitzer --- drivers/md/dm-core.h | 2 -- drivers/md/dm.c | 34 +++++++++++++++------------------- 2 files changed, 15 insertions(+), 21 deletions(-) diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h index 224d44503a06..6fe883fac471 100644 --- a/drivers/md/dm-core.h +++ b/drivers/md/dm-core.h @@ -65,7 +65,6 @@ struct mapped_device { */ struct work_struct work; wait_queue_head_t wait; - atomic_t pending[2]; spinlock_t deferred_lock; struct bio_list deferred; @@ -119,7 +118,6 @@ struct mapped_device { struct srcu_struct io_barrier; }; -int md_in_flight(struct mapped_device *md); void disable_write_same(struct mapped_device *md); void disable_write_zeroes(struct mapped_device *md); diff --git a/drivers/md/dm.c b/drivers/md/dm.c index a8ae7931bce7..ff6e5a5902f2 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -646,25 +646,30 @@ static void free_tio(struct dm_target_io *tio) bio_put(&tio->clone); } -int md_in_flight(struct mapped_device *md) +static bool md_in_flight(struct mapped_device *md) { - return atomic_read(&md->pending[READ]) + - atomic_read(&md->pending[WRITE]); + int cpu; + struct hd_struct *part = &dm_disk(md)->part0; + + for_each_possible_cpu(cpu) { + if (local_read(&per_cpu_ptr(part->dkstats, cpu)->in_flight[0]) || + local_read(&per_cpu_ptr(part->dkstats, cpu)->in_flight[1])) + return true; + } + + return false; } static void start_io_acct(struct dm_io *io) { struct mapped_device *md = io->md; struct bio *bio = io->orig_bio; - int rw = bio_data_dir(bio); io->start_time = jiffies; generic_start_io_acct(md->queue, bio_op(bio), bio_sectors(bio), &dm_disk(md)->part0); - atomic_inc(&md->pending[rw]); - if (unlikely(dm_stats_used(&md->stats))) dm_stats_account_io(&md->stats, bio_data_dir(bio), bio->bi_iter.bi_sector, bio_sectors(bio), @@ -676,8 +681,6 @@ static void end_io_acct(struct dm_io *io) struct mapped_device *md = io->md; struct bio *bio = io->orig_bio; unsigned long duration = jiffies - io->start_time; - int pending; - int rw = bio_data_dir(bio); generic_end_io_acct(md->queue, bio_op(bio), &dm_disk(md)->part0, io->start_time); @@ -687,16 +690,11 @@ static void end_io_acct(struct dm_io *io) bio->bi_iter.bi_sector, bio_sectors(bio), true, duration, &io->stats_aux); - /* - * After this is decremented the bio must not be touched if it is - * a flush. - */ - pending = atomic_dec_return(&md->pending[rw]); - pending += atomic_read(&md->pending[rw^0x1]); - /* nudge anyone waiting on suspend queue */ - if (!pending) - wake_up(&md->wait); + if (unlikely(waitqueue_active(&md->wait))) { + if (!md_in_flight(md)) + wake_up(&md->wait); + } } /* @@ -1904,8 +1902,6 @@ static struct mapped_device *alloc_dev(int minor) if (!md->disk) goto bad; - atomic_set(&md->pending[0], 0); - atomic_set(&md->pending[1], 0); init_waitqueue_head(&md->wait); INIT_WORK(&md->work, dm_wq_work); init_waitqueue_head(&md->eventq);