From patchwork Tue Oct 12 18:17:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 12553465 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1B92C433F5 for ; Tue, 12 Oct 2021 18:17:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A369660C4B for ; Tue, 12 Oct 2021 18:17:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232851AbhJLSTw (ORCPT ); Tue, 12 Oct 2021 14:19:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56342 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232865AbhJLSTt (ORCPT ); Tue, 12 Oct 2021 14:19:49 -0400 Received: from mail-io1-xd2f.google.com (mail-io1-xd2f.google.com [IPv6:2607:f8b0:4864:20::d2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 26FB1C061570 for ; Tue, 12 Oct 2021 11:17:47 -0700 (PDT) Received: by mail-io1-xd2f.google.com with SMTP id i189so16998132ioa.1 for ; Tue, 12 Oct 2021 11:17:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+7vZjDgeauo5FzeWcQNWxt7bHslhgDBADGVxfKZuB1I=; b=6yFuR3c2jQRfLmkuSZMYCATeG6k9GNZUyGdTvT5+ww2XXFk5V6VeCc1UWvk0gbkhvR YswjMfKDTDfY8ThBT5JgR/bhLtEi3YrSUHJHKF+JOIoRCVs5yC4/e1wKVKEXz/OnfO5X c21SaXLwryknnS5OLzsYD8nnGAPDnJJ3aJZnhDbjVDjdX4VjjM7NE4APMQrMJP2RcqZy qENrLPUNrXC2fK7O24FcVMK7J4bNgCMJzIiCEkgGWxaPjQ4FhaNIkw1hicXPJBA3gFru KC8qH7X+lm/J3B0GZiH0JG/2A+FbKUFv1WR4ZUgCqbnOXxAnYR+1BkJ9rHHdt2C+tqW3 TI/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+7vZjDgeauo5FzeWcQNWxt7bHslhgDBADGVxfKZuB1I=; b=iD4kI9iRf5bdNS+7VPZGweADFWsA+ieQVctCwa4o4Vav+kqhl/jisKA8tRJqafMSrc 3JugObly+9oYemRCeE+RO1kahN/AEfV7vexu9V3rLPEIxvc0C7BZVX+z0hez/zHk/+aT 2eDpR8GPhqHDi8eI8dTHlHdCEXAhR8Ci9nctkI44nMpAaz4WD0GOq0WsSvPrEGVuCtbF +7smBH8KRM6q/LqaLywYnXRneJ4bmfj5twJ49x3ks7WZNO62hkLFlFCVtST83FiXDhm6 otwKzfLEAyIs5/Nsb084hi6c5wSKynjaSONLU+Q5NFGPJaNdV0XDH0a+NnL9eg0jQ6tD nofw== X-Gm-Message-State: AOAM531TyJ00CxLtcgSGbCoFWjnjTvZps29jG72LbQj6dGRnStVIRaTi yNeOwfb/HHk8mcRxDUVHoYZIVeJWkidX4g== X-Google-Smtp-Source: ABdhPJxU6ykg4CAKrj+q+pYz08ucBOvhqMCYumych1dZBv5ndJlhZDHX7f8s+JxqYUMqpdiY6ehnOA== X-Received: by 2002:a6b:fd05:: with SMTP id c5mr5111107ioi.15.1634062665585; Tue, 12 Oct 2021 11:17:45 -0700 (PDT) Received: from p1.localdomain ([207.135.234.126]) by smtp.gmail.com with ESMTPSA id x5sm2242476ioh.23.2021.10.12.11.17.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Oct 2021 11:17:45 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 1/9] block: add a struct io_batch argument to fops->iopoll() Date: Tue, 12 Oct 2021 12:17:34 -0600 Message-Id: <20211012181742.672391-2-axboe@kernel.dk> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211012181742.672391-1-axboe@kernel.dk> References: <20211012181742.672391-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org struct io_batch contains a list head and a completion handler, which will allow completions to more effciently completed batches of IO. For now, no functional changes in this patch, we just add the argument to the file_operations iopoll handler Signed-off-by: Jens Axboe --- block/blk-core.c | 8 ++++---- block/blk-exec.c | 2 +- block/blk-mq.c | 9 +++++---- block/blk-mq.h | 3 ++- block/fops.c | 4 ++-- drivers/block/null_blk/main.c | 2 +- drivers/block/rnbd/rnbd-clt.c | 2 +- drivers/nvme/host/pci.c | 4 ++-- drivers/nvme/host/rdma.c | 2 +- drivers/nvme/host/tcp.c | 2 +- drivers/scsi/scsi_lib.c | 2 +- fs/io_uring.c | 2 +- fs/iomap/direct-io.c | 2 +- include/linux/blk-mq.h | 2 +- include/linux/blkdev.h | 4 ++-- include/linux/fs.h | 8 +++++++- mm/page_io.c | 2 +- 17 files changed, 34 insertions(+), 26 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index d5b0258dd218..877c345936a0 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1099,7 +1099,7 @@ EXPORT_SYMBOL(submit_bio); * Note: the caller must either be the context that submitted @bio, or * be in a RCU critical section to prevent freeing of @bio. */ -int bio_poll(struct bio *bio, unsigned int flags) +int bio_poll(struct bio *bio, struct io_batch *ib, unsigned int flags) { struct request_queue *q = bio->bi_bdev->bd_disk->queue; blk_qc_t cookie = READ_ONCE(bio->bi_cookie); @@ -1117,7 +1117,7 @@ int bio_poll(struct bio *bio, unsigned int flags) if (WARN_ON_ONCE(!queue_is_mq(q))) ret = 0; /* not yet implemented, should not happen */ else - ret = blk_mq_poll(q, cookie, flags); + ret = blk_mq_poll(q, cookie, ib, flags); blk_queue_exit(q); return ret; } @@ -1127,7 +1127,7 @@ EXPORT_SYMBOL_GPL(bio_poll); * Helper to implement file_operations.iopoll. Requires the bio to be stored * in iocb->private, and cleared before freeing the bio. */ -int iocb_bio_iopoll(struct kiocb *kiocb, unsigned int flags) +int iocb_bio_iopoll(struct kiocb *kiocb, struct io_batch *ib, unsigned int flags) { struct bio *bio; int ret = 0; @@ -1155,7 +1155,7 @@ int iocb_bio_iopoll(struct kiocb *kiocb, unsigned int flags) rcu_read_lock(); bio = READ_ONCE(kiocb->private); if (bio && bio->bi_bdev) - ret = bio_poll(bio, flags); + ret = bio_poll(bio, ib, flags); rcu_read_unlock(); return ret; diff --git a/block/blk-exec.c b/block/blk-exec.c index 55f0cd34b37b..1b8b47f6e79b 100644 --- a/block/blk-exec.c +++ b/block/blk-exec.c @@ -77,7 +77,7 @@ static bool blk_rq_is_poll(struct request *rq) static void blk_rq_poll_completion(struct request *rq, struct completion *wait) { do { - bio_poll(rq->bio, 0); + bio_poll(rq->bio, NULL, 0); cond_resched(); } while (!completion_done(wait)); } diff --git a/block/blk-mq.c b/block/blk-mq.c index 7027a25c5271..a38412dcb55f 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4030,7 +4030,7 @@ static bool blk_mq_poll_hybrid(struct request_queue *q, blk_qc_t qc) } static int blk_mq_poll_classic(struct request_queue *q, blk_qc_t cookie, - unsigned int flags) + struct io_batch *ib, unsigned int flags) { struct blk_mq_hw_ctx *hctx = blk_qc_to_hctx(q, cookie); long state = get_current_state(); @@ -4041,7 +4041,7 @@ static int blk_mq_poll_classic(struct request_queue *q, blk_qc_t cookie, do { hctx->poll_invoked++; - ret = q->mq_ops->poll(hctx); + ret = q->mq_ops->poll(hctx, ib); if (ret > 0) { hctx->poll_success++; __set_current_state(TASK_RUNNING); @@ -4062,14 +4062,15 @@ static int blk_mq_poll_classic(struct request_queue *q, blk_qc_t cookie, return 0; } -int blk_mq_poll(struct request_queue *q, blk_qc_t cookie, unsigned int flags) +int blk_mq_poll(struct request_queue *q, blk_qc_t cookie, struct io_batch *ib, + unsigned int flags) { if (!(flags & BLK_POLL_NOSLEEP) && q->poll_nsec != BLK_MQ_POLL_CLASSIC) { if (blk_mq_poll_hybrid(q, cookie)) return 1; } - return blk_mq_poll_classic(q, cookie, flags); + return blk_mq_poll_classic(q, cookie, ib, flags); } unsigned int blk_mq_rq_cpu(struct request *rq) diff --git a/block/blk-mq.h b/block/blk-mq.h index 8be447995106..861c5cb076a9 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -38,7 +38,8 @@ struct blk_mq_ctx { } ____cacheline_aligned_in_smp; void blk_mq_submit_bio(struct bio *bio); -int blk_mq_poll(struct request_queue *q, blk_qc_t cookie, unsigned int flags); +int blk_mq_poll(struct request_queue *q, blk_qc_t cookie, struct io_batch *ib, + unsigned int flags); void blk_mq_exit_queue(struct request_queue *q); int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr); void blk_mq_wake_waiters(struct request_queue *q); diff --git a/block/fops.c b/block/fops.c index ce1255529ba2..30487bf6d5e4 100644 --- a/block/fops.c +++ b/block/fops.c @@ -106,7 +106,7 @@ static ssize_t __blkdev_direct_IO_simple(struct kiocb *iocb, set_current_state(TASK_UNINTERRUPTIBLE); if (!READ_ONCE(bio.bi_private)) break; - if (!(iocb->ki_flags & IOCB_HIPRI) || !bio_poll(&bio, 0)) + if (!(iocb->ki_flags & IOCB_HIPRI) || !bio_poll(&bio, NULL, 0)) blk_io_schedule(); } __set_current_state(TASK_RUNNING); @@ -288,7 +288,7 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, if (!READ_ONCE(dio->waiter)) break; - if (!do_poll || !bio_poll(bio, 0)) + if (!do_poll || !bio_poll(bio, NULL, 0)) blk_io_schedule(); } __set_current_state(TASK_RUNNING); diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c index 7ce911e2289d..744601f1704d 100644 --- a/drivers/block/null_blk/main.c +++ b/drivers/block/null_blk/main.c @@ -1494,7 +1494,7 @@ static int null_map_queues(struct blk_mq_tag_set *set) return 0; } -static int null_poll(struct blk_mq_hw_ctx *hctx) +static int null_poll(struct blk_mq_hw_ctx *hctx, struct io_batch *ib) { struct nullb_queue *nq = hctx->driver_data; LIST_HEAD(list); diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c index bd4a41afbbfc..7bd18aef1086 100644 --- a/drivers/block/rnbd/rnbd-clt.c +++ b/drivers/block/rnbd/rnbd-clt.c @@ -1176,7 +1176,7 @@ static blk_status_t rnbd_queue_rq(struct blk_mq_hw_ctx *hctx, return ret; } -static int rnbd_rdma_poll(struct blk_mq_hw_ctx *hctx) +static int rnbd_rdma_poll(struct blk_mq_hw_ctx *hctx, struct io_batch *ib) { struct rnbd_queue *q = hctx->driver_data; struct rnbd_clt_dev *dev = q->dev; diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 0dd4b44b59cd..4ad63bb9f415 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1092,7 +1092,7 @@ static void nvme_poll_irqdisable(struct nvme_queue *nvmeq) enable_irq(pci_irq_vector(pdev, nvmeq->cq_vector)); } -static int nvme_poll(struct blk_mq_hw_ctx *hctx) +static int nvme_poll(struct blk_mq_hw_ctx *hctx, struct io_batch *ib) { struct nvme_queue *nvmeq = hctx->driver_data; bool found; @@ -1274,7 +1274,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) * Did we miss an interrupt? */ if (test_bit(NVMEQ_POLLED, &nvmeq->flags)) - nvme_poll(req->mq_hctx); + nvme_poll(req->mq_hctx, NULL); else nvme_poll_irqdisable(nvmeq); diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index 40317e1b9183..4987cfbf5dd4 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -2106,7 +2106,7 @@ static blk_status_t nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx, return ret; } -static int nvme_rdma_poll(struct blk_mq_hw_ctx *hctx) +static int nvme_rdma_poll(struct blk_mq_hw_ctx *hctx, struct io_batch *ib) { struct nvme_rdma_queue *queue = hctx->driver_data; diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 3c1c29dd3020..7bc321ceec72 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -2429,7 +2429,7 @@ static int nvme_tcp_map_queues(struct blk_mq_tag_set *set) return 0; } -static int nvme_tcp_poll(struct blk_mq_hw_ctx *hctx) +static int nvme_tcp_poll(struct blk_mq_hw_ctx *hctx, struct io_batch *ib) { struct nvme_tcp_queue *queue = hctx->driver_data; struct sock *sk = queue->sock->sk; diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 33fd9a01330c..de3fffc447da 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1784,7 +1784,7 @@ static void scsi_mq_exit_request(struct blk_mq_tag_set *set, struct request *rq, } -static int scsi_mq_poll(struct blk_mq_hw_ctx *hctx) +static int scsi_mq_poll(struct blk_mq_hw_ctx *hctx, struct io_batch *ib) { struct Scsi_Host *shost = hctx->driver_data; diff --git a/fs/io_uring.c b/fs/io_uring.c index e43e130a0e92..082ff64c1bcb 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2412,7 +2412,7 @@ static int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin) if (READ_ONCE(req->iopoll_completed)) break; - ret = kiocb->ki_filp->f_op->iopoll(kiocb, poll_flags); + ret = kiocb->ki_filp->f_op->iopoll(kiocb, NULL, poll_flags); if (unlikely(ret < 0)) return ret; else if (ret) diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index 8efab177011d..83ecfba53abe 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -630,7 +630,7 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, break; if (!dio->submit.poll_bio || - !bio_poll(dio->submit.poll_bio, 0)) + !bio_poll(dio->submit.poll_bio, NULL, 0)) blk_io_schedule(); } __set_current_state(TASK_RUNNING); diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index f86e28828a95..29555673090d 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -538,7 +538,7 @@ struct blk_mq_ops { /** * @poll: Called to poll for completion of a specific tag. */ - int (*poll)(struct blk_mq_hw_ctx *); + int (*poll)(struct blk_mq_hw_ctx *, struct io_batch *); /** * @complete: Mark the request as complete. diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 2a8689e949b4..96e1261c4846 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -569,8 +569,8 @@ blk_status_t errno_to_blk_status(int errno); #define BLK_POLL_ONESHOT (1 << 0) /* do not sleep to wait for the expected completion time */ #define BLK_POLL_NOSLEEP (1 << 1) -int bio_poll(struct bio *bio, unsigned int flags); -int iocb_bio_iopoll(struct kiocb *kiocb, unsigned int flags); +int bio_poll(struct bio *bio, struct io_batch *ib, unsigned int flags); +int iocb_bio_iopoll(struct kiocb *kiocb, struct io_batch *ib, unsigned int flags); static inline struct request_queue *bdev_get_queue(struct block_device *bdev) { diff --git a/include/linux/fs.h b/include/linux/fs.h index f98a361a6752..70986a1d62db 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -48,6 +48,7 @@ struct backing_dev_info; struct bdi_writeback; struct bio; +struct request; struct export_operations; struct fiemap_extent_info; struct hd_geometry; @@ -2068,6 +2069,11 @@ struct dir_context { struct iov_iter; +struct io_batch { + struct request *req_list; + void (*complete)(struct io_batch *); +}; + struct file_operations { struct module *owner; loff_t (*llseek) (struct file *, loff_t, int); @@ -2075,7 +2081,7 @@ struct file_operations { ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *); ssize_t (*read_iter) (struct kiocb *, struct iov_iter *); ssize_t (*write_iter) (struct kiocb *, struct iov_iter *); - int (*iopoll)(struct kiocb *kiocb, unsigned int flags); + int (*iopoll)(struct kiocb *kiocb, struct io_batch *, unsigned int flags); int (*iterate) (struct file *, struct dir_context *); int (*iterate_shared) (struct file *, struct dir_context *); __poll_t (*poll) (struct file *, struct poll_table_struct *); diff --git a/mm/page_io.c b/mm/page_io.c index a68faab5b310..6010fb07f231 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -424,7 +424,7 @@ int swap_readpage(struct page *page, bool synchronous) if (!READ_ONCE(bio->bi_private)) break; - if (!bio_poll(bio, 0)) + if (!bio_poll(bio, NULL, 0)) blk_io_schedule(); } __set_current_state(TASK_RUNNING); From patchwork Tue Oct 12 18:17:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 12553479 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D466C433EF for ; Tue, 12 Oct 2021 18:18:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5487660C4B for ; Tue, 12 Oct 2021 18:18:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233226AbhJLSUA (ORCPT ); Tue, 12 Oct 2021 14:20:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56396 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233033AbhJLSUA (ORCPT ); Tue, 12 Oct 2021 14:20:00 -0400 Received: from mail-io1-xd29.google.com (mail-io1-xd29.google.com [IPv6:2607:f8b0:4864:20::d29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 50BC8C061749 for ; Tue, 12 Oct 2021 11:17:57 -0700 (PDT) Received: by mail-io1-xd29.google.com with SMTP id b10so9789723iof.12 for ; Tue, 12 Oct 2021 11:17:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ahqldeCtg66uzCKobXV+3ZUDAi97hwCQoUdjJg1AbOs=; b=nV5vIikQm+H4RZwWKi3Z+DLzGzotqecUYJEbpBc2v4REqCwenQ5EbXDg7tI5ag1Az8 xFD200VZEDHSbCne2eNOQXLVfGmJw1paPs2odHNDyKk4jqxPzCmtCpWhN6gnPUnrpH5v XX80d5aCqJe78Nzoiuv+1iL0VIxeuWMHwjZommG+recFeP98y3X6aBpzLbOlMBXOIVkF JIdO10XlyBSXx87WiiV4nbtP4CTMzCp2bdyt4B8o+WUQuV0tje6F/0ncU1mAUNsyuivV yUdwjoCKG0OntE5SsN5tHowAUAJ1ffxHuTkXhgZQs5WXlpXZ6Fvki8zPx4TqAUTReRrp dZXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ahqldeCtg66uzCKobXV+3ZUDAi97hwCQoUdjJg1AbOs=; b=he6E2cgaO/VI5k/u8ktELHESUM6ypdZt4XFASLGp/HJQxK1McQKS5oFXFwHDz7cGdh GK6L+YQIpc0wKFToQvv6J/mDfBIZ4/AIVA0g5j4fUYIPQ4y1WFvf643F/sDZUeGFHzSr bpFhIoehnrytvEo1dWPuTC2JfHj+d3IBG32t2hcmEQ2M/1aiaju6EKYlV+O76WQee7rQ LO+rE8LUUYawBikocHH6rp2ob0sqIQu38lOLJNOQnegHlS6aXZkm8Te/fdWhr/rX+duF yqjlMVi+aux560zr5mZyScdkqo5w2ZwgRB3Bbv8saqB/cDd0djkIAACIMUeKTDh2k63U pcrg== X-Gm-Message-State: AOAM532j3IyYl1JJvmyMkXOxRHMYDy0t5bbftfTGZ6Q+Ly31oqoJebSg ei7OWhdEpgzdEFrMSmjPIK3pMHLATEbEyA== X-Google-Smtp-Source: ABdhPJzFcnCmtQkpoS9jgNpz1KuCDVvIaJMsIp64VeP66Nk7r7BhI22s3tZ+XvauVR5uQdqpgxbN+Q== X-Received: by 2002:a6b:2b17:: with SMTP id r23mr20718473ior.13.1634062666404; Tue, 12 Oct 2021 11:17:46 -0700 (PDT) Received: from p1.localdomain ([207.135.234.126]) by smtp.gmail.com with ESMTPSA id x5sm2242476ioh.23.2021.10.12.11.17.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Oct 2021 11:17:45 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 2/9] sbitmap: add helper to clear a batch of tags Date: Tue, 12 Oct 2021 12:17:35 -0600 Message-Id: <20211012181742.672391-3-axboe@kernel.dk> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211012181742.672391-1-axboe@kernel.dk> References: <20211012181742.672391-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org sbitmap currently only supports clearing tags one-by-one, add a helper that allows the caller to pass in an array of tags to clear. Signed-off-by: Jens Axboe --- include/linux/sbitmap.h | 11 +++++++++++ lib/sbitmap.c | 44 ++++++++++++++++++++++++++++++++++++++--- 2 files changed, 52 insertions(+), 3 deletions(-) diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h index e30b56023ead..4a6ff274335a 100644 --- a/include/linux/sbitmap.h +++ b/include/linux/sbitmap.h @@ -528,6 +528,17 @@ void sbitmap_queue_min_shallow_depth(struct sbitmap_queue *sbq, void sbitmap_queue_clear(struct sbitmap_queue *sbq, unsigned int nr, unsigned int cpu); +/** + * sbitmap_queue_clear_batch() - Free a batch of allocated bits + * &struct sbitmap_queue. + * @sbq: Bitmap to free from. + * @offset: offset for each tag in array + * @tags: array of tags + * @nr_tags: number of tags in array + */ +void sbitmap_queue_clear_batch(struct sbitmap_queue *sbq, int offset, + int *tags, int nr_tags); + static inline int sbq_index_inc(int index) { return (index + 1) & (SBQ_WAIT_QUEUES - 1); diff --git a/lib/sbitmap.c b/lib/sbitmap.c index f398e0ae548e..c6e2f1f2c4d2 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -628,6 +628,46 @@ void sbitmap_queue_wake_up(struct sbitmap_queue *sbq) } EXPORT_SYMBOL_GPL(sbitmap_queue_wake_up); +static inline void sbitmap_update_cpu_hint(struct sbitmap *sb, int cpu, int tag) +{ + if (likely(!sb->round_robin && tag < sb->depth)) + *per_cpu_ptr(sb->alloc_hint, cpu) = tag; +} + +void sbitmap_queue_clear_batch(struct sbitmap_queue *sbq, int offset, + int *tags, int nr_tags) +{ + struct sbitmap *sb = &sbq->sb; + unsigned long *addr = NULL; + unsigned long mask = 0; + int i; + + smp_mb__before_atomic(); + for (i = 0; i < nr_tags; i++) { + const int tag = tags[i] - offset; + unsigned long *this_addr; + + /* since we're clearing a batch, skip the deferred map */ + this_addr = &sb->map[SB_NR_TO_INDEX(sb, tag)].word; + if (!addr) { + addr = this_addr; + } else if (addr != this_addr) { + atomic_long_andnot(mask, (atomic_long_t *) addr); + mask = 0; + addr = this_addr; + } + mask |= (1UL << SB_NR_TO_BIT(sb, tag)); + } + + if (mask) + atomic_long_andnot(mask, (atomic_long_t *) addr); + + smp_mb__after_atomic(); + sbitmap_queue_wake_up(sbq); + sbitmap_update_cpu_hint(&sbq->sb, raw_smp_processor_id(), + tags[nr_tags - 1] - offset); +} + void sbitmap_queue_clear(struct sbitmap_queue *sbq, unsigned int nr, unsigned int cpu) { @@ -652,9 +692,7 @@ void sbitmap_queue_clear(struct sbitmap_queue *sbq, unsigned int nr, */ smp_mb__after_atomic(); sbitmap_queue_wake_up(sbq); - - if (likely(!sbq->sb.round_robin && nr < sbq->sb.depth)) - *per_cpu_ptr(sbq->sb.alloc_hint, cpu) = nr; + sbitmap_update_cpu_hint(&sbq->sb, cpu, nr); } EXPORT_SYMBOL_GPL(sbitmap_queue_clear); From patchwork Tue Oct 12 18:17:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 12553467 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54245C433FE for ; Tue, 12 Oct 2021 18:17:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 37F4460C4B for ; Tue, 12 Oct 2021 18:17:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232865AbhJLSTz (ORCPT ); Tue, 12 Oct 2021 14:19:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56354 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231672AbhJLSTw (ORCPT ); Tue, 12 Oct 2021 14:19:52 -0400 Received: from mail-il1-x12f.google.com (mail-il1-x12f.google.com [IPv6:2607:f8b0:4864:20::12f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 53321C061570 for ; Tue, 12 Oct 2021 11:17:50 -0700 (PDT) Received: by mail-il1-x12f.google.com with SMTP id y17so11947ilb.9 for ; Tue, 12 Oct 2021 11:17:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=UWWbxxl1nqVuRPyRl4TrQ64PvIRN1ySCGLzomh2MwOY=; b=1toiFYGSPZD5+BDx2LJqqAoDP/et5H3GL7u3PweHmxR9NWV75Cjnbmt+o2mlNbL3z+ LOxLbnaFcNQolgt+c2tBw6Bbv2mSgVI/36lUGwUWIW+G/md3R6awGbv53d3XGxApP0O0 ASHPPp674rSEk5nmD/WfJw6seHbKHj5e4Y1o2o/AHccUtKQpMEhw7XX7fXu4s4IUaQTe 4uHNr43L0qAxUZLn18wf03B+PHXMLfYvYS9YIaVdpzfEn56jkaZnieWWVWpMmDIHqUmM HYo/fA71gkHF8Gp1JmFDy9Ir11ds7Q2eTo4BnDCk/MHx+tiAD0eec/MK/y23LO1I3baE ebVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=UWWbxxl1nqVuRPyRl4TrQ64PvIRN1ySCGLzomh2MwOY=; b=bOfZZ60F0mqNVT5396ZPW0m+ozI8h/2MxGlkgp7Wydiu+zphOVEVPPcZ2ox3YyfqTX wuQW7Q6/o+SPjvkAYWZn5pmeiqR7FkJTzbSr5vkUicuIFHfvjoNujkWO0rv/MT6ssBWR EgKhR3ffVed98LLqljiZRLPP6mu0Jmabf1x9J9BX/3xwlUHbD8F1gaEyyeNWQ015Qpxo bCQa7rFwjYFQtse70w0v4YeXGtM8QcDKdM4OX+MR73YkVW2VsP5JxW5uWHcoTZH36HMz tCcxcK59UE50fB3z/afaWcJY2UvpPUvy7h47KFUmEdiMOgckkTxyckHqJBuaI7szacm1 KaaA== X-Gm-Message-State: AOAM531WB5q4q3430bGqrMuZTvWPoYNeyga7+QVlLtI+b9+yp+9ua3EI +ENp4Pas7nS8NWrx1B2e5wWcQyiayPfZrQ== X-Google-Smtp-Source: ABdhPJwRdb84J3XEqgxDMhSMFfN/MTKY/wPDxsOOOi0DpS+oyqOqxvcz/ZbTvy2MsI63zB+UkV0Rng== X-Received: by 2002:a05:6e02:1486:: with SMTP id n6mr25703510ilk.195.1634062667111; Tue, 12 Oct 2021 11:17:47 -0700 (PDT) Received: from p1.localdomain ([207.135.234.126]) by smtp.gmail.com with ESMTPSA id x5sm2242476ioh.23.2021.10.12.11.17.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Oct 2021 11:17:46 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 3/9] sbitmap: test bit before calling test_and_set_bit() Date: Tue, 12 Oct 2021 12:17:36 -0600 Message-Id: <20211012181742.672391-4-axboe@kernel.dk> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211012181742.672391-1-axboe@kernel.dk> References: <20211012181742.672391-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org If we come across bits that are already set, then it's quicker to test that first and gate the test_and_set_bit() operation on the result of the bit test. Signed-off-by: Jens Axboe --- lib/sbitmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/sbitmap.c b/lib/sbitmap.c index c6e2f1f2c4d2..11b244a8d00f 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -166,7 +166,7 @@ static int __sbitmap_get_word(unsigned long *word, unsigned long depth, return -1; } - if (!test_and_set_bit_lock(nr, word)) + if (!test_bit(nr, word) && !test_and_set_bit_lock(nr, word)) break; hint = nr + 1; From patchwork Tue Oct 12 18:17:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 12553485 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CCAC4C433FE for ; Tue, 12 Oct 2021 18:18:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B700160D07 for ; Tue, 12 Oct 2021 18:18:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233102AbhJLSUC (ORCPT ); Tue, 12 Oct 2021 14:20:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232464AbhJLSUC (ORCPT ); Tue, 12 Oct 2021 14:20:02 -0400 Received: from mail-io1-xd2d.google.com (mail-io1-xd2d.google.com [IPv6:2607:f8b0:4864:20::d2d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A891BC061570 for ; Tue, 12 Oct 2021 11:17:56 -0700 (PDT) Received: by mail-io1-xd2d.google.com with SMTP id b10so9789789iof.12 for ; Tue, 12 Oct 2021 11:17:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gusiocnwZun+uSAGNDHyAkAkochQi/Puw7FMhKMpRIw=; b=6fzgQV6TMmf0gC3PArHqYcvcnVmVp9NAdOCpDXHJyFwvj8i9/vpI9h4ENeKGoNl0XF 7oyUzcbcntkrTH/U4KT2OdbM8P+FaLNfMG9hrmnEF5aNvAJOQOgslgW5uhpcibhX4mDA WAOxLTudOQXioMl9E+A7XS9uk23kBRZki/2i43qLxr1Dv9hZ0MalhRWZHHi/bp6OSvro tpHL38hApWOIERrsXc1iqv7Z4qtPtZshxtwZPfb8365og2+NqfPr3kY/5WE4xEueAcVM ggSxJn9KSQyZqaafhcECZUhlpOBZVtS2Lb2z1VfRWFm4wofca8djDYl9mN9I3ZPjgt9T QUhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gusiocnwZun+uSAGNDHyAkAkochQi/Puw7FMhKMpRIw=; b=bTaCLAq4SABTfjbp4AKhquQ69pEYdCLebMUX0o5wvrlxr5LVg71g1l7enID/eKdH/R VciLgkJufqaETkfMrriVnyxtijeZkRgfLWnKfZaZCYlY+IYsH4UvIlZj35denHcrIy1n 77vKYz8rFitITxf95QiyKbOnFmCuuZUD8YoFHmfYk8lb/dm1mYOuCVa6OHxEggNSVVlN 71I9PmWyXgfqLlYT09uH3KxpNWLPjAn7MwppkgQOKZQ1LyrzGipqtQb49QE1xM5EojmQ cbLdAyHp/svhmyxVdDcXaV9iMbUJCjYLV8iEb5rv1mif/Di+9jIh75y65stMzUBL5oyx 4lFQ== X-Gm-Message-State: AOAM5309M55nD1N2tzW2I8A54Dpmcujp2bG3plC0mUK/z1DYNd0lJ9H0 pzfvDZcrwjJ33QyjMum681fh8Eqibn49kg== X-Google-Smtp-Source: ABdhPJzRNBSZ83EHygJYltfqx/qa5EgzjJcrDjUakwWD9TwwYVzNz2a4JSI8/Zw2tH4e7HwKAYq5kQ== X-Received: by 2002:a6b:6a05:: with SMTP id x5mr25930134iog.6.1634062667723; Tue, 12 Oct 2021 11:17:47 -0700 (PDT) Received: from p1.localdomain ([207.135.234.126]) by smtp.gmail.com with ESMTPSA id x5sm2242476ioh.23.2021.10.12.11.17.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Oct 2021 11:17:47 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 4/9] block: add support for blk_mq_end_request_batch() Date: Tue, 12 Oct 2021 12:17:37 -0600 Message-Id: <20211012181742.672391-5-axboe@kernel.dk> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211012181742.672391-1-axboe@kernel.dk> References: <20211012181742.672391-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Instead of calling blk_mq_end_request() on a single request, add a helper that takes the new struct io_batch and completes any request stored in there. Signed-off-by: Jens Axboe --- block/blk-mq-tag.c | 6 +++ block/blk-mq-tag.h | 1 + block/blk-mq.c | 83 ++++++++++++++++++++++++++++++++++++++---- include/linux/blk-mq.h | 13 +++++++ 4 files changed, 96 insertions(+), 7 deletions(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index c43b97201161..70eb276dc870 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -207,6 +207,12 @@ void blk_mq_put_tag(struct blk_mq_tags *tags, struct blk_mq_ctx *ctx, } } +void blk_mq_put_tags(struct blk_mq_tags *tags, int *array, int nr_tags) +{ + sbitmap_queue_clear_batch(&tags->bitmap_tags, tags->nr_reserved_tags, + array, nr_tags); +} + struct bt_iter_data { struct blk_mq_hw_ctx *hctx; busy_iter_fn *fn; diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h index 71c2f7d8e9b7..e7b6c8dff071 100644 --- a/block/blk-mq-tag.h +++ b/block/blk-mq-tag.h @@ -42,6 +42,7 @@ unsigned long blk_mq_get_tags(struct blk_mq_alloc_data *data, int nr_tags, unsigned int *offset); extern void blk_mq_put_tag(struct blk_mq_tags *tags, struct blk_mq_ctx *ctx, unsigned int tag); +void blk_mq_put_tags(struct blk_mq_tags *tags, int *array, int nr_tags); extern int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx, struct blk_mq_tags **tags, unsigned int depth, bool can_grow); diff --git a/block/blk-mq.c b/block/blk-mq.c index a38412dcb55f..9509c52a66a4 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -613,21 +613,26 @@ void blk_mq_free_plug_rqs(struct blk_plug *plug) } } -inline void __blk_mq_end_request(struct request *rq, blk_status_t error) +static inline void __blk_mq_end_request_acct(struct request *rq, + blk_status_t error, u64 now) { - u64 now = 0; - - if (blk_mq_need_time_stamp(rq)) - now = ktime_get_ns(); - if (rq->rq_flags & RQF_STATS) { blk_mq_poll_stats_start(rq->q); blk_stat_add(rq, now); } blk_mq_sched_completed_request(rq, now); - blk_account_io_done(rq, now); +} + +inline void __blk_mq_end_request(struct request *rq, blk_status_t error) +{ + u64 now = 0; + + if (blk_mq_need_time_stamp(rq)) + now = ktime_get_ns(); + + __blk_mq_end_request_acct(rq, error, now); if (rq->end_io) { rq_qos_done(rq->q, rq); @@ -646,6 +651,70 @@ void blk_mq_end_request(struct request *rq, blk_status_t error) } EXPORT_SYMBOL(blk_mq_end_request); +#define TAG_COMP_BATCH 32 +#define TAG_SCHED_BATCH (TAG_COMP_BATCH >> 1) + +static inline void blk_mq_flush_tag_batch(struct blk_mq_hw_ctx *hctx, + int *tags, int nr_tags) +{ + struct request_queue *q = hctx->queue; + + blk_mq_put_tags(hctx->tags, tags, nr_tags); + if (q->elevator) + blk_mq_put_tags(hctx->sched_tags, &tags[TAG_SCHED_BATCH], nr_tags); + percpu_ref_put_many(&q->q_usage_counter, nr_tags); + blk_mq_sched_restart(hctx); +} + +void blk_mq_end_request_batch(struct io_batch *ib) +{ + int tags[TAG_COMP_BATCH], nr_tags = 0, acct_tags = 0; + struct blk_mq_hw_ctx *last_hctx = NULL; + u64 now = 0; + + while (ib->req_list) { + struct request *rq; + + rq = ib->req_list; + ib->req_list = rq->rq_next; + if (!now && blk_mq_need_time_stamp(rq)) + now = ktime_get_ns(); + blk_update_request(rq, rq->status, blk_rq_bytes(rq)); + __blk_mq_end_request_acct(rq, rq->status, now); + + if (rq->q->elevator) { + blk_mq_free_request(rq); + continue; + } + + if (!refcount_dec_and_test(&rq->ref)) + continue; + + blk_crypto_free_request(rq); + blk_pm_mark_last_busy(rq); + rq_qos_done(rq->q, rq); + WRITE_ONCE(rq->state, MQ_RQ_IDLE); + + if (acct_tags == TAG_COMP_BATCH || + (last_hctx && last_hctx != rq->mq_hctx)) { + blk_mq_flush_tag_batch(last_hctx, tags, nr_tags); + acct_tags = nr_tags = 0; + } + tags[nr_tags] = rq->tag; + last_hctx = rq->mq_hctx; + if (last_hctx->queue->elevator) { + tags[nr_tags + TAG_SCHED_BATCH] = rq->internal_tag; + acct_tags++; + } + nr_tags++; + acct_tags++; + } + + if (nr_tags) + blk_mq_flush_tag_batch(last_hctx, tags, nr_tags); +} +EXPORT_SYMBOL_GPL(blk_mq_end_request_batch); + static void blk_complete_reqs(struct llist_head *list) { struct llist_node *entry = llist_reverse_order(llist_del_all(list)); diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 29555673090d..26f9f6b07734 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -183,9 +183,16 @@ struct request { unsigned int timeout; unsigned long deadline; + /* + * csd is used for remote completions, fifo_time at scheduler time. + * They are mutually exclusive. result is used at completion time + * like csd, but for batched IO. Batched IO does not use IPI + * completions. + */ union { struct __call_single_data csd; u64 fifo_time; + blk_status_t status; }; /* @@ -545,6 +552,11 @@ struct blk_mq_ops { */ void (*complete)(struct request *); + /** + * @complete_batch: Mark list of requests as complete + */ + void (*complete_batch)(struct io_batch *); + /** * @init_hctx: Called when the block layer side of a hardware queue has * been set up, allowing the driver to allocate/init matching @@ -734,6 +746,7 @@ static inline void blk_mq_set_request_complete(struct request *rq) void blk_mq_start_request(struct request *rq); void blk_mq_end_request(struct request *rq, blk_status_t error); void __blk_mq_end_request(struct request *rq, blk_status_t error); +void blk_mq_end_request_batch(struct io_batch *ib); void blk_mq_requeue_request(struct request *rq, bool kick_requeue_list); void blk_mq_kick_requeue_list(struct request_queue *q); From patchwork Tue Oct 12 18:17:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 12553471 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0C67C433EF for ; Tue, 12 Oct 2021 18:17:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9447460ED4 for ; Tue, 12 Oct 2021 18:17:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231672AbhJLST4 (ORCPT ); Tue, 12 Oct 2021 14:19:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232876AbhJLSTx (ORCPT ); Tue, 12 Oct 2021 14:19:53 -0400 Received: from mail-io1-xd2e.google.com (mail-io1-xd2e.google.com [IPv6:2607:f8b0:4864:20::d2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E6E93C061749 for ; Tue, 12 Oct 2021 11:17:50 -0700 (PDT) Received: by mail-io1-xd2e.google.com with SMTP id h196so11952111iof.2 for ; Tue, 12 Oct 2021 11:17:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=asLPVvaI6oCPe8wORJN6EjuFtRgJYqm63DZu+cx8JsQ=; b=CmfEc9mQB9lO29JPwofIa11Kas2KU5iu8SUhbohb4VSoVmZThidxv7WrmW9wMBYrZ5 2ovXZs50WzQXM9a300stHamMPsMBaANPbsN4QnYm3xpfKk3ZWdwNJ7YAdPzRljHewmvc QiPsVNbBbP3OhbMpRt18Lf9Fp2mckYet1aRZHx3JiytcFV7PdYbd+lIIErp/LpLxLzdr 02m+Y/KSVrJA/hoe5ZGU5GD1jMHxiKlLUytfVw/ALlGXjzUWzIgyXKbhgkL7t6y4kV12 VfFIyzK+GNX3XtJR8d0qmpWqoKrekTX3D5tos0MI/kqyrtaqunJa6exPX/QINsCP5v0i 50Pw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=asLPVvaI6oCPe8wORJN6EjuFtRgJYqm63DZu+cx8JsQ=; b=LNVy8tBbqLrRoTO6OLRgqqUaaiMaCwqckKcRB97w8ClMojMmEf7DxRnmN1BCOPpeca Yv0TSWxR2wlk/3JEvuD/eqDmjFj7wrImMzPfl9pyFCX5z6ZYTBGd7w7v8bJSOBvxqVXn saOSfvWebNd5ioCic3cFxvb07LGgImV6VzeL3S9Hlk1VqBn5endNYxDfBdVwH62+Djq2 OzMS0JgdAaMtSDlpiO9m6HOv+A1jFneTcumXbljiHXMzEOacsr5pyNJAAMHwQuIYU5Ol t3kLI1q21qddEWB/d3R5Eh67RZUz7ZRqWvzp0tSN6CpQI7/LaJEb4RQnKxq+PW1oj+ZP 7Zcg== X-Gm-Message-State: AOAM5326WFb+q0aikdYCegfBw6jTpvJR8KYxdXb2wFooI36GzF6FvgEG ZJW71QNoJCeIs1jiAP2hDH/ot74ZKgrCug== X-Google-Smtp-Source: ABdhPJwkE3dxr/0rq0VrRSE68PL1+EAMN/pQspwllaSGJWl/eYWlW2ZK4LHYccQBKKjnL7pSKfMP0g== X-Received: by 2002:a05:6602:15d3:: with SMTP id f19mr12148167iow.161.1634062668482; Tue, 12 Oct 2021 11:17:48 -0700 (PDT) Received: from p1.localdomain ([207.135.234.126]) by smtp.gmail.com with ESMTPSA id x5sm2242476ioh.23.2021.10.12.11.17.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Oct 2021 11:17:47 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 5/9] nvme: move the fast path nvme error and disposition helpers Date: Tue, 12 Oct 2021 12:17:38 -0600 Message-Id: <20211012181742.672391-6-axboe@kernel.dk> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211012181742.672391-1-axboe@kernel.dk> References: <20211012181742.672391-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org These are called for every IO completion, move them inline in the nvme private header rather than have them be a function call out of the PCI part of the nvme drivers. We also need them for batched handling, hence the patch also serves as a preparation for that. Signed-off-by: Jens Axboe --- drivers/nvme/host/core.c | 73 ++-------------------------------------- drivers/nvme/host/nvme.h | 72 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 74 insertions(+), 71 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 4b7c009fccfe..ec7fa6f31e68 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -44,9 +44,10 @@ static unsigned char shutdown_timeout = 5; module_param(shutdown_timeout, byte, 0644); MODULE_PARM_DESC(shutdown_timeout, "timeout in seconds for controller shutdown"); -static u8 nvme_max_retries = 5; +u8 nvme_max_retries = 5; module_param_named(max_retries, nvme_max_retries, byte, 0644); MODULE_PARM_DESC(max_retries, "max number of retries a command may have"); +EXPORT_SYMBOL_GPL(nvme_max_retries); static unsigned long default_ps_max_latency_us = 100000; module_param(default_ps_max_latency_us, ulong, 0644); @@ -261,48 +262,6 @@ static void nvme_delete_ctrl_sync(struct nvme_ctrl *ctrl) nvme_put_ctrl(ctrl); } -static blk_status_t nvme_error_status(u16 status) -{ - switch (status & 0x7ff) { - case NVME_SC_SUCCESS: - return BLK_STS_OK; - case NVME_SC_CAP_EXCEEDED: - return BLK_STS_NOSPC; - case NVME_SC_LBA_RANGE: - case NVME_SC_CMD_INTERRUPTED: - case NVME_SC_NS_NOT_READY: - return BLK_STS_TARGET; - case NVME_SC_BAD_ATTRIBUTES: - case NVME_SC_ONCS_NOT_SUPPORTED: - case NVME_SC_INVALID_OPCODE: - case NVME_SC_INVALID_FIELD: - case NVME_SC_INVALID_NS: - return BLK_STS_NOTSUPP; - case NVME_SC_WRITE_FAULT: - case NVME_SC_READ_ERROR: - case NVME_SC_UNWRITTEN_BLOCK: - case NVME_SC_ACCESS_DENIED: - case NVME_SC_READ_ONLY: - case NVME_SC_COMPARE_FAILED: - return BLK_STS_MEDIUM; - case NVME_SC_GUARD_CHECK: - case NVME_SC_APPTAG_CHECK: - case NVME_SC_REFTAG_CHECK: - case NVME_SC_INVALID_PI: - return BLK_STS_PROTECTION; - case NVME_SC_RESERVATION_CONFLICT: - return BLK_STS_NEXUS; - case NVME_SC_HOST_PATH_ERROR: - return BLK_STS_TRANSPORT; - case NVME_SC_ZONE_TOO_MANY_ACTIVE: - return BLK_STS_ZONE_ACTIVE_RESOURCE; - case NVME_SC_ZONE_TOO_MANY_OPEN: - return BLK_STS_ZONE_OPEN_RESOURCE; - default: - return BLK_STS_IOERR; - } -} - static void nvme_retry_req(struct request *req) { unsigned long delay = 0; @@ -318,34 +277,6 @@ static void nvme_retry_req(struct request *req) blk_mq_delay_kick_requeue_list(req->q, delay); } -enum nvme_disposition { - COMPLETE, - RETRY, - FAILOVER, -}; - -static inline enum nvme_disposition nvme_decide_disposition(struct request *req) -{ - if (likely(nvme_req(req)->status == 0)) - return COMPLETE; - - if (blk_noretry_request(req) || - (nvme_req(req)->status & NVME_SC_DNR) || - nvme_req(req)->retries >= nvme_max_retries) - return COMPLETE; - - if (req->cmd_flags & REQ_NVME_MPATH) { - if (nvme_is_path_error(nvme_req(req)->status) || - blk_queue_dying(req->q)) - return FAILOVER; - } else { - if (blk_queue_dying(req->q)) - return COMPLETE; - } - - return RETRY; -} - static inline void nvme_end_req(struct request *req) { blk_status_t status = nvme_error_status(nvme_req(req)->status); diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index ed79a6c7e804..3d11b5cb478d 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -903,4 +903,76 @@ static inline bool nvme_multi_css(struct nvme_ctrl *ctrl) return (ctrl->ctrl_config & NVME_CC_CSS_MASK) == NVME_CC_CSS_CSI; } +static inline blk_status_t nvme_error_status(u16 status) +{ + switch (status & 0x7ff) { + case NVME_SC_SUCCESS: + return BLK_STS_OK; + case NVME_SC_CAP_EXCEEDED: + return BLK_STS_NOSPC; + case NVME_SC_LBA_RANGE: + case NVME_SC_CMD_INTERRUPTED: + case NVME_SC_NS_NOT_READY: + return BLK_STS_TARGET; + case NVME_SC_BAD_ATTRIBUTES: + case NVME_SC_ONCS_NOT_SUPPORTED: + case NVME_SC_INVALID_OPCODE: + case NVME_SC_INVALID_FIELD: + case NVME_SC_INVALID_NS: + return BLK_STS_NOTSUPP; + case NVME_SC_WRITE_FAULT: + case NVME_SC_READ_ERROR: + case NVME_SC_UNWRITTEN_BLOCK: + case NVME_SC_ACCESS_DENIED: + case NVME_SC_READ_ONLY: + case NVME_SC_COMPARE_FAILED: + return BLK_STS_MEDIUM; + case NVME_SC_GUARD_CHECK: + case NVME_SC_APPTAG_CHECK: + case NVME_SC_REFTAG_CHECK: + case NVME_SC_INVALID_PI: + return BLK_STS_PROTECTION; + case NVME_SC_RESERVATION_CONFLICT: + return BLK_STS_NEXUS; + case NVME_SC_HOST_PATH_ERROR: + return BLK_STS_TRANSPORT; + case NVME_SC_ZONE_TOO_MANY_ACTIVE: + return BLK_STS_ZONE_ACTIVE_RESOURCE; + case NVME_SC_ZONE_TOO_MANY_OPEN: + return BLK_STS_ZONE_OPEN_RESOURCE; + default: + return BLK_STS_IOERR; + } +} + +enum nvme_disposition { + COMPLETE, + RETRY, + FAILOVER, +}; + +extern u8 nvme_max_retries; + +static inline enum nvme_disposition nvme_decide_disposition(struct request *req) +{ + if (likely(nvme_req(req)->status == 0)) + return COMPLETE; + + if (blk_noretry_request(req) || + (nvme_req(req)->status & NVME_SC_DNR) || + nvme_req(req)->retries >= nvme_max_retries) + return COMPLETE; + + if (req->cmd_flags & REQ_NVME_MPATH) { + if (nvme_is_path_error(nvme_req(req)->status) || + blk_queue_dying(req->q)) + return FAILOVER; + } else { + if (blk_queue_dying(req->q)) + return COMPLETE; + } + + return RETRY; +} + #endif /* _NVME_H */ From patchwork Tue Oct 12 18:17:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 12553475 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6584BC43217 for ; Tue, 12 Oct 2021 18:17:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 49F54610D1 for ; Tue, 12 Oct 2021 18:17:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232915AbhJLST5 (ORCPT ); Tue, 12 Oct 2021 14:19:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56364 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232880AbhJLSTx (ORCPT ); Tue, 12 Oct 2021 14:19:53 -0400 Received: from mail-io1-xd2f.google.com (mail-io1-xd2f.google.com [IPv6:2607:f8b0:4864:20::d2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED105C06174E for ; Tue, 12 Oct 2021 11:17:50 -0700 (PDT) Received: by mail-io1-xd2f.google.com with SMTP id r134so13543593iod.11 for ; Tue, 12 Oct 2021 11:17:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=k37AHCp3gxbOm+kB3mONj2ZpBCnDrxLtP5yZn3K0HAw=; b=xLzKlSET2fR6nBbzaKk8t0rd/n9qOxVIXuvmVFyJjDCKcWTxdfqs20/qLQOmcJt5Fn VyPAalxyj7ws2PdRVPuDZ+HLUkAkcz2eX5sV3iYxQ3VxPd8DLzLPi+zEBZx1wmsp4eBk Hs22OF236nM5Sgz/X1hO768y4+UvcTfIbxARacNd+N6CQAR4XFSD4YFgtuceul8GSm/f ubMtOMFT2G8s2Lm7hYwlynmlcPfrNLl32VkU50ffFjejsvQQy0PajecQqcON355Sv7sD bWyYNadnLLicxO1dxToCf5kgE8BZKqo0heHc0rI0im35ZoBVroBVbVtuExSgjRjH8y7Z 6PYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=k37AHCp3gxbOm+kB3mONj2ZpBCnDrxLtP5yZn3K0HAw=; b=nDZA+M3yRFN/76X7fTwKZXJ7q06v+jkxkkOjIDnxu6KVl20eP5Wh5nKnaRBgrPFkX6 tl8fmlmH02+m2FiZcXTD10ufxcLH13UFyGColgE4fM+zyhtNRPr/hfcSKL9qoEv3rUZV Kq0UWde2JH/arYj9NeCuFnIb2dvIWtpJHxujIFSY5kJhEK60sSh1dep8c8trrw4CDsQ5 DkmhDTHUujBCOI4Cx+K97S7Y7MaElysycUD0R9QofDmnRUHpIw46hsZ1KpgqdUblU+xi xt57OdQXAo+6W33dz7AxfJseNNm3DrUxuaoLYmBP/DAtCJuYa4NDiFDkiGHzg9+LUO5P iqXg== X-Gm-Message-State: AOAM530dhk/nKZH2odwZd6cQ0+z6wG+YgTxgvh6oOO9C1xFxWdjhBOeU 2cwhybI/VZmDiCiy4FnGLe8YldXXPk2OLg== X-Google-Smtp-Source: ABdhPJxt9Bpfn9NYmP2VoZf1TdMwyoTtlYbJC9lQKgtGkT/hJlYN6SLGYkhRN3OZpBXc9hOaDsU/MA== X-Received: by 2002:a6b:ea11:: with SMTP id m17mr10276462ioc.139.1634062669277; Tue, 12 Oct 2021 11:17:49 -0700 (PDT) Received: from p1.localdomain ([207.135.234.126]) by smtp.gmail.com with ESMTPSA id x5sm2242476ioh.23.2021.10.12.11.17.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Oct 2021 11:17:48 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 6/9] nvme: add support for batched completion of polled IO Date: Tue, 12 Oct 2021 12:17:39 -0600 Message-Id: <20211012181742.672391-7-axboe@kernel.dk> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211012181742.672391-1-axboe@kernel.dk> References: <20211012181742.672391-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Take advantage of struct io_batch, if passed in to the nvme poll handler. If it's set, rather than complete each request individually inline, store them in the io_batch list. We only do so for requests that will complete successfully, anything else will be completed inline as before. Add an mq_ops->complete_batch() handler to do the post-processing of the io_batch list once polling is complete. Signed-off-by: Jens Axboe --- drivers/nvme/host/pci.c | 69 +++++++++++++++++++++++++++++++++++++---- 1 file changed, 63 insertions(+), 6 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 4ad63bb9f415..4713da708cd4 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -959,7 +959,7 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx, return ret; } -static void nvme_pci_complete_rq(struct request *req) +static void nvme_pci_unmap_rq(struct request *req) { struct nvme_iod *iod = blk_mq_rq_to_pdu(req); struct nvme_dev *dev = iod->nvmeq->dev; @@ -969,9 +969,34 @@ static void nvme_pci_complete_rq(struct request *req) rq_integrity_vec(req)->bv_len, rq_data_dir(req)); if (blk_rq_nr_phys_segments(req)) nvme_unmap_data(dev, req); +} + +static void nvme_pci_complete_rq(struct request *req) +{ + nvme_pci_unmap_rq(req); nvme_complete_rq(req); } +static void nvme_pci_complete_batch(struct io_batch *ib) +{ + struct request *req; + + req = ib->req_list; + while (req) { + nvme_pci_unmap_rq(req); + if (req->rq_flags & RQF_SPECIAL_PAYLOAD) + nvme_cleanup_cmd(req); + if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) && + req_op(req) == REQ_OP_ZONE_APPEND) + req->__sector = nvme_lba_to_sect(req->q->queuedata, + le64_to_cpu(nvme_req(req)->result.u64)); + req->status = nvme_error_status(nvme_req(req)->status); + req = req->rq_next; + } + + blk_mq_end_request_batch(ib); +} + /* We read the CQE phase first to check if the rest of the entry is valid */ static inline bool nvme_cqe_pending(struct nvme_queue *nvmeq) { @@ -996,7 +1021,8 @@ static inline struct blk_mq_tags *nvme_queue_tagset(struct nvme_queue *nvmeq) return nvmeq->dev->tagset.tags[nvmeq->qid - 1]; } -static inline void nvme_handle_cqe(struct nvme_queue *nvmeq, u16 idx) +static inline void nvme_handle_cqe(struct nvme_queue *nvmeq, + struct io_batch *ib, u16 idx) { struct nvme_completion *cqe = &nvmeq->cqes[idx]; __u16 command_id = READ_ONCE(cqe->command_id); @@ -1023,8 +1049,17 @@ static inline void nvme_handle_cqe(struct nvme_queue *nvmeq, u16 idx) } trace_nvme_sq(req, cqe->sq_head, nvmeq->sq_tail); - if (!nvme_try_complete_req(req, cqe->status, cqe->result)) - nvme_pci_complete_rq(req); + if (!nvme_try_complete_req(req, cqe->status, cqe->result)) { + enum nvme_disposition ret; + + ret = nvme_decide_disposition(req); + if (unlikely(!ib || req->end_io || ret != COMPLETE)) { + nvme_pci_complete_rq(req); + } else { + req->rq_next = ib->req_list; + ib->req_list = req; + } + } } static inline void nvme_update_cq_head(struct nvme_queue *nvmeq) @@ -1050,7 +1085,7 @@ static inline int nvme_process_cq(struct nvme_queue *nvmeq) * the cqe requires a full read memory barrier */ dma_rmb(); - nvme_handle_cqe(nvmeq, nvmeq->cq_head); + nvme_handle_cqe(nvmeq, NULL, nvmeq->cq_head); nvme_update_cq_head(nvmeq); } @@ -1092,6 +1127,27 @@ static void nvme_poll_irqdisable(struct nvme_queue *nvmeq) enable_irq(pci_irq_vector(pdev, nvmeq->cq_vector)); } +static inline int nvme_poll_cq(struct nvme_queue *nvmeq, struct io_batch *ib) +{ + int found = 0; + + while (nvme_cqe_pending(nvmeq)) { + found++; + /* + * load-load control dependency between phase and the rest of + * the cqe requires a full read memory barrier + */ + dma_rmb(); + nvme_handle_cqe(nvmeq, ib, nvmeq->cq_head); + nvme_update_cq_head(nvmeq); + } + + if (found) + nvme_ring_cq_doorbell(nvmeq); + return found; +} + + static int nvme_poll(struct blk_mq_hw_ctx *hctx, struct io_batch *ib) { struct nvme_queue *nvmeq = hctx->driver_data; @@ -1101,7 +1157,7 @@ static int nvme_poll(struct blk_mq_hw_ctx *hctx, struct io_batch *ib) return 0; spin_lock(&nvmeq->cq_poll_lock); - found = nvme_process_cq(nvmeq); + found = nvme_poll_cq(nvmeq, ib); spin_unlock(&nvmeq->cq_poll_lock); return found; @@ -1639,6 +1695,7 @@ static const struct blk_mq_ops nvme_mq_admin_ops = { static const struct blk_mq_ops nvme_mq_ops = { .queue_rq = nvme_queue_rq, .complete = nvme_pci_complete_rq, + .complete_batch = nvme_pci_complete_batch, .commit_rqs = nvme_commit_rqs, .init_hctx = nvme_init_hctx, .init_request = nvme_init_request, From patchwork Tue Oct 12 18:17:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 12553469 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C733C4332F for ; Tue, 12 Oct 2021 18:17:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 121CA60C4B for ; Tue, 12 Oct 2021 18:17:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232876AbhJLST4 (ORCPT ); Tue, 12 Oct 2021 14:19:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56358 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232464AbhJLSTw (ORCPT ); Tue, 12 Oct 2021 14:19:52 -0400 Received: from mail-io1-xd29.google.com (mail-io1-xd29.google.com [IPv6:2607:f8b0:4864:20::d29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CE7CFC061746 for ; Tue, 12 Oct 2021 11:17:50 -0700 (PDT) Received: by mail-io1-xd29.google.com with SMTP id i189so16998344ioa.1 for ; Tue, 12 Oct 2021 11:17:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Z2zeWlNcXzeDSXul25b1n8acHol/1mmQeUhHwlA41/8=; b=sXvotjt3S7ftEMTTNQ+B2Hsqq26xL9gBcJDd8i57vOQAzJiSqQXEREKRUSgOOi2VUt gjxd+hGSO/WP+0/SVKr8/uZiqkdO5NQnpAqLl7IZ+2/eJaJpFVoc2qCSF0DB7bUuIxmd 7doW5VfQyo9v+rPAlNmPz3bqQX2zmJqzDoffeTjdrFdB+mez/hlWXIQ7VfDNWHFJN5cq r0lSwpRN07HlUl194FrfX8Qji8K+TSdq3L+ucBwuZFdwKIoVhNrJ5xTg6Q0dig0HEexi akvI34rD9/eWzjJN7xLs3j4zZpCdY1hxGLVpVunZtrpNN4ngcPuFBU+hcJDQRgvypVfp 9egw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Z2zeWlNcXzeDSXul25b1n8acHol/1mmQeUhHwlA41/8=; b=f3FnKLuL59GISKTCmNDX16RmWgZdHDKSr+HSyWH/MxaTqJcrNLu2N3wFB4FQbmmsw0 JThO6WVgth+q1ooXzCMyFdArFLYuYKw1MYjGHl+BnD5hb3lbVc0e71La1pCCyqSWuBfm h5NeR4qaM9O23Ys/rZjxyZlfltg6mtQm3w3wiyvFxiq/Cfg4QYjbXu1INSuq7/cYtWs8 hjaXT3xwkDWV/d1ni227PBvK9lEV7H6a8QvJspZQYTiHuHJbRO2ETERO9AeoVtgxZTZc Pt0zncgm1fm7ge8txz/jpuAMDlfPpX1nD26zYvFKjY/Pxj4ZqXpFE1fprM2x4KI/weQs NWRg== X-Gm-Message-State: AOAM532RDy7GR+yCMc9Pe2xeuHPzrFtZIx5KlxfGSQGSKoRZiqJ5L+B4 zulNRBAcSiUO8SiZ9zQIBrMw2ZQ4vEC30g== X-Google-Smtp-Source: ABdhPJws3CbyoVL8ghQ9u/yKAbo7iQU/OIwCGEL3H0IiGdh5rLdXuo2sr3qcUU77cP2eKzMoIt/9Lg== X-Received: by 2002:a05:6602:1542:: with SMTP id h2mr14375864iow.198.1634062669754; Tue, 12 Oct 2021 11:17:49 -0700 (PDT) Received: from p1.localdomain ([207.135.234.126]) by smtp.gmail.com with ESMTPSA id x5sm2242476ioh.23.2021.10.12.11.17.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Oct 2021 11:17:49 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 7/9] block: assign batch completion handler in blk_poll() Date: Tue, 12 Oct 2021 12:17:40 -0600 Message-Id: <20211012181742.672391-8-axboe@kernel.dk> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211012181742.672391-1-axboe@kernel.dk> References: <20211012181742.672391-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org If an io_batch is passed in to blk_poll(), we need to assign the batch handler associated with this queue. This allows callers to complete an io_batch handler on by calling it. Signed-off-by: Jens Axboe --- block/blk-mq.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/block/blk-mq.c b/block/blk-mq.c index 9509c52a66a4..62fabc65d6b2 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4107,6 +4107,19 @@ static int blk_mq_poll_classic(struct request_queue *q, blk_qc_t cookie, hctx->poll_considered++; + /* + * If batching is requested but the target doesn't support batched + * completions, then just clear ib and completions will be handled + * normally. + */ + if (ib) { + ib->complete = q->mq_ops->complete_batch; + if (!ib->complete) { + WARN_ON_ONCE(ib->req_list); + ib = NULL; + } + } + do { hctx->poll_invoked++; From patchwork Tue Oct 12 18:17:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 12553473 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1260AC43219 for ; Tue, 12 Oct 2021 18:17:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F284F60C4B for ; Tue, 12 Oct 2021 18:17:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232880AbhJLST5 (ORCPT ); Tue, 12 Oct 2021 14:19:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56370 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232898AbhJLSTx (ORCPT ); Tue, 12 Oct 2021 14:19:53 -0400 Received: from mail-io1-xd34.google.com (mail-io1-xd34.google.com [IPv6:2607:f8b0:4864:20::d34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C897FC061753 for ; Tue, 12 Oct 2021 11:17:51 -0700 (PDT) Received: by mail-io1-xd34.google.com with SMTP id b10so9789907iof.12 for ; Tue, 12 Oct 2021 11:17:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=DjfdwdOP2h7oZEEBaaGTV7Ar0fpekZ11g0+z4RlOBFc=; b=bRoG2yUgCYnYvg7NKkf++dKLMYEkxzLOz0jSARY33vIzlpxU1IgRGTM3UDqk7RyeLM J6su0UnY7wxg75qfgtW4CErQHMobfuiipG1EBn6s2WNyQB9la9TurxcemyFXCzD4Q9bQ c2z5OCbCRJL2Ks1OPNQVv2reTG4kLpasVe55/R5azpcoS3FtbIdO0NUY3uxEJFLULwe2 cPN7mNqDB0cSDHb0XEUsqakHBGfBqjB2FsVf4OrnlGSNmkTFKrMiHo+4/Pyt5CgbVMdN DQCdribv0ZJIFK+4YuAsSDFnUPj3QSFQ1XEDVrQEkbOp4JBVjeVYni/6LFydo4VYbz6L of+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=DjfdwdOP2h7oZEEBaaGTV7Ar0fpekZ11g0+z4RlOBFc=; b=tvAtWgP5b8j2QUBauguXTA8czMA+qzpF2HoykKG2s7AYcP9gzR6PcMg537PmJOO2ba yiABqJqnH78Vv7iQGC1pYS6pj3OPMDqg6SppgAxQnevNYb/Rsr6smfRF1eRaapZStKG8 hkRxXKjaz3OPWiRMLwo1mLxRY1Lv+w8yWD6trGyr4R1jskCSTzlm+tXebEkq464fZPxM jCnIIaOu37zXq1WAOuJKpLkuX705qYpV4nBhUx0meep8JGOnBr00txHwXbiv7Lh5c5+X 7PrPvfvMDNsp6Vm/kcKs9p1fDtaknn0fbHLAAWKFam6746/SY4g7VXQ5KaZK3FNO6ehD XDyg== X-Gm-Message-State: AOAM5331cg2iUBulm4UYpWpZAgJanaUdB/dAa4Y+T8S9RfcEtYQrLAyk JSJ3Y1/I4xewvp8/MmkK1yZw7nCMjtzgWQ== X-Google-Smtp-Source: ABdhPJziSe/gtjT2FZNxJ2RLiMM1v05nbamjdJp+BydRrv5XNMyXoHvJawcqB13St6uhac91Fx2U9A== X-Received: by 2002:a6b:f816:: with SMTP id o22mr26398442ioh.106.1634062670348; Tue, 12 Oct 2021 11:17:50 -0700 (PDT) Received: from p1.localdomain ([207.135.234.126]) by smtp.gmail.com with ESMTPSA id x5sm2242476ioh.23.2021.10.12.11.17.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Oct 2021 11:17:49 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 8/9] io_uring: utilize the io_batch infrastructure for more efficient polled IO Date: Tue, 12 Oct 2021 12:17:41 -0600 Message-Id: <20211012181742.672391-9-axboe@kernel.dk> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211012181742.672391-1-axboe@kernel.dk> References: <20211012181742.672391-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Wire up using an io_batch for f_op->iopoll(). If the lower stack supports it, we can handle high rates of polled IO more efficiently. This raises the single core efficiency on my system from ~6.1M IOPS to ~6.6M IOPS running a random read workload at depth 128 on two gen2 Optane drives. Signed-off-by: Jens Axboe --- fs/io_uring.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 082ff64c1bcb..cbf00ad3ac3f 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2390,6 +2390,8 @@ static int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin) { struct io_wq_work_node *pos, *start, *prev; unsigned int poll_flags = BLK_POLL_NOSLEEP; + struct file *file = NULL; + struct io_batch ib; int nr_events = 0; /* @@ -2399,11 +2401,17 @@ static int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin) if (ctx->poll_multi_queue && force_nonspin) poll_flags |= BLK_POLL_ONESHOT; + ib.req_list = NULL; wq_list_for_each(pos, start, &ctx->iopoll_list) { struct io_kiocb *req = container_of(pos, struct io_kiocb, comp_list); struct kiocb *kiocb = &req->rw.kiocb; int ret; + if (!file) + file = kiocb->ki_filp; + else if (file != kiocb->ki_filp) + break; + /* * Move completed and retryable entries to our local lists. * If we find a request that requires polling, break out @@ -2412,19 +2420,21 @@ static int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin) if (READ_ONCE(req->iopoll_completed)) break; - ret = kiocb->ki_filp->f_op->iopoll(kiocb, NULL, poll_flags); + ret = kiocb->ki_filp->f_op->iopoll(kiocb, &ib, poll_flags); if (unlikely(ret < 0)) return ret; else if (ret) poll_flags |= BLK_POLL_ONESHOT; /* iopoll may have completed current req */ - if (READ_ONCE(req->iopoll_completed)) + if (ib.req_list || READ_ONCE(req->iopoll_completed)) break; } - if (!pos) + if (!pos && !ib.req_list) return 0; + if (ib.req_list) + ib.complete(&ib); prev = start; wq_list_for_each_resume(pos, prev) { From patchwork Tue Oct 12 18:17:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 12553477 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 242A7C433F5 for ; Tue, 12 Oct 2021 18:17:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0982561050 for ; Tue, 12 Oct 2021 18:17:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232898AbhJLST6 (ORCPT ); Tue, 12 Oct 2021 14:19:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56382 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231586AbhJLST4 (ORCPT ); Tue, 12 Oct 2021 14:19:56 -0400 Received: from mail-io1-xd2e.google.com (mail-io1-xd2e.google.com [IPv6:2607:f8b0:4864:20::d2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B2B34C061570 for ; Tue, 12 Oct 2021 11:17:54 -0700 (PDT) Received: by mail-io1-xd2e.google.com with SMTP id n7so15840085iod.0 for ; Tue, 12 Oct 2021 11:17:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=02YgnQreYCX7+Oxqq3CeMBFEG37oqt2Ce2mb0JVD220=; b=Eux1eS+TTL560c91wPLqapoKifb21C7tey/OZpPpoGKzlGsNJl79vwVMDUPcZOkSU2 GJpDYNmzoZa+aPmTqTBIuQkf6fFclenH2wgXmgr2o7Wy1SKcyfABIasO65EClQJEKmNO WeKHOymXE0PqhjPBUyl1S9sEhbAFubc2+3liCuGGWQZLSQ0nBKImxfHAzoOJNO97x5Kw rZkRuwBNf/mTH27ptlF8GiU+5XcdedWl+V4u1MBpbAaXEJ8m7DB4FpyPaG+4+lYY0gDJ UXBBEzH7trA1BaxyS1iTb5b82DNvHmNYw2KwVSc0AM3VrcnP8SJlietGdzBGZSy9AWau U4Sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=02YgnQreYCX7+Oxqq3CeMBFEG37oqt2Ce2mb0JVD220=; b=RSLzmg3qviXL+DiX29FOAlyojgjRM2+vLuMbdTevtbN06ckslSMG/2ZDqc3bvNdmdH 2ndCvsjchxdKBGvlv7hNBjW/oCU7Q43rPOn+Sggx7bnJ5SY/mLezApctXYr6wwKqEq/E B2U02zZWL4kIePYyteK8K7bDimTIpaVDko/CT1OSHyoJQEODPV92ZRiwGXRLvA7l+eBL vb/C3BurwnB1T2qPl5wLhDw59hPaoSkiNmAhsvPSnwCnuQgJsbhIPaA/ZBmucXvv5Fbn 1wbEAj2U64BVThTZddgZQDl8tOKasqYdYZRQlIDhV5do8UcA80hY1yvL+r174pwkhxbe MaMQ== X-Gm-Message-State: AOAM532+/6eEUNhL1EHi+Rn3S1oedDe6NVo/xzEN1C3X8xbp86/MKjiy uhBkZW+fiGZG5SAQKoZr3Z4wKpKZUW1r+A== X-Google-Smtp-Source: ABdhPJzLXeyV1eoE+dHJLaLxkTDH6LP3vtSkQOg0svef2OUCzcEZ92ZiNVNGpY5iVdx/jUpkl6Z5bA== X-Received: by 2002:a6b:3b85:: with SMTP id i127mr24321080ioa.111.1634062671161; Tue, 12 Oct 2021 11:17:51 -0700 (PDT) Received: from p1.localdomain ([207.135.234.126]) by smtp.gmail.com with ESMTPSA id x5sm2242476ioh.23.2021.10.12.11.17.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Oct 2021 11:17:50 -0700 (PDT) From: Jens Axboe To: linux-block@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 9/9] nvme: wire up completion batching for the IRQ path Date: Tue, 12 Oct 2021 12:17:42 -0600 Message-Id: <20211012181742.672391-10-axboe@kernel.dk> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211012181742.672391-1-axboe@kernel.dk> References: <20211012181742.672391-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Trivial to do now, just need our own io_batch on the stack and pass that in to the usual command completion handling. I pondered making this dependent on how many entries we had to process, but even for a single entry there's no discernable difference in performance or latency. Running a sync workload over io_uring: t/io_uring -b512 -d1 -s1 -c1 -p0 -F1 -B1 -n2 /dev/nvme1n1 /dev/nvme2n1 yields the below performance before the patch: IOPS=254820, BW=124MiB/s, IOS/call=1/1, inflight=(1 1) IOPS=251174, BW=122MiB/s, IOS/call=1/1, inflight=(1 1) IOPS=250806, BW=122MiB/s, IOS/call=1/1, inflight=(1 1) and the following after: IOPS=255972, BW=124MiB/s, IOS/call=1/1, inflight=(1 1) IOPS=251920, BW=123MiB/s, IOS/call=1/1, inflight=(1 1) IOPS=251794, BW=122MiB/s, IOS/call=1/1, inflight=(1 1) which definitely isn't slower, about the same if you factor in a bit of variance. For peak performance workloads, benchmarking shows a 2% improvement. Signed-off-by: Jens Axboe --- drivers/nvme/host/pci.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 4713da708cd4..fb3de6f68eb1 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1076,8 +1076,10 @@ static inline void nvme_update_cq_head(struct nvme_queue *nvmeq) static inline int nvme_process_cq(struct nvme_queue *nvmeq) { + struct io_batch ib; int found = 0; + ib.req_list = NULL; while (nvme_cqe_pending(nvmeq)) { found++; /* @@ -1085,12 +1087,15 @@ static inline int nvme_process_cq(struct nvme_queue *nvmeq) * the cqe requires a full read memory barrier */ dma_rmb(); - nvme_handle_cqe(nvmeq, NULL, nvmeq->cq_head); + nvme_handle_cqe(nvmeq, &ib, nvmeq->cq_head); nvme_update_cq_head(nvmeq); } - if (found) + if (found) { + if (ib.req_list) + nvme_pci_complete_batch(&ib); nvme_ring_cq_doorbell(nvmeq); + } return found; }