From patchwork Thu Jan 12 11:58:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nitesh Shetty X-Patchwork-Id: 13097990 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2040AC54EBC for ; Thu, 12 Jan 2023 13:37:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232889AbjALNhd (ORCPT ); Thu, 12 Jan 2023 08:37:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232845AbjALNhM (ORCPT ); Thu, 12 Jan 2023 08:37:12 -0500 Received: from mailout2.samsung.com (mailout2.samsung.com [203.254.224.25]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9653048CCB for ; Thu, 12 Jan 2023 05:37:10 -0800 (PST) Received: from epcas5p4.samsung.com (unknown [182.195.41.42]) by mailout2.samsung.com (KnoxPortal) with ESMTP id 20230112133707epoutp0212532dcfd77bdd5ccc512a23a5164a34~5kz0vKWCT2230822308epoutp02s for ; Thu, 12 Jan 2023 13:37:07 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.samsung.com 20230112133707epoutp0212532dcfd77bdd5ccc512a23a5164a34~5kz0vKWCT2230822308epoutp02s DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1673530627; bh=1hrkgfW4fuWprZ+hdImzV02kYbdcSvOZkXpC5k82hno=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=NKJRhyTWx4KAf3Z6F88IU32oTOiEPJ2YjiOJqoqLH3Q756pf5UpEWsE7G/O759+UG f007E0kBczdt4BDUmO8PEJ9wK4Fv58LkFtFiXbjQ4I08vkFgDK53q76puYYiNwlFx8 3rhFXU5NuoInyZSd+yq9hOtxPNxWzjn29xwrpONI= Received: from epsnrtp4.localdomain (unknown [182.195.42.165]) by epcas5p2.samsung.com (KnoxPortal) with ESMTP id 20230112133706epcas5p204c5ff92395c7759f7ecf00574594e55~5kz0Gbabv2066520665epcas5p2C; Thu, 12 Jan 2023 13:37:06 +0000 (GMT) Received: from epsmges5p3new.samsung.com (unknown [182.195.38.181]) by epsnrtp4.localdomain (Postfix) with ESMTP id 4Nt5FF1ZQrz4x9Pq; Thu, 12 Jan 2023 13:37:05 +0000 (GMT) Received: from epcas5p2.samsung.com ( [182.195.41.40]) by epsmges5p3new.samsung.com (Symantec Messaging Gateway) with SMTP id CA.56.03362.10D00C36; Thu, 12 Jan 2023 22:37:05 +0900 (KST) Received: from epsmtrp2.samsung.com (unknown [182.195.40.14]) by epcas5p3.samsung.com (KnoxPortal) with ESMTPA id 20230112120011epcas5p33c10ec9a0cb2ee4c0c68737bb879a154~5jfL-tEuf2906129061epcas5p3u; Thu, 12 Jan 2023 12:00:11 +0000 (GMT) Received: from epsmgms1p2.samsung.com (unknown [182.195.42.42]) by epsmtrp2.samsung.com (KnoxPortal) with ESMTP id 20230112120011epsmtrp2497415fb52a1294aebb952c9a531b200~5jfL_kmgL2747827478epsmtrp2I; Thu, 12 Jan 2023 12:00:11 +0000 (GMT) X-AuditID: b6c32a4b-287ff70000010d22-c0-63c00d01682e Received: from epsmtip2.samsung.com ( [182.195.34.31]) by epsmgms1p2.samsung.com (Symantec Messaging Gateway) with SMTP id 61.23.10542.B46FFB36; Thu, 12 Jan 2023 21:00:11 +0900 (KST) Received: from green5.sa.corp.samsungelectronics.net (unknown [107.110.206.5]) by epsmtip2.samsung.com (KnoxPortal) with ESMTPA id 20230112120008epsmtip279c81d9fc7492fa1fe2ac4e2ce550029~5jfJMwj_g0962409624epsmtip2_; Thu, 12 Jan 2023 12:00:08 +0000 (GMT) From: Nitesh Shetty To: Jens Axboe , Alasdair Kergon , Mike Snitzer , dm-devel@redhat.com, Keith Busch , Christoph Hellwig , Sagi Grimberg , James Smart , Chaitanya Kulkarni , Alexander Viro Cc: anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com, nitheshshetty@gmail.com, gost.dev@samsung.com, Nitesh Shetty , Hannes Reinecke , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 1/9] block: Introduce queue limits for copy-offload support Date: Thu, 12 Jan 2023 17:28:55 +0530 Message-Id: <20230112115908.23662-2-nj.shetty@samsung.com> X-Mailer: git-send-email 2.35.1.500.gb896f729e2 In-Reply-To: <20230112115908.23662-1-nj.shetty@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA01Ta0xTZxj2O6ecHshKjlXDR5fNWmY2IEALpR4YoE5cDpMfNVuiIxh2Uk5o A73YliG7uFIQMzLAGQQpZKCWOgpCBki4FZENuUsmQ+Wm6MqCTqsTNt1AGOXA5r/nfd7nvTzv lw9H+X9jAlylMTJ6DZ0mwrw4zT/6vxMEeF0KcWcdRtYPXENJ86mXKFkzXYiRDleZBzne1YqQ HedPI2R1TQ9Ctp/7AyF7Vh5j5Onum4CcHbMgpGMikOxw9HPI0bZyjKywzXLJFmcWIJsXK1By viqHS9b9/oRD9k28To687PXYs42y3B3GqFbLNJcaufMDhxodTqca7F9jVKP1K6p93IRR+dku jHrSOYZRBU12QDUOfk7NN7xJNTgfI3JeQmqUkqGTGb2Q0Si0ySpNSrTowIdJ+5LCZWJJkCSC 3CUSamg1Ey2KjZcHva9KW/UrEn5Kp6WvUnLaYBCFxETptelGRqjUGozRIkaXnKaT6oINtNqQ rkkJ1jDGSIlYHBq+KvwkVTlh7eLqSg8ey8q6jpnAxdg84IlDQgotxXPcPOCF84l2APsuXfBg g2cA9i6OcNhgHkCnowNslPSXLK+r2gCcK2tC3Qk+kY3Ae649eQDHMSIQDq7gbnorMYXA1kE/ tx4l2hGYdWIZcye2EHL460rBWi2H2AltVW1rPI+IhKPN5cDdBxIhsPDuZjftSbwLa28vcFjJ Zthf6lzDKLEdZl8uQ939IVHoCa/cGkfZRWNh6VztOt4CH/Y2cVksgPMuB8biDFhd9D3GFucA aLllWXe5G54YKETdS6CEP6xvC2HpN+CZgTqEHewN8xedCMvzYMt3G9gP1tZXrvf3hTefZ61j CjpXphD2cAUAPjt/FTkFhJZXDFleMWT5f3QlQO3Al9EZ1CmMIVwXpmEy/ntlhVbdANb+QMCB FnB/5mlwN0Bw0A0gjoq28jp6rij4vGQ68zNGr03Sp6cxhm4Qvnrwb1HBNoV29RNpjEkSaYRY KpPJpBFhMonIh8dcrlDwiRTayKQyjI7Rb9QhuKfAhCi3D9uDuF4VvfbDNp8mjkRefBD1brW+ iFFcqEU2dUwqjp9tdJ4c0lX6v+11/ZeH6ri8BdVQmPm3lKiiI66W/U/jcF9sttxlUzknCw+N 1C6UyMxLHyzslsZ9uX/HseOJy5f8DVMB1Jnno+JNi8tX/2pO/MZ8URez16f6du6+TFNC1dnY 3snEAuXkn9XvfSSQdeb07aixev681/6xjvb1f/Ba/KNQ2nk/fzpi8c5JNYZmDx3tu/eWWb4r 3jTTFel3aMg2oyeKbhxttZa/KKG0D1R5DWOO5riZDD7v2qOfgv6pzFV+EWG7kZuwpItdsgo0 mYgp3FzsHdruFScMjDxyuA0XcQxKWhKA6g30v9lB556MBAAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrPIsWRmVeSWpSXmKPExsWy7bCSvK73t/3JBhc3c1usP3WM2aJpwl9m i9V3+9ks9r6bzWpx88BOJos9iyYxWaxcfZTJYvfCj0wWR/+/ZbOYdOgao8XTq7OYLPbe0rbY s/cki8XlXXPYLOYve8puseNJI6PFtt/zmS0+L21ht1j3+j2LxYlb0hbn/x5ndRD1mHX/LJvH zll32T3O39vI4nH5bKnHplWdbB6bl9R77L7ZwObR2/yOzeP9vqtsHn1bVjF6bD5d7fF5k5zH pidvmQJ4o7hsUlJzMstSi/TtErgybi05wF4wM7CisfEcWwPjcpcuRk4OCQETiZPT/7F2MXJx CAnsYJQ4+ekJI0RCUmLZ3yPMELawxMp/z9lBbCGBRiaJd29zuhg5ONgEtCVO/+cA6RUReMYk cfbeI2YQh1ngOJPEyeaVLCANwgJ+ElefLGcFsVkEVCWWLd3FBmLzClhJXN42hxFkkISAvkT/ fUGQMKeAtcSaG19YIHZZSczac5UJolxQ4uTMJ2BxZgF5ieats5knMArMQpKahSS1gJFpFaNk akFxbnpusWGBUV5quV5xYm5xaV66XnJ+7iZGcJxqae1g3LPqg94hRiYOxkOMEhzMSiK8e47u TxbiTUmsrEotyo8vKs1JLT7EKM3BoiTOe6HrZLyQQHpiSWp2ampBahFMlomDU6qB6UzFwZL7 FytXX/q4K95R/YDV9vsmuklz1c9emmrxO3XnlY4rzaG9KpnMZgn9izaKxFw7ecXGocf1j6f+ ibsv+Q9FX7y44MXHzDfLQgLf7fzZ8vh82PuEs/M00o7cvpHzQ+eti/4czZUKUUKvmOq1nRen c71R8/c9xnavpPy9RvSODT9jYzqvWpcGxD2Z5ijAn+hw93rZe26+lUulo5X3h1vulTFo+F4s u0Uobl6v1s1Iy8gv6V8E3aPf3fp2wHReUrVfqI+7+dtPX7bKt1/q5W16wHAp+uXlHINj70PW 15/zPX/TgH/OApUHapE7mLZ/vPrvbNSzh5Y1+dVTbE6vijcRZKpVTXibpfrF5XOGjxJLcUai oRZzUXEiAIs/ID1CAwAA X-CMS-MailID: 20230112120011epcas5p33c10ec9a0cb2ee4c0c68737bb879a154 X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20230112120011epcas5p33c10ec9a0cb2ee4c0c68737bb879a154 References: <20230112115908.23662-1-nj.shetty@samsung.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add device limits as sysfs entries, - copy_offload (RW) - copy_max_bytes (RW) - copy_max_bytes_hw (RO) Above limits help to split the copy payload in block layer. copy_offload: used for setting copy offload(1) or emulation(0). copy_max_bytes: maximum total length of copy in single payload. copy_max_bytes_hw: Reflects the device supported maximum limit. Reviewed-by: Hannes Reinecke Signed-off-by: Nitesh Shetty Signed-off-by: Kanchan Joshi Signed-off-by: Anuj Gupta --- Documentation/ABI/stable/sysfs-block | 36 ++++++++++++++++ block/blk-settings.c | 24 +++++++++++ block/blk-sysfs.c | 64 ++++++++++++++++++++++++++++ include/linux/blkdev.h | 12 ++++++ include/uapi/linux/fs.h | 3 ++ 5 files changed, 139 insertions(+) diff --git a/Documentation/ABI/stable/sysfs-block b/Documentation/ABI/stable/sysfs-block index cd14ecb3c9a5..e0c9be009706 100644 --- a/Documentation/ABI/stable/sysfs-block +++ b/Documentation/ABI/stable/sysfs-block @@ -155,6 +155,42 @@ Description: last zone of the device which may be smaller. +What: /sys/block//queue/copy_offload +Date: November 2022 +Contact: linux-block@vger.kernel.org +Description: + [RW] When read, this file shows whether offloading copy to + device is enabled (1) or disabled (0). Writing '0' to this + file will disable offloading copies for this device. + Writing any '1' value will enable this feature. If device + does not support offloading, then writing 1, will result in + error. + + +What: /sys/block//queue/copy_max_bytes +Date: November 2022 +Contact: linux-block@vger.kernel.org +Description: + [RW] While 'copy_max_bytes_hw' is the hardware limit for the + device, 'copy_max_bytes' setting is the software limit. + Setting this value lower will make Linux issue smaller size + copies from block layer. + + +What: /sys/block//queue/copy_max_bytes_hw +Date: November 2022 +Contact: linux-block@vger.kernel.org +Description: + [RO] Devices that support offloading copy functionality may have + internal limits on the number of bytes that can be offloaded + in a single operation. The `copy_max_bytes_hw` + parameter is set by the device driver to the maximum number of + bytes that can be copied in a single operation. Copy + requests issued to the device must not exceed this limit. + A value of 0 means that the device does not + support copy offload. + + What: /sys/block//queue/crypto/ Date: February 2022 Contact: linux-block@vger.kernel.org diff --git a/block/blk-settings.c b/block/blk-settings.c index 0477c4d527fe..ca6f15a70fdc 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -58,6 +58,8 @@ void blk_set_default_limits(struct queue_limits *lim) lim->zoned = BLK_ZONED_NONE; lim->zone_write_granularity = 0; lim->dma_alignment = 511; + lim->max_copy_sectors_hw = 0; + lim->max_copy_sectors = 0; } /** @@ -81,6 +83,8 @@ void blk_set_stacking_limits(struct queue_limits *lim) lim->max_dev_sectors = UINT_MAX; lim->max_write_zeroes_sectors = UINT_MAX; lim->max_zone_append_sectors = UINT_MAX; + lim->max_copy_sectors_hw = ULONG_MAX; + lim->max_copy_sectors = ULONG_MAX; } EXPORT_SYMBOL(blk_set_stacking_limits); @@ -177,6 +181,22 @@ void blk_queue_max_discard_sectors(struct request_queue *q, } EXPORT_SYMBOL(blk_queue_max_discard_sectors); +/** + * blk_queue_max_copy_sectors_hw - set max sectors for a single copy payload + * @q: the request queue for the device + * @max_copy_sectors: maximum number of sectors to copy + **/ +void blk_queue_max_copy_sectors_hw(struct request_queue *q, + unsigned int max_copy_sectors) +{ + if (max_copy_sectors >= MAX_COPY_TOTAL_LENGTH) + max_copy_sectors = MAX_COPY_TOTAL_LENGTH; + + q->limits.max_copy_sectors_hw = max_copy_sectors; + q->limits.max_copy_sectors = max_copy_sectors; +} +EXPORT_SYMBOL_GPL(blk_queue_max_copy_sectors_hw); + /** * blk_queue_max_secure_erase_sectors - set max sectors for a secure erase * @q: the request queue for the device @@ -572,6 +592,10 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, t->max_segment_size = min_not_zero(t->max_segment_size, b->max_segment_size); + t->max_copy_sectors = min(t->max_copy_sectors, b->max_copy_sectors); + t->max_copy_sectors_hw = min(t->max_copy_sectors_hw, + b->max_copy_sectors_hw); + t->misaligned |= b->misaligned; alignment = queue_limit_alignment_offset(b, start); diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 93d9e9c9a6ea..82a28a6c2e8a 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -212,6 +212,63 @@ static ssize_t queue_discard_zeroes_data_show(struct request_queue *q, char *pag return queue_var_show(0, page); } +static ssize_t queue_copy_offload_show(struct request_queue *q, char *page) +{ + return queue_var_show(blk_queue_copy(q), page); +} + +static ssize_t queue_copy_offload_store(struct request_queue *q, + const char *page, size_t count) +{ + s64 copy_offload; + ssize_t ret = queue_var_store64(©_offload, page); + + if (ret < 0) + return ret; + + if (copy_offload && !q->limits.max_copy_sectors_hw) + return -EINVAL; + + if (copy_offload) + blk_queue_flag_set(QUEUE_FLAG_COPY, q); + else + blk_queue_flag_clear(QUEUE_FLAG_COPY, q); + + return count; +} + +static ssize_t queue_copy_max_hw_show(struct request_queue *q, char *page) +{ + return sprintf(page, "%llu\n", (unsigned long long) + q->limits.max_copy_sectors_hw << SECTOR_SHIFT); +} + +static ssize_t queue_copy_max_show(struct request_queue *q, char *page) +{ + return sprintf(page, "%llu\n", (unsigned long long) + q->limits.max_copy_sectors << SECTOR_SHIFT); +} + +static ssize_t queue_copy_max_store(struct request_queue *q, + const char *page, size_t count) +{ + s64 max_copy; + ssize_t ret = queue_var_store64(&max_copy, page); + + if (ret < 0) + return ret; + + if (max_copy & (queue_logical_block_size(q) - 1)) + return -EINVAL; + + max_copy >>= SECTOR_SHIFT; + if (max_copy > q->limits.max_copy_sectors_hw) + max_copy = q->limits.max_copy_sectors_hw; + + q->limits.max_copy_sectors = max_copy; + return count; +} + static ssize_t queue_write_same_max_show(struct request_queue *q, char *page) { return queue_var_show(0, page); @@ -604,6 +661,10 @@ QUEUE_RO_ENTRY(queue_nr_zones, "nr_zones"); QUEUE_RO_ENTRY(queue_max_open_zones, "max_open_zones"); QUEUE_RO_ENTRY(queue_max_active_zones, "max_active_zones"); +QUEUE_RW_ENTRY(queue_copy_offload, "copy_offload"); +QUEUE_RO_ENTRY(queue_copy_max_hw, "copy_max_bytes_hw"); +QUEUE_RW_ENTRY(queue_copy_max, "copy_max_bytes"); + QUEUE_RW_ENTRY(queue_nomerges, "nomerges"); QUEUE_RW_ENTRY(queue_rq_affinity, "rq_affinity"); QUEUE_RW_ENTRY(queue_poll, "io_poll"); @@ -651,6 +712,9 @@ static struct attribute *queue_attrs[] = { &queue_discard_max_entry.attr, &queue_discard_max_hw_entry.attr, &queue_discard_zeroes_data_entry.attr, + &queue_copy_offload_entry.attr, + &queue_copy_max_hw_entry.attr, + &queue_copy_max_entry.attr, &queue_write_same_max_entry.attr, &queue_write_zeroes_max_entry.attr, &queue_zone_append_max_entry.attr, diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 43d4e073b111..807ffb5f715d 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -303,6 +303,9 @@ struct queue_limits { unsigned int discard_alignment; unsigned int zone_write_granularity; + unsigned long max_copy_sectors_hw; + unsigned long max_copy_sectors; + unsigned short max_segments; unsigned short max_integrity_segments; unsigned short max_discard_segments; @@ -571,6 +574,7 @@ struct request_queue { #define QUEUE_FLAG_NOWAIT 29 /* device supports NOWAIT */ #define QUEUE_FLAG_SQ_SCHED 30 /* single queue style io dispatch */ #define QUEUE_FLAG_SKIP_TAGSET_QUIESCE 31 /* quiesce_tagset skip the queue*/ +#define QUEUE_FLAG_COPY 32 /* supports copy offload */ #define QUEUE_FLAG_MQ_DEFAULT ((1UL << QUEUE_FLAG_IO_STAT) | \ (1UL << QUEUE_FLAG_SAME_COMP) | \ @@ -591,6 +595,7 @@ bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q); test_bit(QUEUE_FLAG_STABLE_WRITES, &(q)->queue_flags) #define blk_queue_io_stat(q) test_bit(QUEUE_FLAG_IO_STAT, &(q)->queue_flags) #define blk_queue_add_random(q) test_bit(QUEUE_FLAG_ADD_RANDOM, &(q)->queue_flags) +#define blk_queue_copy(q) test_bit(QUEUE_FLAG_COPY, &(q)->queue_flags) #define blk_queue_zone_resetall(q) \ test_bit(QUEUE_FLAG_ZONE_RESETALL, &(q)->queue_flags) #define blk_queue_dax(q) test_bit(QUEUE_FLAG_DAX, &(q)->queue_flags) @@ -911,6 +916,8 @@ extern void blk_queue_chunk_sectors(struct request_queue *, unsigned int); extern void blk_queue_max_segments(struct request_queue *, unsigned short); extern void blk_queue_max_discard_segments(struct request_queue *, unsigned short); +extern void blk_queue_max_copy_sectors_hw(struct request_queue *q, + unsigned int max_copy_sectors); void blk_queue_max_secure_erase_sectors(struct request_queue *q, unsigned int max_sectors); extern void blk_queue_max_segment_size(struct request_queue *, unsigned int); @@ -1229,6 +1236,11 @@ static inline unsigned int bdev_discard_granularity(struct block_device *bdev) return bdev_get_queue(bdev)->limits.discard_granularity; } +static inline unsigned int bdev_max_copy_sectors(struct block_device *bdev) +{ + return bdev_get_queue(bdev)->limits.max_copy_sectors; +} + static inline unsigned int bdev_max_secure_erase_sectors(struct block_device *bdev) { diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index b7b56871029c..b3ad173f619c 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -64,6 +64,9 @@ struct fstrim_range { __u64 minlen; }; +/* maximum total copy length */ +#define MAX_COPY_TOTAL_LENGTH (1 << 27) + /* extent-same (dedupe) ioctls; these MUST match the btrfs ioctl definitions */ #define FILE_DEDUPE_RANGE_SAME 0 #define FILE_DEDUPE_RANGE_DIFFERS 1 From patchwork Thu Jan 12 11:58:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nitesh Shetty X-Patchwork-Id: 13097991 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E978C61DB3 for ; Thu, 12 Jan 2023 13:37:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232938AbjALNhj (ORCPT ); Thu, 12 Jan 2023 08:37:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39940 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232421AbjALNhO (ORCPT ); Thu, 12 Jan 2023 08:37:14 -0500 Received: from mailout4.samsung.com (mailout4.samsung.com [203.254.224.34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A7CD639FA6 for ; Thu, 12 Jan 2023 05:37:12 -0800 (PST) Received: from epcas5p1.samsung.com (unknown [182.195.41.39]) by mailout4.samsung.com (KnoxPortal) with ESMTP id 20230112133711epoutp0463e4cde3ec9a3d6ee4afe3413584cd34~5kz4LCmmd3145031450epoutp04D for ; Thu, 12 Jan 2023 13:37:11 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout4.samsung.com 20230112133711epoutp0463e4cde3ec9a3d6ee4afe3413584cd34~5kz4LCmmd3145031450epoutp04D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1673530631; bh=V58rfq97JuSvwF86hkYEZ4/1VpPRrr5x2/bdR0eeSHQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Gy5vlXhM+hz2KD0bZpLk/3yzOnKQ1250/OGgjjneiVU+okk4iRGoRffNlvEWq3tg7 KXom6IgoPHdk87lYzGlYof/yW+0aot6UxPUeZG/tFyEv/GdZOOqDwzRZkZVtfVhAtK ts/aBxVArasb32e8YYgt1SVeIf2RdxFezbQUmQVM= Received: from epsnrtp2.localdomain (unknown [182.195.42.163]) by epcas5p2.samsung.com (KnoxPortal) with ESMTP id 20230112133710epcas5p257f267d68794389c2d1b01423de7ba15~5kz3U20ol2066520665epcas5p2K; Thu, 12 Jan 2023 13:37:10 +0000 (GMT) Received: from epsmges5p2new.samsung.com (unknown [182.195.38.183]) by epsnrtp2.localdomain (Postfix) with ESMTP id 4Nt5FJ4fmpz4x9Pw; Thu, 12 Jan 2023 13:37:08 +0000 (GMT) Received: from epcas5p4.samsung.com ( [182.195.41.42]) by epsmges5p2new.samsung.com (Symantec Messaging Gateway) with SMTP id 4D.6D.62806.40D00C36; Thu, 12 Jan 2023 22:37:08 +0900 (KST) Received: from epsmtrp2.samsung.com (unknown [182.195.40.14]) by epcas5p4.samsung.com (KnoxPortal) with ESMTPA id 20230112120039epcas5p49ccf70d806c530c8228130cc25737b51~5jflyKk9_2230422304epcas5p4n; Thu, 12 Jan 2023 12:00:39 +0000 (GMT) Received: from epsmgms1p1new.samsung.com (unknown [182.195.42.41]) by epsmtrp2.samsung.com (KnoxPortal) with ESMTP id 20230112120039epsmtrp2be8ddd51c8a1847c41530819b85679b1~5jflxLbWW3008130081epsmtrp2I; Thu, 12 Jan 2023 12:00:39 +0000 (GMT) X-AuditID: b6c32a4a-c43ff7000000f556-b1-63c00d042cbb Received: from epsmtip2.samsung.com ( [182.195.34.31]) by epsmgms1p1new.samsung.com (Symantec Messaging Gateway) with SMTP id 8B.A5.02211.666FFB36; Thu, 12 Jan 2023 21:00:38 +0900 (KST) Received: from green5.sa.corp.samsungelectronics.net (unknown [107.110.206.5]) by epsmtip2.samsung.com (KnoxPortal) with ESMTPA id 20230112120036epsmtip221869583c809c8cf44bce7fffa4a1ac7~5jfjHqcvT0888908889epsmtip2a; Thu, 12 Jan 2023 12:00:36 +0000 (GMT) From: Nitesh Shetty To: Jens Axboe , Alasdair Kergon , Mike Snitzer , dm-devel@redhat.com, Keith Busch , Christoph Hellwig , Sagi Grimberg , James Smart , Chaitanya Kulkarni , Alexander Viro Cc: anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com, nitheshshetty@gmail.com, gost.dev@samsung.com, Nitesh Shetty , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 2/9] block: Add copy offload support infrastructure Date: Thu, 12 Jan 2023 17:28:56 +0530 Message-Id: <20230112115908.23662-3-nj.shetty@samsung.com> X-Mailer: git-send-email 2.35.1.500.gb896f729e2 In-Reply-To: <20230112115908.23662-1-nj.shetty@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA01Ta1ATVxjl7obNQhtZEcYLtkoXmQ44QKIhXagIHVF3aqdDR35oK4ZtWIGS bDJ5iHVKQR4jYhVpwRmDvBzwAUhaJEx4pEJaykMpg4g2KEVamE61iVQY1FJKExJa/53v3PM9 znfvxVFfOxaIZ3BaVs0xchLz5rV9FxoWzhN0y4SmX96gDIM/oFTe2SWUapwowSizvcKTsna3 I9TVxl6E6qz9E6F6l20Y9aXlLqBmxvQIZR7fQnWZB3jUaMcFjKq+NMOnTNPHAdW2WI1Sc/UF fKr58RMe1T++gRpe6vOM96P1k0MY3a6f4NPDP3/Do0eHdHRLw0mMvl6XQ3daczH6dL4do598 O4bRZ1obAD3XspFumbYhia9+mLk9nWVSWXUQy8mUqRlcWiy5d590pzRKIhSFi6Kpt8ggjlGw sWTCe4nhuzPkDptk0BFGrnNQiYxGQ0bu2K5W6rRsULpSo40lWVWqXCVWRWgYhUbHpUVwrDZG JBRujXIIUzLTi3ouIqr6nKP3TswiucCQVgy8cEiIYdFiPVIMvHFfohPAAvOiO3jqCJ4/cgcL APYVl4DVlJHvyzEn9iXMAN434S5RPgJv1S3xiwGOY8QWeHMZd2r8iAcIbL8Z7NSgRAUC+21T ns6DdcRuaL1YuYJ5RAhc+rqd58QCIgaWD9hW6kAiEpZMrnXSXsTbsOmnebdkLRw4P72CUWIT zDdWoM76kFjG4bmRBZ5r0ARYf+oF5sLr4KO+Vr4LB8I5u9nNZ8GrZVcwV3IBgPp7erfLOFg4 WII6h0CJUGjoiHTRr8PywWbE1XgNPL04jbh4ATRVreJg2GSocdcPgHefHcdcXmjYWMlz7e0M gPbbkWdBkP4lO/qX7Oj/b1wD0AYQwKo0ijRWE6XayrFZ/92xTKloASsPP+xdE5h6OBthAQgO LADiKOkn6Oq9IfMVpDKfHmPVSqlaJ2c1FhDlWHcpGugvUzp+DqeVisTRQrFEIhFHb5OIyPUC 1lgt8yXSGC2bybIqVr2ah+BegblI8sdC2V+Gg2teKMZtO1lL9vyujXHoH4Pcjtpr8WNteerS GOK1yMoDaEi4T9+GPac+vzxfWLWnUD6x+dD6kQdZfzfXvfIjqZduHs1p7Q7t7/JqFHeGDGT5 T4WlxNtPlvVwhw4myj3Un5w4Pz9269dJy+Fj244mBCyilvHk0ejDvXE+s8klVv1n089tdzJN fU+l7VU3ZiT9nudymSveHrXNeffTjyR1CD00vwXX5HLDPtk1zwKl4xnDb6qsQ7ebeu4o/vG6 3ORfl2J+32o8kMS/vilbV1bqIXlH1GYILbs2FGFMapqAvxttXyUzwoWH+79ofWwENR/tz8xr DPigt6IobF/5JZKnSWdEYahaw/wLQJ7G44EEAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrOIsWRmVeSWpSXmKPExsWy7bCSvG7at/3JBg9umFusP3WM2aJpwl9m i9V3+9ks9r6bzWpx88BOJouVq48yWexe+JHJ4uj/t2wWkw5dY7R4enUWk8XeW9oWe/aeZLG4 vGsOm8X8ZU/ZLXY8aWS02PZ7PrPF56Ut7BbrXr9nsThxS9ri/N/jrA4iHrPun2Xz2DnrLrvH +XsbWTwuny312LSqk81j85J6j903G9g8epvfsXm833eVzaNvyypGj8+b5Dw2PXnLFMATxWWT kpqTWZZapG+XwJXRcXARU8HS+orr7R+YGhjXp3cxcnJICJhIXDwyla2LkYtDSGA3o0Rrx35W iISkxLK/R5ghbGGJlf+es0MUNTJJzH2wg6WLkYODTUBb4vR/DpC4iMAzJomz9x4xgzjMAkuZ JBbsawSbJCzgJnFz0Vwwm0VAVeLvhp0sIDavgJXE1JNv2UEGSQjoS/TfFwQJcwpYS6y58QWs RAioZNaeq0wQ5YISJ2c+AYszC8hLNG+dzTyBUWAWktQsJKkFjEyrGCVTC4pz03OLDQsM81LL 9YoTc4tL89L1kvNzNzGC41JLcwfj9lUf9A4xMnEwHmKU4GBWEuHdc3R/shBvSmJlVWpRfnxR aU5q8SFGaQ4WJXHeC10n44UE0hNLUrNTUwtSi2CyTBycUg1Met+mtAmsFVNYvbSCaYKWtMyd uXM5lRqrf9s1Bznxz/QpEf8e5q6u8pPvnaWvCK8tk44hP08lS6nF2pprQes1N5SnF1Xncv97 KCAfVyzt6ywiccKmQqnQ89fpKzyqu6Nfmq+6+DL66teYc1abXnq4d/yftHX566Ztq+LWJb+5 IrOSY0/p874srqV7cgXvnYkTP6v40Dt8xr5gS94zue2M9YmH/t7bq7ZlM+/Kt9+cGt48/vey aX5V9JQz3boLmv+dXSWWx/AlzsV5ovX6c9me9murpFR+CZtd/yt5kGnvOv+YqWtiVt41eLri u3it6yTPuEvvbfpN+lWFxdi/L5701lPE8eEU0Q7vw62TIruUWIozEg21mIuKEwECdCPAOgMA AA== X-CMS-MailID: 20230112120039epcas5p49ccf70d806c530c8228130cc25737b51 X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20230112120039epcas5p49ccf70d806c530c8228130cc25737b51 References: <20230112115908.23662-1-nj.shetty@samsung.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Introduce blkdev_issue_copy which supports source and destination bdevs, and an array of (source, destination and copy length) tuples. Introduce REQ_COPY copy offload operation flag. Create a read-write bio pair with a token as payload and submitted to the device in order. Read request populates token with source specific information which is then passed with write request. This design is courtesy Mikulas Patocka's token based copy Larger copy will be divided, based on max_copy_sectors limit. Signed-off-by: Nitesh Shetty Signed-off-by: Anuj Gupta --- block/blk-lib.c | 358 ++++++++++++++++++++++++++++++++++++++ block/blk.h | 2 + include/linux/blk_types.h | 44 +++++ include/linux/blkdev.h | 3 + include/uapi/linux/fs.h | 15 ++ 5 files changed, 422 insertions(+) diff --git a/block/blk-lib.c b/block/blk-lib.c index e59c3069e835..2ce3c872ca49 100644 --- a/block/blk-lib.c +++ b/block/blk-lib.c @@ -115,6 +115,364 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector, } EXPORT_SYMBOL(blkdev_issue_discard); +/* + * For synchronous copy offload/emulation, wait and process all in-flight BIOs. + * This must only be called once all bios have been issued so that the refcount + * can only decrease. This just waits for all bios to make it through + * bio_copy_*_write_end_io. IO errors are propagated through cio->io_error. + */ +static int cio_await_completion(struct cio *cio) +{ + int ret = 0; + + atomic_dec(&cio->refcount); + + if (cio->endio) + return 0; + + if (atomic_read(&cio->refcount)) { + __set_current_state(TASK_UNINTERRUPTIBLE); + blk_io_schedule(); + } + + ret = cio->io_err; + kfree(cio); + + return ret; +} + +static void blk_copy_offload_write_end_io(struct bio *bio) +{ + struct copy_ctx *ctx = bio->bi_private; + struct cio *cio = ctx->cio; + sector_t clen; + int ri = ctx->range_idx; + + if (bio->bi_status) { + cio->io_err = blk_status_to_errno(bio->bi_status); + clen = (bio->bi_iter.bi_sector << SECTOR_SHIFT) - + cio->ranges[ri].dst; + cio->ranges[ri].comp_len = min_t(sector_t, clen, + cio->ranges[ri].comp_len); + } + __free_page(bio->bi_io_vec[0].bv_page); + bio_put(bio); + + if (atomic_dec_and_test(&ctx->refcount)) + kfree(ctx); + if (atomic_dec_and_test(&cio->refcount)) { + if (cio->endio) { + cio->endio(cio->private, cio->io_err); + kfree(cio); + } else + blk_wake_io_task(cio->waiter); + } +} + +static void blk_copy_offload_read_end_io(struct bio *read_bio) +{ + struct copy_ctx *ctx = read_bio->bi_private; + struct cio *cio = ctx->cio; + sector_t clen; + int ri = ctx->range_idx; + unsigned long flags; + + if (read_bio->bi_status) { + cio->io_err = blk_status_to_errno(read_bio->bi_status); + goto err_rw_bio; + } + + /* For zoned device, we check if completed bio is first entry in linked + * list, + * if yes, we start the worker to submit write bios. + * if not, then we just update status of bio in ctx, + * once the worker gets scheduled, it will submit writes for all + * the consecutive REQ_COPY_READ_COMPLETE bios. + */ + if (bdev_is_zoned(ctx->write_bio->bi_bdev)) { + spin_lock_irqsave(&cio->list_lock, flags); + ctx->status = REQ_COPY_READ_COMPLETE; + if (ctx == list_first_entry(&cio->list, + struct copy_ctx, list)) { + spin_unlock_irqrestore(&cio->list_lock, flags); + schedule_work(&ctx->dispatch_work); + goto free_read_bio; + } + spin_unlock_irqrestore(&cio->list_lock, flags); + } else + schedule_work(&ctx->dispatch_work); + +free_read_bio: + bio_put(read_bio); + + return; + +err_rw_bio: + clen = (read_bio->bi_iter.bi_sector << SECTOR_SHIFT) - + cio->ranges[ri].src; + cio->ranges[ri].comp_len = min_t(sector_t, clen, + cio->ranges[ri].comp_len); + __free_page(read_bio->bi_io_vec[0].bv_page); + bio_put(ctx->write_bio); + bio_put(read_bio); + if (atomic_dec_and_test(&ctx->refcount)) + kfree(ctx); + if (atomic_dec_and_test(&cio->refcount)) { + if (cio->endio) { + cio->endio(cio->private, cio->io_err); + kfree(cio); + } else + blk_wake_io_task(cio->waiter); + } +} + +static void blk_copy_dispatch_work_fn(struct work_struct *work) +{ + struct copy_ctx *ctx = container_of(work, struct copy_ctx, + dispatch_work); + + submit_bio(ctx->write_bio); +} + +static void blk_zoned_copy_dispatch_work_fn(struct work_struct *work) +{ + struct copy_ctx *ctx = container_of(work, struct copy_ctx, + dispatch_work); + struct cio *cio = ctx->cio; + unsigned long flags = 0; + + atomic_inc(&cio->refcount); + spin_lock_irqsave(&cio->list_lock, flags); + + while (!list_empty(&cio->list)) { + ctx = list_first_entry(&cio->list, struct copy_ctx, list); + + if (ctx->status == REQ_COPY_READ_PROGRESS) + break; + + atomic_inc(&ctx->refcount); + ctx->status = REQ_COPY_WRITE_PROGRESS; + spin_unlock_irqrestore(&cio->list_lock, flags); + submit_bio(ctx->write_bio); + spin_lock_irqsave(&cio->list_lock, flags); + + list_del(&ctx->list); + if (atomic_dec_and_test(&ctx->refcount)) + kfree(ctx); + } + + spin_unlock_irqrestore(&cio->list_lock, flags); + if (atomic_dec_and_test(&cio->refcount)) + blk_wake_io_task(cio->waiter); +} + +/* + * blk_copy_offload - Use device's native copy offload feature. + * we perform copy operation by sending 2 bio. + * 1. First we send a read bio with REQ_COPY flag along with a token and source + * and length. Once read bio reaches driver layer, device driver adds all the + * source info to token and does a fake completion. + * 2. Once read opration completes, we issue write with REQ_COPY flag with same + * token. In driver layer, token info is used to form a copy offload command. + * + * For conventional devices we submit write bio independentenly once read + * completes. For zoned devices , reads can complete out of order, so we + * maintain a linked list and submit writes in the order, reads are submitted. + */ +static int blk_copy_offload(struct block_device *src_bdev, + struct block_device *dst_bdev, struct range_entry *ranges, + int nr, cio_iodone_t end_io, void *private, gfp_t gfp_mask) +{ + struct cio *cio; + struct copy_ctx *ctx; + struct bio *read_bio, *write_bio; + struct page *token; + sector_t src_blk, copy_len, dst_blk; + sector_t rem, max_copy_len; + int ri = 0, ret = 0; + unsigned long flags; + + cio = kzalloc(sizeof(struct cio), GFP_KERNEL); + if (!cio) + return -ENOMEM; + cio->ranges = ranges; + atomic_set(&cio->refcount, 1); + cio->waiter = current; + cio->endio = end_io; + cio->private = private; + if (bdev_is_zoned(dst_bdev)) { + INIT_LIST_HEAD(&cio->list); + spin_lock_init(&cio->list_lock); + } + + max_copy_len = min(bdev_max_copy_sectors(src_bdev), + bdev_max_copy_sectors(dst_bdev)) << SECTOR_SHIFT; + + for (ri = 0; ri < nr; ri++) { + cio->ranges[ri].comp_len = ranges[ri].len; + src_blk = ranges[ri].src; + dst_blk = ranges[ri].dst; + for (rem = ranges[ri].len; rem > 0; rem -= copy_len) { + copy_len = min(rem, max_copy_len); + + token = alloc_page(gfp_mask); + if (unlikely(!token)) { + ret = -ENOMEM; + goto err_token; + } + + ctx = kzalloc(sizeof(struct copy_ctx), gfp_mask); + if (!ctx) { + ret = -ENOMEM; + goto err_ctx; + } + read_bio = bio_alloc(src_bdev, 1, REQ_OP_READ | REQ_COPY + | REQ_SYNC | REQ_NOMERGE, gfp_mask); + if (!read_bio) { + ret = -ENOMEM; + goto err_read_bio; + } + write_bio = bio_alloc(dst_bdev, 1, REQ_OP_WRITE + | REQ_COPY | REQ_SYNC | REQ_NOMERGE, + gfp_mask); + if (!write_bio) { + cio->io_err = -ENOMEM; + goto err_write_bio; + } + + ctx->cio = cio; + ctx->range_idx = ri; + ctx->write_bio = write_bio; + atomic_set(&ctx->refcount, 1); + + if (bdev_is_zoned(dst_bdev)) { + INIT_WORK(&ctx->dispatch_work, + blk_zoned_copy_dispatch_work_fn); + INIT_LIST_HEAD(&ctx->list); + spin_lock_irqsave(&cio->list_lock, flags); + ctx->status = REQ_COPY_READ_PROGRESS; + list_add_tail(&ctx->list, &cio->list); + spin_unlock_irqrestore(&cio->list_lock, flags); + } else + INIT_WORK(&ctx->dispatch_work, + blk_copy_dispatch_work_fn); + + __bio_add_page(read_bio, token, PAGE_SIZE, 0); + read_bio->bi_iter.bi_size = copy_len; + read_bio->bi_iter.bi_sector = src_blk >> SECTOR_SHIFT; + read_bio->bi_end_io = blk_copy_offload_read_end_io; + read_bio->bi_private = ctx; + + __bio_add_page(write_bio, token, PAGE_SIZE, 0); + write_bio->bi_iter.bi_size = copy_len; + write_bio->bi_end_io = blk_copy_offload_write_end_io; + write_bio->bi_iter.bi_sector = dst_blk >> SECTOR_SHIFT; + write_bio->bi_private = ctx; + + atomic_inc(&cio->refcount); + submit_bio(read_bio); + src_blk += copy_len; + dst_blk += copy_len; + } + } + + /* Wait for completion of all IO's*/ + return cio_await_completion(cio); + +err_write_bio: + bio_put(read_bio); +err_read_bio: + kfree(ctx); +err_ctx: + __free_page(token); +err_token: + ranges[ri].comp_len = min_t(sector_t, + ranges[ri].comp_len, (ranges[ri].len - rem)); + + cio->io_err = ret; + return cio_await_completion(cio); +} + +static inline int blk_copy_sanity_check(struct block_device *src_bdev, + struct block_device *dst_bdev, struct range_entry *ranges, int nr) +{ + unsigned int align_mask = max(bdev_logical_block_size(dst_bdev), + bdev_logical_block_size(src_bdev)) - 1; + sector_t len = 0; + int i; + + if (!nr) + return -EINVAL; + + if (nr >= MAX_COPY_NR_RANGE) + return -EINVAL; + + if (bdev_read_only(dst_bdev)) + return -EPERM; + + for (i = 0; i < nr; i++) { + if (!ranges[i].len) + return -EINVAL; + + len += ranges[i].len; + if ((ranges[i].dst & align_mask) || + (ranges[i].src & align_mask) || + (ranges[i].len & align_mask)) + return -EINVAL; + ranges[i].comp_len = 0; + } + + if (len && len >= MAX_COPY_TOTAL_LENGTH) + return -EINVAL; + + return 0; +} + +static inline bool blk_check_copy_offload(struct request_queue *src_q, + struct request_queue *dst_q) +{ + return blk_queue_copy(dst_q) && blk_queue_copy(src_q); +} + +/* + * blkdev_issue_copy - queue a copy + * @src_bdev: source block device + * @dst_bdev: destination block device + * @ranges: array of source/dest/len, + * ranges are expected to be allocated/freed by caller + * @nr: number of source ranges to copy + * @end_io: end_io function to be called on completion of copy operation, + * for synchronous operation this should be NULL + * @private: end_io function will be called with this private data, should be + * NULL, if operation is synchronous in nature + * @gfp_mask: memory allocation flags (for bio_alloc) + * + * Description: + * Copy source ranges from source block device to destination block + * device. length of a source range cannot be zero. Max total length of + * copy is limited to MAX_COPY_TOTAL_LENGTH and also maximum number of + * entries is limited to MAX_COPY_NR_RANGE + */ +int blkdev_issue_copy(struct block_device *src_bdev, + struct block_device *dst_bdev, struct range_entry *ranges, int nr, + cio_iodone_t end_io, void *private, gfp_t gfp_mask) +{ + struct request_queue *src_q = bdev_get_queue(src_bdev); + struct request_queue *dst_q = bdev_get_queue(dst_bdev); + int ret = -EINVAL; + + ret = blk_copy_sanity_check(src_bdev, dst_bdev, ranges, nr); + if (ret) + return ret; + + if (blk_check_copy_offload(src_q, dst_q)) + ret = blk_copy_offload(src_bdev, dst_bdev, ranges, nr, + end_io, private, gfp_mask); + + return ret; +} +EXPORT_SYMBOL_GPL(blkdev_issue_copy); + static int __blkdev_issue_write_zeroes(struct block_device *bdev, sector_t sector, sector_t nr_sects, gfp_t gfp_mask, struct bio **biop, unsigned flags) diff --git a/block/blk.h b/block/blk.h index 4c3b3325219a..6d9924a7d559 100644 --- a/block/blk.h +++ b/block/blk.h @@ -304,6 +304,8 @@ static inline bool bio_may_exceed_limits(struct bio *bio, break; } + if (unlikely(op_is_copy(bio->bi_opf))) + return false; /* * All drivers must accept single-segments bios that are <= PAGE_SIZE. * This is a quick and dirty check that relies on the fact that diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 99be590f952f..de1638c87ecf 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -422,6 +422,7 @@ enum req_flag_bits { */ /* for REQ_OP_WRITE_ZEROES: */ __REQ_NOUNMAP, /* do not free blocks when zeroing */ + __REQ_COPY, /* copy request */ __REQ_NR_BITS, /* stops here */ }; @@ -451,6 +452,7 @@ enum req_flag_bits { #define REQ_DRV (__force blk_opf_t)(1ULL << __REQ_DRV) #define REQ_SWAP (__force blk_opf_t)(1ULL << __REQ_SWAP) +#define REQ_COPY ((__force blk_opf_t)(1ULL << __REQ_COPY)) #define REQ_FAILFAST_MASK \ (REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT | REQ_FAILFAST_DRIVER) @@ -477,6 +479,11 @@ static inline bool op_is_write(blk_opf_t op) return !!(op & (__force blk_opf_t)1); } +static inline bool op_is_copy(blk_opf_t op) +{ + return (op & REQ_COPY); +} + /* * Check if the bio or request is one that needs special treatment in the * flush state machine. @@ -536,4 +543,41 @@ struct blk_rq_stat { u64 batch; }; +typedef void (cio_iodone_t)(void *private, int status); + +struct cio { + struct range_entry *ranges; + struct task_struct *waiter; /* waiting task (NULL if none) */ + atomic_t refcount; + int io_err; + cio_iodone_t *endio; /* applicable for async operation */ + void *private; /* applicable for async operation */ + + /* For zoned device we maintain a linked list of IO submissions. + * This is to make sure we maintain the order of submissions. + * Otherwise some reads completing out of order, will submit writes not + * aligned with zone write pointer. + */ + struct list_head list; + spinlock_t list_lock; +}; + +enum copy_io_status { + REQ_COPY_READ_PROGRESS, + REQ_COPY_READ_COMPLETE, + REQ_COPY_WRITE_PROGRESS, +}; + +struct copy_ctx { + struct cio *cio; + struct work_struct dispatch_work; + struct bio *write_bio; + atomic_t refcount; + int range_idx; /* used in error/partial completion */ + + /* For zoned device linked list is maintained. Along with state of IO */ + struct list_head list; + enum copy_io_status status; +}; + #endif /* __LINUX_BLK_TYPES_H */ diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 807ffb5f715d..48e9160b7195 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1063,6 +1063,9 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector, sector_t nr_sects, gfp_t gfp_mask, struct bio **biop); int blkdev_issue_secure_erase(struct block_device *bdev, sector_t sector, sector_t nr_sects, gfp_t gfp); +int blkdev_issue_copy(struct block_device *src_bdev, + struct block_device *dst_bdev, struct range_entry *ranges, + int nr, cio_iodone_t end_io, void *private, gfp_t gfp_mask); #define BLKDEV_ZERO_NOUNMAP (1 << 0) /* do not free blocks */ #define BLKDEV_ZERO_NOFALLBACK (1 << 1) /* don't write explicit zeroes */ diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index b3ad173f619c..9248b6d259de 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -67,6 +67,21 @@ struct fstrim_range { /* maximum total copy length */ #define MAX_COPY_TOTAL_LENGTH (1 << 27) +/* Maximum no of entries supported */ +#define MAX_COPY_NR_RANGE (1 << 12) + +/* range entry for copy offload, all fields should be byte addressed */ +struct range_entry { + __u64 src; /* source to be copied */ + __u64 dst; /* destination */ + __u64 len; /* length in bytes to be copied */ + + /* length of data copy actually completed. This will be filled by + * kernel, once copy completes + */ + __u64 comp_len; +}; + /* extent-same (dedupe) ioctls; these MUST match the btrfs ioctl definitions */ #define FILE_DEDUPE_RANGE_SAME 0 #define FILE_DEDUPE_RANGE_DIFFERS 1 From patchwork Thu Jan 12 11:58:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nitesh Shetty X-Patchwork-Id: 13097992 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEC1AC54EBD for ; Thu, 12 Jan 2023 13:38:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233128AbjALNiG (ORCPT ); Thu, 12 Jan 2023 08:38:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40106 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233167AbjALNhW (ORCPT ); Thu, 12 Jan 2023 08:37:22 -0500 Received: from mailout2.samsung.com (mailout2.samsung.com [203.254.224.25]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F9CD6340 for ; Thu, 12 Jan 2023 05:37:20 -0800 (PST) Received: from epcas5p3.samsung.com (unknown [182.195.41.41]) by mailout2.samsung.com (KnoxPortal) with ESMTP id 20230112133714epoutp025d19136666d3059c90b76df8e889cdf0~5kz7ojjLu2559025590epoutp02R for ; Thu, 12 Jan 2023 13:37:14 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.samsung.com 20230112133714epoutp025d19136666d3059c90b76df8e889cdf0~5kz7ojjLu2559025590epoutp02R DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1673530634; bh=fU5HjIdSnEaxtmpO3Wugku3TLnqNWog9cVAcaBP9h5c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=rJuAX6fOqOH8z+AuX32Z+7ZWe/tlvI6L+HQKNHeqXM/9DplVDfnYV0g+ARaVAF3TU bpHCicQaCY5yxeUw0erjR5XHrHdueaLAC23Ar/0Gyx/7gd91mL4vmHntb542PlYsMn PQLGt6l3Ua6/S52hLRUopkFfjz08LoziYF4LsT8I= Received: from epsnrtp4.localdomain (unknown [182.195.42.165]) by epcas5p4.samsung.com (KnoxPortal) with ESMTP id 20230112133714epcas5p4f3a746e46b20ab75d8ad8ae51564fe33~5kz7D3ub_0230502305epcas5p4v; Thu, 12 Jan 2023 13:37:14 +0000 (GMT) Received: from epsmges5p1new.samsung.com (unknown [182.195.38.174]) by epsnrtp4.localdomain (Postfix) with ESMTP id 4Nt5FN4hfXz4x9Pv; Thu, 12 Jan 2023 13:37:12 +0000 (GMT) Received: from epcas5p3.samsung.com ( [182.195.41.41]) by epsmges5p1new.samsung.com (Symantec Messaging Gateway) with SMTP id 0A.D4.02301.80D00C36; Thu, 12 Jan 2023 22:37:12 +0900 (KST) Received: from epsmtrp2.samsung.com (unknown [182.195.40.14]) by epcas5p3.samsung.com (KnoxPortal) with ESMTPA id 20230112120054epcas5p3ec5887c4e1de59f7529dafca1cd6aa65~5jf0VhKwh1949219492epcas5p3H; Thu, 12 Jan 2023 12:00:54 +0000 (GMT) Received: from epsmgms1p2.samsung.com (unknown [182.195.42.42]) by epsmtrp2.samsung.com (KnoxPortal) with ESMTP id 20230112120054epsmtrp2579dd9ec74f61611bed4cb72ee905aea~5jf0TKh3E3008330083epsmtrp2S; Thu, 12 Jan 2023 12:00:54 +0000 (GMT) X-AuditID: b6c32a49-201ff700000108fd-1c-63c00d082ee9 Received: from epsmtip2.samsung.com ( [182.195.34.31]) by epsmgms1p2.samsung.com (Symantec Messaging Gateway) with SMTP id 82.33.10542.676FFB36; Thu, 12 Jan 2023 21:00:54 +0900 (KST) Received: from green5.sa.corp.samsungelectronics.net (unknown [107.110.206.5]) by epsmtip2.samsung.com (KnoxPortal) with ESMTPA id 20230112120051epsmtip20c3c0f8832f14bcc5d9336453a0310e3~5jfxjNjru0767707677epsmtip2X; Thu, 12 Jan 2023 12:00:51 +0000 (GMT) From: Nitesh Shetty To: Jens Axboe , Alasdair Kergon , Mike Snitzer , dm-devel@redhat.com, Keith Busch , Christoph Hellwig , Sagi Grimberg , James Smart , Chaitanya Kulkarni , Alexander Viro Cc: anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com, nitheshshetty@gmail.com, gost.dev@samsung.com, Nitesh Shetty , Vincent Fu , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 3/9] block: add emulation for copy Date: Thu, 12 Jan 2023 17:28:57 +0530 Message-Id: <20230112115908.23662-4-nj.shetty@samsung.com> X-Mailer: git-send-email 2.35.1.500.gb896f729e2 In-Reply-To: <20230112115908.23662-1-nj.shetty@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA01Te0xbVRzOube9vWBrLh3MM5YpKdEEJtA66C5sIAOCd8KQuKDJpsNSbgqh tKUPUBIjj6mMyUNlUzpDIcqE0g3pAHl1IlqeIjMMGC/Z1sLmtsKEwcw2gi0XdP995/t93+91 zsFR/irmjacrtLRaIZELMHdW6y9+fgE4r1sqHCjdQzYO9qJkQfk6SjbMlmGkZfEcm5zsbkfI +gYrQnbW/I2Q1g0HRn7RMw7I+TE9Qlqm9pJdlgEWOdrxDUYazs9zyDZ7PiBbHxtQcqX2JIe8 eHeJRfZP7SZtp4sAObLex470ovRzwxjVrp/lUCN/NrGo0WEdZTaewqhL331EdU7mYVRJ4SJG LV0ew6jSZiOgVszPU2a7A0nkHss4mEZLUmm1D62QKlPTFbJwQdzR5OjkELFQFCAKJfcLfBSS TDpcEBOfGBCbLnfOKvDJlsh1TipRotEIgiIOqpU6Le2TptRowwW0KlWuClYFaiSZGp1CFqig tWEiofCVEKfwvYy0W7UFbFXNG+8vt3Sx8sBIZDFwwyERDJttC5xi4I7ziU4AfzQ9YjOHZQA7 q+oAc1gB8Nr3Daxti6mrAnNhPtEBYNFEHCMqRGCd0eYM4DhG7IVDG7hL40nMILB9yNelQYlu BK5PmzbNO4gQ+HCwm+PCLOJFuFi5AVyYR4TBvrx5xJUHEkGwbM7DRbsRB6Dp2gMWI/GAA5X2 TYwSL8DClnOoKz8kPnWD+pvTbKbRGPjl3XaMwTvgnb5mDoO94cqiZYvPgfUVdRhjPgmgfkIP mMCr8OPBMtTVBEr4wcaOIIbeA88MXkSYws/Cksd2hOF5sK1qG/tCU2P1Vv5dcPxh/ham4JBt eWujpQAu2U+BcuCjf2og/VMD6f8vXQ1QI9hFqzSZMloTohIp6Jz/blmqzDSDzffvf7gNzF6/ H9gDEBz0AIijAk9el/UnKZ+XKvkgl1Yrk9U6Oa3pASHOhX+OentJlc4PpNAmi4JDhcFisTg4 dJ9YJHiOR7cYpHxCJtHSGTStotXbPgR3885DUg7dtE7vZz+IPoGlxN/Jzg38J/r3rz+Zef0P 2xl27HGAP/N2wA/THvEJ/UHrlt6W1SX3q4PcibjG2/1EFLbyzs8LE7Lfjmedrwyr5N77y2Ms a82g7JwqHjky9NK7SX66oNtQvNwadd3x5s7VbnV+tD10lHvZZJg0H7GW02e/Ap4OFstRgfYO O6LX73k09byFmMxJNl9TyY0EK/dCXNv9ZFnVE2HETlOSlUjIOcCvlc7MR55QFMzGlC16Vb38 rdQ6lJI792us5kZG0VJZlKVDfsm/dPzYa7vrq6+ehjGHsq9cORq61lTzmXGfJaKeEPSt5VnD z9568mHWwqMw4+ELBu6MgKVJk4j8UbVG8i+GQyo9iAQAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrDIsWRmVeSWpSXmKPExsWy7bCSvG7Zt/3JBt8ny1usP3WM2aJpwl9m i9V3+9ks9r6bzWpx88BOJouVq48yWexe+JHJ4uj/t2wWkw5dY7R4enUWk8XeW9oWe/aeZLG4 vGsOm8X8ZU/ZLXY8aWS02PZ7PrPF56Ut7BbrXr9nsThxS9ricXcHo8X5v8dZHUQ9Zt0/y+ax c9Zddo/z9zayeFw+W+qxaVUnm8fmJfUeu282sHn0Nr9j83i/7yqbR9+WVYwenzfJeWx68pYp gCeKyyYlNSezLLVI3y6BK+P50ibWgoX+FZ+27mFpYDzv0MXIySEhYCKxZs8Uti5GLg4hgR2M Ek8uzWWESEhKLPt7hBnCFpZY+e85O0RRI5PE5bMvWbsYOTjYBLQlTv/nAImLCDxjkjh77xEz iMMscIZJYsWkXnaQbmEBU4nvpw6A2SwCqhLvZv4H28ArYCVxvOEpE8ggCQF9if77giBhTgFr iTU3vrCA2EJAJbP2XGWCKBeUODnzCVicWUBeonnrbOYJjAKzkKRmIUktYGRaxSiZWlCcm55b bFhglJdarlecmFtcmpeul5yfu4kRHKFaWjsY96z6oHeIkYmD8RCjBAezkgjvnqP7k4V4UxIr q1KL8uOLSnNSiw8xSnOwKInzXug6GS8kkJ5YkpqdmlqQWgSTZeLglGpgqvpY85qBiY+h73vX /Xtmoqmyyekb3+5zfnNNNJU7qiF5TttGba3ZS2q9C/imVpfGilrLGSw4zaJ6yEwh6nyq+ZPV E3ptGv9WqX7dprvrjsKhPSlqfS8El/lsVDWvcuY/GL1GYffnv/0rPjhXVp7a4btT4dW0f1Nb /JsW8l8PqE1z3F+1JMht3eW37l7nu5YkyO99HrvXYJXMjmfz2JbmC0hw++xe0sx/76RvI8uk bm3V2ty8yRZlZfzHGjZavXvzbnPqT/5czo49632rF+U8mZNQ8+Gqlsh+OYFNX+bETgoNO5u5 onzxHZ7I560dz+5bFOwLuK4dfM6Fd9UDnSevK4KnX+o3Pbj1YwrPjp1hSizFGYmGWsxFxYkA qGTMbT8DAAA= X-CMS-MailID: 20230112120054epcas5p3ec5887c4e1de59f7529dafca1cd6aa65 X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20230112120054epcas5p3ec5887c4e1de59f7529dafca1cd6aa65 References: <20230112115908.23662-1-nj.shetty@samsung.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org For the devices which does not support copy, copy emulation is added. Copy-emulation is implemented by reading from source ranges into memory and writing to the corresponding destination asynchronously. For zoned device we maintain a linked list of read submission and try to submit corresponding write in same order. Also emulation is used, if copy offload fails or partially completes. Signed-off-by: Nitesh Shetty Signed-off-by: Vincent Fu Signed-off-by: Anuj Gupta --- block/blk-lib.c | 241 ++++++++++++++++++++++++++++++++++++++++- block/blk-map.c | 4 +- include/linux/blkdev.h | 3 + 3 files changed, 245 insertions(+), 3 deletions(-) diff --git a/block/blk-lib.c b/block/blk-lib.c index 2ce3c872ca49..43b1d0ef5732 100644 --- a/block/blk-lib.c +++ b/block/blk-lib.c @@ -428,6 +428,239 @@ static inline int blk_copy_sanity_check(struct block_device *src_bdev, return 0; } +static void *blk_alloc_buf(sector_t req_size, sector_t *alloc_size, + gfp_t gfp_mask) +{ + int min_size = PAGE_SIZE; + void *buf; + + while (req_size >= min_size) { + buf = kvmalloc(req_size, gfp_mask); + if (buf) { + *alloc_size = req_size; + return buf; + } + /* retry half the requested size */ + req_size >>= 1; + } + + return NULL; +} + +static void blk_copy_emulate_write_end_io(struct bio *bio) +{ + struct copy_ctx *ctx = bio->bi_private; + struct cio *cio = ctx->cio; + sector_t clen; + int ri = ctx->range_idx; + + if (bio->bi_status) { + cio->io_err = blk_status_to_errno(bio->bi_status); + clen = (bio->bi_iter.bi_sector << SECTOR_SHIFT) - + cio->ranges[ri].dst; + cio->ranges[ri].comp_len = min_t(sector_t, clen, + cio->ranges[ri].comp_len); + } + kvfree(page_address(bio->bi_io_vec[0].bv_page)); + bio_map_kern_endio(bio); + if (atomic_dec_and_test(&ctx->refcount)) + kfree(ctx); + if (atomic_dec_and_test(&cio->refcount)) { + if (cio->endio) { + cio->endio(cio->private, cio->io_err); + kfree(cio); + } else + blk_wake_io_task(cio->waiter); + } +} + +static void blk_copy_emulate_read_end_io(struct bio *read_bio) +{ + struct copy_ctx *ctx = read_bio->bi_private; + struct cio *cio = ctx->cio; + sector_t clen; + int ri = ctx->range_idx; + unsigned long flags; + + if (read_bio->bi_status) { + cio->io_err = blk_status_to_errno(read_bio->bi_status); + goto err_rw_bio; + } + + /* For zoned device, we check if completed bio is first entry in linked + * list, + * if yes, we start the worker to submit write bios. + * if not, then we just update status of bio in ctx, + * once the worker gets scheduled, it will submit writes for all + * the consecutive REQ_COPY_READ_COMPLETE bios. + */ + if (bdev_is_zoned(ctx->write_bio->bi_bdev)) { + spin_lock_irqsave(&cio->list_lock, flags); + ctx->status = REQ_COPY_READ_COMPLETE; + if (ctx == list_first_entry(&cio->list, + struct copy_ctx, list)) { + spin_unlock_irqrestore(&cio->list_lock, flags); + schedule_work(&ctx->dispatch_work); + goto free_read_bio; + } + spin_unlock_irqrestore(&cio->list_lock, flags); + } else + schedule_work(&ctx->dispatch_work); + +free_read_bio: + kfree(read_bio); + + return; + +err_rw_bio: + clen = (read_bio->bi_iter.bi_sector << SECTOR_SHIFT) - + cio->ranges[ri].src; + cio->ranges[ri].comp_len = min_t(sector_t, clen, + cio->ranges[ri].comp_len); + __free_page(read_bio->bi_io_vec[0].bv_page); + bio_map_kern_endio(read_bio); + if (atomic_dec_and_test(&ctx->refcount)) + kfree(ctx); + if (atomic_dec_and_test(&cio->refcount)) { + if (cio->endio) { + cio->endio(cio->private, cio->io_err); + kfree(cio); + } else + blk_wake_io_task(cio->waiter); + } +} + +/* + * If native copy offload feature is absent, this function tries to emulate, + * by copying data from source to a temporary buffer and from buffer to + * destination device. + */ +static int blk_copy_emulate(struct block_device *src_bdev, + struct block_device *dst_bdev, struct range_entry *ranges, + int nr, cio_iodone_t end_io, void *private, gfp_t gfp_mask) +{ + struct request_queue *sq = bdev_get_queue(src_bdev); + struct request_queue *dq = bdev_get_queue(dst_bdev); + struct bio *read_bio, *write_bio; + void *buf = NULL; + struct copy_ctx *ctx; + struct cio *cio; + sector_t src, dst, offset, buf_len, req_len, rem = 0; + int ri = 0, ret = 0; + unsigned long flags; + sector_t max_src_hw_len = min_t(unsigned int, queue_max_hw_sectors(sq), + queue_max_segments(sq) << (PAGE_SHIFT - SECTOR_SHIFT)) + << SECTOR_SHIFT; + sector_t max_dst_hw_len = min_t(unsigned int, queue_max_hw_sectors(dq), + queue_max_segments(dq) << (PAGE_SHIFT - SECTOR_SHIFT)) + << SECTOR_SHIFT; + sector_t max_hw_len = min_t(unsigned int, + max_src_hw_len, max_dst_hw_len); + + cio = kzalloc(sizeof(struct cio), GFP_KERNEL); + if (!cio) + return -ENOMEM; + cio->ranges = ranges; + atomic_set(&cio->refcount, 1); + cio->waiter = current; + cio->endio = end_io; + cio->private = private; + + if (bdev_is_zoned(dst_bdev)) { + INIT_LIST_HEAD(&cio->list); + spin_lock_init(&cio->list_lock); + } + + for (ri = 0; ri < nr; ri++) { + offset = ranges[ri].comp_len; + src = ranges[ri].src + offset; + dst = ranges[ri].dst + offset; + /* If IO fails, we truncate comp_len */ + ranges[ri].comp_len = ranges[ri].len; + + for (rem = ranges[ri].len - offset; rem > 0; rem -= buf_len) { + req_len = min_t(int, max_hw_len, rem); + + buf = blk_alloc_buf(req_len, &buf_len, gfp_mask); + if (!buf) { + ret = -ENOMEM; + goto err_alloc_buf; + } + + ctx = kzalloc(sizeof(struct copy_ctx), gfp_mask); + if (!ctx) { + ret = -ENOMEM; + goto err_ctx; + } + + read_bio = bio_map_kern(sq, buf, buf_len, gfp_mask); + if (IS_ERR(read_bio)) { + ret = PTR_ERR(read_bio); + goto err_read_bio; + } + + write_bio = bio_map_kern(dq, buf, buf_len, gfp_mask); + if (IS_ERR(write_bio)) { + ret = PTR_ERR(write_bio); + goto err_write_bio; + } + + ctx->cio = cio; + ctx->range_idx = ri; + ctx->write_bio = write_bio; + atomic_set(&ctx->refcount, 1); + + read_bio->bi_iter.bi_sector = src >> SECTOR_SHIFT; + read_bio->bi_iter.bi_size = buf_len; + read_bio->bi_opf = REQ_OP_READ | REQ_SYNC; + bio_set_dev(read_bio, src_bdev); + read_bio->bi_end_io = blk_copy_emulate_read_end_io; + read_bio->bi_private = ctx; + + write_bio->bi_iter.bi_size = buf_len; + write_bio->bi_opf = REQ_OP_WRITE | REQ_SYNC; + bio_set_dev(write_bio, dst_bdev); + write_bio->bi_end_io = blk_copy_emulate_write_end_io; + write_bio->bi_iter.bi_sector = dst >> SECTOR_SHIFT; + write_bio->bi_private = ctx; + + if (bdev_is_zoned(dst_bdev)) { + INIT_WORK(&ctx->dispatch_work, + blk_zoned_copy_dispatch_work_fn); + INIT_LIST_HEAD(&ctx->list); + spin_lock_irqsave(&cio->list_lock, flags); + ctx->status = REQ_COPY_READ_PROGRESS; + list_add_tail(&ctx->list, &cio->list); + spin_unlock_irqrestore(&cio->list_lock, flags); + } else + INIT_WORK(&ctx->dispatch_work, + blk_copy_dispatch_work_fn); + + atomic_inc(&cio->refcount); + submit_bio(read_bio); + + src += buf_len; + dst += buf_len; + } + } + + /* Wait for completion of all IO's*/ + return cio_await_completion(cio); + +err_write_bio: + bio_put(read_bio); +err_read_bio: + kfree(ctx); +err_ctx: + kvfree(buf); +err_alloc_buf: + ranges[ri].comp_len -= min_t(sector_t, + ranges[ri].comp_len, (ranges[ri].len - rem)); + + cio->io_err = ret; + return cio_await_completion(cio); +} + static inline bool blk_check_copy_offload(struct request_queue *src_q, struct request_queue *dst_q) { @@ -460,15 +693,21 @@ int blkdev_issue_copy(struct block_device *src_bdev, struct request_queue *src_q = bdev_get_queue(src_bdev); struct request_queue *dst_q = bdev_get_queue(dst_bdev); int ret = -EINVAL; + bool offload = false; ret = blk_copy_sanity_check(src_bdev, dst_bdev, ranges, nr); if (ret) return ret; - if (blk_check_copy_offload(src_q, dst_q)) + offload = blk_check_copy_offload(src_q, dst_q); + if (offload) ret = blk_copy_offload(src_bdev, dst_bdev, ranges, nr, end_io, private, gfp_mask); + if (ret || !offload) + ret = blk_copy_emulate(src_bdev, dst_bdev, ranges, nr, + end_io, private, gfp_mask); + return ret; } EXPORT_SYMBOL_GPL(blkdev_issue_copy); diff --git a/block/blk-map.c b/block/blk-map.c index 19940c978c73..bcf8db2b75f1 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -363,7 +363,7 @@ static void bio_invalidate_vmalloc_pages(struct bio *bio) #endif } -static void bio_map_kern_endio(struct bio *bio) +void bio_map_kern_endio(struct bio *bio) { bio_invalidate_vmalloc_pages(bio); bio_uninit(bio); @@ -380,7 +380,7 @@ static void bio_map_kern_endio(struct bio *bio) * Map the kernel address into a bio suitable for io to a block * device. Returns an error pointer in case of error. */ -static struct bio *bio_map_kern(struct request_queue *q, void *data, +struct bio *bio_map_kern(struct request_queue *q, void *data, unsigned int len, gfp_t gfp_mask) { unsigned long kaddr = (unsigned long)data; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 48e9160b7195..c5621550e5b4 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1066,6 +1066,9 @@ int blkdev_issue_secure_erase(struct block_device *bdev, sector_t sector, int blkdev_issue_copy(struct block_device *src_bdev, struct block_device *dst_bdev, struct range_entry *ranges, int nr, cio_iodone_t end_io, void *private, gfp_t gfp_mask); +struct bio *bio_map_kern(struct request_queue *q, void *data, unsigned int len, + gfp_t gfp_mask); +void bio_map_kern_endio(struct bio *bio); #define BLKDEV_ZERO_NOUNMAP (1 << 0) /* do not free blocks */ #define BLKDEV_ZERO_NOFALLBACK (1 << 1) /* don't write explicit zeroes */ From patchwork Thu Jan 12 11:58:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Nitesh Shetty X-Patchwork-Id: 13097994 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61D29C677F1 for ; Thu, 12 Jan 2023 13:38:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234056AbjALNiP (ORCPT ); Thu, 12 Jan 2023 08:38:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40124 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233427AbjALNhW (ORCPT ); Thu, 12 Jan 2023 08:37:22 -0500 Received: from mailout3.samsung.com (mailout3.samsung.com [203.254.224.33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7A9B496F6 for ; Thu, 12 Jan 2023 05:37:20 -0800 (PST) Received: from epcas5p3.samsung.com (unknown [182.195.41.41]) by mailout3.samsung.com (KnoxPortal) with ESMTP id 20230112133719epoutp03be41ab238d1de231bd772cfd48337729~5kz-dzlBr1682216822epoutp03M for ; Thu, 12 Jan 2023 13:37:19 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout3.samsung.com 20230112133719epoutp03be41ab238d1de231bd772cfd48337729~5kz-dzlBr1682216822epoutp03M DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1673530639; bh=03mtfZ0KyaknfOoQAQSi8hDKEjbikutk8J2fRbEhzmM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Wu/RuSKWLyD4M6PLltsxBZMfrqP7MyiEAS7C3gqYh/Rj5ZvJkadjDEolBz1L/iJCK 01sgsVczntc3ft61T8P1H+bOO0qvdxMxCiA4K4jmW7x6s7Wrh8eeu1XLb8nX69EDXP x5nkUHcW86sXpaydBVhYIzvyyHuI04KpoanCQLOA= Received: from epsnrtp3.localdomain (unknown [182.195.42.164]) by epcas5p4.samsung.com (KnoxPortal) with ESMTP id 20230112133718epcas5p437bad450396846e909113f3d0d882a48~5kz_0tOX52984229842epcas5p4k; Thu, 12 Jan 2023 13:37:18 +0000 (GMT) Received: from epsmges5p3new.samsung.com (unknown [182.195.38.181]) by epsnrtp3.localdomain (Postfix) with ESMTP id 4Nt5FS5GFhz4x9Px; Thu, 12 Jan 2023 13:37:16 +0000 (GMT) Received: from epcas5p1.samsung.com ( [182.195.41.39]) by epsmges5p3new.samsung.com (Symantec Messaging Gateway) with SMTP id 21.66.03362.C0D00C36; Thu, 12 Jan 2023 22:37:16 +0900 (KST) Received: from epsmtrp2.samsung.com (unknown [182.195.40.14]) by epcas5p4.samsung.com (KnoxPortal) with ESMTPA id 20230112120131epcas5p4374e6add89990dd546bd0ae38f4386f0~5jgWZWF5v0983009830epcas5p4V; Thu, 12 Jan 2023 12:01:31 +0000 (GMT) Received: from epsmgms1p2.samsung.com (unknown [182.195.42.42]) by epsmtrp2.samsung.com (KnoxPortal) with ESMTP id 20230112120131epsmtrp2e60d5d391c6f729215e19237dc753b83~5jgWT-6i73008330083epsmtrp2h; Thu, 12 Jan 2023 12:01:31 +0000 (GMT) X-AuditID: b6c32a4b-4e5fa70000010d22-d9-63c00d0c30b5 Received: from epsmtip2.samsung.com ( [182.195.34.31]) by epsmgms1p2.samsung.com (Symantec Messaging Gateway) with SMTP id 89.33.10542.A96FFB36; Thu, 12 Jan 2023 21:01:31 +0900 (KST) Received: from green5.sa.corp.samsungelectronics.net (unknown [107.110.206.5]) by epsmtip2.samsung.com (KnoxPortal) with ESMTPA id 20230112120128epsmtip2f6dd2b1d2e6ed1351ebe40f693614486~5jgTeUo3b0767707677epsmtip2c; Thu, 12 Jan 2023 12:01:27 +0000 (GMT) From: Nitesh Shetty To: Jens Axboe , Alasdair Kergon , Mike Snitzer , dm-devel@redhat.com, Keith Busch , Christoph Hellwig , Sagi Grimberg , James Smart , Chaitanya Kulkarni , Alexander Viro Cc: anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com, nitheshshetty@gmail.com, gost.dev@samsung.com, Nitesh Shetty , Hannes Reinecke , =?utf-8?q?Javier_Gonz=C3=A1lez?= , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 4/9] block: Introduce a new ioctl for copy Date: Thu, 12 Jan 2023 17:28:58 +0530 Message-Id: <20230112115908.23662-5-nj.shetty@samsung.com> X-Mailer: git-send-email 2.35.1.500.gb896f729e2 In-Reply-To: <20230112115908.23662-1-nj.shetty@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA02Te0xTVxzHPfeW21tCl0sROSLDeolbkACt9HFBGLiR7W7UrAmLSzQGu/ZC CaVt2oIPslksmoBB1IFKFZGXUR6y8RqvKtahAkE0PgZGhoOyGBU6eUwCMkcpbP73+f3O93t+ j5ODo7x/MH88VWtiDFqFhsQ8WS03gz8O9eJ2KQUnyz6i6ntvodThE4soVTNcgFG2yXMe1FBX G0J1lp9CqCs13QjVUfYaocaeTrOp7ncTGHXK/hhQ44+sCGV7EkJ12npY1IP28xhVemmcTbU6 sgHVslCKUtNVOWzq6ksni7rzZAM1sHjbI24dbR3px+g26zCbHvj9Zxb9oD+DbqjOxejGykN0 x5AZo/MtkxjtvPYIo483VQO6sS+Lnm4IpBscE4icuystWs0oVIyBz2iVOlWqNiWGTEhM+ixJ LBEIQ4WRlJTkaxXpTAwZL5OHfp6qWRqa5GcqNBlLKbnCaCTDP4k26DJMDF+tM5piSEav0uhF +jCjIt2YoU0J0zKmKKFAsFW8JNybpq6+bkH0h4P2l5grWWZwKyAPcHBIiGDn2RkkD3jiPKID wOYLs5g7mAJwLP9PD3cwDaCt8Tyyahnv70XdB+0Anp4rZrsDCwK7RgZYeQDHMSIE9r3DXYa1 xFMEtvUFuTQosYDAZzfngUvjQ8TAxrx0l4ZFbIaW7kIPF3OJKFj89o6HSwKJcFgw4u1Kc4ht sHZwhuWWeMOeYscyo8RGaGk+t9wPJCo4sL28DrgbjYeXnxeibvaBL243sd3sD6cnbZib98Er hZcxtzkHQOtv1hVzLDzSW4C6mkCJYFjfHu5OfwiLeq8i7sIfwPwFx8pSuLD1wioHwdr6iyv3 r4eP32SvMA1zxhZXNnocwKb+n9gnAN/63kDW9way/l/6IkCrwXpGb0xPYYxifYSW2fffKyt1 6Q1g+SNsSWgFo8/+CrMDBAd2AHGUXMvt7L6u5HFVigMHGYMuyZChYYx2IF5a+EnU31epW/pJ WlOSUBQpEEkkElFkhERI+nGZ5lIlj0hRmJg0htEzhlUfgnP8zcgw0iv6rj5fM3+solY5WDeH F3h3+GzaMf460fKp15pGvXPvYpPs5ZTYl3DmfjlVE8XamJA225G9eyBf7WmG28m4/Qc4c7uP RgTw+Kc5fpm+39pG9J6gVnoo+EhQVXXLjfkqxxqlbI+5wssuPqYJKRGfKS/f9iJgCPbvxJ1f V5V9Exs7GP3q7N+JXiV3i+5f2pGlqoyGGOC/GvfJipNF/bAzabPcsT1Z9GY+7yG5aUNaUKYE 8I6mth2MVfvff7tVbj+Tn7xONhtl/TXw+d349mTZaJZXsPSP73fF51z7cXRGKrg3UReoqnfK v5go+kqzx6+H+xBR9dwzt93o+SXXIJWSLKNaIdyCGoyKfwE8GgRjkQQAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrFIsWRmVeSWpSXmKPExsWy7bCSvO7sb/uTDZYuYbJYf+oYs0XThL/M Fqvv9rNZ7H03m9Xi5oGdTBZ7Fk1isli5+iiTxe6FH5ksHt/5zG5x9P9bNotJh64xWjy9OovJ Yu8tbYs9e0+yWFzeNYfNYv6yp+wWO540Mlps+z2f2eLz0hZ2i3Wv37NYnLglbXH+73FWBzGP WffPsnnsnHWX3eP8vY0sHpfPlnpsWtXJ5rF5Sb3H7psNbB69ze/YPN7vu8rm0bdlFaPH5tPV Hp83yXlsevKWKYA3issmJTUnsyy1SN8ugStj1f5mpoIm5Yq5DUtYGhiPyXQxcnJICJhIPD17 irmLkYtDSGAHo8Sr/ulMEAlJiWV/jzBD2MISK/89Z4coamSSWLXnAJDDwcEmoC1x+j8HSFxE 4BmTxNl7j8AmMQu0M0tc6j3PDFIkLGArsbkrF2QQi4CqRPPRKawgNq+AlcTMPydYQUokBPQl +u8LgoQ5Bawl1tz4wgJiCwGVzNpzlQmiXFDi5MwnLCDlzALqEuvnCYGEmQXkJZq3zmaewCg4 C0nVLISqWUiqFjAyr2KUTC0ozk3PLTYsMMpLLdcrTswtLs1L10vOz93ECI5uLa0djHtWfdA7 xMjEwXiIUYKDWUmEd8/R/clCvCmJlVWpRfnxRaU5qcWHGKU5WJTEeS90nYwXEkhPLEnNTk0t SC2CyTJxcEo1MJ2Zdn+/3p422fncrnnGM9Uj87MOBBs68vz40TArbMok1it9W0SDk98853zI tGgH4+fEXcfZmm792sL7L+U/p6po/5oDEZZtAcqP9h9giJl1KNxcafs2AfvnIQtFW1p1QkI/ 9IlXux9oNfNT+WkpeKHdUXovg96dVSmu/XLFfg9YvKb8frffKFg7sTtlzoqwxPXvL69NlPCa sKTtZ4jN3+kWgpXzHD3mSpaaCendb7N3XcIiWcW2W+ZaUK/d4l9/pfxnP9u8yjWl73ZltWH/ hNvHTnaHzfDdcWEll+SbpSxHO1JXfr125P3MmxKPbhz4W3oxMKlj1SZ7qxDPvMCnt5QKmIq+ qjyKPZv/aa2egxJLcUaioRZzUXEiAHdQi11dAwAA X-CMS-MailID: 20230112120131epcas5p4374e6add89990dd546bd0ae38f4386f0 X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20230112120131epcas5p4374e6add89990dd546bd0ae38f4386f0 References: <20230112115908.23662-1-nj.shetty@samsung.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add new BLKCOPY ioctl that offloads copying of one or more sources ranges to one or more destination in a device. COPY ioctl accepts a 'copy_range' structure that contains no of range, a reserved field , followed by an array of ranges. Each source range is represented by 'range_entry' that contains source start offset, destination start offset and length of source ranges (in bytes) MAX_COPY_NR_RANGE, limits the number of entries for the IOCTL and MAX_COPY_TOTAL_LENGTH limits the total copy length, IOCTL can handle. Example code, to issue BLKCOPY: /* Sample example to copy three entries with [dest,src,len], * [32768, 0, 4096] [36864, 4096, 4096] [40960,8192,4096] on same device */ int main(void) { int i, ret, fd; unsigned long src = 0, dst = 32768, len = 4096; struct copy_range *cr; cr = (struct copy_range *)malloc(sizeof(*cr)+ (sizeof(struct range_entry)*3)); cr->nr_range = 3; cr->reserved = 0; for (i = 0; i< cr->nr_range; i++, src += len, dst += len) { cr->ranges[i].dst = dst; cr->ranges[i].src = src; cr->ranges[i].len = len; cr->ranges[i].comp_len = 0; } fd = open("/dev/nvme0n1", O_RDWR); if (fd < 0) return 1; ret = ioctl(fd, BLKCOPY, cr); if (ret != 0) printf("copy failed, ret= %d\n", ret); for (i=0; i< cr->nr_range; i++) if (cr->ranges[i].len != cr->ranges[i].comp_len) printf("Partial copy for entry %d: requested %llu, completed %llu\n", i, cr->ranges[i].len, cr->ranges[i].comp_len); close(fd); free(cr); return ret; } Reviewed-by: Hannes Reinecke Signed-off-by: Nitesh Shetty Signed-off-by: Javier González Signed-off-by: Anuj Gupta --- block/ioctl.c | 36 ++++++++++++++++++++++++++++++++++++ include/uapi/linux/fs.h | 9 +++++++++ 2 files changed, 45 insertions(+) diff --git a/block/ioctl.c b/block/ioctl.c index 96617512982e..d636bc1f0047 100644 --- a/block/ioctl.c +++ b/block/ioctl.c @@ -120,6 +120,40 @@ static int blk_ioctl_discard(struct block_device *bdev, fmode_t mode, return err; } +static int blk_ioctl_copy(struct block_device *bdev, fmode_t mode, + unsigned long arg) +{ + struct copy_range ucopy_range, *kcopy_range = NULL; + size_t payload_size = 0; + int ret; + + if (!(mode & FMODE_WRITE)) + return -EBADF; + + if (copy_from_user(&ucopy_range, (void __user *)arg, + sizeof(ucopy_range))) + return -EFAULT; + + if (unlikely(!ucopy_range.nr_range || ucopy_range.reserved || + ucopy_range.nr_range >= MAX_COPY_NR_RANGE)) + return -EINVAL; + + payload_size = (ucopy_range.nr_range * sizeof(struct range_entry)) + + sizeof(ucopy_range); + + kcopy_range = memdup_user((void __user *)arg, payload_size); + if (IS_ERR(kcopy_range)) + return PTR_ERR(kcopy_range); + + ret = blkdev_issue_copy(bdev, bdev, kcopy_range->ranges, + kcopy_range->nr_range, NULL, NULL, GFP_KERNEL); + if (copy_to_user((void __user *)arg, kcopy_range, payload_size)) + ret = -EFAULT; + + kfree(kcopy_range); + return ret; +} + static int blk_ioctl_secure_erase(struct block_device *bdev, fmode_t mode, void __user *argp) { @@ -482,6 +516,8 @@ static int blkdev_common_ioctl(struct file *file, fmode_t mode, unsigned cmd, return blk_ioctl_discard(bdev, mode, arg); case BLKSECDISCARD: return blk_ioctl_secure_erase(bdev, mode, argp); + case BLKCOPY: + return blk_ioctl_copy(bdev, mode, arg); case BLKZEROOUT: return blk_ioctl_zeroout(bdev, mode, arg); case BLKGETDISKSEQ: diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index 9248b6d259de..8af10b926a6f 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -82,6 +82,14 @@ struct range_entry { __u64 comp_len; }; +struct copy_range { + __u64 nr_range; + __u64 reserved; + + /* Ranges always must be at the end */ + struct range_entry ranges[]; +}; + /* extent-same (dedupe) ioctls; these MUST match the btrfs ioctl definitions */ #define FILE_DEDUPE_RANGE_SAME 0 #define FILE_DEDUPE_RANGE_DIFFERS 1 @@ -203,6 +211,7 @@ struct fsxattr { #define BLKROTATIONAL _IO(0x12,126) #define BLKZEROOUT _IO(0x12,127) #define BLKGETDISKSEQ _IOR(0x12,128,__u64) +#define BLKCOPY _IOWR(0x12, 129, struct copy_range) /* * A jump here: 130-136 are reserved for zoned block devices * (see uapi/linux/blkzoned.h) From patchwork Thu Jan 12 11:58:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Nitesh Shetty X-Patchwork-Id: 13097993 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5E8BC61DB3 for ; Thu, 12 Jan 2023 13:38:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234031AbjALNiN (ORCPT ); Thu, 12 Jan 2023 08:38:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39994 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233517AbjALNh3 (ORCPT ); Thu, 12 Jan 2023 08:37:29 -0500 Received: from mailout3.samsung.com (mailout3.samsung.com [203.254.224.33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 71D0248CD0 for ; Thu, 12 Jan 2023 05:37:25 -0800 (PST) Received: from epcas5p3.samsung.com (unknown [182.195.41.41]) by mailout3.samsung.com (KnoxPortal) with ESMTP id 20230112133724epoutp035bed86f78755c498197e60f266d3dd36~5k0EEou5B2187321873epoutp03G for ; Thu, 12 Jan 2023 13:37:23 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout3.samsung.com 20230112133724epoutp035bed86f78755c498197e60f266d3dd36~5k0EEou5B2187321873epoutp03G DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1673530644; bh=67BJ3YmkgeWMg/JuN/LPXCpYf+4f34A9tqoqFxx6Vt0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=mOQZ4ZxUhkDWus0Cp+1gyfAmqEUFd5AyVdvoymdDVRFChYFata0BqvGNPXnvF7D04 9EpFKDGAI4rCnTVRaXZOwAK2YZtUOLxCmFwngT47M1qPmIKahgsAqzh9M1WjB4TZyd tDY4klSWDqrjJq5j1nMmAxy8YpO3OV8davbTSQK0= Received: from epsnrtp1.localdomain (unknown [182.195.42.162]) by epcas5p2.samsung.com (KnoxPortal) with ESMTP id 20230112133722epcas5p2c5185164963d166d345a81918057d7fe~5k0CwXVaU0973809738epcas5p2Q; Thu, 12 Jan 2023 13:37:22 +0000 (GMT) Received: from epsmges5p3new.samsung.com (unknown [182.195.38.178]) by epsnrtp1.localdomain (Postfix) with ESMTP id 4Nt5FY0K3Cz4x9Pr; Thu, 12 Jan 2023 13:37:21 +0000 (GMT) Received: from epcas5p2.samsung.com ( [182.195.41.40]) by epsmges5p3new.samsung.com (Symantec Messaging Gateway) with SMTP id B2.66.03362.01D00C36; Thu, 12 Jan 2023 22:37:20 +0900 (KST) Received: from epsmtrp2.samsung.com (unknown [182.195.40.14]) by epcas5p1.samsung.com (KnoxPortal) with ESMTPA id 20230112120151epcas5p1e7c3ec0c7bd0869b9cf0bea64d65991a~5jgpbcrov1811218112epcas5p11; Thu, 12 Jan 2023 12:01:51 +0000 (GMT) Received: from epsmgms1p1new.samsung.com (unknown [182.195.42.41]) by epsmtrp2.samsung.com (KnoxPortal) with ESMTP id 20230112120151epsmtrp2a8d33c190d8014e7efc890f5f97403f8~5jgpaaZqK3008130081epsmtrp2w; Thu, 12 Jan 2023 12:01:51 +0000 (GMT) X-AuditID: b6c32a4b-4e5fa70000010d22-df-63c00d10613c Received: from epsmtip2.samsung.com ( [182.195.34.31]) by epsmgms1p1new.samsung.com (Symantec Messaging Gateway) with SMTP id CC.B5.02211.FA6FFB36; Thu, 12 Jan 2023 21:01:51 +0900 (KST) Received: from green5.sa.corp.samsungelectronics.net (unknown [107.110.206.5]) by epsmtip2.samsung.com (KnoxPortal) with ESMTPA id 20230112120148epsmtip20a0c64f9dd5e23792d1761512a024e5f~5jgmoYBDX0887508875epsmtip2M; Thu, 12 Jan 2023 12:01:48 +0000 (GMT) From: Nitesh Shetty To: Jens Axboe , Alasdair Kergon , Mike Snitzer , dm-devel@redhat.com, Keith Busch , Christoph Hellwig , Sagi Grimberg , James Smart , Chaitanya Kulkarni , Alexander Viro Cc: anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com, nitheshshetty@gmail.com, gost.dev@samsung.com, Nitesh Shetty , =?utf-8?q?Javier_Gonz=C3=A1lez?= , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 5/9] nvme: add copy offload support Date: Thu, 12 Jan 2023 17:28:59 +0530 Message-Id: <20230112115908.23662-6-nj.shetty@samsung.com> X-Mailer: git-send-email 2.35.1.500.gb896f729e2 In-Reply-To: <20230112115908.23662-1-nj.shetty@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA01Te0xTVxzeube9LY7qFXA7oNNas/EKpYW2O6AwExm7m8skYxu6xdCm3AGh tLUPmPwjDGSzKK9NGYXxCOAG5RErGF4VguOphCUMNlCGDhgwIygd8IdT1nJh87/vfL/v+73O OVzczU54cRPVBlqnVqgExA7WjVu+PgEkr1spKigLQU1DfTj6Mv8ZjixTeQSyLZWw0UR3G4Zq Lb0Y6qh8gqGZe3YO6t14RKDCnnGA5sbMGLJN+qNO2yALjbaXEqj86hwHtc5mAHTjaTmO7DVZ HNT4cJmFBib3opFn/eyjeyjz9DBBtZmnONTI79dY1OiwkbLWXSCo69XnqI6JdIK6lLlEUMs3 xwgqt7kOUHbrfso6+wiLcv006UgCrYijdXxardTEJarjwwTHo2OPxUplInGAOAS9KeCrFcl0 mCDi/aiAyESVY1YBP0WhMjqoKIVeLwgMP6LTGA00P0GjN4QJaG2cSivRCvWKZL1RHS9U04ZQ sUgUJHUI5UkJK3/WYFpT6hdz6cOcdNAoNwEXLiQlsHj+G44Tu5EdABZ8TzN4BUCTSWICOxx4 DcCfSh7g24aN9mGcCdgArFqwcphDJgZrx0sIE+ByCdIf3t7gOg0e5D0Mtt0+5NTg5F0M1pdP sZ0Bd1IGJxfXNrOyyNdhl+0K4cQ8MhT2LS/hzjyQDIR507udtAt5GNb/9jeLkeyGg8Wzmxgn D8DMlpLNhiBZ5AKHVvoJptMIWFRUCBjsDv/qb+Yw2Aval2xbmlRY++2PBGPOAtD8q3nL8BY8 P5S32QRO+sKm9kCGfg1eHmrEmMI74aWnsxjD82Br2TY+BOubKrbye8Lx9QyCmYWCNfNsZle5 AFpmLrLzAd/8wjzmF+Yx/1+5AuB1wJPW6pPjab1UG6ymU/+7Y6Um2Qo2X7/f8Vbwx/3Hwh6A cUEPgFxc4MHr7O1SuvHiFGfTaJ0mVmdU0foeIHXsuwD32qPUOL6P2hArloSIJDKZTBISLBML XuXRLeVKNzJeYaCTaFpL67Z9GNfFKx07+cE+y9H3itMuVLc2n79PuLpaBvm9A+GXCeLr8EhW DPY4BwzF7LXmX7OHRD1Pe3hOdKZY8rFcikVQQYJSuU9rrDn4gTLYq2P0pTO+3Z25d6pU9q+y A97wbpad0O0y0KNV6xXRQWme09mr6rLu+DXLwMric95CSrbo4EWhjhJ2sQ8nncoo/aTrYkrl mP+t1Y98l7w9Dp7ymKk+4O8nv6NUNnvLWrpEtEtTZegrq1mfR7PW5dcLrNLTmrO/LIoKQ8fc 7T47I690B+7/ruXExDtXs+Da3bAf3s4xuz3Z5Rsx0vdPnfHdl2+KT3/4WUNvnTAGLNT+nENG ezScbPDcN59LiNIELH2CQuyH6/SKfwHjWb65hgQAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrEIsWRmVeSWpSXmKPExsWy7bCSvO76b/uTDe7/l7VYf+oYs0XThL/M Fqvv9rNZ7H03m9Xi5oGdTBYrVx9lsti98COTxeM7n9ktjv5/y2Yx6dA1RounV2cxWey9pW2x Z+9JFovLu+awWcxf9pTdYseTRkaLbb/nM1t8XtrCbrHu9XsWixO3pC3O/z3O6iDqMev+WTaP nbPusnucv7eRxePy2VKPTas62Tw2L6n32H2zgc2jt/kdm8f7fVfZPPq2rGL0+LxJzmPTk7dM ATxRXDYpqTmZZalF+nYJXBmfni1lKugqr3jacJa9gXFdQhcjJ4eEgInE/11nmbsYuTiEBHYz Sqx5fp0ZIiEpsezvEShbWGLlv+fsEEWNTBKNF7tZuxg5ONgEtCVO/+cAiYsIPGOSOHvvEdgk ZoGnTBI9N/6ygHQLC5hJ3Hr5DWwSi4CqxP6909hAbF4BK4lj798xgwySENCX6L8vCBLmFLCW WHPjC1irEFDJrD1XmSDKBSVOznzCAlLOLKAusX6eEEiYWUBeonnrbOYJjIKzkFTNQqiahaRq ASPzKkbJ1ILi3PTcYsMCw7zUcr3ixNzi0rx0veT83E2M4GjW0tzBuH3VB71DjEwcjIcYJTiY lUR49xzdnyzEm5JYWZValB9fVJqTWnyIUZqDRUmc90LXyXghgfTEktTs1NSC1CKYLBMHp1QD U22iwIyYA0/uTuxh6WjpdvoQnpfVotw29e0J903tcj+9r1hYT3bvvbX+zen/pdc6Xy9UMtHZ GTbF616KoGlNboxt1Teb03YbrlQ88tXUySxwWSpyKvV8UfH6d82i87d+eGi0TDE6+2vER4WL upLCLUmvQyas6Gfu0XHya7saIxDy98G7xy6iP3dd/tj5xifvdONpQZmQr072HVynykPNdl67 53s368HR7yfcttVGBNev0Xjy98KbT7derN24vs3yi1/K9NU6j12OswqVRc+5/6rmtNpfh/s8 Fzc+YdpuXF3KOPd8R8Hblx0xnKUrT35VEt3df2tl746+oDVTZ8snVXyQq+ZbZBe10muflcOH D0osxRmJhlrMRcWJAJax+RVVAwAA X-CMS-MailID: 20230112120151epcas5p1e7c3ec0c7bd0869b9cf0bea64d65991a X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20230112120151epcas5p1e7c3ec0c7bd0869b9cf0bea64d65991a References: <20230112115908.23662-1-nj.shetty@samsung.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org For device supporting native copy, nvme driver receives read and write request with BLK_COPY op flags. For read request the nvme driver populates the payload with source information. For write request the driver converts it to nvme copy command using the source information in the payload and submits to the device. current design only supports single source range. This design is courtesy Mikulas Patocka's token based copy trace event support for nvme_copy_cmd. Set the device copy limits to queue limits. Signed-off-by: Kanchan Joshi Signed-off-by: Nitesh Shetty Signed-off-by: Javier González Signed-off-by: Anuj Gupta --- drivers/nvme/host/constants.c | 1 + drivers/nvme/host/core.c | 106 +++++++++++++++++++++++++++++++++- drivers/nvme/host/fc.c | 5 ++ drivers/nvme/host/nvme.h | 7 +++ drivers/nvme/host/pci.c | 27 ++++++++- drivers/nvme/host/rdma.c | 7 +++ drivers/nvme/host/tcp.c | 16 +++++ drivers/nvme/host/trace.c | 19 ++++++ include/linux/nvme.h | 43 +++++++++++++- 9 files changed, 223 insertions(+), 8 deletions(-) diff --git a/drivers/nvme/host/constants.c b/drivers/nvme/host/constants.c index e958d5015585..4e60946d3aa8 100644 --- a/drivers/nvme/host/constants.c +++ b/drivers/nvme/host/constants.c @@ -19,6 +19,7 @@ static const char * const nvme_ops[] = { [nvme_cmd_resv_report] = "Reservation Report", [nvme_cmd_resv_acquire] = "Reservation Acquire", [nvme_cmd_resv_release] = "Reservation Release", + [nvme_cmd_copy] = "Copy Offload", [nvme_cmd_zone_mgmt_send] = "Zone Management Send", [nvme_cmd_zone_mgmt_recv] = "Zone Management Receive", [nvme_cmd_zone_append] = "Zone Management Append", diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 7be562a4e1aa..0af4ce771c5d 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -753,6 +753,80 @@ static inline void nvme_setup_flush(struct nvme_ns *ns, cmnd->common.nsid = cpu_to_le32(ns->head->ns_id); } +static inline blk_status_t nvme_setup_copy_read(struct nvme_ns *ns, + struct request *req) +{ + struct bio *bio = req->bio; + struct nvme_copy_token *token = bvec_kmap_local(&bio->bi_io_vec[0]); + + memcpy(token->subsys, "nvme", 4); + token->ns = ns; + token->src_sector = bio->bi_iter.bi_sector; + token->sectors = bio->bi_iter.bi_size >> 9; + + return BLK_STS_OK; +} + +static inline blk_status_t nvme_setup_copy_write(struct nvme_ns *ns, + struct request *req, struct nvme_command *cmnd) +{ + struct nvme_copy_range *range = NULL; + struct bio *bio = req->bio; + struct nvme_copy_token *token = bvec_kmap_local(&bio->bi_io_vec[0]); + sector_t src_sector, dst_sector, n_sectors; + u64 src_lba, dst_lba, n_lba; + unsigned short nr_range = 1; + u16 control = 0; + + if (unlikely(memcmp(token->subsys, "nvme", 4))) + return BLK_STS_NOTSUPP; + if (unlikely(token->ns != ns)) + return BLK_STS_NOTSUPP; + + src_sector = token->src_sector; + dst_sector = bio->bi_iter.bi_sector; + n_sectors = token->sectors; + if (WARN_ON(n_sectors != bio->bi_iter.bi_size >> 9)) + return BLK_STS_NOTSUPP; + + src_lba = nvme_sect_to_lba(ns, src_sector); + dst_lba = nvme_sect_to_lba(ns, dst_sector); + n_lba = nvme_sect_to_lba(ns, n_sectors); + + if (WARN_ON(!n_lba)) + return BLK_STS_NOTSUPP; + + if (req->cmd_flags & REQ_FUA) + control |= NVME_RW_FUA; + + if (req->cmd_flags & REQ_FAILFAST_DEV) + control |= NVME_RW_LR; + + memset(cmnd, 0, sizeof(*cmnd)); + cmnd->copy.opcode = nvme_cmd_copy; + cmnd->copy.nsid = cpu_to_le32(ns->head->ns_id); + cmnd->copy.sdlba = cpu_to_le64(dst_lba); + + range = kmalloc_array(nr_range, sizeof(*range), + GFP_ATOMIC | __GFP_NOWARN); + if (!range) + return BLK_STS_RESOURCE; + + range[0].slba = cpu_to_le64(src_lba); + range[0].nlb = cpu_to_le16(n_lba - 1); + + cmnd->copy.nr_range = 0; + + req->special_vec.bv_page = virt_to_page(range); + req->special_vec.bv_offset = offset_in_page(range); + req->special_vec.bv_len = sizeof(*range) * nr_range; + req->rq_flags |= RQF_SPECIAL_PAYLOAD; + + cmnd->copy.control = cpu_to_le16(control); + + return BLK_STS_OK; +} + static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req, struct nvme_command *cmnd) { @@ -979,10 +1053,16 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req) ret = nvme_setup_discard(ns, req, cmd); break; case REQ_OP_READ: - ret = nvme_setup_rw(ns, req, cmd, nvme_cmd_read); + if (unlikely(req->cmd_flags & REQ_COPY)) + ret = nvme_setup_copy_read(ns, req); + else + ret = nvme_setup_rw(ns, req, cmd, nvme_cmd_read); break; case REQ_OP_WRITE: - ret = nvme_setup_rw(ns, req, cmd, nvme_cmd_write); + if (unlikely(req->cmd_flags & REQ_COPY)) + ret = nvme_setup_copy_write(ns, req, cmd); + else + ret = nvme_setup_rw(ns, req, cmd, nvme_cmd_write); break; case REQ_OP_ZONE_APPEND: ret = nvme_setup_rw(ns, req, cmd, nvme_cmd_zone_append); @@ -1731,6 +1811,26 @@ static void nvme_config_discard(struct gendisk *disk, struct nvme_ns *ns) blk_queue_max_write_zeroes_sectors(queue, UINT_MAX); } +static void nvme_config_copy(struct gendisk *disk, struct nvme_ns *ns, + struct nvme_id_ns *id) +{ + struct nvme_ctrl *ctrl = ns->ctrl; + struct request_queue *q = disk->queue; + + if (!(ctrl->oncs & NVME_CTRL_ONCS_COPY)) { + blk_queue_max_copy_sectors_hw(q, 0); + blk_queue_flag_clear(QUEUE_FLAG_COPY, q); + return; + } + + /* setting copy limits */ + if (blk_queue_flag_test_and_set(QUEUE_FLAG_COPY, q)) + return; + + blk_queue_max_copy_sectors_hw(q, + nvme_lba_to_sect(ns, le16_to_cpu(id->mssrl))); +} + static bool nvme_ns_ids_equal(struct nvme_ns_ids *a, struct nvme_ns_ids *b) { return uuid_equal(&a->uuid, &b->uuid) && @@ -1930,6 +2030,7 @@ static void nvme_update_disk_info(struct gendisk *disk, set_capacity_and_notify(disk, capacity); nvme_config_discard(disk, ns); + nvme_config_copy(disk, ns, id); blk_queue_max_write_zeroes_sectors(disk->queue, ns->ctrl->max_zeroes_sectors); } @@ -5320,6 +5421,7 @@ static inline void _nvme_check_size(void) BUILD_BUG_ON(sizeof(struct nvme_download_firmware) != 64); BUILD_BUG_ON(sizeof(struct nvme_format_cmd) != 64); BUILD_BUG_ON(sizeof(struct nvme_dsm_cmd) != 64); + BUILD_BUG_ON(sizeof(struct nvme_copy_command) != 64); BUILD_BUG_ON(sizeof(struct nvme_write_zeroes_cmd) != 64); BUILD_BUG_ON(sizeof(struct nvme_abort_cmd) != 64); BUILD_BUG_ON(sizeof(struct nvme_get_log_page_command) != 64); diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index 4564f16a0b20..fd04dee590b4 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -2807,6 +2807,11 @@ nvme_fc_queue_rq(struct blk_mq_hw_ctx *hctx, if (ret) return ret; + if (unlikely((rq->cmd_flags & REQ_COPY) && + (req_op(rq) == REQ_OP_READ))) { + blk_mq_end_request(rq, BLK_STS_OK); + return BLK_STS_OK; + } /* * nvme core doesn't quite treat the rq opaquely. Commands such * as WRITE ZEROES will return a non-zero rq payload_bytes yet diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 424c8a467a0c..6e282956deb4 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -500,6 +500,13 @@ struct nvme_ns { }; +struct nvme_copy_token { + char subsys[4]; + struct nvme_ns *ns; + u64 src_sector; + u64 sectors; +}; + /* NVMe ns supports metadata actions by the controller (generate/strip) */ static inline bool nvme_ns_has_pi(struct nvme_ns *ns) { diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index b13baccedb4a..8a66dd05cd24 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -499,16 +499,19 @@ static inline void nvme_sq_copy_cmd(struct nvme_queue *nvmeq, nvmeq->sq_tail = 0; } -static void nvme_commit_rqs(struct blk_mq_hw_ctx *hctx) +static inline void nvme_commit_sq_db(struct nvme_queue *nvmeq) { - struct nvme_queue *nvmeq = hctx->driver_data; - spin_lock(&nvmeq->sq_lock); if (nvmeq->sq_tail != nvmeq->last_sq_tail) nvme_write_sq_db(nvmeq, true); spin_unlock(&nvmeq->sq_lock); } +static void nvme_commit_rqs(struct blk_mq_hw_ctx *hctx) +{ + nvme_commit_sq_db(hctx->driver_data); +} + static void **nvme_pci_iod_list(struct request *req) { struct nvme_iod *iod = blk_mq_rq_to_pdu(req); @@ -898,6 +901,12 @@ static blk_status_t nvme_prep_rq(struct nvme_dev *dev, struct request *req) if (ret) return ret; + if (unlikely((req->cmd_flags & REQ_COPY) && + (req_op(req) == REQ_OP_READ))) { + blk_mq_start_request(req); + return BLK_STS_OK; + } + if (blk_rq_nr_phys_segments(req)) { ret = nvme_map_data(dev, req, &iod->cmd); if (ret) @@ -944,6 +953,18 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx, ret = nvme_prep_rq(dev, req); if (unlikely(ret)) return ret; + if (unlikely((req->cmd_flags & REQ_COPY) && + (req_op(req) == REQ_OP_READ))) { + blk_mq_set_request_complete(req); + blk_mq_end_request(req, BLK_STS_OK); + /* Commit the sq if copy read was the last req in the list, + * as copy read deoesn't update sq db + */ + if (bd->last) + nvme_commit_sq_db(nvmeq); + return ret; + } + spin_lock(&nvmeq->sq_lock); nvme_sq_copy_cmd(nvmeq, &iod->cmd); nvme_write_sq_db(nvmeq, bd->last); diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index bbad26b82b56..a8bf2a87f42a 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -2043,6 +2043,13 @@ static blk_status_t nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx, nvme_start_request(rq); + if (unlikely((rq->cmd_flags & REQ_COPY) && + (req_op(rq) == REQ_OP_READ))) { + blk_mq_end_request(rq, BLK_STS_OK); + ret = BLK_STS_OK; + goto unmap_qe; + } + if (IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY) && queue->pi_support && (c->common.opcode == nvme_cmd_write || diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 8cedc1ef496c..776e2ba84911 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -2346,6 +2346,11 @@ static blk_status_t nvme_tcp_setup_cmd_pdu(struct nvme_ns *ns, if (ret) return ret; + if (unlikely((rq->cmd_flags & REQ_COPY) && + (req_op(rq) == REQ_OP_READ))) { + return BLK_STS_OK; + } + req->state = NVME_TCP_SEND_CMD_PDU; req->status = cpu_to_le16(NVME_SC_SUCCESS); req->offset = 0; @@ -2414,6 +2419,17 @@ static blk_status_t nvme_tcp_queue_rq(struct blk_mq_hw_ctx *hctx, nvme_start_request(rq); + if (unlikely((rq->cmd_flags & REQ_COPY) && + (req_op(rq) == REQ_OP_READ))) { + blk_mq_set_request_complete(rq); + blk_mq_end_request(rq, BLK_STS_OK); + /* if copy read is the last req queue tcp reqs */ + if (bd->last && nvme_tcp_queue_more(queue)) + queue_work_on(queue->io_cpu, nvme_tcp_wq, + &queue->io_work); + return ret; + } + nvme_tcp_queue_request(req, true, bd->last); return BLK_STS_OK; diff --git a/drivers/nvme/host/trace.c b/drivers/nvme/host/trace.c index 1c36fcedea20..da4a7494e5a7 100644 --- a/drivers/nvme/host/trace.c +++ b/drivers/nvme/host/trace.c @@ -150,6 +150,23 @@ static const char *nvme_trace_read_write(struct trace_seq *p, u8 *cdw10) return ret; } +static const char *nvme_trace_copy(struct trace_seq *p, u8 *cdw10) +{ + const char *ret = trace_seq_buffer_ptr(p); + u64 slba = get_unaligned_le64(cdw10); + u8 nr_range = get_unaligned_le16(cdw10 + 8); + u16 control = get_unaligned_le16(cdw10 + 10); + u32 dsmgmt = get_unaligned_le32(cdw10 + 12); + u32 reftag = get_unaligned_le32(cdw10 + 16); + + trace_seq_printf(p, + "slba=%llu, nr_range=%u, ctrl=0x%x, dsmgmt=%u, reftag=%u", + slba, nr_range, control, dsmgmt, reftag); + trace_seq_putc(p, 0); + + return ret; +} + static const char *nvme_trace_dsm(struct trace_seq *p, u8 *cdw10) { const char *ret = trace_seq_buffer_ptr(p); @@ -243,6 +260,8 @@ const char *nvme_trace_parse_nvm_cmd(struct trace_seq *p, return nvme_trace_zone_mgmt_send(p, cdw10); case nvme_cmd_zone_mgmt_recv: return nvme_trace_zone_mgmt_recv(p, cdw10); + case nvme_cmd_copy: + return nvme_trace_copy(p, cdw10); default: return nvme_trace_common(p, cdw10); } diff --git a/include/linux/nvme.h b/include/linux/nvme.h index 4fad4aa245fb..e92dd69c745a 100644 --- a/include/linux/nvme.h +++ b/include/linux/nvme.h @@ -337,7 +337,7 @@ struct nvme_id_ctrl { __u8 nvscc; __u8 nwpc; __le16 acwu; - __u8 rsvd534[2]; + __le16 ocfs; __le32 sgls; __le32 mnan; __u8 rsvd544[224]; @@ -365,6 +365,7 @@ enum { NVME_CTRL_ONCS_WRITE_ZEROES = 1 << 3, NVME_CTRL_ONCS_RESERVATIONS = 1 << 5, NVME_CTRL_ONCS_TIMESTAMP = 1 << 6, + NVME_CTRL_ONCS_COPY = 1 << 8, NVME_CTRL_VWC_PRESENT = 1 << 0, NVME_CTRL_OACS_SEC_SUPP = 1 << 0, NVME_CTRL_OACS_NS_MNGT_SUPP = 1 << 3, @@ -414,7 +415,10 @@ struct nvme_id_ns { __le16 npdg; __le16 npda; __le16 nows; - __u8 rsvd74[18]; + __le16 mssrl; + __le32 mcl; + __u8 msrc; + __u8 rsvd91[11]; __le32 anagrpid; __u8 rsvd96[3]; __u8 nsattr; @@ -796,6 +800,7 @@ enum nvme_opcode { nvme_cmd_resv_report = 0x0e, nvme_cmd_resv_acquire = 0x11, nvme_cmd_resv_release = 0x15, + nvme_cmd_copy = 0x19, nvme_cmd_zone_mgmt_send = 0x79, nvme_cmd_zone_mgmt_recv = 0x7a, nvme_cmd_zone_append = 0x7d, @@ -818,7 +823,8 @@ enum nvme_opcode { nvme_opcode_name(nvme_cmd_resv_release), \ nvme_opcode_name(nvme_cmd_zone_mgmt_send), \ nvme_opcode_name(nvme_cmd_zone_mgmt_recv), \ - nvme_opcode_name(nvme_cmd_zone_append)) + nvme_opcode_name(nvme_cmd_zone_append), \ + nvme_opcode_name(nvme_cmd_copy)) @@ -995,6 +1001,36 @@ struct nvme_dsm_range { __le64 slba; }; +struct nvme_copy_command { + __u8 opcode; + __u8 flags; + __u16 command_id; + __le32 nsid; + __u64 rsvd2; + __le64 metadata; + union nvme_data_ptr dptr; + __le64 sdlba; + __u8 nr_range; + __u8 rsvd12; + __le16 control; + __le16 rsvd13; + __le16 dspec; + __le32 ilbrt; + __le16 lbat; + __le16 lbatm; +}; + +struct nvme_copy_range { + __le64 rsvd0; + __le64 slba; + __le16 nlb; + __le16 rsvd18; + __le32 rsvd20; + __le32 eilbrt; + __le16 elbat; + __le16 elbatm; +}; + struct nvme_write_zeroes_cmd { __u8 opcode; __u8 flags; @@ -1752,6 +1788,7 @@ struct nvme_command { struct nvme_download_firmware dlfw; struct nvme_format_cmd format; struct nvme_dsm_cmd dsm; + struct nvme_copy_command copy; struct nvme_write_zeroes_cmd write_zeroes; struct nvme_zone_mgmt_send_cmd zms; struct nvme_zone_mgmt_recv_cmd zmr; From patchwork Thu Jan 12 11:59:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nitesh Shetty X-Patchwork-Id: 13097995 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 71FCEC54EBC for ; Thu, 12 Jan 2023 13:38:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232875AbjALNih (ORCPT ); Thu, 12 Jan 2023 08:38:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40806 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233418AbjALNiE (ORCPT ); Thu, 12 Jan 2023 08:38:04 -0500 Received: from mailout4.samsung.com (mailout4.samsung.com [203.254.224.34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 68BD148CD8 for ; Thu, 12 Jan 2023 05:37:29 -0800 (PST) Received: from epcas5p1.samsung.com (unknown [182.195.41.39]) by mailout4.samsung.com (KnoxPortal) with ESMTP id 20230112133727epoutp042e1ead62b337077393971dae46dc126c~5k0HcuzfL2977529775epoutp04j for ; Thu, 12 Jan 2023 13:37:27 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout4.samsung.com 20230112133727epoutp042e1ead62b337077393971dae46dc126c~5k0HcuzfL2977529775epoutp04j DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1673530647; bh=wF6YNjr6pG5bxty/21ZdkgTMo2lpu7is7HNFUPbtMEM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UZSgboR7qDhc5L8SjaGCmD8o8E5yDK0kajAPDjgcdnPOfOJJw8gEQrn+uj4vO20eb ipPKMarBkSyxCRp7sqMVPyihwI9k453LW4opt6X6tnQp6DDrsG3W58FnDGoNVLpfmR bzmCtVqtkDLth2WlNVmYX7WKAQrzuJCSZyRZ+CRY= Received: from epsnrtp2.localdomain (unknown [182.195.42.163]) by epcas5p4.samsung.com (KnoxPortal) with ESMTP id 20230112133726epcas5p45954e7cf152d246adbbfab3893c97ed5~5k0GfcAcl1397713977epcas5p4B; Thu, 12 Jan 2023 13:37:26 +0000 (GMT) Received: from epsmges5p3new.samsung.com (unknown [182.195.38.178]) by epsnrtp2.localdomain (Postfix) with ESMTP id 4Nt5Fd0Nz7z4x9Pt; Thu, 12 Jan 2023 13:37:25 +0000 (GMT) Received: from epcas5p1.samsung.com ( [182.195.41.39]) by epsmges5p3new.samsung.com (Symantec Messaging Gateway) with SMTP id 16.66.03362.41D00C36; Thu, 12 Jan 2023 22:37:24 +0900 (KST) Received: from epsmtrp1.samsung.com (unknown [182.195.40.13]) by epcas5p1.samsung.com (KnoxPortal) with ESMTPA id 20230112120201epcas5p1d2ee1f9fd6a1f458ffa770bb33b4bb41~5jgybUTpm2086220862epcas5p17; Thu, 12 Jan 2023 12:02:01 +0000 (GMT) Received: from epsmgms1p1new.samsung.com (unknown [182.195.42.41]) by epsmtrp1.samsung.com (KnoxPortal) with ESMTP id 20230112120201epsmtrp1ec0c3eaa4e90bef1349565279d9543fc~5jgyZ-8hr2544625446epsmtrp1E; Thu, 12 Jan 2023 12:02:01 +0000 (GMT) X-AuditID: b6c32a4b-4e5fa70000010d22-e7-63c00d14a9b3 Received: from epsmtip2.samsung.com ( [182.195.34.31]) by epsmgms1p1new.samsung.com (Symantec Messaging Gateway) with SMTP id FE.B5.02211.9B6FFB36; Thu, 12 Jan 2023 21:02:01 +0900 (KST) Received: from green5.sa.corp.samsungelectronics.net (unknown [107.110.206.5]) by epsmtip2.samsung.com (KnoxPortal) with ESMTPA id 20230112120158epsmtip2bfefac09dabeba4c439c80267b278eeb~5jgvu0heN0767707677epsmtip2e; Thu, 12 Jan 2023 12:01:58 +0000 (GMT) From: Nitesh Shetty To: Jens Axboe , Alasdair Kergon , Mike Snitzer , dm-devel@redhat.com, Keith Busch , Christoph Hellwig , Sagi Grimberg , James Smart , Chaitanya Kulkarni , Alexander Viro Cc: anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com, nitheshshetty@gmail.com, gost.dev@samsung.com, Nitesh Shetty , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 6/9] nvmet: add copy command support for bdev and file ns Date: Thu, 12 Jan 2023 17:29:00 +0530 Message-Id: <20230112115908.23662-7-nj.shetty@samsung.com> X-Mailer: git-send-email 2.35.1.500.gb896f729e2 In-Reply-To: <20230112115908.23662-1-nj.shetty@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA01TfUxTVxT3vldeH8aaJ7B4wYBd6zaQAS0WfCAMP6p5RjdIzLKMbZaX8gKE 0tZ+qJAZQMAENooScNI6kYmI4FrHd6l8DIIIhKFDGBCEIWUZEmDIIovEsJYHm//9zu+c3z3n d+69OOq2iHnhSUodo1HSCgG2ldPQ6fdBgAevXS6607qftPQ+RMmLl9+gZPWzAoxsWTC5kKPt VoS8W92FkLayJYTsWpvHyMKOYUDODBkRsmXMn3zQ0sMhB5uvY2RpxQyXbLJnArJhtRQll29n c0nz3CKHfDS2ixx40+1y0IMyTvZjlNX4jEsNTPzEoQb79VRNVS5G1ZanU7bRDIzKz1rAqMXW IYwy1FUBarnGh6qxzyMx22KTIxIZOp7R8BmlXBWfpEyIFJw4JTsiCwkViQPEYeR+AV9JpzCR AunJmIBjSQqHTQH/LK3QO6gYWqsVBH0UoVHpdQw/UaXVRQoYdbxCLVEHaukUrV6ZEKhkdOFi kSg4xFEYl5xYXVnMUddKz0/9cjwD9JF5wBWHhASaLI1YHtiKuxE2AGfvjHHY4CWAl+asG5lX AGbeXgWbkoxfLSibaAGwquG6CxtkIbAi909HgOMY4Q/71nCnwIMYR6C1T+isQQkTAh/NT7k4 E+7ESWhum+c6MYd4Dz55ObfegUeEw2/zFrjOcyARBAsmdzhpV+IAvDfyN4ct2QF7SuzrGCV2 w6x60/pAkFjDYdZII2C1Uji59CU7tDt80V3HZbEXnC24tIHPwbtFlRirzQbQ+Jtxw2UUzOkt QJ3noIQftDQHsbQ3LO41I2zf7TB/1Y6wPA823djEQnjPchNjsSccXsncwBRcLszbWK8BwNdX l8BlwDe+5cf4lh/j/61vArQKeDJqbUoCow1R71My5/67ZLkqpQasv/y9J5rA89//CuwACA46 AMRRgQfvQVeb3I0XT6emMRqVTKNXMNoOEOLY9xXU6x25yvF1lDqZWBImkoSGhkrC9oWKBTt5 TH2p3I1IoHVMMsOoGc2mDsFdvTKQRsPs/f4f0g964yqPTyq/j8oI8kv77mf/M9jTw61CzKQY Eupjk7++UX5oxZS2e7jwiTnO5jvdza3x3bN0y8se/HmqccjVJ7NL5E3U1kelFc0VZ79vL0rP ke3c0jzx8ZXTtmOfdTOjlYsXnkbHnh6Snveb/mP6rOHHuOCxcFlX2cAWmeX5SDvftcx36v54 urVi+4BwfObxP5+WVOuiD2HunZ7Hv+kOSBaaj35Ruyvw6oQmuM3a8zpi9EBPDJgbLI/Kk3jb zIajL0rVadKZPaJOH6soeiZ/dO3iGc/2psWyNt6Fa9dS5Y8jXw0fHrnVkrsiL/yqxD/p3Zyp DwdzDafq6IfbIgUcbSIt3otqtPS/rbdHPoIEAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrGIsWRmVeSWpSXmKPExsWy7bCSvO7Ob/uTDW6d57BYf+oYs0XThL/M Fqvv9rNZ7H03m9Xi5oGdTBYrVx9lsti98COTxdH/b9ksJh26xmjx9OosJou9t7Qt9uw9yWJx edccNov5y56yW+x40shose33fGaLz0tb2C3WvX7PYnHilrTF+b/HWR1EPGbdP8vmsXPWXXaP 8/c2snhcPlvqsWlVJ5vH5iX1HrtvNrB59Da/Y/N4v+8qm0ffllWMHp83yXlsevKWKYAnissm JTUnsyy1SN8ugStj9YqpLAWbXSoenvNsYDxt0cXIySEhYCLRcGk9cxcjF4eQwG5Gid9HrzNC JCQllv09wgxhC0us/PecHaKokUni1st1QAkODjYBbYnT/zlA4iICz5gkzt57BDaJWWApk8SC fY2sIN3CAj4S6/a/ZQexWQRUJS5+eg22gVfASqKn6x07yCAJAX2J/vuCIGFOAWuJNTe+sIDY QkAls/ZcZYIoF5Q4OfMJWJxZQF6ieets5gmMArOQpGYhSS1gZFrFKJlaUJybnltsWGCYl1qu V5yYW1yal66XnJ+7iREclVqaOxi3r/qgd4iRiYPxEKMEB7OSCO+eo/uThXhTEiurUovy44tK c1KLDzFKc7AoifNe6DoZLySQnliSmp2aWpBaBJNl4uCUamDq/uTCsfGEp4+F3cbGxjNdl9mu NuoKbXjNnKm+LaW5fuK1Y1GFfSLxG+9ozP1TWqh9orzwk3G38DbBmG25X1XmPPtmrS3ExW3S YsIqkr047Xps3yszJeHtr2YsX7fHhmG+RFJA0LRTGf0FH9pqLP6rWuefjnQ1Zn+6UWd764/i eVfmr/4o+fLLnADho8fSns7RbLVNWnvzYPm95A0h5x4o1yuVZ61N2nLxo1xU9VnzbQf/mUwT 07N1m9p25q19SaVK4qebrJ/377h0Q+vKU4bZ/9NaAvZyzdzVGN95a96GPKYyLYUEu/1ZV61b r5ak3nfaY3ImbHvNu63rdOcqL5ZvMYszC3gc2n+hv4E78agSS3FGoqEWc1FxIgAYjirpOQMA AA== X-CMS-MailID: 20230112120201epcas5p1d2ee1f9fd6a1f458ffa770bb33b4bb41 X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20230112120201epcas5p1d2ee1f9fd6a1f458ffa770bb33b4bb41 References: <20230112115908.23662-1-nj.shetty@samsung.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add support for handling target command on target. For bdev-ns we call into blkdev_issue_copy, which the block layer completes by a offloaded copy request to backend bdev or by emulating the request. For file-ns we call vfs_copy_file_range to service our request. Currently target always shows copy capability by setting NVME_CTRL_ONCS_COPY in controller ONCS. Signed-off-by: Nitesh Shetty Signed-off-by: Anuj Gupta --- drivers/nvme/target/admin-cmd.c | 9 +++- drivers/nvme/target/io-cmd-bdev.c | 79 +++++++++++++++++++++++++++++++ drivers/nvme/target/io-cmd-file.c | 52 ++++++++++++++++++++ drivers/nvme/target/loop.c | 6 +++ drivers/nvme/target/nvmet.h | 2 + 5 files changed, 146 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c index 6a54ed6fb121..66a45121f5a0 100644 --- a/drivers/nvme/target/admin-cmd.c +++ b/drivers/nvme/target/admin-cmd.c @@ -433,8 +433,7 @@ static void nvmet_execute_identify_ctrl(struct nvmet_req *req) id->nn = cpu_to_le32(NVMET_MAX_NAMESPACES); id->mnan = cpu_to_le32(NVMET_MAX_NAMESPACES); id->oncs = cpu_to_le16(NVME_CTRL_ONCS_DSM | - NVME_CTRL_ONCS_WRITE_ZEROES); - + NVME_CTRL_ONCS_WRITE_ZEROES | NVME_CTRL_ONCS_COPY); /* XXX: don't report vwc if the underlying device is write through */ id->vwc = NVME_CTRL_VWC_PRESENT; @@ -536,6 +535,12 @@ static void nvmet_execute_identify_ns(struct nvmet_req *req) if (req->ns->bdev) nvmet_bdev_set_limits(req->ns->bdev, id); + else { + id->msrc = (u8)to0based(BIO_MAX_VECS - 1); + id->mssrl = cpu_to_le16(BIO_MAX_VECS << + (PAGE_SHIFT - SECTOR_SHIFT)); + id->mcl = cpu_to_le32(le16_to_cpu(id->mssrl)); + } /* * We just provide a single LBA format that matches what the diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c index c2d6cea0236b..01f0160125fb 100644 --- a/drivers/nvme/target/io-cmd-bdev.c +++ b/drivers/nvme/target/io-cmd-bdev.c @@ -46,6 +46,19 @@ void nvmet_bdev_set_limits(struct block_device *bdev, struct nvme_id_ns *id) id->npda = id->npdg; /* NOWS = Namespace Optimal Write Size */ id->nows = to0based(bdev_io_opt(bdev) / bdev_logical_block_size(bdev)); + + /*Copy limits*/ + if (bdev_max_copy_sectors(bdev)) { + id->msrc = id->msrc; + id->mssrl = cpu_to_le16((bdev_max_copy_sectors(bdev) << + SECTOR_SHIFT) / bdev_logical_block_size(bdev)); + id->mcl = cpu_to_le32(id->mssrl); + } else { + id->msrc = (u8)to0based(BIO_MAX_VECS - 1); + id->mssrl = cpu_to_le16((BIO_MAX_VECS << PAGE_SHIFT) / + bdev_logical_block_size(bdev)); + id->mcl = cpu_to_le32(id->mssrl); + } } void nvmet_bdev_ns_disable(struct nvmet_ns *ns) @@ -184,6 +197,23 @@ static void nvmet_bio_done(struct bio *bio) nvmet_req_bio_put(req, bio); } +static void nvmet_bdev_copy_end_io(void *private, int status) +{ + struct nvmet_req *req = (struct nvmet_req *)private; + int id; + + if (status) { + for (id = 0 ; id < req->nr_range; id++) { + if (req->ranges[id].len != req->ranges[id].comp_len) { + req->cqe->result.u32 = cpu_to_le32(id); + break; + } + } + } + kfree(req->ranges); + nvmet_req_complete(req, errno_to_nvme_status(req, status)); +} + #ifdef CONFIG_BLK_DEV_INTEGRITY static int nvmet_bdev_alloc_bip(struct nvmet_req *req, struct bio *bio, struct sg_mapping_iter *miter) @@ -450,6 +480,51 @@ static void nvmet_bdev_execute_write_zeroes(struct nvmet_req *req) } } +static void nvmet_bdev_execute_copy(struct nvmet_req *req) +{ + struct nvme_copy_range range; + struct range_entry *ranges; + struct nvme_command *cmnd = req->cmd; + sector_t dest, dest_off = 0; + int ret, id, nr_range; + + nr_range = cmnd->copy.nr_range + 1; + dest = le64_to_cpu(cmnd->copy.sdlba) << req->ns->blksize_shift; + ranges = kmalloc_array(nr_range, sizeof(*ranges), GFP_KERNEL); + + for (id = 0 ; id < nr_range; id++) { + ret = nvmet_copy_from_sgl(req, id * sizeof(range), + &range, sizeof(range)); + if (ret) + goto out; + + ranges[id].dst = dest + dest_off; + ranges[id].src = le64_to_cpu(range.slba) << + req->ns->blksize_shift; + ranges[id].len = (le16_to_cpu(range.nlb) + 1) << + req->ns->blksize_shift; + ranges[id].comp_len = 0; + dest_off += ranges[id].len; + } + req->ranges = ranges; + req->nr_range = nr_range; + ret = blkdev_issue_copy(req->ns->bdev, req->ns->bdev, ranges, nr_range, + nvmet_bdev_copy_end_io, (void *)req, GFP_KERNEL); + if (ret) { + for (id = 0 ; id < nr_range; id++) { + if (ranges[id].len != ranges[id].comp_len) { + req->cqe->result.u32 = cpu_to_le32(id); + break; + } + } + goto out; + } else + return; +out: + kfree(ranges); + nvmet_req_complete(req, errno_to_nvme_status(req, ret)); +} + u16 nvmet_bdev_parse_io_cmd(struct nvmet_req *req) { switch (req->cmd->common.opcode) { @@ -468,6 +543,10 @@ u16 nvmet_bdev_parse_io_cmd(struct nvmet_req *req) case nvme_cmd_write_zeroes: req->execute = nvmet_bdev_execute_write_zeroes; return 0; + case nvme_cmd_copy: + req->execute = nvmet_bdev_execute_copy; + return 0; + default: return nvmet_report_invalid_opcode(req); } diff --git a/drivers/nvme/target/io-cmd-file.c b/drivers/nvme/target/io-cmd-file.c index 871c4f32f443..76db307f5b30 100644 --- a/drivers/nvme/target/io-cmd-file.c +++ b/drivers/nvme/target/io-cmd-file.c @@ -328,6 +328,49 @@ static void nvmet_file_dsm_work(struct work_struct *w) } } +static void nvmet_file_copy_work(struct work_struct *w) +{ + struct nvmet_req *req = container_of(w, struct nvmet_req, f.work); + int nr_range; + loff_t pos; + struct nvme_command *cmnd = req->cmd; + int ret = 0, len = 0, src, id; + + nr_range = cmnd->copy.nr_range + 1; + pos = le64_to_cpu(req->cmd->copy.sdlba) << req->ns->blksize_shift; + if (unlikely(pos + req->transfer_len > req->ns->size)) { + nvmet_req_complete(req, errno_to_nvme_status(req, -ENOSPC)); + return; + } + + for (id = 0 ; id < nr_range; id++) { + struct nvme_copy_range range; + + ret = nvmet_copy_from_sgl(req, id * sizeof(range), &range, + sizeof(range)); + if (ret) + goto out; + + len = (le16_to_cpu(range.nlb) + 1) << (req->ns->blksize_shift); + src = (le64_to_cpu(range.slba) << (req->ns->blksize_shift)); + ret = vfs_copy_file_range(req->ns->file, src, req->ns->file, + pos, len, 0); +out: + if (ret != len) { + pos += ret; + req->cqe->result.u32 = cpu_to_le32(id); + nvmet_req_complete(req, ret < 0 ? + errno_to_nvme_status(req, ret) : + errno_to_nvme_status(req, -EIO)); + return; + + } else + pos += len; + } + + nvmet_req_complete(req, 0); + +} static void nvmet_file_execute_dsm(struct nvmet_req *req) { if (!nvmet_check_data_len_lte(req, nvmet_dsm_len(req))) @@ -336,6 +379,12 @@ static void nvmet_file_execute_dsm(struct nvmet_req *req) queue_work(nvmet_wq, &req->f.work); } +static void nvmet_file_execute_copy(struct nvmet_req *req) +{ + INIT_WORK(&req->f.work, nvmet_file_copy_work); + queue_work(nvmet_wq, &req->f.work); +} + static void nvmet_file_write_zeroes_work(struct work_struct *w) { struct nvmet_req *req = container_of(w, struct nvmet_req, f.work); @@ -382,6 +431,9 @@ u16 nvmet_file_parse_io_cmd(struct nvmet_req *req) case nvme_cmd_write_zeroes: req->execute = nvmet_file_execute_write_zeroes; return 0; + case nvme_cmd_copy: + req->execute = nvmet_file_execute_copy; + return 0; default: return nvmet_report_invalid_opcode(req); } diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c index f2d24b2d992f..d18ed8067a15 100644 --- a/drivers/nvme/target/loop.c +++ b/drivers/nvme/target/loop.c @@ -146,6 +146,12 @@ static blk_status_t nvme_loop_queue_rq(struct blk_mq_hw_ctx *hctx, return ret; nvme_start_request(req); + if (unlikely((req->cmd_flags & REQ_COPY) && + (req_op(req) == REQ_OP_READ))) { + blk_mq_set_request_complete(req); + blk_mq_end_request(req, BLK_STS_OK); + return BLK_STS_OK; + } iod->cmd.common.flags |= NVME_CMD_SGL_METABUF; iod->req.port = queue->ctrl->port; if (!nvmet_req_init(&iod->req, &queue->nvme_cq, diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h index 89bedfcd974c..c36b55c9ff68 100644 --- a/drivers/nvme/target/nvmet.h +++ b/drivers/nvme/target/nvmet.h @@ -393,6 +393,8 @@ struct nvmet_req { struct device *p2p_client; u16 error_loc; u64 error_slba; + struct range_entry *ranges; + unsigned int nr_range; }; #define NVMET_MAX_MPOOL_BVEC 16 From patchwork Thu Jan 12 11:59:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nitesh Shetty X-Patchwork-Id: 13097996 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2F63C54EBD for ; Thu, 12 Jan 2023 13:39:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233760AbjALNjO (ORCPT ); Thu, 12 Jan 2023 08:39:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40834 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233686AbjALNiG (ORCPT ); Thu, 12 Jan 2023 08:38:06 -0500 Received: from mailout3.samsung.com (mailout3.samsung.com [203.254.224.33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 534794F13A for ; Thu, 12 Jan 2023 05:37:33 -0800 (PST) Received: from epcas5p3.samsung.com (unknown [182.195.41.41]) by mailout3.samsung.com (KnoxPortal) with ESMTP id 20230112133731epoutp0368854f7697124ad3b8a0a8160e3a68b2~5k0LQig6L2187321873epoutp03K for ; Thu, 12 Jan 2023 13:37:31 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout3.samsung.com 20230112133731epoutp0368854f7697124ad3b8a0a8160e3a68b2~5k0LQig6L2187321873epoutp03K DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1673530651; bh=xxiDP6QZKfeqkiRLoXjxlMN+FlItKe4BO652Ksxx5Yk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=jsce4rKGyQiKl5XjIg2wDNbmjh3Netg8iOt/reKPr5u/kcNPlRih5QBUjLplK7rg8 R3LkEkIOWwJmTrSSpnOu6g1Kfd7Aw89Pd90Ow90PWoxrjSb/pThd6us15374NyqwJZ 3M+A5wx1WMsCYjLE+WOAzOPnxptThHYYUYabSwt4= Received: from epsnrtp2.localdomain (unknown [182.195.42.163]) by epcas5p3.samsung.com (KnoxPortal) with ESMTP id 20230112133730epcas5p3115034feada6d2eb43b03735703670da~5k0J99zGe0296902969epcas5p3_; Thu, 12 Jan 2023 13:37:30 +0000 (GMT) Received: from epsmges5p2new.samsung.com (unknown [182.195.38.179]) by epsnrtp2.localdomain (Postfix) with ESMTP id 4Nt5Fh4n3gz4x9Pv; Thu, 12 Jan 2023 13:37:28 +0000 (GMT) Received: from epcas5p3.samsung.com ( [182.195.41.41]) by epsmges5p2new.samsung.com (Symantec Messaging Gateway) with SMTP id F2.7D.62806.81D00C36; Thu, 12 Jan 2023 22:37:28 +0900 (KST) Received: from epsmtrp1.samsung.com (unknown [182.195.40.13]) by epcas5p4.samsung.com (KnoxPortal) with ESMTPA id 20230112120210epcas5p41524bba73af1dcf283d21b9c7ee9d239~5jg63K7pf1219512195epcas5p45; Thu, 12 Jan 2023 12:02:10 +0000 (GMT) Received: from epsmgms1p1new.samsung.com (unknown [182.195.42.41]) by epsmtrp1.samsung.com (KnoxPortal) with ESMTP id 20230112120210epsmtrp104d470e1504fcb5fa6721c082d747ad3~5jg62S8B62544625446epsmtrp1L; Thu, 12 Jan 2023 12:02:10 +0000 (GMT) X-AuditID: b6c32a4a-ea5fa7000000f556-d4-63c00d18ac89 Received: from epsmtip2.samsung.com ( [182.195.34.31]) by epsmgms1p1new.samsung.com (Symantec Messaging Gateway) with SMTP id 63.C5.02211.2C6FFB36; Thu, 12 Jan 2023 21:02:10 +0900 (KST) Received: from green5.sa.corp.samsungelectronics.net (unknown [107.110.206.5]) by epsmtip2.samsung.com (KnoxPortal) with ESMTPA id 20230112120207epsmtip27ad59d8d34c5e51703861416dce1293a~5jg4PDTug1331213312epsmtip2v; Thu, 12 Jan 2023 12:02:07 +0000 (GMT) From: Nitesh Shetty To: Jens Axboe , Alasdair Kergon , Mike Snitzer , dm-devel@redhat.com, Keith Busch , Christoph Hellwig , Sagi Grimberg , James Smart , Chaitanya Kulkarni , Alexander Viro Cc: anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com, nitheshshetty@gmail.com, gost.dev@samsung.com, Nitesh Shetty , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 7/9] dm: Add support for copy offload. Date: Thu, 12 Jan 2023 17:29:01 +0530 Message-Id: <20230112115908.23662-8-nj.shetty@samsung.com> X-Mailer: git-send-email 2.35.1.500.gb896f729e2 In-Reply-To: <20230112115908.23662-1-nj.shetty@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA02Tf1AUZRjH59099hbkcjmwXnGKa5UacPhxBLQQKDMYblpKxWTZMLgd290N x915e6fJTMMRSMKAB1l/sDhyhJT86ChABgTETlHgREoEgSRkBmbkCjAv/BGR3bFS/vd5vu/z fZ/neX/gqHQRC8TVWiNr0DIaEvMRtV0MCQmDkguKyPvng6imgcso9VnZCko1TFowqnuh0osa v9CBUHUNvQjVWf0HQvU+nseoL+yjgJod4RGqe2Ir1dXdL6KGz53EqKpvZsVU+0weoNqWq1DK VVsgpmy/LYqovolN1NDKFa+kAJqfGsToDn5STA/9+oOIHh400c31RRjdcjqX7hw3Y3Rp/gJG L54fwejjrfWAdjW/QDfPzCOpvvuzElQsk8kaZKxWoctUa5WJ5O53M5IzYmIj5WHyOOpVUqZl stlEcsebqWEpao17TFJ2iNGY3FIqw3FkxLYEg85kZGUqHWdMJFl9pkYfrQ/nmGzOpFWGa1lj vDwyMirGnXggS3Wj9LKXfjLwE1uFBTOD8Q3FwBuHRDR0XSxDioEPLiU6ASwZdIqF4B6AY04z EIL7ANZMz3mtWZzDD1BhoRvAr2t5kRDkI7Cl/3e3BccxYit0PMY9hgDiFgI7HJs9OShRicC+ +enVnfyJOFj080PgYRERDO88vI56WELEw2q+TeTZBxIR0DLl55G9iddg49ifIiHFD/ZXzKwy SgTB/LOVqw1BIt8b1p4ZA0KnO2CxxSIS2B86r7SKBQ6Ec5bCJ3wY1n15BhPMBQDyN/kn5u3w 6IAF9TSBEiGw6VyEID8PvxqwIULhZ2Dp8gwi6BLYfmqNN8PGJism8EY4+iAPE2ah4UjVqiwl jgN43XawDMj4p8bhnxqH/7+wFaD1YCOr57KVLBejj9Kyh/+7ZIUuuxmsvvzQXe1g+vbdcDtA cGAHEEfJAElXb49CKslkjuSwBl2GwaRhOTuIcR93ORq4QaFzfx2tMUMeHRcZHRsbGx33Sqyc fE7Cnq1SSAklY2SzWFbPGtZ8CO4daEZ8/KI6kUaty1etfDv3Lf/bK0ttqtaQpQV1WUPOTssp cSXnMCfbPjWbHGPKnKt97Tdzlws/X1q+M6F2sNakgw0fpfbmbxlriZg4mTbqaj4m2XnNlrYu qrxjT9+299YzCcd60p6VOn/C5GXqRzecKVMlFftG/zKHfpBiW1kiE+uvwTprzW7VEe5j5p2r vr9M12zan77FWF1in91lvjVzCd7T/73920PW4tfL1adPrKx/Y+8eu6T/QOecX1DVd6nrLu17 PzxlaH4o/eWhIrb9x6S7PfHJ3h/a8goSsBOyLvyoumHZvGT6fm+jwze4G3VldAXzhfEvWTV1 eVzWP4kvphtA6CNSxKkYeShq4Jh/AfROagqCBAAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrGIsWRmVeSWpSXmKPExsWy7bCSvO6hb/uTDX7c4LdYf+oYs0XThL/M Fqvv9rNZ7H03m9Xi5oGdTBYrVx9lsti98COTxdH/b9ksJh26xmjx9OosJou9t7Qt9uw9yWJx edccNov5y56yW+x40shose33fGaLz0tb2C3WvX7PYnHilrTF+b/HWR1EPGbdP8vmsXPWXXaP 8/c2snhcPlvqsWlVJ5vH5iX1HrtvNrB59Da/Y/N4v+8qm0ffllWMHp83yXlsevKWKYAnissm JTUnsyy1SN8ugSvjSu8x1oK7UhXrZvazNTDeFO1i5OSQEDCReHX5OzOILSSwm1Gi83kyRFxS YtnfI8wQtrDEyn/P2bsYuYBqGpkkNp1aztbFyMHBJqAtcfo/B0hcROAZk8TZe4+YQRxmgaVM Egv2NbKCdAsLWEp0XvzBCGKzCKhKvPhxCWwqr4CVxMJZ21hABkkI6Ev03xcECXMKWEusufGF BeIgK4lZe64yQZQLSpyc+QQsziwgL9G8dTbzBEaBWUhSs5CkFjAyrWKUTC0ozk3PLTYsMMxL LdcrTswtLs1L10vOz93ECI5KLc0djNtXfdA7xMjEwXiIUYKDWUmEd8/R/clCvCmJlVWpRfnx RaU5qcWHGKU5WJTEeS90nYwXEkhPLEnNTk0tSC2CyTJxcEo1MOn+u2dx+eyCz54Ky0v0/970 85saK2KY7mSw7CbX3D/Rh/siPi70/DMnZ1+t1d2TogWsk1+umXzR5IHG04ezo/w5Ot6suHFA 5e5Nrje3tPLcW9a1aj5a+sHmSPaPTF+d5wkJGzSv3GPX2pOg8p2J8YW1boPOubYVK5Ni7ui6 P2DsMFY0Frcr/Lz19+pDPEW/XmeXnlnxv8M1xWhBi3pJ2eHmN/Mntv/k33tH5tuG4oWXzC7c 37rj3sbqhIuBLTt705OfLekxu10xhe9RUKLv9GCO8o+BZcYeK1d8urL6RFvlBzfbs4eOnWSP ShaUdvv4osjP94vsnYRnwreaedby3FWQmG3Yd+Zp/7qCHyf27UlRYinOSDTUYi4qTgQAPbEw lTkDAAA= X-CMS-MailID: 20230112120210epcas5p41524bba73af1dcf283d21b9c7ee9d239 X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20230112120210epcas5p41524bba73af1dcf283d21b9c7ee9d239 References: <20230112115908.23662-1-nj.shetty@samsung.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Before enabling copy for dm target, check if underlying devices and dm target support copy. Avoid split happening inside dm target. Fail early if the request needs split, currently splitting copy request is not supported. Signed-off-by: Nitesh Shetty --- drivers/md/dm-table.c | 42 +++++++++++++++++++++++++++++++++++ drivers/md/dm.c | 7 ++++++ include/linux/device-mapper.h | 5 +++++ 3 files changed, 54 insertions(+) diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index 8541d5688f3a..4a1bbbb2493b 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -1875,6 +1875,39 @@ static bool dm_table_supports_nowait(struct dm_table *t) return true; } +static int device_not_copy_capable(struct dm_target *ti, struct dm_dev *dev, + sector_t start, sector_t len, void *data) +{ + struct request_queue *q = bdev_get_queue(dev->bdev); + + return !blk_queue_copy(q); +} + +static bool dm_table_supports_copy(struct dm_table *t) +{ + struct dm_target *ti; + unsigned int i; + + for (i = 0; i < t->num_targets; i++) { + ti = dm_table_get_target(t, i); + + if (!ti->copy_offload_supported) + return false; + + /* + * target provides copy support (as implied by setting + * 'copy_offload_supported') + * and it relies on _all_ data devices having copy support. + */ + if (!ti->type->iterate_devices || + ti->type->iterate_devices(ti, + device_not_copy_capable, NULL)) + return false; + } + + return true; +} + static int device_not_discard_capable(struct dm_target *ti, struct dm_dev *dev, sector_t start, sector_t len, void *data) { @@ -1957,6 +1990,15 @@ int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, q->limits.discard_misaligned = 0; } + if (!dm_table_supports_copy(t)) { + blk_queue_flag_clear(QUEUE_FLAG_COPY, q); + /* Must also clear copy limits... */ + q->limits.max_copy_sectors = 0; + q->limits.max_copy_sectors_hw = 0; + } else { + blk_queue_flag_set(QUEUE_FLAG_COPY, q); + } + if (!dm_table_supports_secure_erase(t)) q->limits.max_secure_erase_sectors = 0; diff --git a/drivers/md/dm.c b/drivers/md/dm.c index b424a6ee27ba..0b04093fbeb2 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1690,6 +1690,13 @@ static blk_status_t __split_and_process_bio(struct clone_info *ci) if (unlikely(ci->is_abnormal_io)) return __process_abnormal_io(ci, ti); + if ((unlikely(op_is_copy(ci->bio->bi_opf)) && + max_io_len(ti, ci->sector) < ci->sector_count)) { + DMERR("Error, IO size(%u) > max target size(%llu)\n", + ci->sector_count, max_io_len(ti, ci->sector)); + return BLK_STS_IOERR; + } + /* * Only support bio polling for normal IO, and the target io is * exactly inside the dm_io instance (verified in dm_poll_dm_io) diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h index 04c6acf7faaa..da4e77e81011 100644 --- a/include/linux/device-mapper.h +++ b/include/linux/device-mapper.h @@ -379,6 +379,11 @@ struct dm_target { * bio_set_dev(). NOTE: ideally a target should _not_ need this. */ bool needs_bio_set_dev:1; + + /* + * copy offload is supported + */ + bool copy_offload_supported:1; }; void *dm_per_bio_data(struct bio *bio, size_t data_size); From patchwork Thu Jan 12 11:59:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nitesh Shetty X-Patchwork-Id: 13097997 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3CBC8C54EBC for ; Thu, 12 Jan 2023 13:39:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234697AbjALNjh (ORCPT ); Thu, 12 Jan 2023 08:39:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40806 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234347AbjALNic (ORCPT ); Thu, 12 Jan 2023 08:38:32 -0500 Received: from mailout2.samsung.com (mailout2.samsung.com [203.254.224.25]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DDEA550F64 for ; Thu, 12 Jan 2023 05:37:41 -0800 (PST) Received: from epcas5p4.samsung.com (unknown [182.195.41.42]) by mailout2.samsung.com (KnoxPortal) with ESMTP id 20230112133740epoutp02cd362d4838bf78b281fc9c5fc21c3994~5k0TDbtd12559025590epoutp02X for ; Thu, 12 Jan 2023 13:37:40 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.samsung.com 20230112133740epoutp02cd362d4838bf78b281fc9c5fc21c3994~5k0TDbtd12559025590epoutp02X DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1673530660; bh=SaeKrlDyh6fhEVghEF1v9Ec1Wl/S+jtnq86cAYk6FMQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=qmlp4lo/qYbxQ3eDquSUj4HE4Nj9KWRjg2i3Mp4jxcXp6pcH7USVZR6rypqRnt+2b S7oVonTKwUsC03gkzxIoQpRjLiWrWGAC2x9MEX8yZofRdGHM5UpvNwi8aguYkefJmP GkkxQQ/WPoEg1ItvEKp7dAUrMBvPi/sgNcg2bMiw= Received: from epsnrtp4.localdomain (unknown [182.195.42.165]) by epcas5p2.samsung.com (KnoxPortal) with ESMTP id 20230112133739epcas5p206af3cc1f5dec226c2cbae46ae12620e~5k0SefRfj0471004710epcas5p2i; Thu, 12 Jan 2023 13:37:39 +0000 (GMT) Received: from epsmges5p1new.samsung.com (unknown [182.195.38.177]) by epsnrtp4.localdomain (Postfix) with ESMTP id 4Nt5Fs5pj2z4x9Pq; Thu, 12 Jan 2023 13:37:37 +0000 (GMT) Received: from epcas5p3.samsung.com ( [182.195.41.41]) by epsmges5p1new.samsung.com (Symantec Messaging Gateway) with SMTP id 36.E4.02301.12D00C36; Thu, 12 Jan 2023 22:37:37 +0900 (KST) Received: from epsmtrp2.samsung.com (unknown [182.195.40.14]) by epcas5p3.samsung.com (KnoxPortal) with ESMTPA id 20230112120229epcas5p38a07a42302d823422960eb11de5d685b~5jhMvmI8a1350713507epcas5p3t; Thu, 12 Jan 2023 12:02:29 +0000 (GMT) Received: from epsmgms1p1new.samsung.com (unknown [182.195.42.41]) by epsmtrp2.samsung.com (KnoxPortal) with ESMTP id 20230112120229epsmtrp219acd6ff30d1e478e4572c00456c771e~5jhMuw5Oq3008330083epsmtrp2_; Thu, 12 Jan 2023 12:02:29 +0000 (GMT) X-AuditID: b6c32a49-473fd700000108fd-43-63c00d213f98 Received: from epsmtip2.samsung.com ( [182.195.34.31]) by epsmgms1p1new.samsung.com (Symantec Messaging Gateway) with SMTP id A7.C5.02211.5D6FFB36; Thu, 12 Jan 2023 21:02:29 +0900 (KST) Received: from green5.sa.corp.samsungelectronics.net (unknown [107.110.206.5]) by epsmtip2.samsung.com (KnoxPortal) with ESMTPA id 20230112120226epsmtip266a12aa2937ea1ba7ca4953b1e3b4fa2~5jhKEUpPU0963109631epsmtip2M; Thu, 12 Jan 2023 12:02:26 +0000 (GMT) From: Nitesh Shetty To: Jens Axboe , Alasdair Kergon , Mike Snitzer , dm-devel@redhat.com, Keith Busch , Christoph Hellwig , Sagi Grimberg , James Smart , Chaitanya Kulkarni , Alexander Viro Cc: anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com, nitheshshetty@gmail.com, gost.dev@samsung.com, Nitesh Shetty , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 9/9] dm kcopyd: use copy offload support Date: Thu, 12 Jan 2023 17:29:03 +0530 Message-Id: <20230112115908.23662-10-nj.shetty@samsung.com> X-Mailer: git-send-email 2.35.1.500.gb896f729e2 In-Reply-To: <20230112115908.23662-1-nj.shetty@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA02Te0xbVRzHc+4tlwuh5g5QD8hYd5kxQHh0K/WgPFzc5p0Yg5uJijN4Uy7P 0tY+RDQGGCObwCgwJqGQAjoXBtuACoRXHdTgWAeyhDcbK0xIEMd7Ax1ObCno/vuez/l+z/n9 zoPEXRcJTzJJpuaUMlZKE868lp99fQP287skwZeG/VG9+RccnS58iqO6SS2BjIvlDmi8qw1D V+p6MNRRvYKhnq0FAhWbRgCaHdZhyDjhjzqNt3hosL2CQJWXZx1R60wWQC2blTha++GMI7r+ xxIP9U68hAae3nR4w53RWfoJpk036cgM3G/kMYP9GsZQ+w3B/Hgpg+kYzySY89mLBLP00zDB FDTVAmbN4M0YZhawaJeYlLBEjo3jlAJOJpHHJckSwumok7FvxoaIg4UBwlD0Ki2QsalcOH3k neiAY0lSa5u04HNWqrGiaFalooMiwpRyjZoTJMpV6nCaU8RJFSJFoIpNVWlkCYEyTv2aMDj4 YIjV+GlK4oPSSIVW8EXWcBOWCZo9c4ETCSkRzK/Id8gFzqQr1QHgXfNDwj5YBTCv4KKjzeVK rQNYnyfdTfT9Po3bTUYATRf0mH2QjUFLTxbIBSRJUP7w9hZpC7hT9zDYdtvH5sGpcgz2Lkw7 2CbcqNfhSm0JbvPzqJfh8pDIhvlWfG+jcBtDKghqLXts2MmKr4494tkte+CtspltjVP7YHZz +XY9kNoiYWnRn8Be6BHYPfiro127wfmbTTvaE64tGgm7ToNXSmoIe/gMgLpR3U44EuaYtdtF 4JQvrG8PsuO98KL5Ombf+Dl4fnMGs3M+bNXvah94tb5qZ30POLKRtaMZ2Jxj2jmrAgD7Govx QiDQPdOQ7pmGdP9vXQXwWuDBKVSpCZwqRCGUcWn/3bFEnmoA2w/f73grmJxaDjQBjAQmAEmc dud39tyQuPLj2PQvOaU8VqmRcioTCLGedxHu+bxEbv05MnWsUBQaLBKLxaLQQ2Ih/SKfa66U uFIJrJpL4TgFp9zNYaSTZyZ2+ISPU0NxUMfExr7kqPS2d/vP9unnvPMP5H5oiDk4F/9VUP7H 5s/WD3nVTN09ER/xfcMHQzUb1zTSZu/W44ed9Y80tZUe42elob8ZrmVEhbj5BY4UTLxdHOFg UXJLf82Jk13MOXUyL+n8WFnJnZG0o4kNYZalU7zqbwdWohrPVffNeL8/9512Svk3+8r8Ucle nvLG+uon5odFl++IQEuVU5+ih9Dv73qv0r9ic3Y0vltS/Hg05QXG5Yn/cnvnWoxwone1/f5H 3ckB8sKTB9K8pocynS1p/R7aDN4pTS5/rKjfg27Xnk5vGBoulUSiwa/PSR8/iDMa/tFTF8re 2nyiq4/Mo3mqRFbohytV7L8w+CsmgQQAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrKIsWRmVeSWpSXmKPExsWy7bCSvO7Vb/uTDa7MkLRYf+oYs0XThL/M Fqvv9rNZ7H03m9Xi5oGdTBYrVx9lsti98COTxdH/b9ksJh26xmjx9OosJou9t7Qt9uw9yWJx edccNov5y56yW+x40shose33fGaLz0tb2C3WvX7PYnHilrTF+b/HWR1EPGbdP8vmsXPWXXaP 8/c2snhcPlvqsWlVJ5vH5iX1HrtvNrB59Da/Y/N4v+8qm0ffllWMHp83yXlsevKWKYAnissm JTUnsyy1SN8ugSvj0XT7gn6FisarW5gaGLdKdTFyckgImEicefmQuYuRi0NIYDejxPN715kg EpISy/4eYYawhSVW/nvODlHUyCQx92k7YxcjBwebgLbE6f8cIHERgWdMEmfvPQKbxCywlEli wb5GVpBuYQFriY+rpjCDNLAIqEp8uGICEuYFCt/5PgEsLCGgL9F/XxAkzAkUXnPjCwuILSRg JTFrz1UmiHJBiZMzn4DFmQXkJZq3zmaewCgwC0lqFpLUAkamVYySqQXFuem5xYYFhnmp5XrF ibnFpXnpesn5uZsYwTGppbmDcfuqD3qHGJk4GA8xSnAwK4nw7jm6P1mINyWxsiq1KD++qDQn tfgQozQHi5I474Wuk/FCAumJJanZqakFqUUwWSYOTqkGpqTXS/ZdDtjwomRrgceVivAQM++J P3hF9F6UX6rJFTCq6pC+8VaWuzS4w7R06g9GK8MnYiX/GFZPdpyb4hNZFjkzPIxny80Hskd7 Dr2fb8c5jTUsdkqWy7qz1vdDebboMV93v7qmIjCgbFrveZ8Tf6tinxYvUdtQ2NJ03v/8yVr2 w0ePXda+s5F/15aGLh+35rebZBSbtHX7RAXaKr03Lw82WeAyM27Fj22rC2LqHXRUfwsLTbvi 5pU54671m5XHVMRf1hie5JmyMupRQPfG953nvq7/OPfK0hkf78+Z+G/DzIfr9ieJxd26M8l8 IZvmComQ3DgVrz+b1u/48WTtEqeyd4Ihf/7c4JrLk5r+jFWJpTgj0VCLuag4EQA/0NL8OAMA AA== X-CMS-MailID: 20230112120229epcas5p38a07a42302d823422960eb11de5d685b X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20230112120229epcas5p38a07a42302d823422960eb11de5d685b References: <20230112115908.23662-1-nj.shetty@samsung.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Introduce copy_jobs to use copy-offload, if supported by underlying devices otherwise fall back to existing method. run_copy_jobs() calls block layer copy offload API, if both source and destination request queue are same and support copy offload. On successful completion, destination regions copied count is made zero, failed regions are processed via existing method. Signed-off-by: Nitesh Shetty Signed-off-by: Anuj Gupta --- drivers/md/dm-kcopyd.c | 56 +++++++++++++++++++++++++++++++++++++----- 1 file changed, 50 insertions(+), 6 deletions(-) diff --git a/drivers/md/dm-kcopyd.c b/drivers/md/dm-kcopyd.c index 4d3bbbea2e9a..2f9985f671ac 100644 --- a/drivers/md/dm-kcopyd.c +++ b/drivers/md/dm-kcopyd.c @@ -74,18 +74,20 @@ struct dm_kcopyd_client { atomic_t nr_jobs; /* - * We maintain four lists of jobs: + * We maintain five lists of jobs: * - * i) jobs waiting for pages - * ii) jobs that have pages, and are waiting for the io to be issued. - * iii) jobs that don't need to do any IO and just run a callback - * iv) jobs that have completed. + * i) jobs waiting to try copy offload + * ii) jobs waiting for pages + * iii) jobs that have pages, and are waiting for the io to be issued. + * iv) jobs that don't need to do any IO and just run a callback + * v) jobs that have completed. * - * All four of these are protected by job_lock. + * All five of these are protected by job_lock. */ spinlock_t job_lock; struct list_head callback_jobs; struct list_head complete_jobs; + struct list_head copy_jobs; struct list_head io_jobs; struct list_head pages_jobs; }; @@ -579,6 +581,43 @@ static int run_io_job(struct kcopyd_job *job) return r; } +static int run_copy_job(struct kcopyd_job *job) +{ + int r, i, count = 0; + struct range_entry range; + + struct request_queue *src_q, *dest_q; + + for (i = 0; i < job->num_dests; i++) { + range.dst = job->dests[i].sector << SECTOR_SHIFT; + range.src = job->source.sector << SECTOR_SHIFT; + range.len = job->source.count << SECTOR_SHIFT; + + src_q = bdev_get_queue(job->source.bdev); + dest_q = bdev_get_queue(job->dests[i].bdev); + + if (src_q != dest_q || !blk_queue_copy(src_q)) + break; + + r = blkdev_issue_copy(job->source.bdev, job->dests[i].bdev, + &range, 1, NULL, NULL, GFP_KERNEL); + if (r) + break; + + job->dests[i].count = 0; + count++; + } + + if (count == job->num_dests) { + push(&job->kc->complete_jobs, job); + } else { + push(&job->kc->pages_jobs, job); + r = 0; + } + + return r; +} + static int run_pages_job(struct kcopyd_job *job) { int r; @@ -659,6 +698,7 @@ static void do_work(struct work_struct *work) spin_unlock_irq(&kc->job_lock); blk_start_plug(&plug); + process_jobs(&kc->copy_jobs, kc, run_copy_job); process_jobs(&kc->complete_jobs, kc, run_complete_job); process_jobs(&kc->pages_jobs, kc, run_pages_job); process_jobs(&kc->io_jobs, kc, run_io_job); @@ -676,6 +716,8 @@ static void dispatch_job(struct kcopyd_job *job) atomic_inc(&kc->nr_jobs); if (unlikely(!job->source.count)) push(&kc->callback_jobs, job); + else if (job->source.bdev->bd_disk == job->dests[0].bdev->bd_disk) + push(&kc->copy_jobs, job); else if (job->pages == &zero_page_list) push(&kc->io_jobs, job); else @@ -916,6 +958,7 @@ struct dm_kcopyd_client *dm_kcopyd_client_create(struct dm_kcopyd_throttle *thro spin_lock_init(&kc->job_lock); INIT_LIST_HEAD(&kc->callback_jobs); INIT_LIST_HEAD(&kc->complete_jobs); + INIT_LIST_HEAD(&kc->copy_jobs); INIT_LIST_HEAD(&kc->io_jobs); INIT_LIST_HEAD(&kc->pages_jobs); kc->throttle = throttle; @@ -971,6 +1014,7 @@ void dm_kcopyd_client_destroy(struct dm_kcopyd_client *kc) BUG_ON(!list_empty(&kc->callback_jobs)); BUG_ON(!list_empty(&kc->complete_jobs)); + WARN_ON(!list_empty(&kc->copy_jobs)); BUG_ON(!list_empty(&kc->io_jobs)); BUG_ON(!list_empty(&kc->pages_jobs)); destroy_workqueue(kc->kcopyd_wq);