From patchwork Tue Mar 5 10:15:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13582015 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59392C54798 for ; Tue, 5 Mar 2024 10:16:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0F35794000F; Tue, 5 Mar 2024 05:16:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F1E2E8000A; Tue, 5 Mar 2024 05:16:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D686A80009; Tue, 5 Mar 2024 05:16:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id BEB3494000F for ; Tue, 5 Mar 2024 05:16:38 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 9878B140D30 for ; Tue, 5 Mar 2024 10:16:38 +0000 (UTC) X-FDA: 81862581276.01.3FDABC5 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf05.hostedemail.com (Postfix) with ESMTP id E8904100007 for ; Tue, 5 Mar 2024 10:16:36 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=WNEwTJ5c; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf05.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709633797; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qz+nVanLpL+n9z9iwjMPgdzjVfAYTSIqRoaTaCVsNO8=; b=41w3n8Z3e95pGjGtI4QDB27z+4cLH7JLUCXKoKCmvbEuWMz9kmny7Indka+mt7dQetmnS9 EPF58VVk5PYCcjkgX+LGzNfXHtdp79u2EbvsOUqFosXkT9DIF0+s1q/ZBomfgCW1awIL5J TodN02tem39H2T06XLTwq8gJCgAeOgA= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=WNEwTJ5c; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf05.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709633797; a=rsa-sha256; cv=none; b=VkPnD3DYZBMzoZA0kHIyYVv91xSb7T8Sk4MQjlyZLT/5jJi9eGTJtm4FkdF+HQnZ6s9MCs A6s3D+kYyEp5trJAkF9WV8odlQ4ouaOUPAOjRbMReZoyuKlPL8Ga3CKFauUpM60UzDOV84 T9HrarfoR8XVDmsP2haYGJlRBauZzO4= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 0A9CA61126; Tue, 5 Mar 2024 10:16:36 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2D045C43394; Tue, 5 Mar 2024 10:16:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1709633795; bh=LoE4Mk2h+x4QygNOYAaQkErlcP1J9OkI1WKvPNtdRq4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WNEwTJ5cMN9ELwzZS16ppqt8z3E74GD7UClDOZ+K258awlhlHWmthDNESMHRqcqrd FB5rGUzT1HNrfY0k9YRNPH7tpd6eRrBCHBFvW2HuHKRpDRIcieGpESqf5qYSZEu7Ot rJawxgEDFjOWV3aypoQz9qlwzRrIaDARCl71nYqsSh9yKTfAjSlP900AU9rH1yZsqq FSiuklZ0TDjrPQoHx0CLw7CspAdlMh8IJ7b9yEWuOD9m26KZiCU7D2cXRMmTCHbBNI pVkHPjju2ehO/3/QWSJNDh9vuYxqDYLnTUKNJldj71/uHBe1w+Vt8uTvJzHh9hKp99 5/3DH4IEH+yWg== From: Leon Romanovsky To: Christoph Hellwig , Robin Murphy , Marek Szyprowski , Joerg Roedel , Will Deacon , Jason Gunthorpe , Chaitanya Kulkarni Cc: Chaitanya Kulkarni , Jonathan Corbet , Jens Axboe , Keith Busch , Sagi Grimberg , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , =?utf-8?b?SsOpcsO0bWUgR2xp?= =?utf-8?b?c3Nl?= , Andrew Morton , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, kvm@vger.kernel.org, linux-mm@kvack.org, Bart Van Assche , Damien Le Moal , Amir Goldstein , "josef@toxicpanda.com" , "Martin K. Petersen" , "daniel@iogearbox.net" , Dan Williams , "jack@suse.com" , Leon Romanovsky , Zhu Yanjun Subject: [RFC 15/16] block: add dma_link_range() based API Date: Tue, 5 Mar 2024 12:15:25 +0200 Message-ID: <1e52aa392b9c434f55203c9d630dd06fcdb75c32.1709631413.git.leon@kernel.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: E8904100007 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: o6w4zrfq8dqpasobm4mhmand1rpouuxb X-HE-Tag: 1709633796-491331 X-HE-Meta: U2FsdGVkX19yLgJY7TdbLbAjtKFtZ+sXwf2dw3kaU446WmFv1FipX7/efsx8UV912hAm7fbSCZjxplxk/Eaot0bVkWubxhXoRJmCUh+lt8smsPTTnTHHoSvsc7EqVd4Aq42VfLWY4zT+L2NnxU9BCfidvmjM/d59QM01T7OYkPccdhryacHfugQJNPFgvPIkA97fg6P9d8dhnclEbvgMqH6wkdztKr3oEdTE4JYx/CMTJKCbQt7TcEbGa2q/HFDxCo/85zJQ8FU5U1GspodDzhtFGRez57bZA44M/VMvS0E2QWRdONlnwD4CjqraaufWKnFpaEY3d1AEwRgAeZ8OEHWZyeHAMIu6Yv1Ue1/+1NflCoxNz1TiQ8SK6wAOVEPmA7MFydylFnX+aK6di1+qItC+QYqp3PLvpODgq/AKrNlV4m/v+G9dZbnmPDJaJUUZieubcyQYr7ewM2As8TYjr7vooebNxEazsfgh9xBPEZG+5Fjfhx1HsmltIW4u4Gou4CoJKsI8mDXG/uNV7OCaK28/1mW1pAkT32chu4+/5D4KdvrW/pXA0VGAgoxQ6tCiLVozXcvJRroX0oQq3q7VuuMm3bm424PiNSGtxYvB6bQ6LBuB5Tj15l8U37LmfXymjzwd4I9jhEpu3UmZW+VxrpKSAhe+N1T1aJwALP2D6GljdZBFLxumOWBySTNJvn9w8AkWOc7j3YRpx5oH6EKwSf1TuGMX3HyRyUUXob6tHChxaJ8O3Ht9z2PMHqqiqtJIfe82p6J8Iea8gEaQhe93YWib5/4qBWCvt+Zc25/pT0v1FI8McTIwiyaZ9Nmgpa+sbkN4YSCM+6e/vryq4b02XztC7JO3oG5iodjmy6ovPnozHCKgVcLnbXQWg9a2T8xE9y6YQRZlaySSgz6+Ju4G8qo3leYP9a+1xp9/rH/rZK7G+7EShcL5utbZUI2BhEs4+mHP+lGzrpo6OYx2075 nWhXV0qX BilyZEbg24Cxrt2o0COXWnI0s/lGZDvDQagDABm4TOUIFO6dQyQsfLlMaR0sH91POs4qI/r8b/RdKx6sN4TTETEsQODqYwnfbg189Ef9+A3P1kb4njqcycB/ED3kfiGJ6MzP0LkWIaxq43/kYoF7gOk5PkDWKaNO7ASNJ07gre8ICx/pX30G20o+BiH31P2DpQRy+PNumv8Cc+vWz82vUCsE01O8etIiTTyirYjnom1tl14G+eeVqcQnXlX8xE2kXZv/UnPq+OPl9OZpHrP5mPFgmuINdVkZyIdcl X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Chaitanya Kulkarni Add two helper functions that are needed to calculate the total DMA length of the request blk_rq_get_dma_length() and to create DMA mapping blk_rq_dma_map(). blk_rq_get_dma_length() is used to get the total length of the request, when driver is allocating IOVA space for this request with the call to dma_alloc_iova(). This length is then initialized to the iova->size and passed to allocate iova call chain :- dma_map_ops->allov_iova() iommu_dma_alloc_iova() alloc_iova_fast() iova_rcache_get() OR alloc_iova() blk_rq_dma_map() iterates through bvec list and creates DMA mapping for each page using iova parameter with the help of dma_link_range(). Note that @iova is allocated & pre-initialized using dma_alloc_iova() by the caller. After creating a mapping for each page, call into the callback function @cb provided by the drive with a mapped DMA address for this page, offset into the iova space (needed at the time of unlink), length of the mapped page, and page number that is mapped in this request. Driver is responsible for using this DMA address to complete the mapping of underlying protocol-specific data structures, such as NVMe PRPs or NVMe SGLs. This callback approach allows us to iterate bvec list only once to create bvec to DMA mapping and use that DMA address in driver to build the protocol-specific data structure, essentially mapping one bvec page at a time to DMA address and using that DMA address to create underlying protocol-specific data structures. Finally, returning the number of linked count. Signed-off-by: Chaitanya Kulkarni Signed-off-by: Leon Romanovsky --- block/blk-merge.c | 156 +++++++++++++++++++++++++++++++++++++++++ include/linux/blk-mq.h | 9 +++ 2 files changed, 165 insertions(+) diff --git a/block/blk-merge.c b/block/blk-merge.c index 2d470cf2173e..63effc8ac1db 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -583,6 +583,162 @@ int __blk_rq_map_sg(struct request_queue *q, struct request *rq, } EXPORT_SYMBOL(__blk_rq_map_sg); +static dma_addr_t blk_dma_link_page(struct page *page, unsigned int page_offset, + struct dma_iova_attrs *iova, + dma_addr_t dma_offset) +{ + dma_addr_t dma_addr; + int ret; + + dma_addr = dma_link_range(page, page_offset, iova, dma_offset); + ret = dma_mapping_error(iova->dev, dma_addr); + if (ret) { + pr_err("dma_mapping_err %d dma_addr 0x%llx dma_offset %llu\n", + ret, dma_addr, dma_offset); + /* better way ? */ + dma_addr = 0; + } + return dma_addr; +} + +/** + * blk_rq_dma_map: block layer request to DMA mapping helper. + * + * @req : [in] request to be mapped + * @cb : [in] callback to be called for each bvec mapped bvec into + * underlaying driver. + * @cb_data : [in] callback data to be passed, privete to the underlaying + * driver. + * @iova : [in] iova to be used to create DMA mapping for this request's + * bvecs. + * Description: + * Iterates through bvec list and create dma mapping between each bvec page + * using @iova with dma_link_range(). Note that @iova needs to be allocated and + * pre-initialized using dma_alloc_iova() by the caller. After creating + * a mapping for each page, call into the callback function @cb provided by + * driver with mapped dma address for this bvec, offset into iova space, length + * of the mapped page, and bvec number that is mapped in this requets. Driver is + * responsible for using this dma address to complete the mapping of underlaying + * protocol specific data structure, such as NVMe PRPs or NVMe SGLs. This + * callback approach allows us to iterate bvec list only once to create bvec to + * DMA mapping & use that dma address in the driver to build the protocol + * specific data structure, essentially mapping one bvec page at a time to DMA + * address and use that DMA address to create underlaying protocol specific + * data structure. + * + * Caller needs to ensure @iova is initialized & allovated with using + * dma_alloc_iova(). + */ +int blk_rq_dma_map(struct request *req, driver_map_cb cb, void *cb_data, + struct dma_iova_attrs *iova) +{ + dma_addr_t curr_dma_offset = 0; + dma_addr_t prev_dma_addr = 0; + dma_addr_t dma_addr; + size_t prev_dma_len = 0; + struct req_iterator iter; + struct bio_vec bv; + int linked_cnt = 0; + + rq_for_each_bvec(bv, req, iter) { + if (bv.bv_offset + bv.bv_len <= PAGE_SIZE) { + curr_dma_offset = prev_dma_addr + prev_dma_len; + + dma_addr = blk_dma_link_page(bv.bv_page, bv.bv_offset, + iova, curr_dma_offset); + if (!dma_addr) + break; + + cb(cb_data, linked_cnt, dma_addr, curr_dma_offset, + bv.bv_len); + + prev_dma_len = bv.bv_len; + prev_dma_addr = dma_addr; + linked_cnt++; + } else { + unsigned nbytes = bv.bv_len; + unsigned total = 0; + unsigned offset, len; + + while (nbytes > 0) { + struct page *page = bv.bv_page; + + offset = bv.bv_offset + total; + len = min(get_max_segment_size(&req->q->limits, + page, offset), + nbytes); + + page += (offset >> PAGE_SHIFT); + offset &= ~PAGE_MASK; + + curr_dma_offset = prev_dma_addr + prev_dma_len; + + dma_addr = blk_dma_link_page(page, offset, + iova, + curr_dma_offset); + if (!dma_addr) + break; + + cb(cb_data, linked_cnt, dma_addr, + curr_dma_offset, len); + + total += len; + nbytes -= len; + + prev_dma_len = len; + prev_dma_addr = dma_addr; + linked_cnt++; + } + } + } + return linked_cnt; +} +EXPORT_SYMBOL_GPL(blk_rq_dma_map); + +/* + * Calculate total DMA length needed to satisfy this request. + */ +size_t blk_rq_get_dma_length(struct request *rq) +{ + struct request_queue *q = rq->q; + struct bio *bio = rq->bio; + unsigned int offset, len; + struct bvec_iter iter; + size_t dma_length = 0; + struct bio_vec bvec; + + if (rq->rq_flags & RQF_SPECIAL_PAYLOAD) + return rq->special_vec.bv_len; + + if (!rq->bio) + return 0; + + for_each_bio(bio) { + bio_for_each_bvec(bvec, bio, iter) { + unsigned int nbytes = bvec.bv_len; + unsigned int total = 0; + + if (bvec.bv_offset + bvec.bv_len <= PAGE_SIZE) { + dma_length += bvec.bv_len; + continue; + } + + while (nbytes > 0) { + offset = bvec.bv_offset + total; + len = min(get_max_segment_size(&q->limits, + bvec.bv_page, + offset), nbytes); + total += len; + nbytes -= len; + dma_length += len; + } + } + } + + return dma_length; +} +EXPORT_SYMBOL(blk_rq_get_dma_length); + static inline unsigned int blk_rq_get_max_sectors(struct request *rq, sector_t offset) { diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 7a8150a5f051..80b9c7f2c3a0 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -8,6 +8,7 @@ #include #include #include +#include struct blk_mq_tags; struct blk_flush_queue; @@ -1144,7 +1145,15 @@ static inline int blk_rq_map_sg(struct request_queue *q, struct request *rq, return __blk_rq_map_sg(q, rq, sglist, &last_sg); } + +typedef void (*driver_map_cb)(void *cb_data, u32 cnt, dma_addr_t dma_addr, + dma_addr_t offset, u32 len); + +int blk_rq_dma_map(struct request *req, driver_map_cb cb, void *cb_data, + struct dma_iova_attrs *iova); + void blk_dump_rq_flags(struct request *, char *); +size_t blk_rq_get_dma_length(struct request *rq); #ifdef CONFIG_BLK_DEV_ZONED static inline unsigned int blk_rq_zone_no(struct request *rq)