diff mbox series

block: fix queue limits checks in blk_rq_map_user_bvec for real

Message ID 20241025115818.54976-1-hch@lst.de (mailing list archive)
State New
Headers show
Series block: fix queue limits checks in blk_rq_map_user_bvec for real | expand

Commit Message

Christoph Hellwig Oct. 25, 2024, 11:58 a.m. UTC
blk_rq_map_user_bvec currently only has ad-hoc checks for queue limits,
and the last fix to it enabled valid NVMe I/O to pass, but also allowed
invalid one for drivers that set a max_segment_size or seg_boundary
limit.

Fix it once for all by using the bio_split_rw_at helper from the I/O
path that indicates if and where a bio would be have to be split to
adhere to the queue limits, and it it returns a positive value, turn
that into -EREMOTEIO to retry using the copy path.

Fixes: 2ff949441802 ("block: fix sanity checks in blk_rq_map_user_bvec")
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-map.c | 54 ++++++++++++++++---------------------------------
 1 file changed, 17 insertions(+), 37 deletions(-)

Comments

Uday Shankar Oct. 25, 2024, 6:31 p.m. UTC | #1
On Fri, Oct 25, 2024 at 01:58:11PM +0200, Christoph Hellwig wrote:
> blk_rq_map_user_bvec currently only has ad-hoc checks for queue limits,
> and the last fix to it enabled valid NVMe I/O to pass, but also allowed
> invalid one for drivers that set a max_segment_size or seg_boundary
> limit.
> 
> Fix it once for all by using the bio_split_rw_at helper from the I/O
> path that indicates if and where a bio would be have to be split to
> adhere to the queue limits, and it it returns a positive value, turn
> that into -EREMOTEIO to retry using the copy path.
> 
> Fixes: 2ff949441802 ("block: fix sanity checks in blk_rq_map_user_bvec")
> Signed-off-by: Christoph Hellwig <hch@lst.de>

This passes my test for NVMe passthrough I/O using a strict subset of a
preregistered buffer (see 2ff949441802 for details).

Tested-by: Uday Shankar <ushankar@purestorage.com>
Keith Busch Oct. 25, 2024, 6:52 p.m. UTC | #2
On Fri, Oct 25, 2024 at 01:58:11PM +0200, Christoph Hellwig wrote:
> blk_rq_map_user_bvec currently only has ad-hoc checks for queue limits,
> and the last fix to it enabled valid NVMe I/O to pass, but also allowed
> invalid one for drivers that set a max_segment_size or seg_boundary
> limit.
> 
> Fix it once for all by using the bio_split_rw_at helper from the I/O
> path that indicates if and where a bio would be have to be split to
> adhere to the queue limits, and it it returns a positive value, turn
> that into -EREMOTEIO to retry using the copy path.

Nice cleanup.

Reviewed-by: Keith Busch <kbusch@kernel.org>
John Garry Oct. 25, 2024, 8:43 p.m. UTC | #3
On 25/10/2024 12:58, Christoph Hellwig wrote:
>   
> +	/* check that the data layout matches the hardware restrictions */
> +	ret = bio_split_rw_at(bio, lim, &nsegs, lim->max_hw_sectors);

eh, but doesn't bio_split_rw_at() accept bytes (and not a value in 
sectors) for max size?
Christoph Hellwig Oct. 28, 2024, 9:20 a.m. UTC | #4
On Fri, Oct 25, 2024 at 09:43:04PM +0100, John Garry wrote:
> On 25/10/2024 12:58, Christoph Hellwig wrote:
>>   +	/* check that the data layout matches the hardware restrictions */
>> +	ret = bio_split_rw_at(bio, lim, &nsegs, lim->max_hw_sectors);
>
> eh, but doesn't bio_split_rw_at() accept bytes (and not a value in sectors) 
> for max size?

Yes.  Thanks for the careful review, I've sent out a new version.

(and this really helped me finding a bug I had been debugging in
another user of this helper :))
diff mbox series

Patch

diff --git a/block/blk-map.c b/block/blk-map.c
index 6ef2ec1f7d78..d1dce1470054 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -561,55 +561,35 @@  EXPORT_SYMBOL(blk_rq_append_bio);
 /* Prepare bio for passthrough IO given ITER_BVEC iter */
 static int blk_rq_map_user_bvec(struct request *rq, const struct iov_iter *iter)
 {
-	struct request_queue *q = rq->q;
-	size_t nr_iter = iov_iter_count(iter);
-	size_t nr_segs = iter->nr_segs;
-	struct bio_vec *bvecs, *bvprvp = NULL;
-	const struct queue_limits *lim = &q->limits;
-	unsigned int nsegs = 0, bytes = 0;
+	const struct queue_limits *lim = &rq->q->limits;
+	unsigned int nsegs;
 	struct bio *bio;
-	size_t i;
+	int ret;
 
-	if (!nr_iter || (nr_iter >> SECTOR_SHIFT) > queue_max_hw_sectors(q))
-		return -EINVAL;
-	if (nr_segs > queue_max_segments(q))
+	if (!iov_iter_count(iter) ||
+	    (iov_iter_count(iter) >> SECTOR_SHIFT) > lim->max_hw_sectors)
 		return -EINVAL;
 
-	/* no iovecs to alloc, as we already have a BVEC iterator */
+	/* reuse the bvecs from the iterator instead of allocating new ones */
 	bio = blk_rq_map_bio_alloc(rq, 0, GFP_KERNEL);
-	if (bio == NULL)
+	if (!bio)
 		return -ENOMEM;
-
 	bio_iov_bvec_set(bio, (struct iov_iter *)iter);
-	blk_rq_bio_prep(rq, bio, nr_segs);
-
-	/* loop to perform a bunch of sanity checks */
-	bvecs = (struct bio_vec *)iter->bvec;
-	for (i = 0; i < nr_segs; i++) {
-		struct bio_vec *bv = &bvecs[i];
 
+	/* check that the data layout matches the hardware restrictions */
+	ret = bio_split_rw_at(bio, lim, &nsegs, lim->max_hw_sectors);
+	if (ret) {
 		/*
-		 * If the queue doesn't support SG gaps and adding this
-		 * offset would create a gap, fallback to copy.
+		 * If we would have to split the bio, try to copy.
 		 */
-		if (bvprvp && bvec_gap_to_prev(lim, bvprvp, bv->bv_offset)) {
-			blk_mq_map_bio_put(bio);
-			return -EREMOTEIO;
-		}
-		/* check full condition */
-		if (nsegs >= nr_segs || bytes > UINT_MAX - bv->bv_len)
-			goto put_bio;
-		if (bytes + bv->bv_len > nr_iter)
-			break;
-
-		nsegs++;
-		bytes += bv->bv_len;
-		bvprvp = bv;
+		if (ret > 0)
+			ret = -EREMOTEIO;
+		blk_mq_map_bio_put(bio);
+		return ret;
 	}
+
+	blk_rq_bio_prep(rq, bio, nsegs);
 	return 0;
-put_bio:
-	blk_mq_map_bio_put(bio);
-	return -EINVAL;
 }
 
 /**