diff mbox series

[2/2] block: don't allow splitting of a REQ_NOWAIT bio

Message ID 20230104160938.62636-3-axboe@kernel.dk (mailing list archive)
State New, archived
Headers show
Series Don't allow REQ_NOWAIT for bio splitting | expand

Commit Message

Jens Axboe Jan. 4, 2023, 4:09 p.m. UTC
If we split a bio marked with REQ_NOWAIT, then we can trigger spurious
EAGAIN if constituent parts of that split bio end up failing request
allocations. Parts will complete just fine, but just a single failure
in one of the chained bios will yield an EAGAIN final result for the
parent bio.

Return EAGAIN early if we end up needing to split such a bio, which
allows for saner recovery handling.

Cc: stable@vger.kernel.org # 5.15+
Link: https://github.com/axboe/liburing/issues/766
Reported-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-merge.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Keith Busch Jan. 4, 2023, 7:13 p.m. UTC | #1
On Wed, Jan 04, 2023 at 09:09:38AM -0700, Jens Axboe wrote:
> If we split a bio marked with REQ_NOWAIT, then we can trigger spurious
> EAGAIN if constituent parts of that split bio end up failing request
> allocations. Parts will complete just fine, but just a single failure
> in one of the chained bios will yield an EAGAIN final result for the
> parent bio.
> 
> Return EAGAIN early if we end up needing to split such a bio, which
> allows for saner recovery handling.

We're losing some performance here for large-ish single depth IO with
nvme. We can get a little back by forcing to use the async worker
earlier in the dispatch instead of getting all the way to the bio
splitting, but the overhead to check for the condition (which is
arbitrary decision anyway since we don't know the queue limits at
io_uring prep time) mostly negates the gain.

It's probably fine, though, since you can still hit peak b/w with just a
little higher qdepth. This patch is a simple way to handle the problem,
so looks good to me.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Jens Axboe Jan. 4, 2023, 8:24 p.m. UTC | #2
On 1/4/23 12:13?PM, Keith Busch wrote:
> On Wed, Jan 04, 2023 at 09:09:38AM -0700, Jens Axboe wrote:
>> If we split a bio marked with REQ_NOWAIT, then we can trigger spurious
>> EAGAIN if constituent parts of that split bio end up failing request
>> allocations. Parts will complete just fine, but just a single failure
>> in one of the chained bios will yield an EAGAIN final result for the
>> parent bio.
>>
>> Return EAGAIN early if we end up needing to split such a bio, which
>> allows for saner recovery handling.
> 
> We're losing some performance here for large-ish single depth IO with
> nvme. We can get a little back by forcing to use the async worker
> earlier in the dispatch instead of getting all the way to the bio
> splitting, but the overhead to check for the condition (which is
> arbitrary decision anyway since we don't know the queue limits at
> io_uring prep time) mostly negates the gain.

Yes, it's not perfect - for perfection, we'd need to be able to either
arbitrarily retry parts of the split bio if we can't get a tag. Or
reserve tags for this request when doing the splits. Either one of those
would require extensive surgery to achieve. In my testing, the cost is
low enough that I think we can live with this for now.

> It's probably fine, though, since you can still hit peak b/w with just a
> little higher qdepth. This patch is a simple way to handle the problem,
> so looks good to me.
> 
> Reviewed-by: Keith Busch <kbusch@kernel.org>

Thanks!
Christoph Hellwig Jan. 10, 2023, 9:21 a.m. UTC | #3
On Wed, Jan 04, 2023 at 09:09:38AM -0700, Jens Axboe wrote:
>  split:
> +	/*
> +	 * We can't sanely support splitting for a REQ_NOWAIT bio. End it
> +	 * with EAGAIN if splitting is required and return an error pointer.
> +	 */
> +	if (bio->bi_opf & REQ_NOWAIT) {
> +		bio->bi_status = BLK_STS_AGAIN;
> +		bio_endio(bio);
> +		return ERR_PTR(-EAGAIN);
> +	}

Hmm.  Just completing the bio here seems a little dangerous in terms
of ownership.  What speaks against letting the caller do it?
Christoph Hellwig Jan. 17, 2023, 6:01 a.m. UTC | #4
And this is the other comment.

On Tue, Jan 10, 2023 at 01:21:24AM -0800, Christoph Hellwig wrote:
> On Wed, Jan 04, 2023 at 09:09:38AM -0700, Jens Axboe wrote:
> >  split:
> > +	/*
> > +	 * We can't sanely support splitting for a REQ_NOWAIT bio. End it
> > +	 * with EAGAIN if splitting is required and return an error pointer.
> > +	 */
> > +	if (bio->bi_opf & REQ_NOWAIT) {
> > +		bio->bi_status = BLK_STS_AGAIN;
> > +		bio_endio(bio);
> > +		return ERR_PTR(-EAGAIN);
> > +	}
> 
> Hmm.  Just completing the bio here seems a little dangerous in terms
> of ownership.  What speaks against letting the caller do it?
---end quoted text---
diff mbox series

Patch

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 071c5f8cf0cf..b7c193d67185 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -309,6 +309,16 @@  static struct bio *bio_split_rw(struct bio *bio, const struct queue_limits *lim,
 	*segs = nsegs;
 	return NULL;
 split:
+	/*
+	 * We can't sanely support splitting for a REQ_NOWAIT bio. End it
+	 * with EAGAIN if splitting is required and return an error pointer.
+	 */
+	if (bio->bi_opf & REQ_NOWAIT) {
+		bio->bi_status = BLK_STS_AGAIN;
+		bio_endio(bio);
+		return ERR_PTR(-EAGAIN);
+	}
+
 	*segs = nsegs;
 
 	/*