diff mbox series

[2/3] block: mark bdev files as FMODE_NOWAIT if underlying device supports it

Message ID 20230509151910.183637-3-axboe@kernel.dk (mailing list archive)
State New, archived
Headers show
Series None | expand

Commit Message

Jens Axboe May 9, 2023, 3:19 p.m. UTC
We set this unconditionally, but it really should be dependent on if
the underlying device is nowait compliant.

Cc: linux-block@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/fops.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Christoph Hellwig May 10, 2023, 1:30 p.m. UTC | #1
On Tue, May 09, 2023 at 09:19:09AM -0600, Jens Axboe wrote:
> We set this unconditionally, but it really should be dependent on if
> the underlying device is nowait compliant.

Somehow I only see patch 2 of 3 of whatever series this is supposed to
be in my linux-block mbox, something is broken with your patch sending
script.

The change itself looks fine even standalone, though:

Reviewed-by: Christoph Hellwig <hch@lst.de>
Jens Axboe May 10, 2023, 3:32 p.m. UTC | #2
On 5/10/23 7:30 AM, Christoph Hellwig wrote:
> On Tue, May 09, 2023 at 09:19:09AM -0600, Jens Axboe wrote:
>> We set this unconditionally, but it really should be dependent on if
>> the underlying device is nowait compliant.
> 
> Somehow I only see patch 2 of 3 of whatever series this is supposed to
> be in my linux-block mbox, something is broken with your patch sending
> script.
> 
> The change itself looks fine even standalone, though:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Thanks - the 2/3 only is on purpose, 1/3 is a networking ditto and
3/3 is just io_uring now being able to delete some code. So this one
was supposed to be able to stand on its own, should've had the cover
letter for everyone obviously though.
Christoph Hellwig June 20, 2023, 6:18 a.m. UTC | #3
So it turns out this gets into the way of my planned cleanup to move
all the tatic FMODE_ flags out of this basically full field into a new
static one in file_operations.  Do you think it is ok to go back to
always claiming FMODE_NOWAIT for block devices and then just punt for
the drivers that don't support it like the patch below?

---
From 05a591ac066d9d2d57c4967bbd49c8bc63b04abf Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Tue, 20 Jun 2023 07:53:13 +0200
Subject: block: always set FMODE_NOWAIT

Block devices are the only file_operation that do not set FMODE_NOWAIT
unconditionall in ->open and thus get in the way of a planned cleanup to
move this flags into a static field in file_operations.   Switch to
always set FMODE_NOWAIT and just return -EAGAIN if it isn't actually
supported.  This just affects minor ->submit_bio based drivers as all
blk-mq drivers and the important remappers (dm, md, nvme-multipath)
support nowait I/O.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/fops.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/block/fops.c b/block/fops.c
index 555b1b9ecd2cb9..8068d0c85ae75b 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -505,7 +505,7 @@ static int blkdev_open(struct inode *inode, struct file *filp)
 	 * during an unstable branch.
 	 */
 	filp->f_flags |= O_LARGEFILE;
-	filp->f_mode |= FMODE_BUF_RASYNC;
+	filp->f_mode |= FMODE_BUF_RASYNC | FMODE_NOWAIT;
 
 	/*
 	 * Use the file private data to store the holder for exclusive openes.
@@ -519,9 +519,6 @@ static int blkdev_open(struct inode *inode, struct file *filp)
 	if (IS_ERR(bdev))
 		return PTR_ERR(bdev);
 
-	if (bdev_nowait(bdev))
-		filp->f_mode |= FMODE_NOWAIT;
-
 	filp->f_mapping = bdev->bd_inode->i_mapping;
 	filp->f_wb_err = filemap_sample_wb_err(filp->f_mapping);
 	return 0;
@@ -563,6 +560,9 @@ static ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	if ((iocb->ki_flags & (IOCB_NOWAIT | IOCB_DIRECT)) == IOCB_NOWAIT)
 		return -EOPNOTSUPP;
 
+	if (!bdev_nowait(bdev))
+		return -EAGAIN;
+
 	size -= iocb->ki_pos;
 	if (iov_iter_count(from) > size) {
 		shorted = iov_iter_count(from) - size;
@@ -585,6 +585,9 @@ static ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to)
 	ssize_t ret = 0;
 	size_t count;
 
+	if (!bdev_nowait(bdev))
+		return -EAGAIN;
+
 	if (unlikely(pos + iov_iter_count(to) > size)) {
 		if (pos >= size)
 			return 0;
Christoph Hellwig June 20, 2023, 8:22 a.m. UTC | #4
On Mon, Jun 19, 2023 at 11:18:55PM -0700, Christoph Hellwig wrote:
> So it turns out this gets into the way of my planned cleanup to move
> all the tatic FMODE_ flags out of this basically full field into a new
> static one in file_operations.  Do you think it is ok to go back to
> always claiming FMODE_NOWAIT for block devices and then just punt for
> the drivers that don't support it like the patch below?

Except that the version I posted if of course completly broken as my
testing rig told me.  This is the version that actually works:

---
From 7916df7434e1570978b9c81c65aa1ec1f3396b13 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Tue, 20 Jun 2023 07:53:13 +0200
Subject: block: always set FMODE_NOWAIT

Block devices are the only file_operations that do not set FMODE_NOWAIT
unconditionall in ->open and thus get in the way of a planned cleanup to
move this flags into a static field in file_operations.   Switch to
always set FMODE_NOWAIT and just return -EAGAIN if it isn't actually
supported.  This just affects minor ->submit_bio based drivers as all
blk-mq drivers and the important remappers (dm, md, nvme-multipath)
support nowait I/O.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/fops.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/block/fops.c b/block/fops.c
index 555b1b9ecd2cb9..58c2f65ae4a57e 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -505,7 +505,7 @@ static int blkdev_open(struct inode *inode, struct file *filp)
 	 * during an unstable branch.
 	 */
 	filp->f_flags |= O_LARGEFILE;
-	filp->f_mode |= FMODE_BUF_RASYNC;
+	filp->f_mode |= FMODE_BUF_RASYNC | FMODE_NOWAIT;
 
 	/*
 	 * Use the file private data to store the holder for exclusive openes.
@@ -519,9 +519,6 @@ static int blkdev_open(struct inode *inode, struct file *filp)
 	if (IS_ERR(bdev))
 		return PTR_ERR(bdev);
 
-	if (bdev_nowait(bdev))
-		filp->f_mode |= FMODE_NOWAIT;
-
 	filp->f_mapping = bdev->bd_inode->i_mapping;
 	filp->f_wb_err = filemap_sample_wb_err(filp->f_mapping);
 	return 0;
@@ -563,6 +560,9 @@ static ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	if ((iocb->ki_flags & (IOCB_NOWAIT | IOCB_DIRECT)) == IOCB_NOWAIT)
 		return -EOPNOTSUPP;
 
+	if ((iocb->ki_flags & IOCB_NOWAIT) && !bdev_nowait(bdev))
+		return -EAGAIN;
+
 	size -= iocb->ki_pos;
 	if (iov_iter_count(from) > size) {
 		shorted = iov_iter_count(from) - size;
@@ -585,6 +585,9 @@ static ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to)
 	ssize_t ret = 0;
 	size_t count;
 
+	if ((iocb->ki_flags & IOCB_NOWAIT) && !bdev_nowait(bdev))
+		return -EAGAIN;
+
 	if (unlikely(pos + iov_iter_count(to) > size)) {
 		if (pos >= size)
 			return 0;
Jens Axboe June 20, 2023, 1:24 p.m. UTC | #5
On 6/20/23 12:18?AM, Christoph Hellwig wrote:
> So it turns out this gets into the way of my planned cleanup to move
> all the tatic FMODE_ flags out of this basically full field into a new
> static one in file_operations.  Do you think it is ok to go back to
> always claiming FMODE_NOWAIT for block devices and then just punt for
> the drivers that don't support it like the patch below?

I think we need stronger justification than that, it's much nicer to
have it in the open path than doing the same check over and over for
each IO.

With your new proposed scheme, why can't the check and FMODE_NOWAIT set
still be in open?
Christoph Hellwig June 20, 2023, 1:29 p.m. UTC | #6
On Tue, Jun 20, 2023 at 07:24:56AM -0600, Jens Axboe wrote:
> I think we need stronger justification than that, it's much nicer to
> have it in the open path than doing the same check over and over for
> each IO.
> 
> With your new proposed scheme, why can't the check and FMODE_NOWAIT set
> still be in open?

Because I want to move the by now huge number of static flags out of
file->f_mode and into file->f_op.flags.  It's just that with this patch
this one flag isn't static anymore for block devices.  We could also
do two sets of file operations assuming we never allow run-time changes
for QUEUE_FLAG_NOWAIT.  If we care about optimizing fo async I/O on the
few drivers not supporting QUEUE_FLAG_NOWAIT that's probably the next
best thing.
Jens Axboe June 20, 2023, 2:56 p.m. UTC | #7
On 6/20/23 7:29?AM, Christoph Hellwig wrote:
> On Tue, Jun 20, 2023 at 07:24:56AM -0600, Jens Axboe wrote:
>> I think we need stronger justification than that, it's much nicer to
>> have it in the open path than doing the same check over and over for
>> each IO.
>>
>> With your new proposed scheme, why can't the check and FMODE_NOWAIT set
>> still be in open?
> 
> Because I want to move the by now huge number of static flags out of
> file->f_mode and into file->f_op.flags.  It's just that with this patch
> this one flag isn't static anymore for block devices.  We could also
> do two sets of file operations assuming we never allow run-time changes
> for QUEUE_FLAG_NOWAIT.  If we care about optimizing fo async I/O on the
> few drivers not supporting QUEUE_FLAG_NOWAIT that's probably the next
> best thing.

Doing two sets of fops makes sense to me, and then hopefully down the
line we can shave it down to 1 again once everything sets FMODE_NOWAIT.
diff mbox series

Patch

diff --git a/block/fops.c b/block/fops.c
index d2e6be4e3d1c..ab750e8a040f 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -481,7 +481,7 @@  static int blkdev_open(struct inode *inode, struct file *filp)
 	 * during an unstable branch.
 	 */
 	filp->f_flags |= O_LARGEFILE;
-	filp->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC;
+	filp->f_mode |= FMODE_BUF_RASYNC;
 
 	if (filp->f_flags & O_NDELAY)
 		filp->f_mode |= FMODE_NDELAY;
@@ -494,6 +494,9 @@  static int blkdev_open(struct inode *inode, struct file *filp)
 	if (IS_ERR(bdev))
 		return PTR_ERR(bdev);
 
+	if (bdev_nowait(bdev))
+		filp->f_mode |= FMODE_NOWAIT;
+
 	filp->private_data = bdev;
 	filp->f_mapping = bdev->bd_inode->i_mapping;
 	filp->f_wb_err = filemap_sample_wb_err(filp->f_mapping);