diff mbox series

[v4,1/7] block: Prevent the use of REQ_FUA with read operations

Message ID 20221031022642.352794-2-damien.lemoal@opensource.wdc.com (mailing list archive)
State New, archived
Headers show
Series Improve libata support for FUA | expand

Commit Message

Damien Le Moal Oct. 31, 2022, 2:26 a.m. UTC
For block devices that do not support FUA, the blk-flush machinery using
preflush/postflush (synchronize cache) does not enforce media access on
the device side for a REQ_FUA read. Furthermore, not all block devices
support FUA for read operations (e.g. ATA devices with NCQ support
turned off). Finally, while all the blk-flush.c code is clearly intended
at processing FUA writes, there is no explicit checks verifying that the
issued request is a write.

Add a check at the beginning of blk_insert_flush() to ensure that any
REQ_FUA read request is failed, reporting "not supported" to the user.

Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 block/blk-flush.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Christoph Hellwig Nov. 1, 2022, 3:09 p.m. UTC | #1
On Mon, Oct 31, 2022 at 11:26:36AM +0900, Damien Le Moal wrote:
> +	/*
> +	 * REQ_FUA does not apply to read requests because:
> +	 * - There is no way to reliably force media access for read operations
> +	 *   with a block device that does not support FUA.
> +	 * - Not all block devices support FUA for read operations (e.g. ATA
> +	 *   devices with NCQ support turned off).
> +	 */
> +	if (!op_is_write(rq->cmd_flags) && (rq->cmd_flags & REQ_FUA)) {
> +		blk_mq_end_request(rq, BLK_STS_NOTSUPP);

How could this even happen?  If we want a debug check,  I think it
should be in submit_bio and a WARN_ON_ONCE.
Damien Le Moal Nov. 1, 2022, 10:05 p.m. UTC | #2
On 11/2/22 00:09, Christoph Hellwig wrote:
> On Mon, Oct 31, 2022 at 11:26:36AM +0900, Damien Le Moal wrote:
>> +	/*
>> +	 * REQ_FUA does not apply to read requests because:
>> +	 * - There is no way to reliably force media access for read operations
>> +	 *   with a block device that does not support FUA.
>> +	 * - Not all block devices support FUA for read operations (e.g. ATA
>> +	 *   devices with NCQ support turned off).
>> +	 */
>> +	if (!op_is_write(rq->cmd_flags) && (rq->cmd_flags & REQ_FUA)) {
>> +		blk_mq_end_request(rq, BLK_STS_NOTSUPP);
> 
> How could this even happen?  If we want a debug check,  I think it
> should be in submit_bio and a WARN_ON_ONCE.

I have not found any code that issues a FUA read. So I do not think this
can happen at all currently. The check is about making sure that it
*never* happens.

I thought of having the check higher up in the submit path but I wanted to
avoid adding yet another check in the very hot path. But if you are OK
with that, I will move it.
Christoph Hellwig Nov. 2, 2022, 7:09 a.m. UTC | #3
On Wed, Nov 02, 2022 at 07:05:35AM +0900, Damien Le Moal wrote:
> >> +	if (!op_is_write(rq->cmd_flags) && (rq->cmd_flags & REQ_FUA)) {
> >> +		blk_mq_end_request(rq, BLK_STS_NOTSUPP);
> > 
> > How could this even happen?  If we want a debug check,  I think it
> > should be in submit_bio and a WARN_ON_ONCE.
> 
> I have not found any code that issues a FUA read. So I do not think this
> can happen at all currently. The check is about making sure that it
> *never* happens.
> 
> I thought of having the check higher up in the submit path but I wanted to
> avoid adding yet another check in the very hot path. But if you are OK
> with that, I will move it.

I'd do something like this:

---
From 96847cce848938d1ee368e609ccb28a19854fba3 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Wed, 2 Nov 2022 08:05:41 +0100
Subject: block: add a sanity check for non-write flush/fua bios

Check that the PREFUSH and FUA flags are only set on write bios,
given that the flush state machine expects that.

Reported-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-core.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index e9e2bf15cd909..4e2b01a53c6ab 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -720,12 +720,15 @@ void submit_bio_noacct(struct bio *bio)
 	 * Filter flush bio's early so that bio based drivers without flush
 	 * support don't have to worry about them.
 	 */
-	if (op_is_flush(bio->bi_opf) &&
-	    !test_bit(QUEUE_FLAG_WC, &q->queue_flags)) {
-		bio->bi_opf &= ~(REQ_PREFLUSH | REQ_FUA);
-		if (!bio_sectors(bio)) {
-			status = BLK_STS_OK;
+	if (op_is_flush(bio->bi_opf)) {
+		if (WARN_ON_ONCE(bio_op(bio) != REQ_OP_WRITE))
 			goto end_io;
+		if (!test_bit(QUEUE_FLAG_WC, &q->queue_flags)) {
+			bio->bi_opf &= ~(REQ_PREFLUSH | REQ_FUA);
+			if (!bio_sectors(bio)) {
+				status = BLK_STS_OK;
+				goto end_io;
+			}
 		}
 	}
Damien Le Moal Nov. 2, 2022, 7:27 a.m. UTC | #4
On 2022/11/02 16:09, Christoph Hellwig wrote:
> On Wed, Nov 02, 2022 at 07:05:35AM +0900, Damien Le Moal wrote:
>>>> +	if (!op_is_write(rq->cmd_flags) && (rq->cmd_flags & REQ_FUA)) {
>>>> +		blk_mq_end_request(rq, BLK_STS_NOTSUPP);
>>>
>>> How could this even happen?  If we want a debug check,  I think it
>>> should be in submit_bio and a WARN_ON_ONCE.
>>
>> I have not found any code that issues a FUA read. So I do not think this
>> can happen at all currently. The check is about making sure that it
>> *never* happens.
>>
>> I thought of having the check higher up in the submit path but I wanted to
>> avoid adding yet another check in the very hot path. But if you are OK
>> with that, I will move it.
> 
> I'd do something like this:
> 
> ---
> From 96847cce848938d1ee368e609ccb28a19854fba3 Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <hch@lst.de>
> Date: Wed, 2 Nov 2022 08:05:41 +0100
> Subject: block: add a sanity check for non-write flush/fua bios
> 
> Check that the PREFUSH and FUA flags are only set on write bios,
> given that the flush state machine expects that.
> 
> Reported-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  block/blk-core.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index e9e2bf15cd909..4e2b01a53c6ab 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -720,12 +720,15 @@ void submit_bio_noacct(struct bio *bio)
>  	 * Filter flush bio's early so that bio based drivers without flush
>  	 * support don't have to worry about them.
>  	 */
> -	if (op_is_flush(bio->bi_opf) &&
> -	    !test_bit(QUEUE_FLAG_WC, &q->queue_flags)) {
> -		bio->bi_opf &= ~(REQ_PREFLUSH | REQ_FUA);
> -		if (!bio_sectors(bio)) {
> -			status = BLK_STS_OK;
> +	if (op_is_flush(bio->bi_opf)) {
> +		if (WARN_ON_ONCE(bio_op(bio) != REQ_OP_WRITE))
>  			goto end_io;
> +		if (!test_bit(QUEUE_FLAG_WC, &q->queue_flags)) {
> +			bio->bi_opf &= ~(REQ_PREFLUSH | REQ_FUA);
> +			if (!bio_sectors(bio)) {
> +				status = BLK_STS_OK;
> +				goto end_io;
> +			}
>  		}
>  	}
>  

Indeed looks nicer. I will send a v5 with this.
Jens Axboe Nov. 2, 2022, 3:17 p.m. UTC | #5
On 11/2/22 1:09 AM, Christoph Hellwig wrote:
> On Wed, Nov 02, 2022 at 07:05:35AM +0900, Damien Le Moal wrote:
>>>> +	if (!op_is_write(rq->cmd_flags) && (rq->cmd_flags & REQ_FUA)) {
>>>> +		blk_mq_end_request(rq, BLK_STS_NOTSUPP);
>>>
>>> How could this even happen?  If we want a debug check,  I think it
>>> should be in submit_bio and a WARN_ON_ONCE.
>>
>> I have not found any code that issues a FUA read. So I do not think this
>> can happen at all currently. The check is about making sure that it
>> *never* happens.
>>
>> I thought of having the check higher up in the submit path but I wanted to
>> avoid adding yet another check in the very hot path. But if you are OK
>> with that, I will move it.
> 
> I'd do something like this:

This looks fine, but if we're never expecting this to happen, I do think
it should just go into libata instead as that's the only user that
cares about it. Yes, that'll lose the backtrace for who submitted it
potentially, but you can debug it pretty easily at that point if you
run into it.
Damien Le Moal Nov. 2, 2022, 10:28 p.m. UTC | #6
On 11/3/22 00:17, Jens Axboe wrote:
> On 11/2/22 1:09 AM, Christoph Hellwig wrote:
>> On Wed, Nov 02, 2022 at 07:05:35AM +0900, Damien Le Moal wrote:
>>>>> +	if (!op_is_write(rq->cmd_flags) && (rq->cmd_flags & REQ_FUA)) {
>>>>> +		blk_mq_end_request(rq, BLK_STS_NOTSUPP);
>>>>
>>>> How could this even happen?  If we want a debug check,  I think it
>>>> should be in submit_bio and a WARN_ON_ONCE.
>>>
>>> I have not found any code that issues a FUA read. So I do not think this
>>> can happen at all currently. The check is about making sure that it
>>> *never* happens.
>>>
>>> I thought of having the check higher up in the submit path but I wanted to
>>> avoid adding yet another check in the very hot path. But if you are OK
>>> with that, I will move it.
>>
>> I'd do something like this:
> 
> This looks fine, but if we're never expecting this to happen, I do think
> it should just go into libata instead as that's the only user that
> cares about it. Yes, that'll lose the backtrace for who submitted it
> potentially, but you can debug it pretty easily at that point if you
> run into it.

I had the check in libata initially but Hannes suggested moving it to the
block layer to have the check valid for all block device types. SCSI does
support FUA reads and we would not be catching these with SAS devices.

Will move this back to libata then.
Christoph Hellwig Nov. 3, 2022, 7:49 a.m. UTC | #7
On Wed, Nov 02, 2022 at 09:17:54AM -0600, Jens Axboe wrote:
> 
> This looks fine, but if we're never expecting this to happen, I do think
> it should just go into libata instead as that's the only user that
> cares about it. Yes, that'll lose the backtrace for who submitted it
> potentially, but you can debug it pretty easily at that point if you
> run into it.

FUA and PREFLUSH are bits only defined for writes.  libata might be the
first thing blowing up, but it really is a block layer constraint.
So validity checking what is being sent to the block layer at the
highest possible lyer is a good thing to ensure we don't get us in
trouble by someone accidentally sending one down or even expecting it
to work.  Especially as at least SCSI actually defines semantics for FUA
on reads, but they are completely bogus and useless.
Jens Axboe Nov. 5, 2022, 3:32 p.m. UTC | #8
On 11/3/22 1:49 AM, Christoph Hellwig wrote:
> On Wed, Nov 02, 2022 at 09:17:54AM -0600, Jens Axboe wrote:
>>
>> This looks fine, but if we're never expecting this to happen, I do think
>> it should just go into libata instead as that's the only user that
>> cares about it. Yes, that'll lose the backtrace for who submitted it
>> potentially, but you can debug it pretty easily at that point if you
>> run into it.
> 
> FUA and PREFLUSH are bits only defined for writes.  libata might be the
> first thing blowing up, but it really is a block layer constraint.
> So validity checking what is being sent to the block layer at the
> highest possible lyer is a good thing to ensure we don't get us in
> trouble by someone accidentally sending one down or even expecting it
> to work.  Especially as at least SCSI actually defines semantics for FUA
> on reads, but they are completely bogus and useless.

OK, fair enough. I guess we can stick it in the block layer.
diff mbox series

Patch

diff --git a/block/blk-flush.c b/block/blk-flush.c
index 53202eff545e..4a2693c7535b 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -397,6 +397,18 @@  void blk_insert_flush(struct request *rq)
 	unsigned int policy = blk_flush_policy(fflags, rq);
 	struct blk_flush_queue *fq = blk_get_flush_queue(q, rq->mq_ctx);
 
+	/*
+	 * REQ_FUA does not apply to read requests because:
+	 * - There is no way to reliably force media access for read operations
+	 *   with a block device that does not support FUA.
+	 * - Not all block devices support FUA for read operations (e.g. ATA
+	 *   devices with NCQ support turned off).
+	 */
+	if (!op_is_write(rq->cmd_flags) && (rq->cmd_flags & REQ_FUA)) {
+		blk_mq_end_request(rq, BLK_STS_NOTSUPP);
+		return;
+	}
+
 	/*
 	 * @policy now records what operations need to be done.  Adjust
 	 * REQ_PREFLUSH and FUA for the driver.