Message ID | 20250311024144.1762333-3-shinichiro.kawasaki@wdc.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | block: nvme: fix blktests nvme/039 failure | expand |
On Tue, Mar 11, 2025 at 10:42 AM Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> wrote: > > Commit 1f47ed294a2b ("block: cleanup and fix batch completion adding > conditions") modified the evaluation criteria for the third argument, > 'ioerror', in the blk_mq_add_to_batch() function. Initially, the > function had checked if 'ioerror' equals zero. Following the commit, it > started checking for negative error values, with the presumption that > such values, for instance -EIO, would be passed in. > > However, blk_mq_add_to_batch() callers do not pass negative error > values. Instead, they pass status codes defined in various ways: > > - NVMe PCI and Apple drivers pass NVMe status code > - virtio_blk driver passes the virtblk request header status byte > - null_blk driver passes blk_status_t > > These codes are either zero or positive, therefore the revised check > fails to function as intended. Specifically, with the NVMe PCI driver, > this modification led to the failure of the blktests test case nvme/039. > In this test scenario, errors are artificially injected to the NVMe > driver, resulting in positive NVMe status codes passed to > blk_mq_add_to_batch(), which unexpectedly processes the failed I/O in a > batch. Hence the failure. > > To correct the ioerror check within blk_mq_add_to_batch(), make all > callers to uniformly pass the argument as blk_status_t. Modify the > callers to translate their specific status codes into blk_status_t. For > this translation, export the helper function nvme_error_status(). Adjust > blk_mq_add_to_batch() to translate blk_status_t back into the error > number for the appropriate check. > > Fixes: 1f47ed294a2b ("block: cleanup and fix batch completion adding conditions") > Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> > --- > drivers/block/null_blk/main.c | 2 +- > drivers/block/virtio_blk.c | 5 +++-- For virtio_blk: Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
On 3/11/25 03:41, Shin'ichiro Kawasaki wrote: > Commit 1f47ed294a2b ("block: cleanup and fix batch completion adding > conditions") modified the evaluation criteria for the third argument, > 'ioerror', in the blk_mq_add_to_batch() function. Initially, the > function had checked if 'ioerror' equals zero. Following the commit, it > started checking for negative error values, with the presumption that > such values, for instance -EIO, would be passed in. > > However, blk_mq_add_to_batch() callers do not pass negative error > values. Instead, they pass status codes defined in various ways: > > - NVMe PCI and Apple drivers pass NVMe status code > - virtio_blk driver passes the virtblk request header status byte > - null_blk driver passes blk_status_t > > These codes are either zero or positive, therefore the revised check > fails to function as intended. Specifically, with the NVMe PCI driver, > this modification led to the failure of the blktests test case nvme/039. > In this test scenario, errors are artificially injected to the NVMe > driver, resulting in positive NVMe status codes passed to > blk_mq_add_to_batch(), which unexpectedly processes the failed I/O in a > batch. Hence the failure. > > To correct the ioerror check within blk_mq_add_to_batch(), make all > callers to uniformly pass the argument as blk_status_t. Modify the > callers to translate their specific status codes into blk_status_t. For > this translation, export the helper function nvme_error_status(). Adjust > blk_mq_add_to_batch() to translate blk_status_t back into the error > number for the appropriate check. > > Fixes: 1f47ed294a2b ("block: cleanup and fix batch completion adding conditions") > Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> > --- > drivers/block/null_blk/main.c | 2 +- > drivers/block/virtio_blk.c | 5 +++-- > drivers/nvme/host/apple.c | 3 ++- > drivers/nvme/host/core.c | 3 ++- > drivers/nvme/host/nvme.h | 1 + > drivers/nvme/host/pci.c | 5 +++-- > include/linux/blk-mq.h | 5 +++-- > 7 files changed, 15 insertions(+), 9 deletions(-) > > diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c > index d94ef37480bd..31f23f6a8ed0 100644 > --- a/drivers/block/null_blk/main.c > +++ b/drivers/block/null_blk/main.c > @@ -1549,7 +1549,7 @@ static int null_poll(struct blk_mq_hw_ctx *hctx, struct io_comp_batch *iob) > cmd = blk_mq_rq_to_pdu(req); > cmd->error = null_process_cmd(cmd, req_op(req), blk_rq_pos(req), > blk_rq_sectors(req)); > - if (!blk_mq_add_to_batch(req, iob, (__force int) cmd->error, > + if (!blk_mq_add_to_batch(req, iob, cmd->error, > blk_mq_end_request_batch)) > blk_mq_end_request(req, cmd->error); > nr++; > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c > index 6a61ec35f426..74f73cb8becd 100644 > --- a/drivers/block/virtio_blk.c > +++ b/drivers/block/virtio_blk.c > @@ -1207,11 +1207,12 @@ static int virtblk_poll(struct blk_mq_hw_ctx *hctx, struct io_comp_batch *iob) > > while ((vbr = virtqueue_get_buf(vq->vq, &len)) != NULL) { > struct request *req = blk_mq_rq_from_pdu(vbr); > + u8 status = virtblk_vbr_status(vbr); > > found++; > if (!blk_mq_complete_request_remote(req) && > - !blk_mq_add_to_batch(req, iob, virtblk_vbr_status(vbr), > - virtblk_complete_batch)) > + !blk_mq_add_to_batch(req, iob, virtblk_result(status), > + virtblk_complete_batch)) > virtblk_request_done(req); > } > > diff --git a/drivers/nvme/host/apple.c b/drivers/nvme/host/apple.c > index a060f69558e7..ae859f772485 100644 > --- a/drivers/nvme/host/apple.c > +++ b/drivers/nvme/host/apple.c > @@ -599,7 +599,8 @@ static inline void apple_nvme_handle_cqe(struct apple_nvme_queue *q, > } > > if (!nvme_try_complete_req(req, cqe->status, cqe->result) && > - !blk_mq_add_to_batch(req, iob, nvme_req(req)->status, > + !blk_mq_add_to_batch(req, iob, > + nvme_error_status(nvme_req(req)->status), > apple_nvme_complete_batch)) > apple_nvme_complete_rq(req); > } > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > index 8359d0aa0e44..725f2052bcb2 100644 > --- a/drivers/nvme/host/core.c > +++ b/drivers/nvme/host/core.c > @@ -274,7 +274,7 @@ void nvme_delete_ctrl_sync(struct nvme_ctrl *ctrl) > nvme_put_ctrl(ctrl); > } > > -static blk_status_t nvme_error_status(u16 status) > +blk_status_t nvme_error_status(u16 status) > { > switch (status & NVME_SCT_SC_MASK) { > case NVME_SC_SUCCESS: > @@ -315,6 +315,7 @@ static blk_status_t nvme_error_status(u16 status) > return BLK_STS_IOERR; > } > } > +EXPORT_SYMBOL_GPL(nvme_error_status); > > static void nvme_retry_req(struct request *req) > { > diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h > index 7be92d07430e..4e166da4aa81 100644 > --- a/drivers/nvme/host/nvme.h > +++ b/drivers/nvme/host/nvme.h > @@ -904,6 +904,7 @@ void nvme_stop_keep_alive(struct nvme_ctrl *ctrl); > int nvme_reset_ctrl(struct nvme_ctrl *ctrl); > int nvme_reset_ctrl_sync(struct nvme_ctrl *ctrl); > int nvme_delete_ctrl(struct nvme_ctrl *ctrl); > +blk_status_t nvme_error_status(u16 status); > void nvme_queue_scan(struct nvme_ctrl *ctrl); > int nvme_get_log(struct nvme_ctrl *ctrl, u32 nsid, u8 log_page, u8 lsp, u8 csi, > void *log, size_t size, u64 offset); > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c > index 640590b21728..873564efc346 100644 > --- a/drivers/nvme/host/pci.c > +++ b/drivers/nvme/host/pci.c > @@ -1130,8 +1130,9 @@ static inline void nvme_handle_cqe(struct nvme_queue *nvmeq, > > trace_nvme_sq(req, cqe->sq_head, nvmeq->sq_tail); > if (!nvme_try_complete_req(req, cqe->status, cqe->result) && > - !blk_mq_add_to_batch(req, iob, nvme_req(req)->status, > - nvme_pci_complete_batch)) > + !blk_mq_add_to_batch(req, iob, > + nvme_error_status(nvme_req(req)->status), > + nvme_pci_complete_batch)) > nvme_pci_complete_rq(req); > } > > diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h > index 71f4f0cc3dac..9ba479a52ce7 100644 > --- a/include/linux/blk-mq.h > +++ b/include/linux/blk-mq.h > @@ -857,7 +857,8 @@ static inline bool blk_mq_is_reserved_rq(struct request *rq) > * ->end_io handler. > */ > static inline bool blk_mq_add_to_batch(struct request *req, > - struct io_comp_batch *iob, int ioerror, > + struct io_comp_batch *iob, > + blk_status_t status, > void (*complete)(struct io_comp_batch *)) > { > /* > @@ -874,7 +875,7 @@ static inline bool blk_mq_add_to_batch(struct request *req, > if (!blk_rq_is_passthrough(req)) { > if (req->end_io) > return false; > - if (ioerror < 0) > + if (blk_status_to_errno(status) < 0) > return false; > } > That feels a bit odd; errno will always be negative, so we really are just checking if it's non-zero. Maybe it's better to use a 'bool' value here, that would avoid all the rather pointless casting, too. Cheers, Hannes
On Tue, Mar 11, 2025 at 11:41:44AM +0900, Shin'ichiro Kawasaki wrote: > However, blk_mq_add_to_batch() callers do not pass negative error > values. Instead, they pass status codes defined in various ways: > > - NVMe PCI and Apple drivers pass NVMe status code > - virtio_blk driver passes the virtblk request header status byte > - null_blk driver passes blk_status_t The __force cast in null_blk should have been a big fat warning.. > To correct the ioerror check within blk_mq_add_to_batch(), make all > callers to uniformly pass the argument as blk_status_t. Modify the > callers to translate their specific status codes into blk_status_t. For > this translation, export the helper function nvme_error_status(). Adjust > blk_mq_add_to_batch() to translate blk_status_t back into the error > number for the appropriate check. This still looks a bit ugly because of all the conversions to a blk_status_t just to convert it back to a errno just to check for a non-zero value (blk_status_to_errno can't return a positive value). I suspect simply passing a "bool is_error" might actually be cleaner than that, combined with a proper kerneldoc comment for blk_mq_add_to_batch explaining how to set it?
On Mar 11, 2025 / 01:13, Christoph Hellwig wrote: > On Tue, Mar 11, 2025 at 11:41:44AM +0900, Shin'ichiro Kawasaki wrote: > > However, blk_mq_add_to_batch() callers do not pass negative error > > values. Instead, they pass status codes defined in various ways: > > > > - NVMe PCI and Apple drivers pass NVMe status code > > - virtio_blk driver passes the virtblk request header status byte > > - null_blk driver passes blk_status_t > > The __force cast in null_blk should have been a big fat warning.. > > > To correct the ioerror check within blk_mq_add_to_batch(), make all > > callers to uniformly pass the argument as blk_status_t. Modify the > > callers to translate their specific status codes into blk_status_t. For > > this translation, export the helper function nvme_error_status(). Adjust > > blk_mq_add_to_batch() to translate blk_status_t back into the error > > number for the appropriate check. > > This still looks a bit ugly because of all the conversions to a > blk_status_t just to convert it back to a errno just to check for > a non-zero value (blk_status_to_errno can't return a positive value). > > I suspect simply passing a "bool is_error" might actually be cleaner > than that, Thanks. Hannes made same comment. Wiil do so in v2. > combined with a proper kerneldoc comment for > blk_mq_add_to_batch explaining how to set it? Will add it in v2. I Will send out v2 soon for furhter review.
diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c index d94ef37480bd..31f23f6a8ed0 100644 --- a/drivers/block/null_blk/main.c +++ b/drivers/block/null_blk/main.c @@ -1549,7 +1549,7 @@ static int null_poll(struct blk_mq_hw_ctx *hctx, struct io_comp_batch *iob) cmd = blk_mq_rq_to_pdu(req); cmd->error = null_process_cmd(cmd, req_op(req), blk_rq_pos(req), blk_rq_sectors(req)); - if (!blk_mq_add_to_batch(req, iob, (__force int) cmd->error, + if (!blk_mq_add_to_batch(req, iob, cmd->error, blk_mq_end_request_batch)) blk_mq_end_request(req, cmd->error); nr++; diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 6a61ec35f426..74f73cb8becd 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -1207,11 +1207,12 @@ static int virtblk_poll(struct blk_mq_hw_ctx *hctx, struct io_comp_batch *iob) while ((vbr = virtqueue_get_buf(vq->vq, &len)) != NULL) { struct request *req = blk_mq_rq_from_pdu(vbr); + u8 status = virtblk_vbr_status(vbr); found++; if (!blk_mq_complete_request_remote(req) && - !blk_mq_add_to_batch(req, iob, virtblk_vbr_status(vbr), - virtblk_complete_batch)) + !blk_mq_add_to_batch(req, iob, virtblk_result(status), + virtblk_complete_batch)) virtblk_request_done(req); } diff --git a/drivers/nvme/host/apple.c b/drivers/nvme/host/apple.c index a060f69558e7..ae859f772485 100644 --- a/drivers/nvme/host/apple.c +++ b/drivers/nvme/host/apple.c @@ -599,7 +599,8 @@ static inline void apple_nvme_handle_cqe(struct apple_nvme_queue *q, } if (!nvme_try_complete_req(req, cqe->status, cqe->result) && - !blk_mq_add_to_batch(req, iob, nvme_req(req)->status, + !blk_mq_add_to_batch(req, iob, + nvme_error_status(nvme_req(req)->status), apple_nvme_complete_batch)) apple_nvme_complete_rq(req); } diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 8359d0aa0e44..725f2052bcb2 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -274,7 +274,7 @@ void nvme_delete_ctrl_sync(struct nvme_ctrl *ctrl) nvme_put_ctrl(ctrl); } -static blk_status_t nvme_error_status(u16 status) +blk_status_t nvme_error_status(u16 status) { switch (status & NVME_SCT_SC_MASK) { case NVME_SC_SUCCESS: @@ -315,6 +315,7 @@ static blk_status_t nvme_error_status(u16 status) return BLK_STS_IOERR; } } +EXPORT_SYMBOL_GPL(nvme_error_status); static void nvme_retry_req(struct request *req) { diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 7be92d07430e..4e166da4aa81 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -904,6 +904,7 @@ void nvme_stop_keep_alive(struct nvme_ctrl *ctrl); int nvme_reset_ctrl(struct nvme_ctrl *ctrl); int nvme_reset_ctrl_sync(struct nvme_ctrl *ctrl); int nvme_delete_ctrl(struct nvme_ctrl *ctrl); +blk_status_t nvme_error_status(u16 status); void nvme_queue_scan(struct nvme_ctrl *ctrl); int nvme_get_log(struct nvme_ctrl *ctrl, u32 nsid, u8 log_page, u8 lsp, u8 csi, void *log, size_t size, u64 offset); diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 640590b21728..873564efc346 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1130,8 +1130,9 @@ static inline void nvme_handle_cqe(struct nvme_queue *nvmeq, trace_nvme_sq(req, cqe->sq_head, nvmeq->sq_tail); if (!nvme_try_complete_req(req, cqe->status, cqe->result) && - !blk_mq_add_to_batch(req, iob, nvme_req(req)->status, - nvme_pci_complete_batch)) + !blk_mq_add_to_batch(req, iob, + nvme_error_status(nvme_req(req)->status), + nvme_pci_complete_batch)) nvme_pci_complete_rq(req); } diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 71f4f0cc3dac..9ba479a52ce7 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -857,7 +857,8 @@ static inline bool blk_mq_is_reserved_rq(struct request *rq) * ->end_io handler. */ static inline bool blk_mq_add_to_batch(struct request *req, - struct io_comp_batch *iob, int ioerror, + struct io_comp_batch *iob, + blk_status_t status, void (*complete)(struct io_comp_batch *)) { /* @@ -874,7 +875,7 @@ static inline bool blk_mq_add_to_batch(struct request *req, if (!blk_rq_is_passthrough(req)) { if (req->end_io) return false; - if (ioerror < 0) + if (blk_status_to_errno(status) < 0) return false; }
Commit 1f47ed294a2b ("block: cleanup and fix batch completion adding conditions") modified the evaluation criteria for the third argument, 'ioerror', in the blk_mq_add_to_batch() function. Initially, the function had checked if 'ioerror' equals zero. Following the commit, it started checking for negative error values, with the presumption that such values, for instance -EIO, would be passed in. However, blk_mq_add_to_batch() callers do not pass negative error values. Instead, they pass status codes defined in various ways: - NVMe PCI and Apple drivers pass NVMe status code - virtio_blk driver passes the virtblk request header status byte - null_blk driver passes blk_status_t These codes are either zero or positive, therefore the revised check fails to function as intended. Specifically, with the NVMe PCI driver, this modification led to the failure of the blktests test case nvme/039. In this test scenario, errors are artificially injected to the NVMe driver, resulting in positive NVMe status codes passed to blk_mq_add_to_batch(), which unexpectedly processes the failed I/O in a batch. Hence the failure. To correct the ioerror check within blk_mq_add_to_batch(), make all callers to uniformly pass the argument as blk_status_t. Modify the callers to translate their specific status codes into blk_status_t. For this translation, export the helper function nvme_error_status(). Adjust blk_mq_add_to_batch() to translate blk_status_t back into the error number for the appropriate check. Fixes: 1f47ed294a2b ("block: cleanup and fix batch completion adding conditions") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> --- drivers/block/null_blk/main.c | 2 +- drivers/block/virtio_blk.c | 5 +++-- drivers/nvme/host/apple.c | 3 ++- drivers/nvme/host/core.c | 3 ++- drivers/nvme/host/nvme.h | 1 + drivers/nvme/host/pci.c | 5 +++-- include/linux/blk-mq.h | 5 +++-- 7 files changed, 15 insertions(+), 9 deletions(-)