diff mbox series

virtio-blk: Fix WARN_ON_ONCE in virtio_queue_rq()

Message ID 20220823145005.26356-1-suwan.kim027@gmail.com (mailing list archive)
State New, archived
Headers show
Series virtio-blk: Fix WARN_ON_ONCE in virtio_queue_rq() | expand

Commit Message

Suwan Kim Aug. 23, 2022, 2:50 p.m. UTC
If a request fails at virtio_queue_rqs(), it is inserted to requeue_list
and passed to virtio_queue_rq(). Then blk_mq_start_request() can be called
again at virtio_queue_rq() and trigger WARN_ON_ONCE like below trace because
request state was already set to MQ_RQ_IN_FLIGHT in virtio_queue_rqs()
despite the failure.

[    1.890468] ------------[ cut here ]------------
[    1.890776] WARNING: CPU: 2 PID: 122 at block/blk-mq.c:1143
blk_mq_start_request+0x8a/0xe0
[    1.891045] Modules linked in:
[    1.891250] CPU: 2 PID: 122 Comm: journal-offline Not tainted 5.19.0+ #44
[    1.891504] Hardware name: ChromiumOS crosvm, BIOS 0
[    1.891739] RIP: 0010:blk_mq_start_request+0x8a/0xe0
[    1.891961] Code: 12 80 74 22 48 8b 4b 10 8b 89 64 01 00 00 8b 53
20 83 fa ff 75 08 ba 00 00 00 80 0b 53 24 c1 e1 10 09 d1 89 48 34 5b
41 5e c3 <0f> 0b eb b8 65 8b 05 2b 39 b6 7e 89 c0 48 0f a3 05 39 77 5b
01 0f
[    1.892443] RSP: 0018:ffffc900002777b0 EFLAGS: 00010202
[    1.892673] RAX: 0000000000000000 RBX: ffff888004bc0000 RCX: 0000000000000000
[    1.892952] RDX: 0000000000000000 RSI: ffff888003d7c200 RDI: ffff888004bc0000
[    1.893228] RBP: 0000000000000000 R08: 0000000000000001 R09: ffff888004bc0100
[    1.893506] R10: ffffffffffffffff R11: ffffffff8185ca10 R12: ffff888004bc0000
[    1.893797] R13: ffffc90000277900 R14: ffff888004ab2340 R15: ffff888003d86e00
[    1.894060] FS:  00007ffa143a4640(0000) GS:ffff88807dd00000(0000)
knlGS:0000000000000000
[    1.894412] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.894682] CR2: 00005648577d9088 CR3: 00000000053da004 CR4: 0000000000170ee0
[    1.894953] Call Trace:
[    1.895139]  <TASK>
[    1.895303]  virtblk_prep_rq+0x1e5/0x280
[    1.895509]  virtio_queue_rq+0x5c/0x310
[    1.895710]  ? virtqueue_add_sgs+0x95/0xb0
[    1.895905]  ? _raw_spin_unlock_irqrestore+0x16/0x30
[    1.896133]  ? virtio_queue_rqs+0x340/0x390
[    1.896453]  ? sbitmap_get+0xfa/0x220
[    1.896678]  __blk_mq_issue_directly+0x41/0x180
[    1.896906]  blk_mq_plug_issue_direct+0xd8/0x2c0
[    1.897115]  blk_mq_flush_plug_list+0x115/0x180
[    1.897342]  blk_add_rq_to_plug+0x51/0x130
[    1.897543]  blk_mq_submit_bio+0x3a1/0x570
[    1.897750]  submit_bio_noacct_nocheck+0x418/0x520
[    1.897985]  ? submit_bio_noacct+0x1e/0x260
[    1.897989]  ext4_bio_write_page+0x222/0x420
[    1.898000]  mpage_process_page_bufs+0x178/0x1c0
[    1.899451]  mpage_prepare_extent_to_map+0x2d2/0x440
[    1.899603]  ext4_writepages+0x495/0x1020
[    1.899733]  do_writepages+0xcb/0x220
[    1.899871]  ? __seccomp_filter+0x171/0x7e0
[    1.900006]  file_write_and_wait_range+0xcd/0xf0
[    1.900167]  ext4_sync_file+0x72/0x320
[    1.900308]  __x64_sys_fsync+0x66/0xa0
[    1.900449]  do_syscall_64+0x31/0x50
[    1.900595]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[    1.900747] RIP: 0033:0x7ffa16ec96ea
[    1.900883] Code: b8 4a 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3
48 83 ec 18 89 7c 24 0c e8 e3 02 f8 ff 8b 7c 24 0c 89 c2 b8 4a 00 00
00 0f 05 <48> 3d 00 f0 ff ff 77 36 89 d7 89 44 24 0c e8 43 03 f8 ff 8b
44 24
[    1.901302] RSP: 002b:00007ffa143a3ac0 EFLAGS: 00000293 ORIG_RAX:
000000000000004a
[    1.901499] RAX: ffffffffffffffda RBX: 0000560277ec6fe0 RCX: 00007ffa16ec96ea
[    1.901696] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000016
[    1.901884] RBP: 0000560277ec5910 R08: 0000000000000000 R09: 00007ffa143a4640
[    1.902082] R10: 00007ffa16e4d39e R11: 0000000000000293 R12: 00005602773f59e0
[    1.902459] R13: 0000000000000000 R14: 00007fffbfc007ff R15: 00007ffa13ba4000
[    1.902763]  </TASK>
[    1.902877] ---[ end trace 0000000000000000 ]---

This patch defers the execution of blk_mq_start_request() after
virtblk_add_req() within virtio_queue_rqs(). virtblk_add_req() is the last
preparation step to submit a request to virtqueue. So we can ensure that
we call blk_mq_start_request() after all the preparations finish.

In virtio_queue_rq(), call blk_mq_start_request() before virtblk_add_req()
to avoid its execution inside spinlock.

Fixes: 0e9911fa768f ("virtio-blk: support mq_ops->queue_rqs()")
Reported-by: Alexandre Courbot <acourbot@chromium.org>
Tested-by: Alexandre Courbot <acourbot@chromium.org>
Signed-off-by: Suwan Kim <suwan.kim027@gmail.com>
---
 drivers/block/virtio_blk.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Christoph Hellwig Aug. 23, 2022, 4:39 p.m. UTC | #1
Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
Stefan Hajnoczi Aug. 23, 2022, 8:56 p.m. UTC | #2
On Tue, Aug 23, 2022 at 11:50:05PM +0900, Suwan Kim wrote:
> @@ -409,6 +409,8 @@ static bool virtblk_add_req_batch(struct virtio_blk_vq *vq,
>  			virtblk_unmap_data(req, vbr);
>  			virtblk_cleanup_cmd(req);
>  			rq_list_add(requeue_list, req);
> +		} else {
> +			blk_mq_start_request(req);
>  		}

The device may see new requests as soon as virtblk_add_req() is called
above. Therefore the device may complete the request before
blk_mq_start_request() is called.

rq->io_start_time_ns = ktime_get_ns() will be after the request was
actually submitted.

I think blk_mq_start_request() needs to be called before
virtblk_add_req().

Stefan
Suwan Kim Aug. 24, 2022, 1:16 p.m. UTC | #3
Hi Stefan,

On Wed, Aug 24, 2022 at 5:56 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> On Tue, Aug 23, 2022 at 11:50:05PM +0900, Suwan Kim wrote:
> > @@ -409,6 +409,8 @@ static bool virtblk_add_req_batch(struct virtio_blk_vq *vq,
> >                       virtblk_unmap_data(req, vbr);
> >                       virtblk_cleanup_cmd(req);
> >                       rq_list_add(requeue_list, req);
> > +             } else {
> > +                     blk_mq_start_request(req);
> >               }
>
> The device may see new requests as soon as virtblk_add_req() is called
> above. Therefore the device may complete the request before
> blk_mq_start_request() is called.
>
> rq->io_start_time_ns = ktime_get_ns() will be after the request was
> actually submitted.
>
> I think blk_mq_start_request() needs to be called before
> virtblk_add_req().
>

But if blk_mq_start_request() is called before virtblk_add_req()
and virtblk_add_req() fails, it can trigger WARN_ON_ONCE() at
virtio_queue_rq().

With regard to the race condition between virtblk_add_req() and
completion, I think that the race condition can not happen because
virtblk_add_req() holds vq lock with irq saving and completion side
(virtblk_done, virtblk_poll) need to acquire the vq lock also.
Moreover, virtblk_done() is irq context so I think it can not be
executed until virtblk_add_req() releases the lock.

Regards,
Suwan Kim
Stefan Hajnoczi Aug. 24, 2022, 5:32 p.m. UTC | #4
On Wed, Aug 24, 2022 at 10:16:10PM +0900, Kim Suwan wrote:
> Hi Stefan,
> 
> On Wed, Aug 24, 2022 at 5:56 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >
> > On Tue, Aug 23, 2022 at 11:50:05PM +0900, Suwan Kim wrote:
> > > @@ -409,6 +409,8 @@ static bool virtblk_add_req_batch(struct virtio_blk_vq *vq,
> > >                       virtblk_unmap_data(req, vbr);
> > >                       virtblk_cleanup_cmd(req);
> > >                       rq_list_add(requeue_list, req);
> > > +             } else {
> > > +                     blk_mq_start_request(req);
> > >               }
> >
> > The device may see new requests as soon as virtblk_add_req() is called
> > above. Therefore the device may complete the request before
> > blk_mq_start_request() is called.
> >
> > rq->io_start_time_ns = ktime_get_ns() will be after the request was
> > actually submitted.
> >
> > I think blk_mq_start_request() needs to be called before
> > virtblk_add_req().
> >
> 
> But if blk_mq_start_request() is called before virtblk_add_req()
> and virtblk_add_req() fails, it can trigger WARN_ON_ONCE() at
> virtio_queue_rq().
> 
> With regard to the race condition between virtblk_add_req() and
> completion, I think that the race condition can not happen because
> virtblk_add_req() holds vq lock with irq saving and completion side
> (virtblk_done, virtblk_poll) need to acquire the vq lock also.
> Moreover, virtblk_done() is irq context so I think it can not be
> executed until virtblk_add_req() releases the lock.

I agree there is no race condition regarding the ordering of
blk_mq_start_request() and request completion. The spinlock prevents
that and I wasn't concerned about that part.

The issue is that the timestamp will be garbage. If we capture the
timestamp during/after the request is executing, then the collected
statistics will be wrong.

Can you look for another solution that doesn't break the timestamp?

FWIW I see that the rq->state state machine allows returning to the idle
state once the request has been started: __blk_mq_requeue_request().

Stefan
Suwan Kim Aug. 25, 2022, 2:49 p.m. UTC | #5
On Thu, Aug 25, 2022 at 2:32 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> On Wed, Aug 24, 2022 at 10:16:10PM +0900, Kim Suwan wrote:
> > Hi Stefan,
> >
> > On Wed, Aug 24, 2022 at 5:56 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > >
> > > On Tue, Aug 23, 2022 at 11:50:05PM +0900, Suwan Kim wrote:
> > > > @@ -409,6 +409,8 @@ static bool virtblk_add_req_batch(struct virtio_blk_vq *vq,
> > > >                       virtblk_unmap_data(req, vbr);
> > > >                       virtblk_cleanup_cmd(req);
> > > >                       rq_list_add(requeue_list, req);
> > > > +             } else {
> > > > +                     blk_mq_start_request(req);
> > > >               }
> > >
> > > The device may see new requests as soon as virtblk_add_req() is called
> > > above. Therefore the device may complete the request before
> > > blk_mq_start_request() is called.
> > >
> > > rq->io_start_time_ns = ktime_get_ns() will be after the request was
> > > actually submitted.
> > >
> > > I think blk_mq_start_request() needs to be called before
> > > virtblk_add_req().
> > >
> >
> > But if blk_mq_start_request() is called before virtblk_add_req()
> > and virtblk_add_req() fails, it can trigger WARN_ON_ONCE() at
> > virtio_queue_rq().
> >
> > With regard to the race condition between virtblk_add_req() and
> > completion, I think that the race condition can not happen because
> > virtblk_add_req() holds vq lock with irq saving and completion side
> > (virtblk_done, virtblk_poll) need to acquire the vq lock also.
> > Moreover, virtblk_done() is irq context so I think it can not be
> > executed until virtblk_add_req() releases the lock.
>
> I agree there is no race condition regarding the ordering of
> blk_mq_start_request() and request completion. The spinlock prevents
> that and I wasn't concerned about that part.
>
> The issue is that the timestamp will be garbage. If we capture the
> timestamp during/after the request is executing, then the collected
> statistics will be wrong.
>
> Can you look for another solution that doesn't break the timestamp?
>
> FWIW I see that the rq->state state machine allows returning to the idle
> state once the request has been started: __blk_mq_requeue_request().

I considered blk_mq_requeue_request() to handle error cases but
I didn't use it because I think it can make the error path request
processing slower than requeuing an error request to plug list again.

But there doesn't seem to be any other option that doesn't break
the timestamp.

As you said, I will use __blk_mq_requeue_request() and send
new patch soon.

To Alexandre,

I will share new diff soon. Could you please test one more time?

Regards,
Suwan Kim
Alexandre Courbot Aug. 26, 2022, 1:41 a.m. UTC | #6
On Thu, Aug 25, 2022 at 11:50 PM Suwan Kim <suwan.kim027@gmail.com> wrote:
>
> On Thu, Aug 25, 2022 at 2:32 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >
> > On Wed, Aug 24, 2022 at 10:16:10PM +0900, Kim Suwan wrote:
> > > Hi Stefan,
> > >
> > > On Wed, Aug 24, 2022 at 5:56 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > >
> > > > On Tue, Aug 23, 2022 at 11:50:05PM +0900, Suwan Kim wrote:
> > > > > @@ -409,6 +409,8 @@ static bool virtblk_add_req_batch(struct virtio_blk_vq *vq,
> > > > >                       virtblk_unmap_data(req, vbr);
> > > > >                       virtblk_cleanup_cmd(req);
> > > > >                       rq_list_add(requeue_list, req);
> > > > > +             } else {
> > > > > +                     blk_mq_start_request(req);
> > > > >               }
> > > >
> > > > The device may see new requests as soon as virtblk_add_req() is called
> > > > above. Therefore the device may complete the request before
> > > > blk_mq_start_request() is called.
> > > >
> > > > rq->io_start_time_ns = ktime_get_ns() will be after the request was
> > > > actually submitted.
> > > >
> > > > I think blk_mq_start_request() needs to be called before
> > > > virtblk_add_req().
> > > >
> > >
> > > But if blk_mq_start_request() is called before virtblk_add_req()
> > > and virtblk_add_req() fails, it can trigger WARN_ON_ONCE() at
> > > virtio_queue_rq().
> > >
> > > With regard to the race condition between virtblk_add_req() and
> > > completion, I think that the race condition can not happen because
> > > virtblk_add_req() holds vq lock with irq saving and completion side
> > > (virtblk_done, virtblk_poll) need to acquire the vq lock also.
> > > Moreover, virtblk_done() is irq context so I think it can not be
> > > executed until virtblk_add_req() releases the lock.
> >
> > I agree there is no race condition regarding the ordering of
> > blk_mq_start_request() and request completion. The spinlock prevents
> > that and I wasn't concerned about that part.
> >
> > The issue is that the timestamp will be garbage. If we capture the
> > timestamp during/after the request is executing, then the collected
> > statistics will be wrong.
> >
> > Can you look for another solution that doesn't break the timestamp?
> >
> > FWIW I see that the rq->state state machine allows returning to the idle
> > state once the request has been started: __blk_mq_requeue_request().
>
> I considered blk_mq_requeue_request() to handle error cases but
> I didn't use it because I think it can make the error path request
> processing slower than requeuing an error request to plug list again.
>
> But there doesn't seem to be any other option that doesn't break
> the timestamp.
>
> As you said, I will use __blk_mq_requeue_request() and send
> new patch soon.
>
> To Alexandre,
>
> I will share new diff soon. Could you please test one more time?

Absolutely! Thanks for looking into this.
Suwan Kim Aug. 29, 2022, 2:48 a.m. UTC | #7
On Fri, Aug 26, 2022 at 10:41:39AM +0900, Alexandre Courbot wrote:
> On Thu, Aug 25, 2022 at 11:50 PM Suwan Kim <suwan.kim027@gmail.com> wrote:
> >
> > On Thu, Aug 25, 2022 at 2:32 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > >
> > > On Wed, Aug 24, 2022 at 10:16:10PM +0900, Kim Suwan wrote:
> > > > Hi Stefan,
> > > >
> > > > On Wed, Aug 24, 2022 at 5:56 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > > >
> > > > > On Tue, Aug 23, 2022 at 11:50:05PM +0900, Suwan Kim wrote:
> > > > > > @@ -409,6 +409,8 @@ static bool virtblk_add_req_batch(struct virtio_blk_vq *vq,
> > > > > >                       virtblk_unmap_data(req, vbr);
> > > > > >                       virtblk_cleanup_cmd(req);
> > > > > >                       rq_list_add(requeue_list, req);
> > > > > > +             } else {
> > > > > > +                     blk_mq_start_request(req);
> > > > > >               }
> > > > >
> > > > > The device may see new requests as soon as virtblk_add_req() is called
> > > > > above. Therefore the device may complete the request before
> > > > > blk_mq_start_request() is called.
> > > > >
> > > > > rq->io_start_time_ns = ktime_get_ns() will be after the request was
> > > > > actually submitted.
> > > > >
> > > > > I think blk_mq_start_request() needs to be called before
> > > > > virtblk_add_req().
> > > > >
> > > >
> > > > But if blk_mq_start_request() is called before virtblk_add_req()
> > > > and virtblk_add_req() fails, it can trigger WARN_ON_ONCE() at
> > > > virtio_queue_rq().
> > > >
> > > > With regard to the race condition between virtblk_add_req() and
> > > > completion, I think that the race condition can not happen because
> > > > virtblk_add_req() holds vq lock with irq saving and completion side
> > > > (virtblk_done, virtblk_poll) need to acquire the vq lock also.
> > > > Moreover, virtblk_done() is irq context so I think it can not be
> > > > executed until virtblk_add_req() releases the lock.
> > >
> > > I agree there is no race condition regarding the ordering of
> > > blk_mq_start_request() and request completion. The spinlock prevents
> > > that and I wasn't concerned about that part.
> > >
> > > The issue is that the timestamp will be garbage. If we capture the
> > > timestamp during/after the request is executing, then the collected
> > > statistics will be wrong.
> > >
> > > Can you look for another solution that doesn't break the timestamp?
> > >
> > > FWIW I see that the rq->state state machine allows returning to the idle
> > > state once the request has been started: __blk_mq_requeue_request().
> >
> > I considered blk_mq_requeue_request() to handle error cases but
> > I didn't use it because I think it can make the error path request
> > processing slower than requeuing an error request to plug list again.
> >
> > But there doesn't seem to be any other option that doesn't break
> > the timestamp.
> >
> > As you said, I will use __blk_mq_requeue_request() and send
> > new patch soon.
> >
> > To Alexandre,
> >
> > I will share new diff soon. Could you please test one more time?
> 
> Absolutely! Thanks for looking into this.

Hi Alexandre,

Could you test this path?
If it works, I will send v2 patch.

Regards,
Suwan Kim

---

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 30255fcaf181..dd9a05174726 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -322,14 +322,14 @@ static blk_status_t virtblk_prep_rq(struct blk_mq_hw_ctx *hctx,
        if (unlikely(status))
                return status;

-       blk_mq_start_request(req);
-
        vbr->sg_table.nents = virtblk_map_data(hctx, req, vbr);
        if (unlikely(vbr->sg_table.nents < 0)) {
                virtblk_cleanup_cmd(req);
                return BLK_STS_RESOURCE;
        }

+       blk_mq_start_request(req);
+
        return BLK_STS_OK;
 }

@@ -391,8 +391,7 @@ static bool virtblk_prep_rq_batch(struct request *req)
 }

 static bool virtblk_add_req_batch(struct virtio_blk_vq *vq,
-                                       struct request **rqlist,
-                                       struct request **requeue_list)
+                                       struct request **rqlist)
 {
        unsigned long flags;
        int err;
@@ -408,7 +407,7 @@ static bool virtblk_add_req_batch(struct virtio_blk_vq *vq,
                if (err) {
                        virtblk_unmap_data(req, vbr);
                        virtblk_cleanup_cmd(req);
-                       rq_list_add(requeue_list, req);
+                       blk_mq_requeue_request(req, true);
                }
        }

@@ -436,7 +435,7 @@ static void virtio_queue_rqs(struct request **rqlist)

                if (!next || req->mq_hctx != next->mq_hctx) {
                        req->rq_next = NULL;
-                       kick = virtblk_add_req_batch(vq, rqlist, &requeue_list);
+                       kick = virtblk_add_req_batch(vq, rqlist);
                        if (kick)
                                virtqueue_notify(vq->vq);
Alexandre Courbot Aug. 30, 2022, 8:23 a.m. UTC | #8
Hi Suwan,

On Mon, Aug 29, 2022 at 11:49 AM Suwan Kim <suwan.kim027@gmail.com> wrote:
>
> On Fri, Aug 26, 2022 at 10:41:39AM +0900, Alexandre Courbot wrote:
> > On Thu, Aug 25, 2022 at 11:50 PM Suwan Kim <suwan.kim027@gmail.com> wrote:
> > >
> > > On Thu, Aug 25, 2022 at 2:32 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > >
> > > > On Wed, Aug 24, 2022 at 10:16:10PM +0900, Kim Suwan wrote:
> > > > > Hi Stefan,
> > > > >
> > > > > On Wed, Aug 24, 2022 at 5:56 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > > > >
> > > > > > On Tue, Aug 23, 2022 at 11:50:05PM +0900, Suwan Kim wrote:
> > > > > > > @@ -409,6 +409,8 @@ static bool virtblk_add_req_batch(struct virtio_blk_vq *vq,
> > > > > > >                       virtblk_unmap_data(req, vbr);
> > > > > > >                       virtblk_cleanup_cmd(req);
> > > > > > >                       rq_list_add(requeue_list, req);
> > > > > > > +             } else {
> > > > > > > +                     blk_mq_start_request(req);
> > > > > > >               }
> > > > > >
> > > > > > The device may see new requests as soon as virtblk_add_req() is called
> > > > > > above. Therefore the device may complete the request before
> > > > > > blk_mq_start_request() is called.
> > > > > >
> > > > > > rq->io_start_time_ns = ktime_get_ns() will be after the request was
> > > > > > actually submitted.
> > > > > >
> > > > > > I think blk_mq_start_request() needs to be called before
> > > > > > virtblk_add_req().
> > > > > >
> > > > >
> > > > > But if blk_mq_start_request() is called before virtblk_add_req()
> > > > > and virtblk_add_req() fails, it can trigger WARN_ON_ONCE() at
> > > > > virtio_queue_rq().
> > > > >
> > > > > With regard to the race condition between virtblk_add_req() and
> > > > > completion, I think that the race condition can not happen because
> > > > > virtblk_add_req() holds vq lock with irq saving and completion side
> > > > > (virtblk_done, virtblk_poll) need to acquire the vq lock also.
> > > > > Moreover, virtblk_done() is irq context so I think it can not be
> > > > > executed until virtblk_add_req() releases the lock.
> > > >
> > > > I agree there is no race condition regarding the ordering of
> > > > blk_mq_start_request() and request completion. The spinlock prevents
> > > > that and I wasn't concerned about that part.
> > > >
> > > > The issue is that the timestamp will be garbage. If we capture the
> > > > timestamp during/after the request is executing, then the collected
> > > > statistics will be wrong.
> > > >
> > > > Can you look for another solution that doesn't break the timestamp?
> > > >
> > > > FWIW I see that the rq->state state machine allows returning to the idle
> > > > state once the request has been started: __blk_mq_requeue_request().
> > >
> > > I considered blk_mq_requeue_request() to handle error cases but
> > > I didn't use it because I think it can make the error path request
> > > processing slower than requeuing an error request to plug list again.
> > >
> > > But there doesn't seem to be any other option that doesn't break
> > > the timestamp.
> > >
> > > As you said, I will use __blk_mq_requeue_request() and send
> > > new patch soon.
> > >
> > > To Alexandre,
> > >
> > > I will share new diff soon. Could you please test one more time?
> >
> > Absolutely! Thanks for looking into this.
>
> Hi Alexandre,
>
> Could you test this path?
> If it works, I will send v2 patch.

This version is working fine for me!

Cheers,
Alex.
diff mbox series

Patch

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 30255fcaf181..73a0620a7cff 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -322,8 +322,6 @@  static blk_status_t virtblk_prep_rq(struct blk_mq_hw_ctx *hctx,
 	if (unlikely(status))
 		return status;
 
-	blk_mq_start_request(req);
-
 	vbr->sg_table.nents = virtblk_map_data(hctx, req, vbr);
 	if (unlikely(vbr->sg_table.nents < 0)) {
 		virtblk_cleanup_cmd(req);
@@ -349,6 +347,8 @@  static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx *hctx,
 	if (unlikely(status))
 		return status;
 
+	blk_mq_start_request(req);
+
 	spin_lock_irqsave(&vblk->vqs[qid].lock, flags);
 	err = virtblk_add_req(vblk->vqs[qid].vq, vbr);
 	if (err) {
@@ -409,6 +409,8 @@  static bool virtblk_add_req_batch(struct virtio_blk_vq *vq,
 			virtblk_unmap_data(req, vbr);
 			virtblk_cleanup_cmd(req);
 			rq_list_add(requeue_list, req);
+		} else {
+			blk_mq_start_request(req);
 		}
 	}