Message ID | 20210728103518.1221195-1-f.ebner@proxmox.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | block/io_uring: resubmit when result is -EAGAIN | expand |
Cc'ing the maintainers for you. See https://wiki.qemu.org/Contribute/SubmitAPatch#CC_the_relevant_maintainer $ ./scripts/get_maintainer.pl -f block/io_uring.c Aarushi Mehta <mehta.aaru20@gmail.com> (maintainer:Linux io_uring) Julia Suvorova <jusual@redhat.com> (maintainer:Linux io_uring) Stefan Hajnoczi <stefanha@redhat.com> (maintainer:Linux io_uring) Kevin Wolf <kwolf@redhat.com> (supporter:Block layer core) Max Reitz <mreitz@redhat.com> (supporter:Block layer core) qemu-block@nongnu.org (open list:Linux io_uring) qemu-devel@nongnu.org (open list:All patches CC here) Also Cc'ing Stefano for commit b4e44c9944e ("io_uring: retry io_uring_submit() if it fails with errno=EINTR"). (Stefano, you might want to add yourself a R: tag in MAINTAINERS). On 7/28/21 12:35 PM, Fabian Ebner wrote: > Quoting from [0]: > > Some setups, like SCSI, can throw spurious -EAGAIN off the softirq > completion path. Normally we expect this to happen inline as part > of submission, but apparently SCSI has a weird corner case where it > can happen as part of normal completions. > > Host kernels without patch [0] can panic when this happens [1], and > resubmitting makes the panic more likely. On the other hand, for > kernels with patch [0], resubmitting ensures that a block job is not > aborted just because of such spurious errors. > > [0]: https://lore.kernel.org/io-uring/20210727165811.284510-3-axboe@kernel.dk/T/#u > > [1]: > #9 [ffffb732000c8b70] asm_exc_page_fault at ffffffffa4800ade > #10 [ffffb732000c8bf8] io_prep_async_work at ffffffffa3d89c16 > #11 [ffffb732000c8c50] io_rw_reissue at ffffffffa3d8b2e1 > #12 [ffffb732000c8c78] io_complete_rw at ffffffffa3d8baa8 > #13 [ffffb732000c8c98] blkdev_bio_end_io at ffffffffa3d62a80 > #14 [ffffb732000c8cc8] bio_endio at ffffffffa3f4e800 > #15 [ffffb732000c8ce8] dec_pending at ffffffffa432f854 > #16 [ffffb732000c8d30] clone_endio at ffffffffa433170c > #17 [ffffb732000c8d70] bio_endio at ffffffffa3f4e800 > #18 [ffffb732000c8d90] blk_update_request at ffffffffa3f53a37 > #19 [ffffb732000c8dd0] scsi_end_request at ffffffffa4233a5c > #20 [ffffb732000c8e08] scsi_io_completion at ffffffffa423432c > #21 [ffffb732000c8e58] scsi_finish_command at ffffffffa422c527 > #22 [ffffb732000c8e88] scsi_softirq_done at ffffffffa42341e4 > > Signed-off-by: Fabian Ebner <f.ebner@proxmox.com> > --- > > I'm new to this code and io_uring, so I don't know what all the > implications are, but retrying upon EAGAIN does not sound like > a bad thing to my inexperienced ears. > > Some more context, leading up to this patch: > > We had some users reporting issues after we switched to using io_uring > by default. Namely, kernel panics [2] for some, and failing block jobs > [3] (with a custom backup mechanism we implemented on top of QEMU's > block layer) for others. > > I had luck and managed to reprouce the issue, and it was a failed > block job about half of the time and a kernel panic the other half. > When using a host kernel with [0], it's a failed block job all the > time, and this patch attempts to fix that, by resubmitting instead > of bubbling up the error. > > [2]: https://forum.proxmox.com/threads/kernel-panic-whole-server-crashes-about-every-day.91803/post-404382 > [3]: https://forum.proxmox.com/threads/backup-job-failed-with-err-11-on-2-of-6-vms.92568/ > > block/io_uring.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/io_uring.c b/block/io_uring.c > index 00a3ee9fb8..77d162cb24 100644 > --- a/block/io_uring.c > +++ b/block/io_uring.c > @@ -165,7 +165,7 @@ static void luring_process_completions(LuringState *s) > total_bytes = ret + luringcb->total_read; > > if (ret < 0) { > - if (ret == -EINTR) { > + if (ret == -EINTR || ret == -EAGAIN) { > luring_resubmit(s, luringcb); > continue; > } >
On Wed, Jul 28, 2021 at 02:09:40PM +0200, Philippe Mathieu-Daudé wrote: >Cc'ing the maintainers for you. See >https://wiki.qemu.org/Contribute/SubmitAPatch#CC_the_relevant_maintainer > >$ ./scripts/get_maintainer.pl -f block/io_uring.c >Aarushi Mehta <mehta.aaru20@gmail.com> (maintainer:Linux io_uring) >Julia Suvorova <jusual@redhat.com> (maintainer:Linux io_uring) >Stefan Hajnoczi <stefanha@redhat.com> (maintainer:Linux io_uring) >Kevin Wolf <kwolf@redhat.com> (supporter:Block layer core) >Max Reitz <mreitz@redhat.com> (supporter:Block layer core) >qemu-block@nongnu.org (open list:Linux io_uring) >qemu-devel@nongnu.org (open list:All patches CC here) > >Also Cc'ing Stefano for commit b4e44c9944e ("io_uring: retry >io_uring_submit() if it fails with errno=EINTR"). Thanks Phil! >(Stefano, you might want to add yourself a R: tag in MAINTAINERS). Yep, I'll send a patch for that. > >On 7/28/21 12:35 PM, Fabian Ebner wrote: >> Quoting from [0]: >> >> Some setups, like SCSI, can throw spurious -EAGAIN off the softirq >> completion path. Normally we expect this to happen inline as part >> of submission, but apparently SCSI has a weird corner case where it >> can happen as part of normal completions. >> >> Host kernels without patch [0] can panic when this happens [1], and >> resubmitting makes the panic more likely. On the other hand, for >> kernels with patch [0], resubmitting ensures that a block job is not >> aborted just because of such spurious errors. >> >> [0]: https://lore.kernel.org/io-uring/20210727165811.284510-3-axboe@kernel.dk/T/#u >> >> [1]: >> #9 [ffffb732000c8b70] asm_exc_page_fault at ffffffffa4800ade >> #10 [ffffb732000c8bf8] io_prep_async_work at ffffffffa3d89c16 >> #11 [ffffb732000c8c50] io_rw_reissue at ffffffffa3d8b2e1 >> #12 [ffffb732000c8c78] io_complete_rw at ffffffffa3d8baa8 >> #13 [ffffb732000c8c98] blkdev_bio_end_io at ffffffffa3d62a80 >> #14 [ffffb732000c8cc8] bio_endio at ffffffffa3f4e800 >> #15 [ffffb732000c8ce8] dec_pending at ffffffffa432f854 >> #16 [ffffb732000c8d30] clone_endio at ffffffffa433170c >> #17 [ffffb732000c8d70] bio_endio at ffffffffa3f4e800 >> #18 [ffffb732000c8d90] blk_update_request at ffffffffa3f53a37 >> #19 [ffffb732000c8dd0] scsi_end_request at ffffffffa4233a5c >> #20 [ffffb732000c8e08] scsi_io_completion at ffffffffa423432c >> #21 [ffffb732000c8e58] scsi_finish_command at ffffffffa422c527 >> #22 [ffffb732000c8e88] scsi_softirq_done at ffffffffa42341e4 >> >> Signed-off-by: Fabian Ebner <f.ebner@proxmox.com> >> --- >> >> I'm new to this code and io_uring, so I don't know what all the >> implications are, but retrying upon EAGAIN does not sound like >> a bad thing to my inexperienced ears. Yeah, that doesn't sound bad. For kernels that don't have the patch applied, I don't think there's much we can do about it, so this change seems okay to me: Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
On Wed, Jul 28, 2021 at 12:35:18PM +0200, Fabian Ebner wrote: > Quoting from [0]: > > Some setups, like SCSI, can throw spurious -EAGAIN off the softirq > completion path. Normally we expect this to happen inline as part > of submission, but apparently SCSI has a weird corner case where it > can happen as part of normal completions. > > Host kernels without patch [0] can panic when this happens [1], and > resubmitting makes the panic more likely. On the other hand, for > kernels with patch [0], resubmitting ensures that a block job is not > aborted just because of such spurious errors. > > [0]: https://lore.kernel.org/io-uring/20210727165811.284510-3-axboe@kernel.dk/T/#u > > [1]: > #9 [ffffb732000c8b70] asm_exc_page_fault at ffffffffa4800ade > #10 [ffffb732000c8bf8] io_prep_async_work at ffffffffa3d89c16 > #11 [ffffb732000c8c50] io_rw_reissue at ffffffffa3d8b2e1 > #12 [ffffb732000c8c78] io_complete_rw at ffffffffa3d8baa8 > #13 [ffffb732000c8c98] blkdev_bio_end_io at ffffffffa3d62a80 > #14 [ffffb732000c8cc8] bio_endio at ffffffffa3f4e800 > #15 [ffffb732000c8ce8] dec_pending at ffffffffa432f854 > #16 [ffffb732000c8d30] clone_endio at ffffffffa433170c > #17 [ffffb732000c8d70] bio_endio at ffffffffa3f4e800 > #18 [ffffb732000c8d90] blk_update_request at ffffffffa3f53a37 > #19 [ffffb732000c8dd0] scsi_end_request at ffffffffa4233a5c > #20 [ffffb732000c8e08] scsi_io_completion at ffffffffa423432c > #21 [ffffb732000c8e58] scsi_finish_command at ffffffffa422c527 > #22 [ffffb732000c8e88] scsi_softirq_done at ffffffffa42341e4 > > Signed-off-by: Fabian Ebner <f.ebner@proxmox.com> > --- > > I'm new to this code and io_uring, so I don't know what all the > implications are, but retrying upon EAGAIN does not sound like > a bad thing to my inexperienced ears. > > Some more context, leading up to this patch: > > We had some users reporting issues after we switched to using io_uring > by default. Namely, kernel panics [2] for some, and failing block jobs > [3] (with a custom backup mechanism we implemented on top of QEMU's > block layer) for others. > > I had luck and managed to reprouce the issue, and it was a failed > block job about half of the time and a kernel panic the other half. > When using a host kernel with [0], it's a failed block job all the > time, and this patch attempts to fix that, by resubmitting instead > of bubbling up the error. Great, thanks for the patch! Some of the relevant information is below the '---' line and won't be included in the git log. Please tweak the commit description to focus on: - io_uring may return spurious -EAGAIN with Linux SCSI (The details of how io_uring internally resubmits are not important but the fact that spurious -EAGAIN is visible to userspace matters.) - Resubmitting such requests solves the problem The kernel panic is not as important in the commit description since this patch works the same regardless of whether the kernel panics sometimes or not. > > [2]: https://forum.proxmox.com/threads/kernel-panic-whole-server-crashes-about-every-day.91803/post-404382 > [3]: https://forum.proxmox.com/threads/backup-job-failed-with-err-11-on-2-of-6-vms.92568/ > > block/io_uring.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/io_uring.c b/block/io_uring.c > index 00a3ee9fb8..77d162cb24 100644 > --- a/block/io_uring.c > +++ b/block/io_uring.c > @@ -165,7 +165,7 @@ static void luring_process_completions(LuringState *s) > total_bytes = ret + luringcb->total_read; > > if (ret < 0) { > - if (ret == -EINTR) { > + if (ret == -EINTR || ret == -EAGAIN) { > luring_resubmit(s, luringcb); > continue; > } Please add a comment: /* * Only writev/readv/fsync requests on regular files or host block * devices are submitted. Therefore -EAGAIN is not expected but it's * known to happen sometimes with Linux SCSI. Submit again and hope the * request completes successfully. * * For more information, see: * https://lore.kernel.org/io-uring/20210727165811.284510-3-axboe@kernel.dk/T/#u * * If the code is changed to submit other types of requests in the * future, then this workaround may need to be extended to deal with * genuine -EAGAIN results that should not be resubmitted immediately. */ Stefan
diff --git a/block/io_uring.c b/block/io_uring.c index 00a3ee9fb8..77d162cb24 100644 --- a/block/io_uring.c +++ b/block/io_uring.c @@ -165,7 +165,7 @@ static void luring_process_completions(LuringState *s) total_bytes = ret + luringcb->total_read; if (ret < 0) { - if (ret == -EINTR) { + if (ret == -EINTR || ret == -EAGAIN) { luring_resubmit(s, luringcb); continue; }
Quoting from [0]: Some setups, like SCSI, can throw spurious -EAGAIN off the softirq completion path. Normally we expect this to happen inline as part of submission, but apparently SCSI has a weird corner case where it can happen as part of normal completions. Host kernels without patch [0] can panic when this happens [1], and resubmitting makes the panic more likely. On the other hand, for kernels with patch [0], resubmitting ensures that a block job is not aborted just because of such spurious errors. [0]: https://lore.kernel.org/io-uring/20210727165811.284510-3-axboe@kernel.dk/T/#u [1]: #9 [ffffb732000c8b70] asm_exc_page_fault at ffffffffa4800ade #10 [ffffb732000c8bf8] io_prep_async_work at ffffffffa3d89c16 #11 [ffffb732000c8c50] io_rw_reissue at ffffffffa3d8b2e1 #12 [ffffb732000c8c78] io_complete_rw at ffffffffa3d8baa8 #13 [ffffb732000c8c98] blkdev_bio_end_io at ffffffffa3d62a80 #14 [ffffb732000c8cc8] bio_endio at ffffffffa3f4e800 #15 [ffffb732000c8ce8] dec_pending at ffffffffa432f854 #16 [ffffb732000c8d30] clone_endio at ffffffffa433170c #17 [ffffb732000c8d70] bio_endio at ffffffffa3f4e800 #18 [ffffb732000c8d90] blk_update_request at ffffffffa3f53a37 #19 [ffffb732000c8dd0] scsi_end_request at ffffffffa4233a5c #20 [ffffb732000c8e08] scsi_io_completion at ffffffffa423432c #21 [ffffb732000c8e58] scsi_finish_command at ffffffffa422c527 #22 [ffffb732000c8e88] scsi_softirq_done at ffffffffa42341e4 Signed-off-by: Fabian Ebner <f.ebner@proxmox.com> --- I'm new to this code and io_uring, so I don't know what all the implications are, but retrying upon EAGAIN does not sound like a bad thing to my inexperienced ears. Some more context, leading up to this patch: We had some users reporting issues after we switched to using io_uring by default. Namely, kernel panics [2] for some, and failing block jobs [3] (with a custom backup mechanism we implemented on top of QEMU's block layer) for others. I had luck and managed to reprouce the issue, and it was a failed block job about half of the time and a kernel panic the other half. When using a host kernel with [0], it's a failed block job all the time, and this patch attempts to fix that, by resubmitting instead of bubbling up the error. [2]: https://forum.proxmox.com/threads/kernel-panic-whole-server-crashes-about-every-day.91803/post-404382 [3]: https://forum.proxmox.com/threads/backup-job-failed-with-err-11-on-2-of-6-vms.92568/ block/io_uring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)