Message ID | 20180910145616.8598-3-famz@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | virtio-scsi: Fix QEMU hang with vIOMMU and ATS | expand |
On 10/09/2018 16:56, Fam Zheng wrote: > We have this unwanted call stack: > > > ... > > #13 0x00005586602b7793 in virtio_scsi_handle_cmd_vq > > #14 0x00005586602b8d66 in virtio_scsi_data_plane_handle_cmd > > #15 0x00005586602ddab7 in virtio_queue_notify_aio_vq > > #16 0x00005586602dfc9f in virtio_queue_host_notifier_aio_poll > > #17 0x00005586607885da in run_poll_handlers_once > > #18 0x000055866078880e in try_poll_mode > > #19 0x00005586607888eb in aio_poll > > #20 0x0000558660784561 in aio_wait_bh_oneshot > > #21 0x00005586602b9582 in virtio_scsi_dataplane_stop > > #22 0x00005586605a7110 in virtio_bus_stop_ioeventfd > > #23 0x00005586605a9426 in virtio_pci_stop_ioeventfd > > #24 0x00005586605ab808 in virtio_pci_common_write > > #25 0x0000558660242396 in memory_region_write_accessor > > #26 0x00005586602425ab in access_with_adjusted_size > > #27 0x0000558660245281 in memory_region_dispatch_write > > #28 0x00005586601e008e in flatview_write_continue > > #29 0x00005586601e01d8 in flatview_write > > #30 0x00005586601e04de in address_space_write > > #31 0x00005586601e052f in address_space_rw > > #32 0x00005586602607f2 in kvm_cpu_exec > > #33 0x0000558660227148 in qemu_kvm_cpu_thread_fn > > #34 0x000055866078bde7 in qemu_thread_start > > #35 0x00007f5784906594 in start_thread > > #36 0x00007f5784639e6f in clone > > Avoid it with the aio_disable_external/aio_enable_external pair, so that > no vq poll handlers can be called in aio_wait_bh_oneshot. I don't understand. We are in the vCPU thread, so not in the AioContext's home thread. Why is aio_wait_bh_oneshot polling rather than going through the aio_wait_bh path? Thanks, Paolo
On Tue, 09/11 13:32, Paolo Bonzini wrote: > On 10/09/2018 16:56, Fam Zheng wrote: > > We have this unwanted call stack: > > > > > ... > > > #13 0x00005586602b7793 in virtio_scsi_handle_cmd_vq > > > #14 0x00005586602b8d66 in virtio_scsi_data_plane_handle_cmd > > > #15 0x00005586602ddab7 in virtio_queue_notify_aio_vq > > > #16 0x00005586602dfc9f in virtio_queue_host_notifier_aio_poll > > > #17 0x00005586607885da in run_poll_handlers_once > > > #18 0x000055866078880e in try_poll_mode > > > #19 0x00005586607888eb in aio_poll > > > #20 0x0000558660784561 in aio_wait_bh_oneshot > > > #21 0x00005586602b9582 in virtio_scsi_dataplane_stop > > > #22 0x00005586605a7110 in virtio_bus_stop_ioeventfd > > > #23 0x00005586605a9426 in virtio_pci_stop_ioeventfd > > > #24 0x00005586605ab808 in virtio_pci_common_write > > > #25 0x0000558660242396 in memory_region_write_accessor > > > #26 0x00005586602425ab in access_with_adjusted_size > > > #27 0x0000558660245281 in memory_region_dispatch_write > > > #28 0x00005586601e008e in flatview_write_continue > > > #29 0x00005586601e01d8 in flatview_write > > > #30 0x00005586601e04de in address_space_write > > > #31 0x00005586601e052f in address_space_rw > > > #32 0x00005586602607f2 in kvm_cpu_exec > > > #33 0x0000558660227148 in qemu_kvm_cpu_thread_fn > > > #34 0x000055866078bde7 in qemu_thread_start > > > #35 0x00007f5784906594 in start_thread > > > #36 0x00007f5784639e6f in clone > > > > Avoid it with the aio_disable_external/aio_enable_external pair, so that > > no vq poll handlers can be called in aio_wait_bh_oneshot. > > I don't understand. We are in the vCPU thread, so not in the > AioContext's home thread. Why is aio_wait_bh_oneshot polling rather > than going through the aio_wait_bh path? What do you mean by 'aio_wait_bh path'? Here is aio_wait_bh_oneshot: void aio_wait_bh_oneshot(AioContext *ctx, QEMUBHFunc *cb, void *opaque) { AioWaitBHData data = { .cb = cb, .opaque = opaque, }; assert(qemu_get_current_aio_context() == qemu_get_aio_context()); aio_bh_schedule_oneshot(ctx, aio_wait_bh, &data); AIO_WAIT_WHILE(&data.wait, ctx, !data.done); } ctx is qemu_aio_context here, so there's no interaction with IOThread. This is the full backtrace: https://paste.ubuntu.com/p/Dm7zGzmmRG/ Fam
On 11/09/2018 16:12, Fam Zheng wrote: > On Tue, 09/11 13:32, Paolo Bonzini wrote: >> On 10/09/2018 16:56, Fam Zheng wrote: >>> We have this unwanted call stack: >>> >>> > ... >>> > #13 0x00005586602b7793 in virtio_scsi_handle_cmd_vq >>> > #14 0x00005586602b8d66 in virtio_scsi_data_plane_handle_cmd >>> > #15 0x00005586602ddab7 in virtio_queue_notify_aio_vq >>> > #16 0x00005586602dfc9f in virtio_queue_host_notifier_aio_poll >>> > #17 0x00005586607885da in run_poll_handlers_once >>> > #18 0x000055866078880e in try_poll_mode >>> > #19 0x00005586607888eb in aio_poll >>> > #20 0x0000558660784561 in aio_wait_bh_oneshot >>> > #21 0x00005586602b9582 in virtio_scsi_dataplane_stop >>> > #22 0x00005586605a7110 in virtio_bus_stop_ioeventfd >>> > #23 0x00005586605a9426 in virtio_pci_stop_ioeventfd >>> > #24 0x00005586605ab808 in virtio_pci_common_write >>> > #25 0x0000558660242396 in memory_region_write_accessor >>> > #26 0x00005586602425ab in access_with_adjusted_size >>> > #27 0x0000558660245281 in memory_region_dispatch_write >>> > #28 0x00005586601e008e in flatview_write_continue >>> > #29 0x00005586601e01d8 in flatview_write >>> > #30 0x00005586601e04de in address_space_write >>> > #31 0x00005586601e052f in address_space_rw >>> > #32 0x00005586602607f2 in kvm_cpu_exec >>> > #33 0x0000558660227148 in qemu_kvm_cpu_thread_fn >>> > #34 0x000055866078bde7 in qemu_thread_start >>> > #35 0x00007f5784906594 in start_thread >>> > #36 0x00007f5784639e6f in clone >>> >>> Avoid it with the aio_disable_external/aio_enable_external pair, so that >>> no vq poll handlers can be called in aio_wait_bh_oneshot. >> >> I don't understand. We are in the vCPU thread, so not in the >> AioContext's home thread. Why is aio_wait_bh_oneshot polling rather >> than going through the aio_wait_bh path? > > What do you mean by 'aio_wait_bh path'? Here is aio_wait_bh_oneshot: Sorry, I meant the "atomic_inc(&wait_->num_waiters);" path. But if this backtrace is obtained without dataplane, that's the answer I was seeking. > void aio_wait_bh_oneshot(AioContext *ctx, QEMUBHFunc *cb, void *opaque) > { > AioWaitBHData data = { > .cb = cb, > .opaque = opaque, > }; > > assert(qemu_get_current_aio_context() == qemu_get_aio_context()); > > aio_bh_schedule_oneshot(ctx, aio_wait_bh, &data); > AIO_WAIT_WHILE(&data.wait, ctx, !data.done); > } > > ctx is qemu_aio_context here, so there's no interaction with IOThread. In this case, it should be okay to have the reentrancy, what is the bug that this patch is fixing? Thanks, Paolo
On Tue, 09/11 17:30, Paolo Bonzini wrote: > On 11/09/2018 16:12, Fam Zheng wrote: > > On Tue, 09/11 13:32, Paolo Bonzini wrote: > >> On 10/09/2018 16:56, Fam Zheng wrote: > >>> We have this unwanted call stack: > >>> > >>> > ... > >>> > #13 0x00005586602b7793 in virtio_scsi_handle_cmd_vq > >>> > #14 0x00005586602b8d66 in virtio_scsi_data_plane_handle_cmd > >>> > #15 0x00005586602ddab7 in virtio_queue_notify_aio_vq > >>> > #16 0x00005586602dfc9f in virtio_queue_host_notifier_aio_poll > >>> > #17 0x00005586607885da in run_poll_handlers_once > >>> > #18 0x000055866078880e in try_poll_mode > >>> > #19 0x00005586607888eb in aio_poll > >>> > #20 0x0000558660784561 in aio_wait_bh_oneshot > >>> > #21 0x00005586602b9582 in virtio_scsi_dataplane_stop > >>> > #22 0x00005586605a7110 in virtio_bus_stop_ioeventfd > >>> > #23 0x00005586605a9426 in virtio_pci_stop_ioeventfd > >>> > #24 0x00005586605ab808 in virtio_pci_common_write > >>> > #25 0x0000558660242396 in memory_region_write_accessor > >>> > #26 0x00005586602425ab in access_with_adjusted_size > >>> > #27 0x0000558660245281 in memory_region_dispatch_write > >>> > #28 0x00005586601e008e in flatview_write_continue > >>> > #29 0x00005586601e01d8 in flatview_write > >>> > #30 0x00005586601e04de in address_space_write > >>> > #31 0x00005586601e052f in address_space_rw > >>> > #32 0x00005586602607f2 in kvm_cpu_exec > >>> > #33 0x0000558660227148 in qemu_kvm_cpu_thread_fn > >>> > #34 0x000055866078bde7 in qemu_thread_start > >>> > #35 0x00007f5784906594 in start_thread > >>> > #36 0x00007f5784639e6f in clone > >>> > >>> Avoid it with the aio_disable_external/aio_enable_external pair, so that > >>> no vq poll handlers can be called in aio_wait_bh_oneshot. > >> > >> I don't understand. We are in the vCPU thread, so not in the > >> AioContext's home thread. Why is aio_wait_bh_oneshot polling rather > >> than going through the aio_wait_bh path? > > > > What do you mean by 'aio_wait_bh path'? Here is aio_wait_bh_oneshot: > > Sorry, I meant the "atomic_inc(&wait_->num_waiters);" path. But if this > backtrace is obtained without dataplane, that's the answer I was seeking. > > > void aio_wait_bh_oneshot(AioContext *ctx, QEMUBHFunc *cb, void *opaque) > > { > > AioWaitBHData data = { > > .cb = cb, > > .opaque = opaque, > > }; > > > > assert(qemu_get_current_aio_context() == qemu_get_aio_context()); > > > > aio_bh_schedule_oneshot(ctx, aio_wait_bh, &data); > > AIO_WAIT_WHILE(&data.wait, ctx, !data.done); > > } > > > > ctx is qemu_aio_context here, so there's no interaction with IOThread. > > In this case, it should be okay to have the reentrancy, what is the bug > that this patch is fixing? The same symptom as in the previous patch: virtio_scsi_handle_cmd_vq hangs. The reason it hangs is fixed by the previous patch, but I don't think it should be invoked as we're in the middle of virtio_scsi_dataplane_stop(). Applying either one of the two patches avoids the problem, but this one is more superficial. What do you think? Fam
On 12/09/2018 03:31, Fam Zheng wrote: >>> >>> ctx is qemu_aio_context here, so there's no interaction with IOThread. >> In this case, it should be okay to have the reentrancy, what is the bug >> that this patch is fixing? > The same symptom as in the previous patch: virtio_scsi_handle_cmd_vq hangs. The > reason it hangs is fixed by the previous patch, but I don't think it should be > invoked as we're in the middle of virtio_scsi_dataplane_stop(). Applying either > one of the two patches avoids the problem, but this one is more superficial. > What do you think? I think it's okay if it is invoked. The sequence is first you stop the vq, then you drain the BlockBackends, then you switch AioContext. All that matters is the outcome when virtio_scsi_dataplane_stop returns. Paolo
On Wed, 09/12 13:11, Paolo Bonzini wrote: > On 12/09/2018 03:31, Fam Zheng wrote: > >>> > >>> ctx is qemu_aio_context here, so there's no interaction with IOThread. > >> In this case, it should be okay to have the reentrancy, what is the bug > >> that this patch is fixing? > > The same symptom as in the previous patch: virtio_scsi_handle_cmd_vq hangs. The > > reason it hangs is fixed by the previous patch, but I don't think it should be > > invoked as we're in the middle of virtio_scsi_dataplane_stop(). Applying either > > one of the two patches avoids the problem, but this one is more superficial. > > What do you think? > > I think it's okay if it is invoked. The sequence is first you stop the > vq, then you drain the BlockBackends, then you switch AioContext. All > that matters is the outcome when virtio_scsi_dataplane_stop returns. Yes, but together with vIOMMU, it also effectively leads to a virtio_error(), which is not clean. QEMU stderr when this call happens (with patch 1 but not this patch): 2018-09-12T11:48:10.193023Z qemu-system-x86_64: vtd_iommu_translate: detected translation failure (dev=02:00:00, iova=0x0) 2018-09-12T11:48:10.193044Z qemu-system-x86_64: New fault is not recorded due to compression of faults 2018-09-12T11:48:10.193061Z qemu-system-x86_64: virtio: zero sized buffers are not allowed Fam
On 12/09/2018 13:50, Fam Zheng wrote: >> I think it's okay if it is invoked. The sequence is first you stop the >> vq, then you drain the BlockBackends, then you switch AioContext. All >> that matters is the outcome when virtio_scsi_dataplane_stop returns. > Yes, but together with vIOMMU, it also effectively leads to a virtio_error(), > which is not clean. QEMU stderr when this call happens (with patch 1 but not > this patch): > > 2018-09-12T11:48:10.193023Z qemu-system-x86_64: vtd_iommu_translate: detected translation failure (dev=02:00:00, iova=0x0) > 2018-09-12T11:48:10.193044Z qemu-system-x86_64: New fault is not recorded due to compression of faults > 2018-09-12T11:48:10.193061Z qemu-system-x86_64: virtio: zero sized buffers are not allowed But with iothread, virtio_scsi_dataplane_stop runs in another thread than the iothread; in that case you still have a race where the iothread can process the vq before aio_disable_external and print the error. IIUC the guest has cleared the IOMMU page tables _before_ clearing the DRIVER_OK bit in the status field. Could this be a guest bug? Paolo
On Wed, 09/12 14:42, Paolo Bonzini wrote: > On 12/09/2018 13:50, Fam Zheng wrote: > >> I think it's okay if it is invoked. The sequence is first you stop the > >> vq, then you drain the BlockBackends, then you switch AioContext. All > >> that matters is the outcome when virtio_scsi_dataplane_stop returns. > > Yes, but together with vIOMMU, it also effectively leads to a virtio_error(), > > which is not clean. QEMU stderr when this call happens (with patch 1 but not > > this patch): > > > > 2018-09-12T11:48:10.193023Z qemu-system-x86_64: vtd_iommu_translate: detected translation failure (dev=02:00:00, iova=0x0) > > 2018-09-12T11:48:10.193044Z qemu-system-x86_64: New fault is not recorded due to compression of faults > > 2018-09-12T11:48:10.193061Z qemu-system-x86_64: virtio: zero sized buffers are not allowed > > But with iothread, virtio_scsi_dataplane_stop runs in another thread > than the iothread; in that case you still have a race where the iothread > can process the vq before aio_disable_external and print the error. > > IIUC the guest has cleared the IOMMU page tables _before_ clearing the > DRIVER_OK bit in the status field. Could this be a guest bug? > I'm not sure if it is a bug or not. I think what happens is the device is left enabled by Seabios, and then reset by kernel. Fam
On 13/09/2018 08:03, Fam Zheng wrote: > On Wed, 09/12 14:42, Paolo Bonzini wrote: >> On 12/09/2018 13:50, Fam Zheng wrote: >>>> I think it's okay if it is invoked. The sequence is first you stop the >>>> vq, then you drain the BlockBackends, then you switch AioContext. All >>>> that matters is the outcome when virtio_scsi_dataplane_stop returns. >>> Yes, but together with vIOMMU, it also effectively leads to a virtio_error(), >>> which is not clean. QEMU stderr when this call happens (with patch 1 but not >>> this patch): >>> >>> 2018-09-12T11:48:10.193023Z qemu-system-x86_64: vtd_iommu_translate: detected translation failure (dev=02:00:00, iova=0x0) >>> 2018-09-12T11:48:10.193044Z qemu-system-x86_64: New fault is not recorded due to compression of faults >>> 2018-09-12T11:48:10.193061Z qemu-system-x86_64: virtio: zero sized buffers are not allowed >> >> But with iothread, virtio_scsi_dataplane_stop runs in another thread >> than the iothread; in that case you still have a race where the iothread >> can process the vq before aio_disable_external and print the error. >> >> IIUC the guest has cleared the IOMMU page tables _before_ clearing the >> DRIVER_OK bit in the status field. Could this be a guest bug? > > I'm not sure if it is a bug or not. I think what happens is the device is left > enabled by Seabios, and then reset by kernel. That makes sense, though I'm not sure why QEMU needs to process a request long after SeaBIOS has left control to Linux. Maybe it's just that the messages should not go on QEMU stderr, and rather trace-point should be enough. Paolo
On 13/09/2018 11:11, Paolo Bonzini wrote: > On 13/09/2018 08:03, Fam Zheng wrote: >> On Wed, 09/12 14:42, Paolo Bonzini wrote: >>> On 12/09/2018 13:50, Fam Zheng wrote: >>>>> I think it's okay if it is invoked. The sequence is first you stop the >>>>> vq, then you drain the BlockBackends, then you switch AioContext. All >>>>> that matters is the outcome when virtio_scsi_dataplane_stop returns. >>>> Yes, but together with vIOMMU, it also effectively leads to a virtio_error(), >>>> which is not clean. QEMU stderr when this call happens (with patch 1 but not >>>> this patch): >>>> >>>> 2018-09-12T11:48:10.193023Z qemu-system-x86_64: vtd_iommu_translate: detected translation failure (dev=02:00:00, iova=0x0) >>>> 2018-09-12T11:48:10.193044Z qemu-system-x86_64: New fault is not recorded due to compression of faults >>>> 2018-09-12T11:48:10.193061Z qemu-system-x86_64: virtio: zero sized buffers are not allowed >>> >>> But with iothread, virtio_scsi_dataplane_stop runs in another thread >>> than the iothread; in that case you still have a race where the iothread >>> can process the vq before aio_disable_external and print the error. >>> >>> IIUC the guest has cleared the IOMMU page tables _before_ clearing the >>> DRIVER_OK bit in the status field. Could this be a guest bug? >> >> I'm not sure if it is a bug or not. I think what happens is the device is left >> enabled by Seabios, and then reset by kernel. > > That makes sense, though I'm not sure why QEMU needs to process a > request long after SeaBIOS has left control to Linux. Maybe it's just > that the messages should not go on QEMU stderr, and rather trace-point > should be enough. Aha, it's not that QEMU needs to poll, it's just that polling mode is enabled, and it decides to do one last iteration. In general the virtio spec allows the hardware to poll whenever it wants, hence: 1) I'm not sure that translation failures should mark the device as broken---definitely not when doing polling, possibly not even in response to the guest "kicking" the virtqueue. Alex, does the PCI spec say anything about this? 2) translation faliures should definitely not print messages to stderr. Thanks, Paolo
On Thu, 13 Sep 2018 12:04:34 +0200 Paolo Bonzini <pbonzini@redhat.com> wrote: > On 13/09/2018 11:11, Paolo Bonzini wrote: > > On 13/09/2018 08:03, Fam Zheng wrote: > >> On Wed, 09/12 14:42, Paolo Bonzini wrote: > >>> On 12/09/2018 13:50, Fam Zheng wrote: > >>>>> I think it's okay if it is invoked. The sequence is first you stop the > >>>>> vq, then you drain the BlockBackends, then you switch AioContext. All > >>>>> that matters is the outcome when virtio_scsi_dataplane_stop returns. > >>>> Yes, but together with vIOMMU, it also effectively leads to a virtio_error(), > >>>> which is not clean. QEMU stderr when this call happens (with patch 1 but not > >>>> this patch): > >>>> > >>>> 2018-09-12T11:48:10.193023Z qemu-system-x86_64: vtd_iommu_translate: detected translation failure (dev=02:00:00, iova=0x0) > >>>> 2018-09-12T11:48:10.193044Z qemu-system-x86_64: New fault is not recorded due to compression of faults > >>>> 2018-09-12T11:48:10.193061Z qemu-system-x86_64: virtio: zero sized buffers are not allowed > >>> > >>> But with iothread, virtio_scsi_dataplane_stop runs in another thread > >>> than the iothread; in that case you still have a race where the iothread > >>> can process the vq before aio_disable_external and print the error. > >>> > >>> IIUC the guest has cleared the IOMMU page tables _before_ clearing the > >>> DRIVER_OK bit in the status field. Could this be a guest bug? > >> > >> I'm not sure if it is a bug or not. I think what happens is the device is left > >> enabled by Seabios, and then reset by kernel. > > > > That makes sense, though I'm not sure why QEMU needs to process a > > request long after SeaBIOS has left control to Linux. Maybe it's just > > that the messages should not go on QEMU stderr, and rather trace-point > > should be enough. > > Aha, it's not that QEMU needs to poll, it's just that polling mode is > enabled, and it decides to do one last iteration. In general the virtio > spec allows the hardware to poll whenever it wants, hence: > > 1) I'm not sure that translation failures should mark the device as > broken---definitely not when doing polling, possibly not even in > response to the guest "kicking" the virtqueue. Alex, does the PCI spec > say anything about this? AFAIK the PCI spec doesn't define anything about the IOMMU or response to translation failures. Depending on whether it's a read or write, the device might see an unsupported request or not even be aware of the error. It's really a platform RAS question whether to have any more significant response, most don't, but at least one tends to consider IOMMU faults to be a data integrity issue worth bring the system down. We've struggled with handling ongoing DMA generating IOMMU faults during kexec for a long time, so any sort of marking a device broken for a fault should be thoroughly considered, especially when a device could be assigned to a user who can trivially trigger a fault. > 2) translation faliures should definitely not print messages to stderr. Yep, easy DoS vector for a malicious guest, or malicious userspace driver within the guest. Thanks, Alex
On Thu, Sep 13, 2018 at 10:00:43AM -0600, Alex Williamson wrote: > On Thu, 13 Sep 2018 12:04:34 +0200 > Paolo Bonzini <pbonzini@redhat.com> wrote: > > > On 13/09/2018 11:11, Paolo Bonzini wrote: > > > On 13/09/2018 08:03, Fam Zheng wrote: > > >> On Wed, 09/12 14:42, Paolo Bonzini wrote: > > >>> On 12/09/2018 13:50, Fam Zheng wrote: > > >>>>> I think it's okay if it is invoked. The sequence is first you stop the > > >>>>> vq, then you drain the BlockBackends, then you switch AioContext. All > > >>>>> that matters is the outcome when virtio_scsi_dataplane_stop returns. > > >>>> Yes, but together with vIOMMU, it also effectively leads to a virtio_error(), > > >>>> which is not clean. QEMU stderr when this call happens (with patch 1 but not > > >>>> this patch): > > >>>> > > >>>> 2018-09-12T11:48:10.193023Z qemu-system-x86_64: vtd_iommu_translate: detected translation failure (dev=02:00:00, iova=0x0) > > >>>> 2018-09-12T11:48:10.193044Z qemu-system-x86_64: New fault is not recorded due to compression of faults > > >>>> 2018-09-12T11:48:10.193061Z qemu-system-x86_64: virtio: zero sized buffers are not allowed > > >>> > > >>> But with iothread, virtio_scsi_dataplane_stop runs in another thread > > >>> than the iothread; in that case you still have a race where the iothread > > >>> can process the vq before aio_disable_external and print the error. > > >>> > > >>> IIUC the guest has cleared the IOMMU page tables _before_ clearing the > > >>> DRIVER_OK bit in the status field. Could this be a guest bug? > > >> > > >> I'm not sure if it is a bug or not. I think what happens is the device is left > > >> enabled by Seabios, and then reset by kernel. > > > > > > That makes sense, though I'm not sure why QEMU needs to process a > > > request long after SeaBIOS has left control to Linux. Maybe it's just > > > that the messages should not go on QEMU stderr, and rather trace-point > > > should be enough. > > > > Aha, it's not that QEMU needs to poll, it's just that polling mode is > > enabled, and it decides to do one last iteration. In general the virtio > > spec allows the hardware to poll whenever it wants, hence: > > > > 1) I'm not sure that translation failures should mark the device as > > broken---definitely not when doing polling, possibly not even in > > response to the guest "kicking" the virtqueue. Alex, does the PCI spec > > say anything about this? > > AFAIK the PCI spec doesn't define anything about the IOMMU or response > to translation failures. Depending on whether it's a read or write, > the device might see an unsupported request or not even be aware of the > error. It's really a platform RAS question whether to have any more > significant response, most don't, but at least one tends to consider > IOMMU faults to be a data integrity issue worth bring the system down. > We've struggled with handling ongoing DMA generating IOMMU faults > during kexec for a long time, so any sort of marking a device broken > for a fault should be thoroughly considered, especially when a device > could be assigned to a user who can trivially trigger a fault. > > > 2) translation faliures should definitely not print messages to stderr. > > Yep, easy DoS vector for a malicious guest, or malicious userspace > driver within the guest. Thanks, Note that it's using error_report_once() upstream so it'll only print once for the whole lifecycle of QEMU process, and it's still a tracepoint downstream so no error will be dumped by default. So AFAIU it's not a DoS target for either. I would consider it a good hint for strange bugs since AFAIU DMA error should never exist on well-behaved guests. However I'll be fine too to post a patch to make it an explicit tracepoint again if any of us would still like it to go away. Thanks,
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 8c37bd314a..8ab54218c1 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -279,7 +279,9 @@ void virtio_blk_data_plane_stop(VirtIODevice *vdev) trace_virtio_blk_data_plane_stop(s); aio_context_acquire(s->ctx); + aio_disable_external(s->ctx); aio_wait_bh_oneshot(s->ctx, virtio_blk_data_plane_stop_bh, s); + aio_enable_external(s->ctx); /* Drain and switch bs back to the QEMU main loop */ blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context()); diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c index b995bab3a2..1452398491 100644 --- a/hw/scsi/virtio-scsi-dataplane.c +++ b/hw/scsi/virtio-scsi-dataplane.c @@ -208,7 +208,9 @@ void virtio_scsi_dataplane_stop(VirtIODevice *vdev) s->dataplane_stopping = true; aio_context_acquire(s->ctx); + aio_disable_external(s->ctx); aio_wait_bh_oneshot(s->ctx, virtio_scsi_dataplane_stop_bh, s); + aio_enable_external(s->ctx); aio_context_release(s->ctx); blk_drain_all(); /* ensure there are no in-flight requests */