mbox series

[v8,0/8] memory: prevent dma-reentracy issues

Message ID 20230421142736.2817601-1-alxndr@bu.edu (mailing list archive)
Headers show
Series memory: prevent dma-reentracy issues | expand

Message

Alexander Bulekov April 21, 2023, 2:27 p.m. UTC
v7 -> v8:
    - Disable reentrancy checks for bcm2835_property's iomem (Patch 7)
    - Cache DeviceState* in the MemoryRegion to avoid dynamic cast for
      each MemoryRegion access. (Patch 1)
    - Make re-entrancy fatal for debug-builds (Patch 8)

v6 -> v7:
    - Fix bad qemu_bh_new_guarded calls found by Thomas (Patch 4)
    - Add an MR-specific flag to disable reentrancy (Patch 5)
    - Disable reentrancy checks for lsi53c895a's RAM-like MR (Patch 6)
    
    Patches 5 and 6 need review. I left the review-tags for Patch 4,
    however a few of the qemu_bh_new_guarded calls have changed.
  
v5 -> v6:
    - Only apply checkpatch checks to code in paths containing "/hw/"
      (/hw/ and include/hw/)
    - Fix a bug in a _guarded call added to hw/block/virtio-blk.c
v4-> v5:
    - Add corresponding checkpatch checks
    - Save/restore reentrancy-flag when entering/exiting BHs
    - Improve documentation
    - Check object_dynamic_cast return value
    
v3 -> v4: Instead of changing all of the DMA APIs, instead add an
    optional reentrancy guard to the BH API.
    
v2 -> v3: Bite the bullet and modify the DMA APIs, rather than
    attempting to guess DeviceStates in BHs.
    
These patches aim to solve two types of DMA-reentrancy issues:

1.) mmio -> dma -> mmio case
To solve this, we track whether the device is engaged in io by
checking/setting a reentrancy-guard within APIs used for MMIO access.

2.) bh -> dma write -> mmio case
This case is trickier, since we dont have a generic way to associate a
bh with the underlying Device/DeviceState. Thus, this version allows a
device to associate a reentrancy-guard with a bh, when creating it.
(Instead of calling qemu_bh_new, you call qemu_bh_new_guarded)

I replaced most of the qemu_bh_new invocations with the guarded analog,
except for the ones where the DeviceState was not trivially accessible.

Alexander Bulekov (8):
  memory: prevent dma-reentracy issues
  async: Add an optional reentrancy guard to the BH API
  checkpatch: add qemu_bh_new/aio_bh_new checks
  hw: replace most qemu_bh_new calls with qemu_bh_new_guarded
  memory: Allow disabling re-entrancy checking per-MR
  lsi53c895a: disable reentrancy detection for script RAM
  bcm2835_property: disable reentrancy detection for iomem
  memory: abort on re-entrancy in debug builds

 docs/devel/multiple-iothreads.txt |  7 +++++++
 hw/9pfs/xen-9p-backend.c          |  5 ++++-
 hw/block/dataplane/virtio-blk.c   |  3 ++-
 hw/block/dataplane/xen-block.c    |  5 +++--
 hw/char/virtio-serial-bus.c       |  3 ++-
 hw/display/qxl.c                  |  9 ++++++---
 hw/display/virtio-gpu.c           |  6 ++++--
 hw/ide/ahci.c                     |  3 ++-
 hw/ide/ahci_internal.h            |  1 +
 hw/ide/core.c                     |  4 +++-
 hw/misc/bcm2835_property.c        |  7 +++++++
 hw/misc/imx_rngc.c                |  6 ++++--
 hw/misc/macio/mac_dbdma.c         |  2 +-
 hw/net/virtio-net.c               |  3 ++-
 hw/nvme/ctrl.c                    |  6 ++++--
 hw/scsi/lsi53c895a.c              |  6 ++++++
 hw/scsi/mptsas.c                  |  3 ++-
 hw/scsi/scsi-bus.c                |  3 ++-
 hw/scsi/vmw_pvscsi.c              |  3 ++-
 hw/usb/dev-uas.c                  |  3 ++-
 hw/usb/hcd-dwc2.c                 |  3 ++-
 hw/usb/hcd-ehci.c                 |  3 ++-
 hw/usb/hcd-uhci.c                 |  2 +-
 hw/usb/host-libusb.c              |  6 ++++--
 hw/usb/redirect.c                 |  6 ++++--
 hw/usb/xen-usb.c                  |  3 ++-
 hw/virtio/virtio-balloon.c        |  5 +++--
 hw/virtio/virtio-crypto.c         |  3 ++-
 include/block/aio.h               | 18 ++++++++++++++++--
 include/exec/memory.h             |  5 +++++
 include/hw/qdev-core.h            |  7 +++++++
 include/qemu/main-loop.h          |  7 +++++--
 scripts/checkpatch.pl             |  8 ++++++++
 softmmu/memory.c                  | 17 +++++++++++++++++
 softmmu/trace-events              |  1 +
 tests/unit/ptimer-test-stubs.c    |  3 ++-
 util/async.c                      | 21 ++++++++++++++++++++-
 util/main-loop.c                  |  5 +++--
 util/trace-events                 |  1 +
 39 files changed, 171 insertions(+), 41 deletions(-)

Comments

Thomas Huth April 25, 2023, 9:46 a.m. UTC | #1
On 21/04/2023 16.27, Alexander Bulekov wrote:
> v7 -> v8:
>      - Disable reentrancy checks for bcm2835_property's iomem (Patch 7)
>      - Cache DeviceState* in the MemoryRegion to avoid dynamic cast for
>        each MemoryRegion access. (Patch 1)
>      - Make re-entrancy fatal for debug-builds (Patch 8)

  Hi Alexander,

I just put your series into a run with the gitlab-CI and it seems this now 
introduced another failure in one of the avocado tests:

  https://gitlab.com/thuth/qemu/-/jobs/4171448248#L318

The "IbmPrep40pMachine.test_openbios_and_netbsd" test is failing now.

You can reproduce it manually quite easily:

  wget 
https://archive.netbsd.org/pub/NetBSD-archive/NetBSD-7.1.2/iso/NetBSD-7.1.2-prep.iso

  ./qemu-system-ppc -nographic -M 40p -boot d -cdrom NetBSD-7.1.2-prep.iso

Without your patches, this prints out "NetBSD/prep BOOT, Revision 1.9" in 
the console, but with your patches, the message does not appear anymore.

Could you please have a look?

  Thanks,
   Thomas
Alexander Bulekov April 26, 2023, 4:03 p.m. UTC | #2
On 230425 1146, Thomas Huth wrote:
> On 21/04/2023 16.27, Alexander Bulekov wrote:
> > v7 -> v8:
> >      - Disable reentrancy checks for bcm2835_property's iomem (Patch 7)
> >      - Cache DeviceState* in the MemoryRegion to avoid dynamic cast for
> >        each MemoryRegion access. (Patch 1)
> >      - Make re-entrancy fatal for debug-builds (Patch 8)
> 
>  Hi Alexander,
> 
> I just put your series into a run with the gitlab-CI and it seems this now
> introduced another failure in one of the avocado tests:
> 
>  https://gitlab.com/thuth/qemu/-/jobs/4171448248#L318
> 
> The "IbmPrep40pMachine.test_openbios_and_netbsd" test is failing now.
> 
> You can reproduce it manually quite easily:
> 
>  wget https://archive.netbsd.org/pub/NetBSD-archive/NetBSD-7.1.2/iso/NetBSD-7.1.2-prep.iso
> 
>  ./qemu-system-ppc -nographic -M 40p -boot d -cdrom NetBSD-7.1.2-prep.iso
> 
> Without your patches, this prints out "NetBSD/prep BOOT, Revision 1.9" in
> the console, but with your patches, the message does not appear anymore.

#0  trace_memory_region_reentrant_io  at trace/trace-softmmu.h:335
#1  0x0000555555f3f315 in access_with_adjusted_size  at ../softmmu/memory.c:549
#2  0x0000555555f3f0f6 in memory_region_dispatch_write  at ../softmmu/memory.c:1531
#3  0x0000555555f4e655 in flatview_write_continue  at ../softmmu/physmem.c:2641
#4  0x0000555555f4b30e in flatview_write  at ../softmmu/physmem.c:2683
#5  0x0000555555f4b119 in address_space_write  at ../softmmu/physmem.c:2779
#6  0x0000555555c5a233 in raven_io_write  at ../hw/pci-host/raven.c:194
#7  0x0000555555f3f409 in memory_region_write_accessor  at ../softmmu/memory.c:493
#8  0x0000555555f3f22b in access_with_adjusted_size  at ../softmmu/memory.c:568
#9  0x0000555555f3f0f6 in memory_region_dispatch_write  at ../softmmu/memory.c:1531
#10 0x0000555555fcf74b in io_writex  at ../accel/tcg/cputlb.c:1430
#11 0x0000555555fc33dd in store_helper  at ../accel/tcg/cputlb.c:2454
#12 full_le_stl_mmu  at ../accel/tcg/cputlb.c:2542
#13 0x00007fffa404b78c in code_gen_buffer                                      
#14 0x0000555555fa510d in cpu_tb_exec  at ../accel/tcg/cpu-exec.c:460
#15 0x0000555555fa5ec0 in cpu_loop_exec_tb  at ../accel/tcg/cpu-exec.c:893
#16 cpu_exec_loop  at ../accel/tcg/cpu-exec.c:1013
#17 0x0000555555fa5697 in cpu_exec_setjmp  at ../accel/tcg/cpu-exec.c:1043
#18 0x0000555555fa558e in cpu_exec  at ../accel/tcg/cpu-exec.c:1069
#19 0x0000555555fd1f8f in tcg_cpus_exec  at ../accel/tcg/tcg-accel-ops.c:81
#20 0x0000555555fd2c16 in rr_cpu_thread_fn  at ../accel/tcg/tcg-accel-ops-rr.c:223
#21 0x0000555556156465 in qemu_thread_start  at ../util/qemu-thread-posix.c:541
#22 0x00007ffff6960fd4 in start_thread  at ./nptl/pthread_create.c:442
#23 0x00007ffff69e166c in clone3  at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
gef➤  p ->name
$4 = 0x555557615bd0 "pci-conf-idx"

Raven owns both the ravio_io MR and the pci-conf MRs. raven_io needs to
be marked re-entrancy safe.