Message ID | 20230704080628.852525-1-mnissler@rivosinc.com (mailing list archive) |
---|---|
Headers | show |
Series | Support message-based DMA in vfio-user server | expand |
On 04.07.23 10:06, Mattias Nissler wrote: > This series adds basic support for message-based DMA in qemu's vfio-user > server. This is useful for cases where the client does not provide file > descriptors for accessing system memory via memory mappings. My motivating use > case is to hook up device models as PCIe endpoints to a hardware design. This > works by bridging the PCIe transaction layer to vfio-user, and the endpoint > does not access memory directly, but sends memory requests TLPs to the hardware > design in order to perform DMA. > > Note that in addition to the 3 commits included, we also need a > subprojects/libvfio-user roll to bring in this bugfix: > https://github.com/nutanix/libvfio-user/commit/bb308a2e8ee9486a4c8b53d8d773f7c8faaeba08 > Stefan, can I ask you to kindly update the > https://gitlab.com/qemu-project/libvfio-user mirror? I'll be happy to include > an update to subprojects/libvfio-user.wrap in this series. > > Finally, there is some more work required on top of this series to get > message-based DMA to really work well: > > * libvfio-user has a long-standing issue where socket communication gets messed > up when messages are sent from both ends at the same time. See > https://github.com/nutanix/libvfio-user/issues/279 for more details. I've > been engaging there and plan to contribute a fix. > > * qemu currently breaks down DMA accesses into chunks of size 8 bytes at > maximum, each of which will be handled in a separate vfio-user DMA request > message. This is quite terrible for large DMA acceses, such as when nvme > reads and writes page-sized blocks for example. Thus, I would like to improve > qemu to be able to perform larger accesses, at least for indirect memory > regions. I have something working locally, but since this will likely result > in more involved surgery and discussion, I am leaving this to be addressed in > a separate patch. > I remember asking Stefan in the past if there wouldn't be a way to avoid that mmap dance (and also handle uffd etc. easier) for vhost-user (especially, virtiofsd) by only making QEMU access guest memory. That could make memory-backend-ram support something like vhost-user, avoiding shared memory and everything that comes with that (e.g., no KSM, no shared zeropage). So this series tackles vfio-user, does anybody know what it would take to get something similar running for vhost-user?
On Tue, Jul 04, 2023 at 01:06:24AM -0700, Mattias Nissler wrote: > This series adds basic support for message-based DMA in qemu's vfio-user > server. This is useful for cases where the client does not provide file > descriptors for accessing system memory via memory mappings. My motivating use > case is to hook up device models as PCIe endpoints to a hardware design. This > works by bridging the PCIe transaction layer to vfio-user, and the endpoint > does not access memory directly, but sends memory requests TLPs to the hardware > design in order to perform DMA. > > Note that in addition to the 3 commits included, we also need a > subprojects/libvfio-user roll to bring in this bugfix: > https://github.com/nutanix/libvfio-user/commit/bb308a2e8ee9486a4c8b53d8d773f7c8faaeba08 > Stefan, can I ask you to kindly update the > https://gitlab.com/qemu-project/libvfio-user mirror? I'll be happy to include > an update to subprojects/libvfio-user.wrap in this series. Done: https://gitlab.com/qemu-project/libvfio-user/-/commits/master Repository mirroring is automated now, so new upstream commits will appear in the QEMU mirror repository from now on. > > Finally, there is some more work required on top of this series to get > message-based DMA to really work well: > > * libvfio-user has a long-standing issue where socket communication gets messed > up when messages are sent from both ends at the same time. See > https://github.com/nutanix/libvfio-user/issues/279 for more details. I've > been engaging there and plan to contribute a fix. > > * qemu currently breaks down DMA accesses into chunks of size 8 bytes at > maximum, each of which will be handled in a separate vfio-user DMA request > message. This is quite terrible for large DMA acceses, such as when nvme > reads and writes page-sized blocks for example. Thus, I would like to improve > qemu to be able to perform larger accesses, at least for indirect memory > regions. I have something working locally, but since this will likely result > in more involved surgery and discussion, I am leaving this to be addressed in > a separate patch. > > Mattias Nissler (3): > softmmu: Support concurrent bounce buffers > softmmu: Remove DMA unmap notification callback > vfio-user: Message-based DMA support > > hw/remote/vfio-user-obj.c | 62 ++++++++++++++++-- > softmmu/dma-helpers.c | 28 -------- > softmmu/physmem.c | 131 ++++++++------------------------------ > 3 files changed, 83 insertions(+), 138 deletions(-) Sorry for the late review. I was on vacation and am catching up on emails. Paolo worked on the QEMU memory API and can give input on how to make this efficient for large DMA accesses. There is a chance that memory dispatch with larger sizes will be needed for ENQCMD CPU instruction emulation too. Stefan
Stefan, I hope you had a great vacation! Thanks for updating the mirror and your review. Your comments all make sense, and I will address your input when I find time - just a quick ack now since I'm travelling next week and will be on vacation the first half of August, so it might be a while. Thanks, Mattias On Thu, Jul 20, 2023 at 8:41 PM Stefan Hajnoczi <stefanha@redhat.com> wrote: > > On Tue, Jul 04, 2023 at 01:06:24AM -0700, Mattias Nissler wrote: > > This series adds basic support for message-based DMA in qemu's vfio-user > > server. This is useful for cases where the client does not provide file > > descriptors for accessing system memory via memory mappings. My motivating use > > case is to hook up device models as PCIe endpoints to a hardware design. This > > works by bridging the PCIe transaction layer to vfio-user, and the endpoint > > does not access memory directly, but sends memory requests TLPs to the hardware > > design in order to perform DMA. > > > > Note that in addition to the 3 commits included, we also need a > > subprojects/libvfio-user roll to bring in this bugfix: > > https://github.com/nutanix/libvfio-user/commit/bb308a2e8ee9486a4c8b53d8d773f7c8faaeba08 > > Stefan, can I ask you to kindly update the > > https://gitlab.com/qemu-project/libvfio-user mirror? I'll be happy to include > > an update to subprojects/libvfio-user.wrap in this series. > > Done: > https://gitlab.com/qemu-project/libvfio-user/-/commits/master > > Repository mirroring is automated now, so new upstream commits will > appear in the QEMU mirror repository from now on. > > > > > Finally, there is some more work required on top of this series to get > > message-based DMA to really work well: > > > > * libvfio-user has a long-standing issue where socket communication gets messed > > up when messages are sent from both ends at the same time. See > > https://github.com/nutanix/libvfio-user/issues/279 for more details. I've > > been engaging there and plan to contribute a fix. > > > > * qemu currently breaks down DMA accesses into chunks of size 8 bytes at > > maximum, each of which will be handled in a separate vfio-user DMA request > > message. This is quite terrible for large DMA acceses, such as when nvme > > reads and writes page-sized blocks for example. Thus, I would like to improve > > qemu to be able to perform larger accesses, at least for indirect memory > > regions. I have something working locally, but since this will likely result > > in more involved surgery and discussion, I am leaving this to be addressed in > > a separate patch. > > > > Mattias Nissler (3): > > softmmu: Support concurrent bounce buffers > > softmmu: Remove DMA unmap notification callback > > vfio-user: Message-based DMA support > > > > hw/remote/vfio-user-obj.c | 62 ++++++++++++++++-- > > softmmu/dma-helpers.c | 28 -------- > > softmmu/physmem.c | 131 ++++++++------------------------------ > > 3 files changed, 83 insertions(+), 138 deletions(-) > > Sorry for the late review. I was on vacation and am catching up on > emails. > > Paolo worked on the QEMU memory API and can give input on how to make > this efficient for large DMA accesses. There is a chance that memory > dispatch with larger sizes will be needed for ENQCMD CPU instruction > emulation too. > > Stefan