mbox series

[0/3] Support message-based DMA in vfio-user server

Message ID 20230704080628.852525-1-mnissler@rivosinc.com (mailing list archive)
Headers show
Series Support message-based DMA in vfio-user server | expand

Message

Mattias Nissler July 4, 2023, 8:06 a.m. UTC
This series adds basic support for message-based DMA in qemu's vfio-user
server. This is useful for cases where the client does not provide file
descriptors for accessing system memory via memory mappings. My motivating use
case is to hook up device models as PCIe endpoints to a hardware design. This
works by bridging the PCIe transaction layer to vfio-user, and the endpoint
does not access memory directly, but sends memory requests TLPs to the hardware
design in order to perform DMA.

Note that in addition to the 3 commits included, we also need a
subprojects/libvfio-user roll to bring in this bugfix:
https://github.com/nutanix/libvfio-user/commit/bb308a2e8ee9486a4c8b53d8d773f7c8faaeba08
Stefan, can I ask you to kindly update the
https://gitlab.com/qemu-project/libvfio-user mirror? I'll be happy to include
an update to subprojects/libvfio-user.wrap in this series.

Finally, there is some more work required on top of this series to get
message-based DMA to really work well:

* libvfio-user has a long-standing issue where socket communication gets messed
  up when messages are sent from both ends at the same time. See
  https://github.com/nutanix/libvfio-user/issues/279 for more details. I've
  been engaging there and plan to contribute a fix.

* qemu currently breaks down DMA accesses into chunks of size 8 bytes at
  maximum, each of which will be handled in a separate vfio-user DMA request
  message. This is quite terrible for large DMA acceses, such as when nvme
  reads and writes page-sized blocks for example. Thus, I would like to improve
  qemu to be able to perform larger accesses, at least for indirect memory
  regions. I have something working locally, but since this will likely result
  in more involved surgery and discussion, I am leaving this to be addressed in
  a separate patch.

Mattias Nissler (3):
  softmmu: Support concurrent bounce buffers
  softmmu: Remove DMA unmap notification callback
  vfio-user: Message-based DMA support

 hw/remote/vfio-user-obj.c |  62 ++++++++++++++++--
 softmmu/dma-helpers.c     |  28 --------
 softmmu/physmem.c         | 131 ++++++++------------------------------
 3 files changed, 83 insertions(+), 138 deletions(-)

Comments

David Hildenbrand July 4, 2023, 8:20 a.m. UTC | #1
On 04.07.23 10:06, Mattias Nissler wrote:
> This series adds basic support for message-based DMA in qemu's vfio-user
> server. This is useful for cases where the client does not provide file
> descriptors for accessing system memory via memory mappings. My motivating use
> case is to hook up device models as PCIe endpoints to a hardware design. This
> works by bridging the PCIe transaction layer to vfio-user, and the endpoint
> does not access memory directly, but sends memory requests TLPs to the hardware
> design in order to perform DMA.
> 
> Note that in addition to the 3 commits included, we also need a
> subprojects/libvfio-user roll to bring in this bugfix:
> https://github.com/nutanix/libvfio-user/commit/bb308a2e8ee9486a4c8b53d8d773f7c8faaeba08
> Stefan, can I ask you to kindly update the
> https://gitlab.com/qemu-project/libvfio-user mirror? I'll be happy to include
> an update to subprojects/libvfio-user.wrap in this series.
> 
> Finally, there is some more work required on top of this series to get
> message-based DMA to really work well:
> 
> * libvfio-user has a long-standing issue where socket communication gets messed
>    up when messages are sent from both ends at the same time. See
>    https://github.com/nutanix/libvfio-user/issues/279 for more details. I've
>    been engaging there and plan to contribute a fix.
> 
> * qemu currently breaks down DMA accesses into chunks of size 8 bytes at
>    maximum, each of which will be handled in a separate vfio-user DMA request
>    message. This is quite terrible for large DMA acceses, such as when nvme
>    reads and writes page-sized blocks for example. Thus, I would like to improve
>    qemu to be able to perform larger accesses, at least for indirect memory
>    regions. I have something working locally, but since this will likely result
>    in more involved surgery and discussion, I am leaving this to be addressed in
>    a separate patch.
> 

I remember asking Stefan in the past if there wouldn't be a way to avoid 
that mmap dance (and also handle uffd etc. easier) for vhost-user 
(especially, virtiofsd) by only making QEMU access guest memory.

That could make memory-backend-ram support something like vhost-user, 
avoiding shared memory and everything that comes with that (e.g., no 
KSM, no shared zeropage).

So this series tackles vfio-user, does anybody know what it would take 
to get something similar running for vhost-user?
Stefan Hajnoczi July 20, 2023, 6:41 p.m. UTC | #2
On Tue, Jul 04, 2023 at 01:06:24AM -0700, Mattias Nissler wrote:
> This series adds basic support for message-based DMA in qemu's vfio-user
> server. This is useful for cases where the client does not provide file
> descriptors for accessing system memory via memory mappings. My motivating use
> case is to hook up device models as PCIe endpoints to a hardware design. This
> works by bridging the PCIe transaction layer to vfio-user, and the endpoint
> does not access memory directly, but sends memory requests TLPs to the hardware
> design in order to perform DMA.
> 
> Note that in addition to the 3 commits included, we also need a
> subprojects/libvfio-user roll to bring in this bugfix:
> https://github.com/nutanix/libvfio-user/commit/bb308a2e8ee9486a4c8b53d8d773f7c8faaeba08
> Stefan, can I ask you to kindly update the
> https://gitlab.com/qemu-project/libvfio-user mirror? I'll be happy to include
> an update to subprojects/libvfio-user.wrap in this series.

Done:
https://gitlab.com/qemu-project/libvfio-user/-/commits/master

Repository mirroring is automated now, so new upstream commits will
appear in the QEMU mirror repository from now on.

> 
> Finally, there is some more work required on top of this series to get
> message-based DMA to really work well:
> 
> * libvfio-user has a long-standing issue where socket communication gets messed
>   up when messages are sent from both ends at the same time. See
>   https://github.com/nutanix/libvfio-user/issues/279 for more details. I've
>   been engaging there and plan to contribute a fix.
> 
> * qemu currently breaks down DMA accesses into chunks of size 8 bytes at
>   maximum, each of which will be handled in a separate vfio-user DMA request
>   message. This is quite terrible for large DMA acceses, such as when nvme
>   reads and writes page-sized blocks for example. Thus, I would like to improve
>   qemu to be able to perform larger accesses, at least for indirect memory
>   regions. I have something working locally, but since this will likely result
>   in more involved surgery and discussion, I am leaving this to be addressed in
>   a separate patch.
> 
> Mattias Nissler (3):
>   softmmu: Support concurrent bounce buffers
>   softmmu: Remove DMA unmap notification callback
>   vfio-user: Message-based DMA support
> 
>  hw/remote/vfio-user-obj.c |  62 ++++++++++++++++--
>  softmmu/dma-helpers.c     |  28 --------
>  softmmu/physmem.c         | 131 ++++++++------------------------------
>  3 files changed, 83 insertions(+), 138 deletions(-)

Sorry for the late review. I was on vacation and am catching up on
emails.

Paolo worked on the QEMU memory API and can give input on how to make
this efficient for large DMA accesses. There is a chance that memory
dispatch with larger sizes will be needed for ENQCMD CPU instruction
emulation too.

Stefan
Mattias Nissler July 20, 2023, 10:10 p.m. UTC | #3
Stefan,

I hope you had a great vacation!

Thanks for updating the mirror and your review. Your comments all make
sense, and I will address your input when I find time - just a quick
ack now since I'm travelling next week and will be on vacation the
first half of August, so it might be a while.

Thanks,
Mattias

On Thu, Jul 20, 2023 at 8:41 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> On Tue, Jul 04, 2023 at 01:06:24AM -0700, Mattias Nissler wrote:
> > This series adds basic support for message-based DMA in qemu's vfio-user
> > server. This is useful for cases where the client does not provide file
> > descriptors for accessing system memory via memory mappings. My motivating use
> > case is to hook up device models as PCIe endpoints to a hardware design. This
> > works by bridging the PCIe transaction layer to vfio-user, and the endpoint
> > does not access memory directly, but sends memory requests TLPs to the hardware
> > design in order to perform DMA.
> >
> > Note that in addition to the 3 commits included, we also need a
> > subprojects/libvfio-user roll to bring in this bugfix:
> > https://github.com/nutanix/libvfio-user/commit/bb308a2e8ee9486a4c8b53d8d773f7c8faaeba08
> > Stefan, can I ask you to kindly update the
> > https://gitlab.com/qemu-project/libvfio-user mirror? I'll be happy to include
> > an update to subprojects/libvfio-user.wrap in this series.
>
> Done:
> https://gitlab.com/qemu-project/libvfio-user/-/commits/master
>
> Repository mirroring is automated now, so new upstream commits will
> appear in the QEMU mirror repository from now on.
>
> >
> > Finally, there is some more work required on top of this series to get
> > message-based DMA to really work well:
> >
> > * libvfio-user has a long-standing issue where socket communication gets messed
> >   up when messages are sent from both ends at the same time. See
> >   https://github.com/nutanix/libvfio-user/issues/279 for more details. I've
> >   been engaging there and plan to contribute a fix.
> >
> > * qemu currently breaks down DMA accesses into chunks of size 8 bytes at
> >   maximum, each of which will be handled in a separate vfio-user DMA request
> >   message. This is quite terrible for large DMA acceses, such as when nvme
> >   reads and writes page-sized blocks for example. Thus, I would like to improve
> >   qemu to be able to perform larger accesses, at least for indirect memory
> >   regions. I have something working locally, but since this will likely result
> >   in more involved surgery and discussion, I am leaving this to be addressed in
> >   a separate patch.
> >
> > Mattias Nissler (3):
> >   softmmu: Support concurrent bounce buffers
> >   softmmu: Remove DMA unmap notification callback
> >   vfio-user: Message-based DMA support
> >
> >  hw/remote/vfio-user-obj.c |  62 ++++++++++++++++--
> >  softmmu/dma-helpers.c     |  28 --------
> >  softmmu/physmem.c         | 131 ++++++++------------------------------
> >  3 files changed, 83 insertions(+), 138 deletions(-)
>
> Sorry for the late review. I was on vacation and am catching up on
> emails.
>
> Paolo worked on the QEMU memory API and can give input on how to make
> this efficient for large DMA accesses. There is a chance that memory
> dispatch with larger sizes will be needed for ENQCMD CPU instruction
> emulation too.
>
> Stefan