mbox series

[RFC,0/6] KVM: x86: async PF user

Message ID 20241118123948.4796-1-kalyazin@amazon.com (mailing list archive)
Headers show
Series KVM: x86: async PF user | expand

Message

Nikita Kalyazin Nov. 18, 2024, 12:39 p.m. UTC
Async PF [1] allows to run other processes on a vCPU while the host
handles a stage-2 fault caused by a process on that vCPU. When using
VM-exit-based stage-2 fault handling [2], async PF functionality is lost
because KVM does not run the vCPU while a fault is being handled so no
other process can execute on the vCPU. This patch series extends
VM-exit-based stage-2 fault handling with async PF support by letting
userspace handle faults instead of the kernel, hence the "async PF user"
name.

I circulated the idea with Paolo, Sean, David H, and James H at the LPC,
and the only concern I heard was about injecting the "page not present"
event via #PF exception in the CoCo case, where it may not work. In my
implementation, I reused the existing code for doing that, so the async
PF user implementation is on par with the present async PF
implementation in this regard, and support for the CoCo case can be
added separately.

Please note that this series is applied on top of the VM-exit-based
stage-2 fault handling RFC [2].

Implementation

The following workflow is implemented:
 - A process in the guest causes a stage-2 fault.
 - KVM checks whether the fault can be handled asynchronously. If it
   can, KVM prepares the VM exit info that contains a newly added "async
   PF flag" raised and an async PF token value corresponding to the
   fault.
 - Userspace reads the VM exit info and resumes the vCPU immediately.
   Meanwhile it processes the fault.
 - When the fault is resolved, userspace calls a new async ioctl using
   the token to notify KVM.
 - KVM communicates to the guest that the process can be resumed.

Notes:
 - No changes to the x86 async PF PV interface are required
 - The series does not introduce new dependencies on x86 compared to the
   existing async PF

Testing

Inspired by [3], I built a Firecracker-based setup, where Firecracker
implemented the VM-exit-based fault handling. I observed that a workload
consisting of a CPU-bound and memory-bound threads running concurrently
was executing faster with async PF user enabled: with 10 ms-long fault
processing, it was 26% faster.

It is difficult to provide an objective performance comparison between
async PF kernel and async PF user, because async PF user can only work
with VM-exit-based fault handling, which has its own performance
characteristics compared to in-kernel fault handling or UserfaultFD.

The patch series is built on top of the VM-exit-based stage-2 fault
handling RFC [2].

Patch 1 updates documentation to reflect [2] changes.
Patches 2-6 add the implementation of async PF user.

Questions:
 - Are there any general concerns about the approach?
 - Can we leave the CoCo use case aside for now, or do we need to
   support it straight away?
 - What is the desired level of coupling between async PF and async PF
   user? For now, I kept the coupling to the bare minimum (only the
   PV-related data structure is shared between the two).

[1] https://kvm-forum.qemu.org/2021/sdei_apf_for_arm64_gavin.pdf
[2] https://lore.kernel.org/kvm/CADrL8HUHRMwUPhr7jLLBgD9YLFAnVHc=N-C=8er-x6GUtV97pQ@mail.gmail.com/T/
[3] https://lore.kernel.org/all/20200508032919.52147-1-gshan@redhat.com/

Nikita

Nikita Kalyazin (6):
  Documentation: KVM: add userfault KVM exit flag
  Documentation: KVM: add async pf user doc
  KVM: x86: add async ioctl support
  KVM: trace events: add type argument to async pf
  KVM: x86: async_pf_user: add infrastructure
  KVM: x86: async_pf_user: hook to fault handling and add ioctl

 Documentation/virt/kvm/api.rst  |  35 ++++++
 arch/x86/include/asm/kvm_host.h |  12 +-
 arch/x86/kvm/Kconfig            |   7 ++
 arch/x86/kvm/lapic.c            |   2 +
 arch/x86/kvm/mmu/mmu.c          |  68 ++++++++++-
 arch/x86/kvm/x86.c              | 101 +++++++++++++++-
 arch/x86/kvm/x86.h              |   2 +
 include/linux/kvm_host.h        |  30 +++++
 include/linux/kvm_types.h       |   1 +
 include/trace/events/kvm.h      |  50 +++++---
 include/uapi/linux/kvm.h        |  12 +-
 virt/kvm/Kconfig                |   3 +
 virt/kvm/Makefile.kvm           |   1 +
 virt/kvm/async_pf.c             |   2 +-
 virt/kvm/async_pf_user.c        | 197 ++++++++++++++++++++++++++++++++
 virt/kvm/async_pf_user.h        |  24 ++++
 virt/kvm/kvm_main.c             |  14 +++
 17 files changed, 535 insertions(+), 26 deletions(-)
 create mode 100644 virt/kvm/async_pf_user.c
 create mode 100644 virt/kvm/async_pf_user.h


base-commit: 15f01813426bf9672e2b24a5bac7b861c25de53b