Message ID | 20241204191349.1730936-7-jthoughton@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | KVM: Introduce KVM Userfault | expand |
Hi James, On Wed, Dec 04, 2024 at 07:13:41PM +0000, James Houghton wrote: > Adhering to the requirements of KVM Userfault: > > 1. When it is toggled (either on or off), zap the second stage with > kvm_arch_flush_shadow_memslot(). This is to (1) respect > userfault-ness and (2) to reconstruct block mappings. > 2. While KVM_MEM_USERFAULT is enabled, restrict new second-stage mappings > to be PAGE_SIZE, just like when dirty logging is enabled. > > Signed-off-by: James Houghton <jthoughton@google.com> > --- > I'm not 100% sure if kvm_arch_flush_shadow_memslot() is correct in > this case (like if the host does not have S2FWB). Invalidating the stage-2 entries is of course necessary for correctness on the !USERFAULT -> USERFAULT transition, and the MMU will do the right thing regardless of whether hardware implements FEAT_S2FWB. What I think you may be getting at is the *performance* implications are quite worrying without FEAT_S2FWB due to the storm of CMOs, and I'd definitely agree with that. > @@ -2062,6 +2069,20 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, > enum kvm_mr_change change) > { > bool log_dirty_pages = new && new->flags & KVM_MEM_LOG_DIRTY_PAGES; > + u32 changed_flags = (new ? new->flags : 0) ^ (old ? old->flags : 0); > + > + /* > + * If KVM_MEM_USERFAULT changed, drop all the stage-2 mappings so that > + * we can (1) respect userfault-ness or (2) create block mappings. > + */ > + if ((changed_flags & KVM_MEM_USERFAULT) && change == KVM_MR_FLAGS_ONLY) > + kvm_arch_flush_shadow_memslot(kvm, old); I'd strongly prefer that we make (2) a userspace problem and don't eagerly invalidate stage-2 mappings on the USERFAULT -> !USERFAULT change. Having implied user-visible behaviors on ioctls is never good, and for systems without FEAT_S2FWB you might be better off avoiding the unmap in the first place. So, if userspace decides there's a benefit to invalidating the stage-2 MMU, it can just delete + recreate the memslot.
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig index ead632ad01b4..d89b4088b580 100644 --- a/arch/arm64/kvm/Kconfig +++ b/arch/arm64/kvm/Kconfig @@ -38,6 +38,7 @@ menuconfig KVM select HAVE_KVM_VCPU_RUN_PID_CHANGE select SCHED_INFO select GUEST_PERF_EVENTS if PERF_EVENTS + select HAVE_KVM_USERFAULT help Support hosting virtualized guest machines. diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index a71fe6f6bd90..53cee0bacb75 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1482,7 +1482,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, * logging_active is guaranteed to never be true for VM_PFNMAP * memslots. */ - if (logging_active) { + if (logging_active || kvm_memslot_userfault(memslot)) { force_pte = true; vma_shift = PAGE_SHIFT; } else { @@ -1571,6 +1571,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, mmu_seq = vcpu->kvm->mmu_invalidate_seq; mmap_read_unlock(current->mm); + if (kvm_gfn_userfault(kvm, memslot, gfn)) { + kvm_prepare_memory_fault_exit(vcpu, gfn << PAGE_SHIFT, + PAGE_SIZE, write_fault, + exec_fault, false, true); + return -EFAULT; + } + pfn = __kvm_faultin_pfn(memslot, gfn, write_fault ? FOLL_WRITE : 0, &writable, &page); if (pfn == KVM_PFN_ERR_HWPOISON) { @@ -2062,6 +2069,20 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, enum kvm_mr_change change) { bool log_dirty_pages = new && new->flags & KVM_MEM_LOG_DIRTY_PAGES; + u32 changed_flags = (new ? new->flags : 0) ^ (old ? old->flags : 0); + + /* + * If KVM_MEM_USERFAULT changed, drop all the stage-2 mappings so that + * we can (1) respect userfault-ness or (2) create block mappings. + */ + if ((changed_flags & KVM_MEM_USERFAULT) && change == KVM_MR_FLAGS_ONLY) + kvm_arch_flush_shadow_memslot(kvm, old); + + /* + * Nothing left to do if not toggling dirty logging. + */ + if (!(changed_flags & KVM_MEM_LOG_DIRTY_PAGES)) + return; /* * At this point memslot has been committed and there is an
Adhering to the requirements of KVM Userfault: 1. When it is toggled (either on or off), zap the second stage with kvm_arch_flush_shadow_memslot(). This is to (1) respect userfault-ness and (2) to reconstruct block mappings. 2. While KVM_MEM_USERFAULT is enabled, restrict new second-stage mappings to be PAGE_SIZE, just like when dirty logging is enabled. Signed-off-by: James Houghton <jthoughton@google.com> --- I'm not 100% sure if kvm_arch_flush_shadow_memslot() is correct in this case (like if the host does not have S2FWB). --- arch/arm64/kvm/Kconfig | 1 + arch/arm64/kvm/mmu.c | 23 ++++++++++++++++++++++- 2 files changed, 23 insertions(+), 1 deletion(-)