Message ID | 20230729013535.1070024-17-seanjc@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/gvt: KVM: KVMGT fixes and page-track cleanups | expand |
On 2023/7/29 09:35, Sean Christopherson wrote: > Disallow moving memslots if the VM has external page-track users, i.e. if > KVMGT is being used to expose a virtual GPU to the guest, as KVMGT doesn't > correctly handle moving memory regions. > > Note, this is potential ABI breakage! E.g. userspace could move regions > that aren't shadowed by KVMGT without harming the guest. However, the > only known user of KVMGT is QEMU, and QEMU doesn't move generic memory This change breaks two kvm selftests: - set_memory_region_test; - memslot_perf_test; Please help confirm if the tests/doc needs to be updated, or if the assumption needs to be further clarified. > regions. KVM's own support for moving memory regions was also broken for > multiple years (albeit for an edge case, but arguably moving RAM is > itself an edge case), e.g. see commit edd4fa37baa6 ("KVM: x86: Allocate > new rmap and large page tracking when moving memslot"). > > Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> > Tested-by: Yongwei Ma <yongwei.ma@intel.com> > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/include/asm/kvm_page_track.h | 3 +++ > arch/x86/kvm/mmu/page_track.c | 5 +++++ > arch/x86/kvm/x86.c | 7 +++++++ > 3 files changed, 15 insertions(+) > > diff --git a/arch/x86/include/asm/kvm_page_track.h b/arch/x86/include/asm/kvm_page_track.h > index 8c4d216e3b2b..f744682648e7 100644 > --- a/arch/x86/include/asm/kvm_page_track.h > +++ b/arch/x86/include/asm/kvm_page_track.h > @@ -75,4 +75,7 @@ kvm_page_track_unregister_notifier(struct kvm *kvm, > void kvm_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new, > int bytes); > void kvm_page_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot); > + > +bool kvm_page_track_has_external_user(struct kvm *kvm); > + > #endif > diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c > index 891e5cc52b45..e6de9638e560 100644 > --- a/arch/x86/kvm/mmu/page_track.c > +++ b/arch/x86/kvm/mmu/page_track.c > @@ -303,3 +303,8 @@ void kvm_page_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot) > n->track_flush_slot(kvm, slot, n); > srcu_read_unlock(&head->track_srcu, idx); > } > + > +bool kvm_page_track_has_external_user(struct kvm *kvm) > +{ > + return hlist_empty(&kvm->arch.track_notifier_head.track_notifier_list); > +} > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 059571d5abed..4394bb49051f 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -12606,6 +12606,13 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, > struct kvm_memory_slot *new, > enum kvm_mr_change change) > { > + /* > + * KVM doesn't support moving memslots when there are external page > + * trackers attached to the VM, i.e. if KVMGT is in use. > + */ > + if (change == KVM_MR_MOVE && kvm_page_track_has_external_user(kvm)) > + return -EINVAL; > + > if (change == KVM_MR_CREATE || change == KVM_MR_MOVE) { > if ((new->base_gfn + new->npages - 1) > kvm_mmu_max_gfn()) > return -EINVAL;
On Wed, Aug 30, 2023, Like Xu wrote: > On 2023/7/29 09:35, Sean Christopherson wrote: > > Disallow moving memslots if the VM has external page-track users, i.e. if > > KVMGT is being used to expose a virtual GPU to the guest, as KVMGT doesn't > > correctly handle moving memory regions. > > > > Note, this is potential ABI breakage! E.g. userspace could move regions > > that aren't shadowed by KVMGT without harming the guest. However, the > > only known user of KVMGT is QEMU, and QEMU doesn't move generic memory > > This change breaks two kvm selftests: > > - set_memory_region_test; > - memslot_perf_test; It shoudn't. As of this patch, KVM doesn't register itself as a page-track user, i.e. KVMGT is the only remaining caller to kvm_page_track_register_notifier(). Unless I messed up, the only way kvm_page_track_has_external_user() can return true is if KVMGT is attached to the VM. The selftests most definitely don't do anything with KVMGT, so I don't see how they can fail. Are you seeing actually failures? > Please help confirm if the tests/doc needs to be updated, > or if the assumption needs to be further clarified. What assumption? > > regions. KVM's own support for moving memory regions was also broken for > > multiple years (albeit for an edge case, but arguably moving RAM is > > itself an edge case), e.g. see commit edd4fa37baa6 ("KVM: x86: Allocate > > new rmap and large page tracking when moving memslot"). > > > > Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> > > Tested-by: Yongwei Ma <yongwei.ma@intel.com> > > Signed-off-by: Sean Christopherson <seanjc@google.com> > > --- > > arch/x86/include/asm/kvm_page_track.h | 3 +++ > > arch/x86/kvm/mmu/page_track.c | 5 +++++ > > arch/x86/kvm/x86.c | 7 +++++++ > > 3 files changed, 15 insertions(+) > > > > diff --git a/arch/x86/include/asm/kvm_page_track.h b/arch/x86/include/asm/kvm_page_track.h > > index 8c4d216e3b2b..f744682648e7 100644 > > --- a/arch/x86/include/asm/kvm_page_track.h > > +++ b/arch/x86/include/asm/kvm_page_track.h > > @@ -75,4 +75,7 @@ kvm_page_track_unregister_notifier(struct kvm *kvm, > > void kvm_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new, > > int bytes); > > void kvm_page_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot); > > + > > +bool kvm_page_track_has_external_user(struct kvm *kvm); > > + > > #endif > > diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c > > index 891e5cc52b45..e6de9638e560 100644 > > --- a/arch/x86/kvm/mmu/page_track.c > > +++ b/arch/x86/kvm/mmu/page_track.c > > @@ -303,3 +303,8 @@ void kvm_page_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot) > > n->track_flush_slot(kvm, slot, n); > > srcu_read_unlock(&head->track_srcu, idx); > > } > > + > > +bool kvm_page_track_has_external_user(struct kvm *kvm) > > +{ > > + return hlist_empty(&kvm->arch.track_notifier_head.track_notifier_list); > > +} > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 059571d5abed..4394bb49051f 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -12606,6 +12606,13 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, > > struct kvm_memory_slot *new, > > enum kvm_mr_change change) > > { > > + /* > > + * KVM doesn't support moving memslots when there are external page > > + * trackers attached to the VM, i.e. if KVMGT is in use. > > + */ > > + if (change == KVM_MR_MOVE && kvm_page_track_has_external_user(kvm)) > > + return -EINVAL; > > + > > if (change == KVM_MR_CREATE || change == KVM_MR_MOVE) { > > if ((new->base_gfn + new->npages - 1) > kvm_mmu_max_gfn()) > > return -EINVAL;
On 31/8/2023 4:50 am, Sean Christopherson wrote: > On Wed, Aug 30, 2023, Like Xu wrote: >> On 2023/7/29 09:35, Sean Christopherson wrote: >>> Disallow moving memslots if the VM has external page-track users, i.e. if >>> KVMGT is being used to expose a virtual GPU to the guest, as KVMGT doesn't >>> correctly handle moving memory regions. >>> >>> Note, this is potential ABI breakage! E.g. userspace could move regions >>> that aren't shadowed by KVMGT without harming the guest. However, the >>> only known user of KVMGT is QEMU, and QEMU doesn't move generic memory >> >> This change breaks two kvm selftests: >> >> - set_memory_region_test; >> - memslot_perf_test; > > It shoudn't. As of this patch, KVM doesn't register itself as a page-track user, > i.e. KVMGT is the only remaining caller to kvm_page_track_register_notifier(). > Unless I messed up, the only way kvm_page_track_has_external_user() can return > true is if KVMGT is attached to the VM. The selftests most definitely don't do > anything with KVMGT, so I don't see how they can fail. > > Are you seeing actually failures? $ set_memory_region_test Testing KVM_RUN with zero added memory regions Allowed number of memory slots: 32764 Adding slots 0..32763, each memory region with 2048K size Testing MOVE of in-use region, 10 loops ==== Test Assertion Failure ==== lib/kvm_util.c:1163: !ret pid=52788 tid=52788 errno=22 - Invalid argument 1 0x0000000000405ede: vm_mem_region_move at kvm_util.c:1161 2 0x000000000040272a: test_move_memory_region at set_memory_region_test.c:195 3 (inlined by) main at set_memory_region_test.c:412 4 0x00007f087423ad84: ?? ??:0 5 0x00000000004029ed: _start at ??:? KVM_SET_USER_MEMORY_REGION failed ret: -1 errno: 22 slot: 10 new_gpa: 0xbffff000 $ memslot_perf_test Testing map performance with 1 runs, 5 seconds each Memslot count too high for this test, decrease the cap (max is 8209) Testing unmap performance with 1 runs, 5 seconds each Test took 1.698964001s for slot setup + 5.020164088s all iterations Done 43 iterations, avg 0.116748002s each Best runtime result was 0.116748002s per iteration (with 43 iterations) Testing unmap chunked performance with 1 runs, 5 seconds each Test took 1.709885279s for slot setup + 5.028875257s all iterations Done 44 iterations, avg 0.114292619s each Best runtime result was 0.114292619s per iteration (with 44 iterations) Testing move active area performance with 1 runs, 5 seconds each ==== Test Assertion Failure ==== lib/kvm_util.c:1163: !ret pid=52779 tid=52779 errno=22 - Invalid argument 1 0x0000000000406b4e: vm_mem_region_move at kvm_util.c:1161 2 0x0000000000403686: test_memslot_move_loop at memslot_perf_test.c:624 3 0x0000000000402c1c: test_execute at memslot_perf_test.c:828 4 (inlined by) test_loop at memslot_perf_test.c:1039 5 (inlined by) main at memslot_perf_test.c:1115 6 0x00007fe01cc3ad84: ?? ??:0 7 0x0000000000402fdd: _start at ??:? KVM_SET_USER_MEMORY_REGION failed ret: -1 errno: 22 slot: 32763 new_gpa: 0x30010000 At one point I wondered if some of the less common kconfig's were enabled, but the above two test failures could be easily fixed with the following diff: diff --git a/arch/x86/kvm/mmu/page_track.h b/arch/x86/kvm/mmu/page_track.h index 62f98c6c5af3..d4d72ed999b1 100644 --- a/arch/x86/kvm/mmu/page_track.h +++ b/arch/x86/kvm/mmu/page_track.h @@ -32,7 +32,7 @@ void kvm_page_track_delete_slot(struct kvm *kvm, struct kvm_memory_slot *slot); static inline bool kvm_page_track_has_external_user(struct kvm *kvm) { - return hlist_empty(&kvm->arch.track_notifier_head.track_notifier_list); + return !hlist_empty(&kvm->arch.track_notifier_head.track_notifier_list); } #else static inline int kvm_page_track_init(struct kvm *kvm) { return 0; } , so I guess it's pretty obvious what's going on here. > >> Please help confirm if the tests/doc needs to be updated, >> or if the assumption needs to be further clarified. > > What assumption? > >>> regions. KVM's own support for moving memory regions was also broken for >>> multiple years (albeit for an edge case, but arguably moving RAM is >>> itself an edge case), e.g. see commit edd4fa37baa6 ("KVM: x86: Allocate >>> new rmap and large page tracking when moving memslot"). >>> >>> Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> >>> Tested-by: Yongwei Ma <yongwei.ma@intel.com> >>> Signed-off-by: Sean Christopherson <seanjc@google.com> >>> --- >>> arch/x86/include/asm/kvm_page_track.h | 3 +++ >>> arch/x86/kvm/mmu/page_track.c | 5 +++++ >>> arch/x86/kvm/x86.c | 7 +++++++ >>> 3 files changed, 15 insertions(+) >>> >>> diff --git a/arch/x86/include/asm/kvm_page_track.h b/arch/x86/include/asm/kvm_page_track.h >>> index 8c4d216e3b2b..f744682648e7 100644 >>> --- a/arch/x86/include/asm/kvm_page_track.h >>> +++ b/arch/x86/include/asm/kvm_page_track.h >>> @@ -75,4 +75,7 @@ kvm_page_track_unregister_notifier(struct kvm *kvm, >>> void kvm_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new, >>> int bytes); >>> void kvm_page_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot); >>> + >>> +bool kvm_page_track_has_external_user(struct kvm *kvm); >>> + >>> #endif >>> diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c >>> index 891e5cc52b45..e6de9638e560 100644 >>> --- a/arch/x86/kvm/mmu/page_track.c >>> +++ b/arch/x86/kvm/mmu/page_track.c >>> @@ -303,3 +303,8 @@ void kvm_page_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot) >>> n->track_flush_slot(kvm, slot, n); >>> srcu_read_unlock(&head->track_srcu, idx); >>> } >>> + >>> +bool kvm_page_track_has_external_user(struct kvm *kvm) >>> +{ >>> + return hlist_empty(&kvm->arch.track_notifier_head.track_notifier_list); >>> +} >>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >>> index 059571d5abed..4394bb49051f 100644 >>> --- a/arch/x86/kvm/x86.c >>> +++ b/arch/x86/kvm/x86.c >>> @@ -12606,6 +12606,13 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, >>> struct kvm_memory_slot *new, >>> enum kvm_mr_change change) >>> { >>> + /* >>> + * KVM doesn't support moving memslots when there are external page >>> + * trackers attached to the VM, i.e. if KVMGT is in use. >>> + */ >>> + if (change == KVM_MR_MOVE && kvm_page_track_has_external_user(kvm)) >>> + return -EINVAL; >>> + >>> if (change == KVM_MR_CREATE || change == KVM_MR_MOVE) { >>> if ((new->base_gfn + new->npages - 1) > kvm_mmu_max_gfn()) >>> return -EINVAL;
On Thu, Aug 31, 2023, Like Xu wrote: > On 31/8/2023 4:50 am, Sean Christopherson wrote: > > On Wed, Aug 30, 2023, Like Xu wrote: > > > On 2023/7/29 09:35, Sean Christopherson wrote: > > > > Disallow moving memslots if the VM has external page-track users, i.e. if > > > > KVMGT is being used to expose a virtual GPU to the guest, as KVMGT doesn't > > > > correctly handle moving memory regions. > > > > > > > > Note, this is potential ABI breakage! E.g. userspace could move regions > > > > that aren't shadowed by KVMGT without harming the guest. However, the > > > > only known user of KVMGT is QEMU, and QEMU doesn't move generic memory > > > > > > This change breaks two kvm selftests: > > > > > > - set_memory_region_test; > > > - memslot_perf_test; > > > > It shoudn't. As of this patch, KVM doesn't register itself as a page-track user, > > i.e. KVMGT is the only remaining caller to kvm_page_track_register_notifier(). > > Unless I messed up, the only way kvm_page_track_has_external_user() can return > > true is if KVMGT is attached to the VM. The selftests most definitely don't do > > anything with KVMGT, so I don't see how they can fail. > > > > Are you seeing actually failures? > > $ set_memory_region_test ... > At one point I wondered if some of the less common kconfig's were enabled, > but the above two test failures could be easily fixed with the following diff: Argh, none of the configs I actually ran selftests on selected CONFIG_KVM_EXTERNAL_WRITE_TRACKING=y. > diff --git a/arch/x86/kvm/mmu/page_track.h b/arch/x86/kvm/mmu/page_track.h > index 62f98c6c5af3..d4d72ed999b1 100644 > --- a/arch/x86/kvm/mmu/page_track.h > +++ b/arch/x86/kvm/mmu/page_track.h > @@ -32,7 +32,7 @@ void kvm_page_track_delete_slot(struct kvm *kvm, struct > kvm_memory_slot *slot); > > static inline bool kvm_page_track_has_external_user(struct kvm *kvm) > { > - return hlist_empty(&kvm->arch.track_notifier_head.track_notifier_list); > + return !hlist_empty(&kvm->arch.track_notifier_head.track_notifier_list); > } > #else > static inline int kvm_page_track_init(struct kvm *kvm) { return 0; } > > , so I guess it's pretty obvious what's going on here. Yes. I'll rerun tests today and get the above posted on your behalf (unless you don't want me to do that). Sorry for yet another screw up, and thanks for testing!
diff --git a/arch/x86/include/asm/kvm_page_track.h b/arch/x86/include/asm/kvm_page_track.h index 8c4d216e3b2b..f744682648e7 100644 --- a/arch/x86/include/asm/kvm_page_track.h +++ b/arch/x86/include/asm/kvm_page_track.h @@ -75,4 +75,7 @@ kvm_page_track_unregister_notifier(struct kvm *kvm, void kvm_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new, int bytes); void kvm_page_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot); + +bool kvm_page_track_has_external_user(struct kvm *kvm); + #endif diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c index 891e5cc52b45..e6de9638e560 100644 --- a/arch/x86/kvm/mmu/page_track.c +++ b/arch/x86/kvm/mmu/page_track.c @@ -303,3 +303,8 @@ void kvm_page_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot) n->track_flush_slot(kvm, slot, n); srcu_read_unlock(&head->track_srcu, idx); } + +bool kvm_page_track_has_external_user(struct kvm *kvm) +{ + return hlist_empty(&kvm->arch.track_notifier_head.track_notifier_list); +} diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 059571d5abed..4394bb49051f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -12606,6 +12606,13 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, struct kvm_memory_slot *new, enum kvm_mr_change change) { + /* + * KVM doesn't support moving memslots when there are external page + * trackers attached to the VM, i.e. if KVMGT is in use. + */ + if (change == KVM_MR_MOVE && kvm_page_track_has_external_user(kvm)) + return -EINVAL; + if (change == KVM_MR_CREATE || change == KVM_MR_MOVE) { if ((new->base_gfn + new->npages - 1) > kvm_mmu_max_gfn()) return -EINVAL;