diff mbox series

KVM: allow NULL writable argument to __kvm_faultin_pfn

Message ID 20250101064928.389504-1-pbonzini@redhat.com (mailing list archive)
State New
Headers show
Series KVM: allow NULL writable argument to __kvm_faultin_pfn | expand

Commit Message

Paolo Bonzini Jan. 1, 2025, 6:49 a.m. UTC
kvm_follow_pfn() is able to work with NULL in the .map_writable field
of the homonymous struct.  But __kvm_faultin_pfn() rejects the combo
despite KVM for e500 trying to use it.  Indeed .map_writable is not
particularly useful if the flags include FOLL_WRITE and readonly
guest memory is not supported, so add support to __kvm_faultin_pfn()
for this case.

Fixes: 1c7b627e9306 ("KVM: Add kvm_faultin_pfn() to specifically service guest page faults")
Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: regressions@lists.linux.dev
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 virt/kvm/kvm_main.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Sean Christopherson Jan. 6, 2025, 6:57 p.m. UTC | #1
On Wed, Jan 01, 2025, Paolo Bonzini wrote:
> kvm_follow_pfn() is able to work with NULL in the .map_writable field
> of the homonymous struct.  But __kvm_faultin_pfn() rejects the combo
> despite KVM for e500 trying to use it.  Indeed .map_writable is not
> particularly useful if the flags include FOLL_WRITE and readonly
> guest memory is not supported, so add support to __kvm_faultin_pfn()
> for this case.

I would prefer to keep the sanity check to minimize the risk of a page fault
handler not supporting opportunistic write mappings.  e500 is definitely the
odd one out here.

What about adding a dedicated wrapper for getting a writable PFN?  E.g. (untested)

---
 arch/powerpc/kvm/e500_mmu_host.c | 2 +-
 arch/x86/kvm/vmx/vmx.c           | 3 +--
 include/linux/kvm_host.h         | 8 ++++++++
 3 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index e5a145b578a4..2251bb30b8ec 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -444,7 +444,7 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500,
 
 	if (likely(!pfnmap)) {
 		tsize_pages = 1UL << (tsize + 10 - PAGE_SHIFT);
-		pfn = __kvm_faultin_pfn(slot, gfn, FOLL_WRITE, NULL, &page);
+		pfn = kvm_faultin_writable_pfn(slot, gfn, &page);
 		if (is_error_noslot_pfn(pfn)) {
 			if (printk_ratelimit())
 				pr_err("%s: real page not found for gfn %lx\n",
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 893366e53732..7012b583f2e8 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6800,7 +6800,6 @@ void vmx_set_apic_access_page_addr(struct kvm_vcpu *vcpu)
 	struct page *refcounted_page;
 	unsigned long mmu_seq;
 	kvm_pfn_t pfn;
-	bool writable;
 
 	/* Defer reload until vmcs01 is the current VMCS. */
 	if (is_guest_mode(vcpu)) {
@@ -6836,7 +6835,7 @@ void vmx_set_apic_access_page_addr(struct kvm_vcpu *vcpu)
 	 * controls the APIC-access page memslot, and only deletes the memslot
 	 * if APICv is permanently inhibited, i.e. the memslot won't reappear.
 	 */
-	pfn = __kvm_faultin_pfn(slot, gfn, FOLL_WRITE, &writable, &refcounted_page);
+	pfn = kvm_faultin_writable_pfn(slot, gfn, &refcounted_page);
 	if (is_error_noslot_pfn(pfn))
 		return;
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index c788d0bd952a..b0af7c7f99da 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1287,6 +1287,14 @@ static inline kvm_pfn_t kvm_faultin_pfn(struct kvm_vcpu *vcpu, gfn_t gfn,
 				 write ? FOLL_WRITE : 0, writable, refcounted_page);
 }
 
+static inline kvm_pfn_t kvm_faultin_writable_pfn(const struct kvm_memory_slot *slot,
+						 gfn_t gfn, struct page **refcounted_page)
+{
+	bool writable;
+
+	return __kvm_faultin_pfn(slot, gfn, FOLL_WRITE, &writable, refcounted_page);
+}
+
 int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
 			int len);
 int kvm_read_guest(struct kvm *kvm, gpa_t gpa, void *data, unsigned long len);

base-commit: 2c3412e999738bfd60859c493ff47f5c268814a3
--
Sean Christopherson Jan. 8, 2025, 2:41 p.m. UTC | #2
On Mon, Jan 06, 2025, Sean Christopherson wrote:
> On Wed, Jan 01, 2025, Paolo Bonzini wrote:
> > kvm_follow_pfn() is able to work with NULL in the .map_writable field
> > of the homonymous struct.  But __kvm_faultin_pfn() rejects the combo
> > despite KVM for e500 trying to use it.  Indeed .map_writable is not
> > particularly useful if the flags include FOLL_WRITE and readonly
> > guest memory is not supported, so add support to __kvm_faultin_pfn()
> > for this case.
> 
> I would prefer to keep the sanity check to minimize the risk of a page fault
> handler not supporting opportunistic write mappings.  e500 is definitely the
> odd one out here.

Per a quick chat at PUCK, Paolo is going to try and fix the e500 code to actually
use the @writable param as it's intended.
diff mbox series

Patch

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index de2c11dae231..5177e56fdbd5 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2975,10 +2975,11 @@  kvm_pfn_t __kvm_faultin_pfn(const struct kvm_memory_slot *slot, gfn_t gfn,
 		.refcounted_page = refcounted_page,
 	};
 
-	if (WARN_ON_ONCE(!writable || !refcounted_page))
+	if (WARN_ON_ONCE(!refcounted_page))
 		return KVM_PFN_ERR_FAULT;
 
-	*writable = false;
+	if (writable)
+		*writable = false;
 	*refcounted_page = NULL;
 
 	return kvm_follow_pfn(&kfp);