diff mbox series

[v2,1/5] KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic range

Message ID 20250215011437.1203084-2-seanjc@google.com (mailing list archive)
State New
Headers show
Series KVM: x86/xen: Restrict hypercall MSR index | expand

Commit Message

Sean Christopherson Feb. 15, 2025, 1:14 a.m. UTC
Reject userspace attempts to set the Xen hypercall page MSR to an index
outside of the "standard" virtualization range [0x40000000, 0x4fffffff],
as KVM is not equipped to handle collisions with real MSRs, e.g. KVM
doesn't update MSR interception, conflicts with VMCS/VMCB fields, special
case writes in KVM, etc.

While the MSR index isn't strictly ABI, i.e. can theoretically float to
any value, in practice no known VMM sets the MSR index to anything other
than 0x40000000 or 0x40000200.

Cc: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/uapi/asm/kvm.h | 3 +++
 arch/x86/kvm/xen.c              | 9 +++++++++
 2 files changed, 12 insertions(+)

Comments

David Woodhouse Feb. 15, 2025, 11 a.m. UTC | #1
On 15 February 2025 02:14:33 CET, Sean Christopherson <seanjc@google.com> wrote:
>Reject userspace attempts to set the Xen hypercall page MSR to an index
>outside of the "standard" virtualization range [0x40000000, 0x4fffffff],
>as KVM is not equipped to handle collisions with real MSRs, e.g. KVM
>doesn't update MSR interception, conflicts with VMCS/VMCB fields, special
>case writes in KVM, etc.
>
>While the MSR index isn't strictly ABI, i.e. can theoretically float to
>any value, in practice no known VMM sets the MSR index to anything other
>than 0x40000000 or 0x40000200.
>
>Cc: Joao Martins <joao.m.martins@oracle.com>
>Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
>Reviewed-by: Paul Durrant <paul@xen.org>
>Signed-off-by: Sean Christopherson <seanjc@google.com>
>---
> arch/x86/include/uapi/asm/kvm.h | 3 +++
> arch/x86/kvm/xen.c              | 9 +++++++++
> 2 files changed, 12 insertions(+)
>
>diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
>index 9e75da97bce0..460306b35a4b 100644
>--- a/arch/x86/include/uapi/asm/kvm.h
>+++ b/arch/x86/include/uapi/asm/kvm.h
>@@ -559,6 +559,9 @@ struct kvm_x86_mce {
> #define KVM_XEN_HVM_CONFIG_PVCLOCK_TSC_UNSTABLE	(1 << 7)
> #define KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA	(1 << 8)
> 
>+#define KVM_XEN_MSR_MIN_INDEX			0x40000000u
>+#define KVM_XEN_MSR_MAX_INDEX			0x4fffffffu
>+
> struct kvm_xen_hvm_config {
> 	__u32 flags;
> 	__u32 msr;
>diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
>index a909b817b9c0..5b94825001a7 100644
>--- a/arch/x86/kvm/xen.c
>+++ b/arch/x86/kvm/xen.c
>@@ -1324,6 +1324,15 @@ int kvm_xen_hvm_config(struct kvm *kvm, struct kvm_xen_hvm_config *xhc)
> 	     xhc->blob_size_32 || xhc->blob_size_64))
> 		return -EINVAL;
> 
>+	/*
>+	 * Restrict the MSR to the range that is unofficially reserved for
>+	 * synthetic, virtualization-defined MSRs, e.g. to prevent confusing
>+	 * KVM by colliding with a real MSR that requires special handling.
>+	 */
>+	if (xhc->msr &&
>+	    (xhc->msr < KVM_XEN_MSR_MIN_INDEX || xhc->msr > KVM_XEN_MSR_MAX_INDEX))
>+		return -EINVAL;
>+
> 	mutex_lock(&kvm->arch.xen.xen_lock);
> 
> 	if (xhc->msr && !kvm->arch.xen_hvm_config.msr)

I'd still like to restrict this to ensure it doesn't collide with MSRs that KVM expects to emulate. But that can be a separate patch, as discussed.

This patch should probably have a docs update too.
Sean Christopherson Feb. 18, 2025, 4:33 p.m. UTC | #2
On Sat, Feb 15, 2025, David Woodhouse wrote:
> On 15 February 2025 02:14:33 CET, Sean Christopherson <seanjc@google.com> wrote:
> >diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
> >index a909b817b9c0..5b94825001a7 100644
> >--- a/arch/x86/kvm/xen.c
> >+++ b/arch/x86/kvm/xen.c
> >@@ -1324,6 +1324,15 @@ int kvm_xen_hvm_config(struct kvm *kvm, struct kvm_xen_hvm_config *xhc)
> > 	     xhc->blob_size_32 || xhc->blob_size_64))
> > 		return -EINVAL;
> > 
> >+	/*
> >+	 * Restrict the MSR to the range that is unofficially reserved for
> >+	 * synthetic, virtualization-defined MSRs, e.g. to prevent confusing
> >+	 * KVM by colliding with a real MSR that requires special handling.
> >+	 */
> >+	if (xhc->msr &&
> >+	    (xhc->msr < KVM_XEN_MSR_MIN_INDEX || xhc->msr > KVM_XEN_MSR_MAX_INDEX))
> >+		return -EINVAL;
> >+
> > 	mutex_lock(&kvm->arch.xen.xen_lock);
> > 
> > 	if (xhc->msr && !kvm->arch.xen_hvm_config.msr)
> 
> I'd still like to restrict this to ensure it doesn't collide with MSRs that
> KVM expects to emulate. But that can be a separate patch, as discussed.

I think that has to go in userspace.  If KVM adds on-by-default, i.e. unguarded,
conflicting MSR emulation, then KVM will have broken userspace regardless of
whether or not KVM explicitly rejects KVM_XEN_HVM_CONFIG based on emulated MSRs.

If we assume future us are somewhat competent and guard new MSR emulation with a
feature bit, capability, etc., then rejecting KVM_XEN_HVM_CONFIG isn't obviously
better, or even feasible in some cases.  E.g. if the opt-in is done via guest
CPUID, then KVM is stuck because KVM_XEN_HVM_CONFIG can (and generally should?)
be called before vCPUs are even created.  Even if the opt-in is VM-scoped, e.g.
a capabilitiy, there are still ordering issues as userpace would see different
behavior depending on the order between KVM_XEN_HVM_CONFIG and the capability.

And if the MSR emulation is guarded, rejecting KVM_XEN_HVM_CONFIG without a
precise check is undesirable, because KVM would unnecessarily break userspace.

> This patch should probably have a docs update too.

Gah, forgot that.
David Woodhouse Feb. 18, 2025, 6:47 p.m. UTC | #3
On 18 February 2025 17:33:14 CET, Sean Christopherson <seanjc@google.com> wrote:
>On Sat, Feb 15, 2025, David Woodhouse wrote:
>> On 15 February 2025 02:14:33 CET, Sean Christopherson <seanjc@google.com> wrote:
>> >diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
>> >index a909b817b9c0..5b94825001a7 100644
>> >--- a/arch/x86/kvm/xen.c
>> >+++ b/arch/x86/kvm/xen.c
>> >@@ -1324,6 +1324,15 @@ int kvm_xen_hvm_config(struct kvm *kvm, struct kvm_xen_hvm_config *xhc)
>> > 	     xhc->blob_size_32 || xhc->blob_size_64))
>> > 		return -EINVAL;
>> > 
>> >+	/*
>> >+	 * Restrict the MSR to the range that is unofficially reserved for
>> >+	 * synthetic, virtualization-defined MSRs, e.g. to prevent confusing
>> >+	 * KVM by colliding with a real MSR that requires special handling.
>> >+	 */
>> >+	if (xhc->msr &&
>> >+	    (xhc->msr < KVM_XEN_MSR_MIN_INDEX || xhc->msr > KVM_XEN_MSR_MAX_INDEX))
>> >+		return -EINVAL;
>> >+
>> > 	mutex_lock(&kvm->arch.xen.xen_lock);
>> > 
>> > 	if (xhc->msr && !kvm->arch.xen_hvm_config.msr)
>> 
>> I'd still like to restrict this to ensure it doesn't collide with MSRs that
>> KVM expects to emulate. But that can be a separate patch, as discussed.
>
>I think that has to go in userspace.  If KVM adds on-by-default, i.e. unguarded,
>conflicting MSR emulation, then KVM will have broken userspace regardless of
>whether or not KVM explicitly rejects KVM_XEN_HVM_CONFIG based on emulated MSRs.
>
>If we assume future us are somewhat competent and guard new MSR emulation with a
>feature bit, capability, etc., then rejecting KVM_XEN_HVM_CONFIG isn't obviously
>better, or even feasible in some cases.  E.g. if the opt-in is done via guest
>CPUID, then KVM is stuck because KVM_XEN_HVM_CONFIG can (and generally should?)
>be called before vCPUs are even created.  Even if the opt-in is VM-scoped, e.g.
>a capabilitiy, there are still ordering issues as userpace would see different
>behavior depending on the order between KVM_XEN_HVM_CONFIG and the capability.

Well, I just changed QEMU to do KVM_XEN_HVM_CONFIG from the first vCPU init because QEMU doesn't know if it needs to avoid the Hyper-V MSR space at kvm_xen_init() time. But yes, we don't want to depend on ordering either way.
Sean Christopherson Feb. 20, 2025, 6:36 p.m. UTC | #4
On Sat, Feb 15, 2025, David Woodhouse wrote:
> On 15 February 2025 02:14:33 CET, Sean Christopherson <seanjc@google.com> wrote:
> >Reject userspace attempts to set the Xen hypercall page MSR to an index
> >outside of the "standard" virtualization range [0x40000000, 0x4fffffff],
> >as KVM is not equipped to handle collisions with real MSRs, e.g. KVM
> >doesn't update MSR interception, conflicts with VMCS/VMCB fields, special
> >case writes in KVM, etc.
> >
> >While the MSR index isn't strictly ABI, i.e. can theoretically float to
> >any value, in practice no known VMM sets the MSR index to anything other
> >than 0x40000000 or 0x40000200.

...

> This patch should probably have a docs update too.

To avoid sending an entirely new version only to discover I suck at writing docs,
how does this look?

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 2b52eb77e29c..5fe84f2427b5 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -1000,6 +1000,10 @@ blobs in userspace.  When the guest writes the MSR, kvm copies one
 page of a blob (32- or 64-bit, depending on the vcpu mode) to guest
 memory.
 
+The MSR index must be in the range [0x40000000, 0x4fffffff], i.e. must reside
+in the range that is unofficially reserved for use by hypervisors.  The min/max
+values are enumerated via KVM_XEN_MSR_MIN_INDEX and KVM_XEN_MSR_MAX_INDEX.
+
 ::
 
   struct kvm_xen_hvm_config {
David Woodhouse Feb. 20, 2025, 7:04 p.m. UTC | #5
On 20 February 2025 19:36:41 CET, Sean Christopherson <seanjc@google.com> wrote:
>On Sat, Feb 15, 2025, David Woodhouse wrote:
>> On 15 February 2025 02:14:33 CET, Sean Christopherson <seanjc@google.com> wrote:
>> >Reject userspace attempts to set the Xen hypercall page MSR to an index
>> >outside of the "standard" virtualization range [0x40000000, 0x4fffffff],
>> >as KVM is not equipped to handle collisions with real MSRs, e.g. KVM
>> >doesn't update MSR interception, conflicts with VMCS/VMCB fields, special
>> >case writes in KVM, etc.
>> >
>> >While the MSR index isn't strictly ABI, i.e. can theoretically float to
>> >any value, in practice no known VMM sets the MSR index to anything other
>> >than 0x40000000 or 0x40000200.
>
>...
>
>> This patch should probably have a docs update too.
>
>To avoid sending an entirely new version only to discover I suck at writing docs,
>how does this look?
>
>diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
>index 2b52eb77e29c..5fe84f2427b5 100644
>--- a/Documentation/virt/kvm/api.rst
>+++ b/Documentation/virt/kvm/api.rst
>@@ -1000,6 +1000,10 @@ blobs in userspace.  When the guest writes the MSR, kvm copies one
> page of a blob (32- or 64-bit, depending on the vcpu mode) to guest
> memory.
> 
>+The MSR index must be in the range [0x40000000, 0x4fffffff], i.e. must reside
>+in the range that is unofficially reserved for use by hypervisors.  The min/max
>+values are enumerated via KVM_XEN_MSR_MIN_INDEX and KVM_XEN_MSR_MAX_INDEX.
>+
> ::
> 
>   struct kvm_xen_hvm_config {

LGTM, thanks
diff mbox series

Patch

diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 9e75da97bce0..460306b35a4b 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -559,6 +559,9 @@  struct kvm_x86_mce {
 #define KVM_XEN_HVM_CONFIG_PVCLOCK_TSC_UNSTABLE	(1 << 7)
 #define KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA	(1 << 8)
 
+#define KVM_XEN_MSR_MIN_INDEX			0x40000000u
+#define KVM_XEN_MSR_MAX_INDEX			0x4fffffffu
+
 struct kvm_xen_hvm_config {
 	__u32 flags;
 	__u32 msr;
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index a909b817b9c0..5b94825001a7 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -1324,6 +1324,15 @@  int kvm_xen_hvm_config(struct kvm *kvm, struct kvm_xen_hvm_config *xhc)
 	     xhc->blob_size_32 || xhc->blob_size_64))
 		return -EINVAL;
 
+	/*
+	 * Restrict the MSR to the range that is unofficially reserved for
+	 * synthetic, virtualization-defined MSRs, e.g. to prevent confusing
+	 * KVM by colliding with a real MSR that requires special handling.
+	 */
+	if (xhc->msr &&
+	    (xhc->msr < KVM_XEN_MSR_MIN_INDEX || xhc->msr > KVM_XEN_MSR_MAX_INDEX))
+		return -EINVAL;
+
 	mutex_lock(&kvm->arch.xen.xen_lock);
 
 	if (xhc->msr && !kvm->arch.xen_hvm_config.msr)