diff mbox series

[v2,33/49] KVM: x86: Advertise TSC_DEADLINE_TIMER in KVM_GET_SUPPORTED_CPUID

Message ID 20240517173926.965351-34-seanjc@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: CPUID overhaul, fixes, and caching | expand

Commit Message

Sean Christopherson May 17, 2024, 5:39 p.m. UTC
Advertise TSC_DEADLINE_TIMER via KVM_GET_SUPPORTED_CPUID when it's
supported in hardware, as the odds of a VMM emulating the local APIC in
userspace, not emulating the TSC deadline timer, _and_ reflecting
KVM_GET_SUPPORTED_CPUID back into KVM_SET_CPUID2 are extremely low.

KVM has _unconditionally_ advertised X2APIC via CPUID since commit
0d1de2d901f4 ("KVM: Always report x2apic as supported feature"), and it
is completely impossible for userspace to emulate X2APIC as KVM doesn't
support forwarding the MSR accesses to userspace.  I.e. KVM has relied on
userspace VMMs to not misreport local APIC capabilities for nearly 13
years.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 Documentation/virt/kvm/api.rst | 9 ++++++---
 arch/x86/kvm/cpuid.c           | 4 ++--
 2 files changed, 8 insertions(+), 5 deletions(-)

Comments

Binbin Wu May 22, 2024, 9:11 a.m. UTC | #1
On 5/18/2024 1:39 AM, Sean Christopherson wrote:
> Advertise TSC_DEADLINE_TIMER via KVM_GET_SUPPORTED_CPUID when it's
> supported in hardware,

But it's using EMUL_F(TSC_DEADLINE_TIMER) below?

>   as the odds of a VMM emulating the local APIC in
> userspace, not emulating the TSC deadline timer, _and_ reflecting
> KVM_GET_SUPPORTED_CPUID back into KVM_SET_CPUID2 are extremely low.
>
> KVM has _unconditionally_ advertised X2APIC via CPUID since commit
> 0d1de2d901f4 ("KVM: Always report x2apic as supported feature"), and it
> is completely impossible for userspace to emulate X2APIC as KVM doesn't
> support forwarding the MSR accesses to userspace.  I.e. KVM has relied on
> userspace VMMs to not misreport local APIC capabilities for nearly 13
> years.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>   Documentation/virt/kvm/api.rst | 9 ++++++---
>   arch/x86/kvm/cpuid.c           | 4 ++--
>   2 files changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 884846282d06..cb744a646de6 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -1804,15 +1804,18 @@ emulate them efficiently. The fields in each entry are defined as follows:
>            the values returned by the cpuid instruction for
>            this function/index combination
>   
> -The TSC deadline timer feature (CPUID leaf 1, ecx[24]) is always returned
> -as false, since the feature depends on KVM_CREATE_IRQCHIP for local APIC
> -support.  Instead it is reported via::
> +x2APIC (CPUID leaf 1, ecx[21) and TSC deadline timer (CPUID leaf 1, ecx[24])
> +may be returned as true, but they depend on KVM_CREATE_IRQCHIP for in-kernel
> +emulation of the local APIC.  TSC deadline timer support is also reported via::
>   
>     ioctl(KVM_CHECK_EXTENSION, KVM_CAP_TSC_DEADLINE_TIMER)
>   
>   if that returns true and you use KVM_CREATE_IRQCHIP, or if you emulate the
>   feature in userspace, then you can enable the feature for KVM_SET_CPUID2.
>   
> +Enabling x2APIC in KVM_SET_CPUID2 requires KVM_CREATE_IRQCHIP as KVM doesn't
> +support forwarding x2APIC MSR accesses to userspace, i.e. KVM does not support
> +emulating x2APIC in userspace.
>   
>   4.47 KVM_PPC_GET_PVINFO
>   -----------------------
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 699ce4261e9c..d1f427284ccc 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -680,8 +680,8 @@ void kvm_set_cpu_caps(void)
>   		F(FMA) | F(CX16) | 0 /* xTPR Update */ | F(PDCM) |
>   		F(PCID) | 0 /* Reserved, DCA */ | F(XMM4_1) |
>   		F(XMM4_2) | EMUL_F(X2APIC) | F(MOVBE) | F(POPCNT) |
> -		0 /* Reserved*/ | F(AES) | F(XSAVE) | 0 /* OSXSAVE */ | F(AVX) |
> -		F(F16C) | F(RDRAND)
> +		EMUL_F(TSC_DEADLINE_TIMER) | F(AES) | F(XSAVE) |
> +		0 /* OSXSAVE */ | F(AVX) | F(F16C) | F(RDRAND)
>   	);
>   
>   	kvm_cpu_cap_init(CPUID_1_EDX,
Sean Christopherson May 28, 2024, 3:21 p.m. UTC | #2
On Wed, May 22, 2024, Binbin Wu wrote:
> 
> 
> On 5/18/2024 1:39 AM, Sean Christopherson wrote:
> > Advertise TSC_DEADLINE_TIMER via KVM_GET_SUPPORTED_CPUID when it's
> > supported in hardware,
> 
> But it's using EMUL_F(TSC_DEADLINE_TIMER) below?

Doh, yeah, the changelog is wrong.  KVM always emulates TSC_DEADLINE_TIMER.

Thanks!
Maxim Levitsky July 5, 2024, 2:04 a.m. UTC | #3
On Fri, 2024-05-17 at 10:39 -0700, Sean Christopherson wrote:
> Advertise TSC_DEADLINE_TIMER via KVM_GET_SUPPORTED_CPUID when it's
> supported in hardware, as the odds of a VMM emulating the local APIC in
> userspace, not emulating the TSC deadline timer, _and_ reflecting
> KVM_GET_SUPPORTED_CPUID back into KVM_SET_CPUID2 are extremely low.
> 
> KVM has _unconditionally_ advertised X2APIC via CPUID since commit
> 0d1de2d901f4 ("KVM: Always report x2apic as supported feature"), and it
> is completely impossible for userspace to emulate X2APIC as KVM doesn't
> support forwarding the MSR accesses to userspace.  I.e. KVM has relied on
> userspace VMMs to not misreport local APIC capabilities for nearly 13
> years.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  Documentation/virt/kvm/api.rst | 9 ++++++---
>  arch/x86/kvm/cpuid.c           | 4 ++--
>  2 files changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 884846282d06..cb744a646de6 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -1804,15 +1804,18 @@ emulate them efficiently. The fields in each entry are defined as follows:
>           the values returned by the cpuid instruction for
>           this function/index combination
>  
> -The TSC deadline timer feature (CPUID leaf 1, ecx[24]) is always returned
> -as false, since the feature depends on KVM_CREATE_IRQCHIP for local APIC
> -support.  Instead it is reported via::
> +x2APIC (CPUID leaf 1, ecx[21) and TSC deadline timer (CPUID leaf 1, ecx[24])
> +may be returned as true, but they depend on KVM_CREATE_IRQCHIP for in-kernel
> +emulation of the local APIC.  TSC deadline timer support is also reported via::
>  
>    ioctl(KVM_CHECK_EXTENSION, KVM_CAP_TSC_DEADLINE_TIMER)
>  
>  if that returns true and you use KVM_CREATE_IRQCHIP, or if you emulate the
>  feature in userspace, then you can enable the feature for KVM_SET_CPUID2.
>  
> +Enabling x2APIC in KVM_SET_CPUID2 requires KVM_CREATE_IRQCHIP as KVM doesn't
> +support forwarding x2APIC MSR accesses to userspace, i.e. KVM does not support
> +emulating x2APIC in userspace.
>  
>  4.47 KVM_PPC_GET_PVINFO
>  -----------------------
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 699ce4261e9c..d1f427284ccc 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -680,8 +680,8 @@ void kvm_set_cpu_caps(void)
>  		F(FMA) | F(CX16) | 0 /* xTPR Update */ | F(PDCM) |
>  		F(PCID) | 0 /* Reserved, DCA */ | F(XMM4_1) |
>  		F(XMM4_2) | EMUL_F(X2APIC) | F(MOVBE) | F(POPCNT) |
> -		0 /* Reserved*/ | F(AES) | F(XSAVE) | 0 /* OSXSAVE */ | F(AVX) |
> -		F(F16C) | F(RDRAND)
> +		EMUL_F(TSC_DEADLINE_TIMER) | F(AES) | F(XSAVE) |
> +		0 /* OSXSAVE */ | F(AVX) | F(F16C) | F(RDRAND)
>  	);
>  
>  	kvm_cpu_cap_init(CPUID_1_EDX,

Hi,

I have a mixed feeling about this.

First of all KVM_GET_SUPPORTED_CPUID documentation explicitly states that it returns bits
that are supported in *default* configuration
TSC_DEADLINE_TIMER and arguably X2APIC are only supported after enabling various caps,
e.g not default configuration.

However, since X2APIC also in KVM_GET_SUPPORTED_CPUID (also wrongly IMHO), for consistency it does make
sense to add TSC_DEADLINE_TIMER as well.


I do think that we need at least to update the documentation of KVM_GET_SUPPORTED_CPUID
and KVM_GET_EMULATED_CPUID, as I state in a review of a later patch.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky
Sean Christopherson July 9, 2024, 7:28 p.m. UTC | #4
On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> On Fri, 2024-05-17 at 10:39 -0700, Sean Christopherson wrote:
> >  4.47 KVM_PPC_GET_PVINFO
> >  -----------------------
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index 699ce4261e9c..d1f427284ccc 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -680,8 +680,8 @@ void kvm_set_cpu_caps(void)
> >  		F(FMA) | F(CX16) | 0 /* xTPR Update */ | F(PDCM) |
> >  		F(PCID) | 0 /* Reserved, DCA */ | F(XMM4_1) |
> >  		F(XMM4_2) | EMUL_F(X2APIC) | F(MOVBE) | F(POPCNT) |
> > -		0 /* Reserved*/ | F(AES) | F(XSAVE) | 0 /* OSXSAVE */ | F(AVX) |
> > -		F(F16C) | F(RDRAND)
> > +		EMUL_F(TSC_DEADLINE_TIMER) | F(AES) | F(XSAVE) |
> > +		0 /* OSXSAVE */ | F(AVX) | F(F16C) | F(RDRAND)
> >  	);
> >  
> >  	kvm_cpu_cap_init(CPUID_1_EDX,
> 
> Hi,
> 
> I have a mixed feeling about this.
> 
> First of all KVM_GET_SUPPORTED_CPUID documentation explicitly states that it
> returns bits that are supported in *default* configuration TSC_DEADLINE_TIMER
> and arguably X2APIC are only supported after enabling various caps, e.g not
> default configuration.

Another side topic, in the near future, I think we should push to make an in-kernel
local APIC a hard requirement.  AFAIK, userspace local APIC gets no meaningful
test coverage, and IIRC we have known bugs where a userspace APIC doesn't work
as it should, e.g. commit 6550c4df7e50 ("KVM: nVMX: Fix interrupt window request
with "Acknowledge interrupt on exit"").

> However, since X2APIC also in KVM_GET_SUPPORTED_CPUID (also wrongly IMHO),
> for consistency it does make sense to add TSC_DEADLINE_TIMER as well.
> 
> I do think that we need at least to update the documentation of KVM_GET_SUPPORTED_CPUID
> and KVM_GET_EMULATED_CPUID, as I state in a review of a later patch.

+1
Maxim Levitsky July 24, 2024, 6 p.m. UTC | #5
On Tue, 2024-07-09 at 12:28 -0700, Sean Christopherson wrote:
> On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> > On Fri, 2024-05-17 at 10:39 -0700, Sean Christopherson wrote:
> > >  4.47 KVM_PPC_GET_PVINFO
> > >  -----------------------
> > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > > index 699ce4261e9c..d1f427284ccc 100644
> > > --- a/arch/x86/kvm/cpuid.c
> > > +++ b/arch/x86/kvm/cpuid.c
> > > @@ -680,8 +680,8 @@ void kvm_set_cpu_caps(void)
> > >  		F(FMA) | F(CX16) | 0 /* xTPR Update */ | F(PDCM) |
> > >  		F(PCID) | 0 /* Reserved, DCA */ | F(XMM4_1) |
> > >  		F(XMM4_2) | EMUL_F(X2APIC) | F(MOVBE) | F(POPCNT) |
> > > -		0 /* Reserved*/ | F(AES) | F(XSAVE) | 0 /* OSXSAVE */ | F(AVX) |
> > > -		F(F16C) | F(RDRAND)
> > > +		EMUL_F(TSC_DEADLINE_TIMER) | F(AES) | F(XSAVE) |
> > > +		0 /* OSXSAVE */ | F(AVX) | F(F16C) | F(RDRAND)
> > >  	);
> > >  
> > >  	kvm_cpu_cap_init(CPUID_1_EDX,
> > 
> > Hi,
> > 
> > I have a mixed feeling about this.
> > 
> > First of all KVM_GET_SUPPORTED_CPUID documentation explicitly states that it
> > returns bits that are supported in *default* configuration TSC_DEADLINE_TIMER
> > and arguably X2APIC are only supported after enabling various caps, e.g not
> > default configuration.
> 
> Another side topic, in the near future, I think we should push to make an in-kernel
> local APIC a hard requirement. 

I vote yes, with my both hands for this, but I am sure that this will for sure break at least some
userspace and/or some misconfigured qemu instances.

>  AFAIK, userspace local APIC gets no meaningful
> test coverage, and IIRC we have known bugs where a userspace APIC doesn't work
> as it should, e.g. commit 6550c4df7e50 ("KVM: nVMX: Fix interrupt window request
> with "Acknowledge interrupt on exit"").
> 
> > However, since X2APIC also in KVM_GET_SUPPORTED_CPUID (also wrongly IMHO),
> > for consistency it does make sense to add TSC_DEADLINE_TIMER as well.
> > 
> > I do think that we need at least to update the documentation of KVM_GET_SUPPORTED_CPUID
> > and KVM_GET_EMULATED_CPUID, as I state in a review of a later patch.
> 
> +1
> 


Best regards,
	Maxim Levitsky
diff mbox series

Patch

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 884846282d06..cb744a646de6 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -1804,15 +1804,18 @@  emulate them efficiently. The fields in each entry are defined as follows:
          the values returned by the cpuid instruction for
          this function/index combination
 
-The TSC deadline timer feature (CPUID leaf 1, ecx[24]) is always returned
-as false, since the feature depends on KVM_CREATE_IRQCHIP for local APIC
-support.  Instead it is reported via::
+x2APIC (CPUID leaf 1, ecx[21) and TSC deadline timer (CPUID leaf 1, ecx[24])
+may be returned as true, but they depend on KVM_CREATE_IRQCHIP for in-kernel
+emulation of the local APIC.  TSC deadline timer support is also reported via::
 
   ioctl(KVM_CHECK_EXTENSION, KVM_CAP_TSC_DEADLINE_TIMER)
 
 if that returns true and you use KVM_CREATE_IRQCHIP, or if you emulate the
 feature in userspace, then you can enable the feature for KVM_SET_CPUID2.
 
+Enabling x2APIC in KVM_SET_CPUID2 requires KVM_CREATE_IRQCHIP as KVM doesn't
+support forwarding x2APIC MSR accesses to userspace, i.e. KVM does not support
+emulating x2APIC in userspace.
 
 4.47 KVM_PPC_GET_PVINFO
 -----------------------
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 699ce4261e9c..d1f427284ccc 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -680,8 +680,8 @@  void kvm_set_cpu_caps(void)
 		F(FMA) | F(CX16) | 0 /* xTPR Update */ | F(PDCM) |
 		F(PCID) | 0 /* Reserved, DCA */ | F(XMM4_1) |
 		F(XMM4_2) | EMUL_F(X2APIC) | F(MOVBE) | F(POPCNT) |
-		0 /* Reserved*/ | F(AES) | F(XSAVE) | 0 /* OSXSAVE */ | F(AVX) |
-		F(F16C) | F(RDRAND)
+		EMUL_F(TSC_DEADLINE_TIMER) | F(AES) | F(XSAVE) |
+		0 /* OSXSAVE */ | F(AVX) | F(F16C) | F(RDRAND)
 	);
 
 	kvm_cpu_cap_init(CPUID_1_EDX,