diff mbox

kvm: RDTSCP on AMD

Message ID 20160706124438.GB7300@pd.tnic (mailing list archive)
State New, archived
Headers show

Commit Message

Borislav Petkov July 6, 2016, 12:44 p.m. UTC
Hi guys,

how about this below to enable RDTSCP emulation on AMD? IOW, I'm staring
at

  33b5e8c03ae7 ("target-i386: Disable rdtscp on Opteron_G* CPU models")

in the qemu repo.

It seems to work here, RDTSCP in the guest gives me node and cpu as
vsyscall_set_cpu() in the guest kernel has set them.

Thoughts?

(Below is the simple qemu diff reenabling RDTSCP)

---

Comments

Paolo Bonzini July 6, 2016, 1:01 p.m. UTC | #1
On 06/07/2016 14:44, Borislav Petkov wrote:
> Hi guys,
> 
> how about this below to enable RDTSCP emulation on AMD? IOW, I'm staring
> at
> 
>   33b5e8c03ae7 ("target-i386: Disable rdtscp on Opteron_G* CPU models")
> 
> in the qemu repo.
> 
> It seems to work here, RDTSCP in the guest gives me node and cpu as
> vsyscall_set_cpu() in the guest kernel has set them.
> 
> Thoughts?
> 
> (Below is the simple qemu diff reenabling RDTSCP)
> 
> @@ -3919,6 +3935,7 @@ static int (*const svm_exit_handlers[])(struct vcpu_svm *svm) = {
>  	[SVM_EXIT_STGI]				= stgi_interception,
>  	[SVM_EXIT_CLGI]				= clgi_interception,
>  	[SVM_EXIT_SKINIT]			= skinit_interception,
> +	[SVM_EXIT_RDTSCP]			= rdtscp_interception,
>  	[SVM_EXIT_WBINVD]                       = wbinvd_interception,
>  	[SVM_EXIT_MONITOR]			= monitor_interception,
>  	[SVM_EXIT_MWAIT]			= mwait_interception,

Nothing is needed in the kernel actually.  You can skip the intercept
by running the guest with MSR_TSC_AUX set to the guest's expected value.
 Which KVM does, except that it's botched so I need to apply the
patch in https://lkml.org/lkml/2016/4/13/802.

> ---
> 
> qemu diff:
> 
> ---
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index 3bd3cfc3ad16..aa6d0d027d00 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c

This is not enough because it's missing some backwards compatibility
gunk (similar to the include/hw/i386/pc.h parts of 33b5e8c03ae), but
it's enough for a proof of concept and to discuss it.

The main issue with this is that it would force a lockstep update of
QEMU and kernel, which we try to avoid.  I'm not sure if we have a
solution for this problem.  Eduardo?

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eduardo Habkost July 6, 2016, 5:34 p.m. UTC | #2
On Wed, Jul 06, 2016 at 03:01:04PM +0200, Paolo Bonzini wrote:
> On 06/07/2016 14:44, Borislav Petkov wrote:
> > Hi guys,
> > 
> > how about this below to enable RDTSCP emulation on AMD? IOW, I'm staring
> > at
> > 
> >   33b5e8c03ae7 ("target-i386: Disable rdtscp on Opteron_G* CPU models")
> > 
> > in the qemu repo.
> > 
> > It seems to work here, RDTSCP in the guest gives me node and cpu as
> > vsyscall_set_cpu() in the guest kernel has set them.
> > 
> > Thoughts?
> > 
> > (Below is the simple qemu diff reenabling RDTSCP)
> > 
> > @@ -3919,6 +3935,7 @@ static int (*const svm_exit_handlers[])(struct vcpu_svm *svm) = {
> >  	[SVM_EXIT_STGI]				= stgi_interception,
> >  	[SVM_EXIT_CLGI]				= clgi_interception,
> >  	[SVM_EXIT_SKINIT]			= skinit_interception,
> > +	[SVM_EXIT_RDTSCP]			= rdtscp_interception,
> >  	[SVM_EXIT_WBINVD]                       = wbinvd_interception,
> >  	[SVM_EXIT_MONITOR]			= monitor_interception,
> >  	[SVM_EXIT_MWAIT]			= mwait_interception,
> 
> Nothing is needed in the kernel actually.  You can skip the intercept
> by running the guest with MSR_TSC_AUX set to the guest's expected value.
>  Which KVM does, except that it's botched so I need to apply the
> patch in https://lkml.org/lkml/2016/4/13/802.

Do you mean -cpu Opteron_G*,+rdtscp will be buggy on Linux v4.5?
(v4.5 reports rdtscp as supported in GET_SUPPORTED_CPUID)

Can we do something to make QEMU detect the buggy kernel before
allowing rdtscp to be enabled, or should we just tell people to
upgrade their kernel?

> 
> > ---
> > 
> > qemu diff:
> > 
> > ---
> > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > index 3bd3cfc3ad16..aa6d0d027d00 100644
> > --- a/target-i386/cpu.c
> > +++ b/target-i386/cpu.c
> 
> This is not enough because it's missing some backwards compatibility
> gunk (similar to the include/hw/i386/pc.h parts of 33b5e8c03ae), but
> it's enough for a proof of concept and to discuss it.
> 
> The main issue with this is that it would force a lockstep update of
> QEMU and kernel, which we try to avoid.  I'm not sure if we have a
> solution for this problem.  Eduardo?

We don't. Either we make QEMU require a newer kernel, or we need
a new CPU model. :(
Paolo Bonzini July 6, 2016, 9:27 p.m. UTC | #3
On 06/07/2016 19:34, Eduardo Habkost wrote:
>> > Nothing is needed in the kernel actually.  You can skip the intercept
>> > by running the guest with MSR_TSC_AUX set to the guest's expected value.
>> >  Which KVM does, except that it's botched so I need to apply the
>> > patch in https://lkml.org/lkml/2016/4/13/802.
> Do you mean -cpu Opteron_G*,+rdtscp will be buggy on Linux v4.5?
> (v4.5 reports rdtscp as supported in GET_SUPPORTED_CPUID)
> 
> Can we do something to make QEMU detect the buggy kernel before
> allowing rdtscp to be enabled, or should we just tell people to
> upgrade their kernel?

We usually just tell people to use the latest stable kernel.

Adding new CPU models is not a big deal, in fact it's almost easier than
getting compat properties right. :)

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 16ef31b87452..5a238f5402f5 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1131,6 +1131,7 @@  static void init_vmcb(struct vcpu_svm *svm)
 	set_intercept(svm, INTERCEPT_STGI);
 	set_intercept(svm, INTERCEPT_CLGI);
 	set_intercept(svm, INTERCEPT_SKINIT);
+	set_intercept(svm, INTERCEPT_RDTSCP);
 	set_intercept(svm, INTERCEPT_WBINVD);
 	set_intercept(svm, INTERCEPT_MONITOR);
 	set_intercept(svm, INTERCEPT_MWAIT);
@@ -3009,6 +3010,20 @@  static int skinit_interception(struct vcpu_svm *svm)
 	return 1;
 }
 
+static int rdtscp_interception(struct vcpu_svm *svm)
+{
+	u64 tsc;
+
+	tsc = kvm_scale_tsc(&svm->vcpu, rdtsc()) + svm->vmcb->control.tsc_offset;
+
+	kvm_register_write(&svm->vcpu, VCPU_REGS_RAX, tsc & 0xffffffff);
+	kvm_register_write(&svm->vcpu, VCPU_REGS_RDX, tsc >> 32);
+	kvm_register_write(&svm->vcpu, VCPU_REGS_RCX, svm->tsc_aux);
+
+	skip_emulated_instruction(&svm->vcpu);
+	return 1;
+}
+
 static int wbinvd_interception(struct vcpu_svm *svm)
 {
 	kvm_emulate_wbinvd(&svm->vcpu);
@@ -3919,6 +3935,7 @@  static int (*const svm_exit_handlers[])(struct vcpu_svm *svm) = {
 	[SVM_EXIT_STGI]				= stgi_interception,
 	[SVM_EXIT_CLGI]				= clgi_interception,
 	[SVM_EXIT_SKINIT]			= skinit_interception,
+	[SVM_EXIT_RDTSCP]			= rdtscp_interception,
 	[SVM_EXIT_WBINVD]                       = wbinvd_interception,
 	[SVM_EXIT_MONITOR]			= monitor_interception,
 	[SVM_EXIT_MWAIT]			= mwait_interception,
---

qemu diff:

---
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 3bd3cfc3ad16..aa6d0d027d00 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -1332,9 +1332,8 @@  static X86CPUDefinition builtin_x86_defs[] = {
             CPUID_DE | CPUID_FP87,
         .features[FEAT_1_ECX] =
             CPUID_EXT_CX16 | CPUID_EXT_SSE3,
-        /* Missing: CPUID_EXT2_RDTSCP */
         .features[FEAT_8000_0001_EDX] =
-            CPUID_EXT2_LM | CPUID_EXT2_FXSR |
+            CPUID_EXT2_LM | CPUID_EXT2_FXSR | CPUID_EXT2_RDTSCP |
             CPUID_EXT2_MMX | CPUID_EXT2_NX | CPUID_EXT2_PSE36 |
             CPUID_EXT2_PAT | CPUID_EXT2_CMOV | CPUID_EXT2_MCA |
             CPUID_EXT2_PGE | CPUID_EXT2_MTRR | CPUID_EXT2_SYSCALL |
@@ -1362,9 +1361,8 @@  static X86CPUDefinition builtin_x86_defs[] = {
         .features[FEAT_1_ECX] =
             CPUID_EXT_POPCNT | CPUID_EXT_CX16 | CPUID_EXT_MONITOR |
             CPUID_EXT_SSE3,
-        /* Missing: CPUID_EXT2_RDTSCP */
         .features[FEAT_8000_0001_EDX] =
-            CPUID_EXT2_LM | CPUID_EXT2_FXSR |
+            CPUID_EXT2_LM | CPUID_EXT2_FXSR | CPUID_EXT2_RDTSCP |
             CPUID_EXT2_MMX | CPUID_EXT2_NX | CPUID_EXT2_PSE36 |
             CPUID_EXT2_PAT | CPUID_EXT2_CMOV | CPUID_EXT2_MCA |
             CPUID_EXT2_PGE | CPUID_EXT2_MTRR | CPUID_EXT2_SYSCALL |
@@ -1395,9 +1393,8 @@  static X86CPUDefinition builtin_x86_defs[] = {
             CPUID_EXT_POPCNT | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 |
             CPUID_EXT_CX16 | CPUID_EXT_SSSE3 | CPUID_EXT_PCLMULQDQ |
             CPUID_EXT_SSE3,
-        /* Missing: CPUID_EXT2_RDTSCP */
         .features[FEAT_8000_0001_EDX] =
-            CPUID_EXT2_LM |
+            CPUID_EXT2_LM | CPUID_EXT2_RDTSCP |
             CPUID_EXT2_PDPE1GB | CPUID_EXT2_FXSR | CPUID_EXT2_MMX |
             CPUID_EXT2_NX | CPUID_EXT2_PSE36 | CPUID_EXT2_PAT |
             CPUID_EXT2_CMOV | CPUID_EXT2_MCA | CPUID_EXT2_PGE |
@@ -1431,9 +1428,8 @@  static X86CPUDefinition builtin_x86_defs[] = {
             CPUID_EXT_AES | CPUID_EXT_POPCNT | CPUID_EXT_SSE42 |
             CPUID_EXT_SSE41 | CPUID_EXT_CX16 | CPUID_EXT_FMA |
             CPUID_EXT_SSSE3 | CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSE3,
-        /* Missing: CPUID_EXT2_RDTSCP */
         .features[FEAT_8000_0001_EDX] =
-            CPUID_EXT2_LM |
+            CPUID_EXT2_LM | CPUID_EXT2_RDTSCP |
             CPUID_EXT2_PDPE1GB | CPUID_EXT2_FXSR | CPUID_EXT2_MMX |
             CPUID_EXT2_NX | CPUID_EXT2_PSE36 | CPUID_EXT2_PAT |
             CPUID_EXT2_CMOV | CPUID_EXT2_MCA | CPUID_EXT2_PGE |