Message ID | 20160603060944.17373-2-haozhong.zhang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
2016-06-03 14:09+0800, Haozhong Zhang: > This patch adds the support to inject SRAR and SRAO as LMCE, i.e. they > will be injected to only one VCPU rather than broadcast to all > VCPUs. As KVM reports LMCE support on Intel platforms, this features is > only available on Intel platforms. > > Signed-off-by: Ashok Raj <ashok.raj@intel.com> > Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> > --- > diff --git a/target-i386/cpu.c b/target-i386/cpu.c > @@ -2786,6 +2798,20 @@ static void mce_init(X86CPU *cpu) > && (cenv->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) == > (CPUID_MCE | CPUID_MCA)) { > cenv->mcg_cap = MCE_CAP_DEF | MCE_BANKS_DEF; > + > + if (cpu->enable_lmce) { > + if (lmce_supported()) { > + cenv->mcg_cap |= MCG_LMCE_P; > + cenv->msr_ia32_feature_control |= > + MSR_IA32_FEATURE_CONTROL_LMCE | > + MSR_IA32_FEATURE_CONTROL_LOCKED; Locking right from the start breaks nested KVM, because nested relies on setting VMXON feature from inside of the guest. Do we keep it unlocked, or move everything into QEMU? (The latter seems simpler.) > + } else { > + error_report("Warning: KVM unavailable or not support LMCE, " > + "LMCE disabled"); > + cpu->enable_lmce = false; > + } > + } > + > cenv->mcg_ctl = ~(uint64_t)0; > for (bank = 0; bank < MCE_BANKS_DEF; bank++) { > cenv->mce_banks[bank * 4] = ~(uint64_t)0; -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Haozhong Zhang <haozhong.zhang@intel.com> wrote: >This patch adds the support to inject SRAR and SRAO as LMCE, i.e. they >will be injected to only one VCPU rather than broadcast to all >VCPUs. As KVM reports LMCE support on Intel platforms, this features is >only available on Intel platforms. > >Signed-off-by: Ashok Raj <ashok.raj@intel.com> >Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> >--- >Cc: Paolo Bonzini <pbonzini@redhat.com> >Cc: Richard Henderson <rth@twiddle.net> >Cc: Eduardo Habkost <ehabkost@redhat.com> >Cc: Marcelo Tosatti <mtosatti@redhat.com> >Cc: Boris Petkov <bp@suse.de> >Cc: kvm@vger.kernel.org >Cc: Tony Luck <tony.luck@intel.com> >Cc: Andi Kleen <andi.kleen@intel.com> >--- > target-i386/cpu.c | 26 ++++++++++++++++++++++++++ > target-i386/cpu.h | 13 ++++++++++++- > target-i386/kvm.c | 35 +++++++++++++++++++++++++++++++---- > 3 files changed, 69 insertions(+), 5 deletions(-) ... >@@ -2786,6 +2798,20 @@ static void mce_init(X86CPU *cpu) > && (cenv->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) == > (CPUID_MCE | CPUID_MCA)) { > cenv->mcg_cap = MCE_CAP_DEF | MCE_BANKS_DEF; >+ >+ if (cpu->enable_lmce) { >+ if (lmce_supported()) { >+ cenv->mcg_cap |= MCG_LMCE_P; >+ cenv->msr_ia32_feature_control |= >+ MSR_IA32_FEATURE_CONTROL_LMCE | >+ MSR_IA32_FEATURE_CONTROL_LOCKED; >+ } else { >+ error_report("Warning: KVM unavailable or not support >LMCE, " >+ "LMCE disabled"); "... or LMCE not supported..." Also, do not split the string for easier grepping.
Haozhong Zhang <haozhong.zhang@intel.com> wrote: >This patch adds the support to inject SRAR and SRAO as LMCE, i.e. they >will be injected to only one VCPU rather than broadcast to all >VCPUs. As KVM reports LMCE support on Intel platforms, this features is >only available on Intel platforms. > >Signed-off-by: Ashok Raj <ashok.raj@intel.com> >Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> >--- >Cc: Paolo Bonzini <pbonzini@redhat.com> >Cc: Richard Henderson <rth@twiddle.net> >Cc: Eduardo Habkost <ehabkost@redhat.com> >Cc: Marcelo Tosatti <mtosatti@redhat.com> >Cc: Boris Petkov <bp@suse.de> >Cc: kvm@vger.kernel.org >Cc: Tony Luck <tony.luck@intel.com> >Cc: Andi Kleen <andi.kleen@intel.com> >--- > target-i386/cpu.c | 26 ++++++++++++++++++++++++++ > target-i386/cpu.h | 13 ++++++++++++- > target-i386/kvm.c | 35 +++++++++++++++++++++++++++++++---- > 3 files changed, 69 insertions(+), 5 deletions(-) ... >@@ -1173,6 +1182,8 @@ struct X86CPU { > */ > bool enable_pmu; > >+ bool enable_lmce; That struct would go fat pretty fast if it grows a bool per CPU feature. Perhaps a more clever, a-bit-per-featurebit scheme would be in order.
On Sat, Jun 04, 2016 at 12:34:39PM +0200, Boris Petkov wrote: > Haozhong Zhang <haozhong.zhang@intel.com> wrote: > > >This patch adds the support to inject SRAR and SRAO as LMCE, i.e. they > >will be injected to only one VCPU rather than broadcast to all > >VCPUs. As KVM reports LMCE support on Intel platforms, this features is > >only available on Intel platforms. > > > >Signed-off-by: Ashok Raj <ashok.raj@intel.com> > >Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> > >--- > >Cc: Paolo Bonzini <pbonzini@redhat.com> > >Cc: Richard Henderson <rth@twiddle.net> > >Cc: Eduardo Habkost <ehabkost@redhat.com> > >Cc: Marcelo Tosatti <mtosatti@redhat.com> > >Cc: Boris Petkov <bp@suse.de> > >Cc: kvm@vger.kernel.org > >Cc: Tony Luck <tony.luck@intel.com> > >Cc: Andi Kleen <andi.kleen@intel.com> > >--- > > target-i386/cpu.c | 26 ++++++++++++++++++++++++++ > > target-i386/cpu.h | 13 ++++++++++++- > > target-i386/kvm.c | 35 +++++++++++++++++++++++++++++++---- > > 3 files changed, 69 insertions(+), 5 deletions(-) > > ... > > >@@ -1173,6 +1182,8 @@ struct X86CPU { > > */ > > bool enable_pmu; > > > >+ bool enable_lmce; > > That struct would go fat pretty fast if it grows a bool per CPU > feature. Perhaps a more clever, a-bit-per-featurebit scheme > would be in order. We already have X86CPU.features, but it's specific for feature flags appearing on CPUID. We can eventually extend FeatureWord/FeatureWordInfo/x86_cpu_get_supported_feature_word() to represent features that don't appear directly on CPUID.
On 06/03/16 17:57, Radim Krčmář wrote: > 2016-06-03 14:09+0800, Haozhong Zhang: > > This patch adds the support to inject SRAR and SRAO as LMCE, i.e. they > > will be injected to only one VCPU rather than broadcast to all > > VCPUs. As KVM reports LMCE support on Intel platforms, this features is > > only available on Intel platforms. > > > > Signed-off-by: Ashok Raj <ashok.raj@intel.com> > > Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> > > --- > > diff --git a/target-i386/cpu.c b/target-i386/cpu.c > > @@ -2786,6 +2798,20 @@ static void mce_init(X86CPU *cpu) > > && (cenv->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) == > > (CPUID_MCE | CPUID_MCA)) { > > cenv->mcg_cap = MCE_CAP_DEF | MCE_BANKS_DEF; > > + > > + if (cpu->enable_lmce) { > > + if (lmce_supported()) { > > + cenv->mcg_cap |= MCG_LMCE_P; > > + cenv->msr_ia32_feature_control |= > > + MSR_IA32_FEATURE_CONTROL_LMCE | > > + MSR_IA32_FEATURE_CONTROL_LOCKED; > > Locking right from the start breaks nested KVM, because nested relies on > setting VMXON feature from inside of the guest. > > Do we keep it unlocked, or move everything into QEMU? > > (The latter seems simpler.) > Setting guest MSR_IA32_FEATURE_CONTROL is not necessary here, it's instead the guest BIOS/OS duty to enable and lock corresponding features. I'll remove this in the next version. Thanks, Haozhong -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/04/16 12:15, Boris Petkov wrote: > Haozhong Zhang <haozhong.zhang@intel.com> wrote: > > >This patch adds the support to inject SRAR and SRAO as LMCE, i.e. they > >will be injected to only one VCPU rather than broadcast to all > >VCPUs. As KVM reports LMCE support on Intel platforms, this features is > >only available on Intel platforms. > > > >Signed-off-by: Ashok Raj <ashok.raj@intel.com> > >Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> > >--- > >Cc: Paolo Bonzini <pbonzini@redhat.com> > >Cc: Richard Henderson <rth@twiddle.net> > >Cc: Eduardo Habkost <ehabkost@redhat.com> > >Cc: Marcelo Tosatti <mtosatti@redhat.com> > >Cc: Boris Petkov <bp@suse.de> > >Cc: kvm@vger.kernel.org > >Cc: Tony Luck <tony.luck@intel.com> > >Cc: Andi Kleen <andi.kleen@intel.com> > >--- > > target-i386/cpu.c | 26 ++++++++++++++++++++++++++ > > target-i386/cpu.h | 13 ++++++++++++- > > target-i386/kvm.c | 35 +++++++++++++++++++++++++++++++---- > > 3 files changed, 69 insertions(+), 5 deletions(-) > > ... > > >@@ -2786,6 +2798,20 @@ static void mce_init(X86CPU *cpu) > > && (cenv->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) == > > (CPUID_MCE | CPUID_MCA)) { > > cenv->mcg_cap = MCE_CAP_DEF | MCE_BANKS_DEF; > >+ > >+ if (cpu->enable_lmce) { > >+ if (lmce_supported()) { > >+ cenv->mcg_cap |= MCG_LMCE_P; > >+ cenv->msr_ia32_feature_control |= > >+ MSR_IA32_FEATURE_CONTROL_LMCE | > >+ MSR_IA32_FEATURE_CONTROL_LOCKED; > >+ } else { > >+ error_report("Warning: KVM unavailable or not support > >LMCE, " > >+ "LMCE disabled"); > > "... or LMCE not supported..." > will change > Also, do not split the string for easier grepping. > OK, I was to avoid expiring 80 characters per line. Thanks, Haozhong -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/04/16 12:34, Boris Petkov wrote: > Haozhong Zhang <haozhong.zhang@intel.com> wrote: > > >This patch adds the support to inject SRAR and SRAO as LMCE, i.e. they > >will be injected to only one VCPU rather than broadcast to all > >VCPUs. As KVM reports LMCE support on Intel platforms, this features is > >only available on Intel platforms. > > > >Signed-off-by: Ashok Raj <ashok.raj@intel.com> > >Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> > >--- > >Cc: Paolo Bonzini <pbonzini@redhat.com> > >Cc: Richard Henderson <rth@twiddle.net> > >Cc: Eduardo Habkost <ehabkost@redhat.com> > >Cc: Marcelo Tosatti <mtosatti@redhat.com> > >Cc: Boris Petkov <bp@suse.de> > >Cc: kvm@vger.kernel.org > >Cc: Tony Luck <tony.luck@intel.com> > >Cc: Andi Kleen <andi.kleen@intel.com> > >--- > > target-i386/cpu.c | 26 ++++++++++++++++++++++++++ > > target-i386/cpu.h | 13 ++++++++++++- > > target-i386/kvm.c | 35 +++++++++++++++++++++++++++++++---- > > 3 files changed, 69 insertions(+), 5 deletions(-) > > ... > > >@@ -1173,6 +1182,8 @@ struct X86CPU { > > */ > > bool enable_pmu; > > > >+ bool enable_lmce; > > That struct would go fat pretty fast if it grows a bool per CPU feature. Perhaps a more clever, a-bit-per-featurebit scheme would be in order. OK, I'll use a 64-bit integer for current and future features. Thanks, Haozhong -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/04/16 18:03, Eduardo Habkost wrote: > On Sat, Jun 04, 2016 at 12:34:39PM +0200, Boris Petkov wrote: > > Haozhong Zhang <haozhong.zhang@intel.com> wrote: > > > > >This patch adds the support to inject SRAR and SRAO as LMCE, i.e. they > > >will be injected to only one VCPU rather than broadcast to all > > >VCPUs. As KVM reports LMCE support on Intel platforms, this features is > > >only available on Intel platforms. > > > > > >Signed-off-by: Ashok Raj <ashok.raj@intel.com> > > >Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> > > >--- > > >Cc: Paolo Bonzini <pbonzini@redhat.com> > > >Cc: Richard Henderson <rth@twiddle.net> > > >Cc: Eduardo Habkost <ehabkost@redhat.com> > > >Cc: Marcelo Tosatti <mtosatti@redhat.com> > > >Cc: Boris Petkov <bp@suse.de> > > >Cc: kvm@vger.kernel.org > > >Cc: Tony Luck <tony.luck@intel.com> > > >Cc: Andi Kleen <andi.kleen@intel.com> > > >--- > > > target-i386/cpu.c | 26 ++++++++++++++++++++++++++ > > > target-i386/cpu.h | 13 ++++++++++++- > > > target-i386/kvm.c | 35 +++++++++++++++++++++++++++++++---- > > > 3 files changed, 69 insertions(+), 5 deletions(-) > > > > ... > > > > >@@ -1173,6 +1182,8 @@ struct X86CPU { > > > */ > > > bool enable_pmu; > > > > > >+ bool enable_lmce; > > > > That struct would go fat pretty fast if it grows a bool per CPU > > feature. Perhaps a more clever, a-bit-per-featurebit scheme > > would be in order. > > We already have X86CPU.features, but it's specific for feature > flags appearing on CPUID. We can eventually extend > FeatureWord/FeatureWordInfo/x86_cpu_get_supported_feature_word() > to represent features that don't appear directly on CPUID. > You mean CPUX86State.features? enable_lmce is also used in patch 2 to avoid migrating from lmce-enabled qemu to lmce-disabled qemu, so I don't think CPUX86State.features is the correct place for enable_lmce. Or, we may append one or more bit words to X86CPU.filtered_features for enable_pmu, enable_lmce and future features, e.g. modified target-i386/cpu.h @@ -455,6 +455,14 @@ typedef enum FeatureWord { typedef uint32_t FeatureWordArray[FEATURE_WORDS]; +typedef enum ExtFeatureBit { + EXT_FEAT_PMU, + EXT_FEAT_LMCE, + EXT_FEATURE_BITS, +} ExtFeatureBit; + +#define EXT_FEATURE_WORDS ((EXT_FEATURE_BITS + 31) / 32) + +#define EXT_FEATURE_FILTER_ADD(words, feat) \ + do { \ + uint32_t *__words = (words); \ + int __idx = (feat) / 32; \ + int __oft = (feat) % 32; \ + __words[__idx + FEATURE_WORDS] |= (1 << __oft); \ + } while (0) + +#define EXT_FEATURE_FILTER_REMOVE(words, feat) \ + do { \ + uint32_t *__words = (words); \ + int __idx = (feat) / 32; \ + int __oft = (feat) % 32; \ + __words[__idx + FEATURE_WORDS] &= ~(1 << __oft); \ + } while (0) + +#define EXT_FEATURE_FILTERED(words, feat) \ + ({ \ + uint32_t *__words = (words); \ + int __idx = (feat) / 32; \ + int __oft = (feat) % 32; \ + __words[__idx + FEATURE_WORDS] & (1 << oft) \ + }) + /* cpuid_features bits */ #define CPUID_FP87 (1U << 0) #define CPUID_VME (1U << 1) @@ -1173,21 +1181,7 @@ struct X86CPU { /* Features that were filtered out because of missing host capabilities */ - uint32_t filtered_features[FEATURE_WORDS]; + /* Features that were filtered out because of missing host capabilities or + being disabled by default */ + uint32_t filtered_features[FEATURE_WORDS + EXT_FEATURE_WORDS]; - - /* Enable PMU CPUID bits. This can't be enabled by default yet because - * it doesn't have ABI stability guarantees, as it passes all PMU CPUID - * bits returned by GET_SUPPORTED_CPUID (that depend on host CPU and kernel - * capabilities) directly to the guest. - */ - bool enable_pmu; - - /* Enable LMCE support which is set via cpu option 'lmce=on/off'. LMCE is - * disabled by default to avoid breaking the migration between QEMU with - * different LMCE support. Only migrating between QEMU with the same LMCE - * support is allowed. - */ - bool enable_lmce; Every place using X86CPU.enable_pmu and .enable_lmce is then replaced by above macros EXT_FEATURE_FILTER_ADD, EXT_FEATURE_FILTER_REMOVE and EXT_FEATURE_FILTERED. Of course, those bits in X86CPU.filtered_features have the opposite meaning to original enable_lmce and enable_pmu. Haozhong -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/07/16 17:41, Haozhong Zhang wrote: > On 06/04/16 18:03, Eduardo Habkost wrote: > > On Sat, Jun 04, 2016 at 12:34:39PM +0200, Boris Petkov wrote: > > > Haozhong Zhang <haozhong.zhang@intel.com> wrote: > > > > > > >This patch adds the support to inject SRAR and SRAO as LMCE, i.e. they > > > >will be injected to only one VCPU rather than broadcast to all > > > >VCPUs. As KVM reports LMCE support on Intel platforms, this features is > > > >only available on Intel platforms. > > > > > > > >Signed-off-by: Ashok Raj <ashok.raj@intel.com> > > > >Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> > > > >--- > > > >Cc: Paolo Bonzini <pbonzini@redhat.com> > > > >Cc: Richard Henderson <rth@twiddle.net> > > > >Cc: Eduardo Habkost <ehabkost@redhat.com> > > > >Cc: Marcelo Tosatti <mtosatti@redhat.com> > > > >Cc: Boris Petkov <bp@suse.de> > > > >Cc: kvm@vger.kernel.org > > > >Cc: Tony Luck <tony.luck@intel.com> > > > >Cc: Andi Kleen <andi.kleen@intel.com> > > > >--- > > > > target-i386/cpu.c | 26 ++++++++++++++++++++++++++ > > > > target-i386/cpu.h | 13 ++++++++++++- > > > > target-i386/kvm.c | 35 +++++++++++++++++++++++++++++++---- > > > > 3 files changed, 69 insertions(+), 5 deletions(-) > > > > > > ... > > > > > > >@@ -1173,6 +1182,8 @@ struct X86CPU { > > > > */ > > > > bool enable_pmu; > > > > > > > >+ bool enable_lmce; > > > > > > That struct would go fat pretty fast if it grows a bool per CPU > > > feature. Perhaps a more clever, a-bit-per-featurebit scheme > > > would be in order. > > > > We already have X86CPU.features, but it's specific for feature > > flags appearing on CPUID. We can eventually extend > > FeatureWord/FeatureWordInfo/x86_cpu_get_supported_feature_word() > > to represent features that don't appear directly on CPUID. > > > > You mean CPUX86State.features? > > enable_lmce is also used in patch 2 to avoid migrating from > lmce-enabled qemu to lmce-disabled qemu, so I don't think > CPUX86State.features is the correct place for enable_lmce. > Sorry, I got things wrong: I thought that CPUX86State.features on destination is overwritten by the source one in the migration. As long as it's in fact not, we may extend CPUX86State.features in a similar way to I proposed for X86CPU.filtered_features to include features not directly appear in cpuid. Haozhong > Or, we may append one or more bit words to X86CPU.filtered_features > for enable_pmu, enable_lmce and future features, e.g. > > modified target-i386/cpu.h > @@ -455,6 +455,14 @@ typedef enum FeatureWord { > > typedef uint32_t FeatureWordArray[FEATURE_WORDS]; > > +typedef enum ExtFeatureBit { > + EXT_FEAT_PMU, > + EXT_FEAT_LMCE, > + EXT_FEATURE_BITS, > +} ExtFeatureBit; > + > +#define EXT_FEATURE_WORDS ((EXT_FEATURE_BITS + 31) / 32) > + > +#define EXT_FEATURE_FILTER_ADD(words, feat) \ > + do { \ > + uint32_t *__words = (words); \ > + int __idx = (feat) / 32; \ > + int __oft = (feat) % 32; \ > + __words[__idx + FEATURE_WORDS] |= (1 << __oft); \ > + } while (0) > + > +#define EXT_FEATURE_FILTER_REMOVE(words, feat) \ > + do { \ > + uint32_t *__words = (words); \ > + int __idx = (feat) / 32; \ > + int __oft = (feat) % 32; \ > + __words[__idx + FEATURE_WORDS] &= ~(1 << __oft); \ > + } while (0) > + > +#define EXT_FEATURE_FILTERED(words, feat) \ > + ({ \ > + uint32_t *__words = (words); \ > + int __idx = (feat) / 32; \ > + int __oft = (feat) % 32; \ > + __words[__idx + FEATURE_WORDS] & (1 << oft) \ > + }) > + > /* cpuid_features bits */ > #define CPUID_FP87 (1U << 0) > #define CPUID_VME (1U << 1) > @@ -1173,21 +1181,7 @@ struct X86CPU { > /* Features that were filtered out because of missing host capabilities */ > - uint32_t filtered_features[FEATURE_WORDS]; > + /* Features that were filtered out because of missing host capabilities or > + being disabled by default */ > + uint32_t filtered_features[FEATURE_WORDS + EXT_FEATURE_WORDS]; > - > - /* Enable PMU CPUID bits. This can't be enabled by default yet because > - * it doesn't have ABI stability guarantees, as it passes all PMU CPUID > - * bits returned by GET_SUPPORTED_CPUID (that depend on host CPU and kernel > - * capabilities) directly to the guest. > - */ > - bool enable_pmu; > - > - /* Enable LMCE support which is set via cpu option 'lmce=on/off'. LMCE is > - * disabled by default to avoid breaking the migration between QEMU with > - * different LMCE support. Only migrating between QEMU with the same LMCE > - * support is allowed. > - */ > - bool enable_lmce; > > Every place using X86CPU.enable_pmu and .enable_lmce is then replaced > by above macros EXT_FEATURE_FILTER_ADD, EXT_FEATURE_FILTER_REMOVE and > EXT_FEATURE_FILTERED. Of course, those bits in X86CPU.filtered_features > have the opposite meaning to original enable_lmce and enable_pmu. > > > Haozhong > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jun 03, 2016 at 02:09:43PM +0800, Haozhong Zhang wrote: [...] > + > + if (cpu->enable_lmce) { > + if (lmce_supported()) { > + cenv->mcg_cap |= MCG_LMCE_P; > + cenv->msr_ia32_feature_control |= > + MSR_IA32_FEATURE_CONTROL_LMCE | > + MSR_IA32_FEATURE_CONTROL_LOCKED; > + } else { > + error_report("Warning: KVM unavailable or not support LMCE, " > + "LMCE disabled"); > + cpu->enable_lmce = false; Please don't do that. If the user explicitly asked for LMCE, you should refuse to start if the host doesn't have the required capabilities. > + } > + } > + > cenv->mcg_ctl = ~(uint64_t)0; > for (bank = 0; bank < MCE_BANKS_DEF; bank++) { > cenv->mce_banks[bank * 4] = ~(uint64_t)0; [...]
On 06/07/16 17:10, Eduardo Habkost wrote: > On Fri, Jun 03, 2016 at 02:09:43PM +0800, Haozhong Zhang wrote: > [...] > > + > > + if (cpu->enable_lmce) { > > + if (lmce_supported()) { > > + cenv->mcg_cap |= MCG_LMCE_P; > > + cenv->msr_ia32_feature_control |= > > + MSR_IA32_FEATURE_CONTROL_LMCE | > > + MSR_IA32_FEATURE_CONTROL_LOCKED; > > + } else { > > + error_report("Warning: KVM unavailable or not support LMCE, " > > + "LMCE disabled"); > > + cpu->enable_lmce = false; > > Please don't do that. If the user explicitly asked for LMCE, you > should refuse to start if the host doesn't have the required > capabilities. > OK, I'll change in the next version. Thanks, Haozhong > > > + } > > + } > > + > > cenv->mcg_ctl = ~(uint64_t)0; > > for (bank = 0; bank < MCE_BANKS_DEF; bank++) { > > cenv->mce_banks[bank * 4] = ~(uint64_t)0; > [...] > > -- > Eduardo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/06/2016 17:57, Radim Krčmář wrote: >> > + cenv->msr_ia32_feature_control |= >> > + MSR_IA32_FEATURE_CONTROL_LMCE | >> > + MSR_IA32_FEATURE_CONTROL_LOCKED; > Locking right from the start breaks nested KVM, because nested relies on > setting VMXON feature from inside of the guest. > > Do we keep it unlocked, or move everything into QEMU? > > (The latter seems simpler.) I think it should be moved into the firmware, with QEMU publishing the desired setting via fw_cfg. The same as what is done in real hardware, that's the KVM mantra. :) For v4 it's okay to just remove this. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/06/2016 17:41, Haozhong Zhang wrote: > On 06/04/16 12:34, Boris Petkov wrote: >> Haozhong Zhang <haozhong.zhang@intel.com> wrote: >> >>> This patch adds the support to inject SRAR and SRAO as LMCE, i.e. they >>> will be injected to only one VCPU rather than broadcast to all >>> VCPUs. As KVM reports LMCE support on Intel platforms, this features is >>> only available on Intel platforms. >>> >>> Signed-off-by: Ashok Raj <ashok.raj@intel.com> >>> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> >>> --- >>> Cc: Paolo Bonzini <pbonzini@redhat.com> >>> Cc: Richard Henderson <rth@twiddle.net> >>> Cc: Eduardo Habkost <ehabkost@redhat.com> >>> Cc: Marcelo Tosatti <mtosatti@redhat.com> >>> Cc: Boris Petkov <bp@suse.de> >>> Cc: kvm@vger.kernel.org >>> Cc: Tony Luck <tony.luck@intel.com> >>> Cc: Andi Kleen <andi.kleen@intel.com> >>> --- >>> target-i386/cpu.c | 26 ++++++++++++++++++++++++++ >>> target-i386/cpu.h | 13 ++++++++++++- >>> target-i386/kvm.c | 35 +++++++++++++++++++++++++++++++---- >>> 3 files changed, 69 insertions(+), 5 deletions(-) >> >> ... >> >>> @@ -1173,6 +1182,8 @@ struct X86CPU { >>> */ >>> bool enable_pmu; >>> >>> + bool enable_lmce; >> >> That struct would go fat pretty fast if it grows a bool per CPU feature. Perhaps a more clever, a-bit-per-featurebit scheme would be in order. > > OK, I'll use a 64-bit integer for current and future features. No, please keep this as is for now. It can be refactored later. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/08/16 13:34, Paolo Bonzini wrote: > > > On 05/06/2016 17:41, Haozhong Zhang wrote: > > On 06/04/16 12:34, Boris Petkov wrote: > >> Haozhong Zhang <haozhong.zhang@intel.com> wrote: > >> > >>> This patch adds the support to inject SRAR and SRAO as LMCE, i.e. they > >>> will be injected to only one VCPU rather than broadcast to all > >>> VCPUs. As KVM reports LMCE support on Intel platforms, this features is > >>> only available on Intel platforms. > >>> > >>> Signed-off-by: Ashok Raj <ashok.raj@intel.com> > >>> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> > >>> --- > >>> Cc: Paolo Bonzini <pbonzini@redhat.com> > >>> Cc: Richard Henderson <rth@twiddle.net> > >>> Cc: Eduardo Habkost <ehabkost@redhat.com> > >>> Cc: Marcelo Tosatti <mtosatti@redhat.com> > >>> Cc: Boris Petkov <bp@suse.de> > >>> Cc: kvm@vger.kernel.org > >>> Cc: Tony Luck <tony.luck@intel.com> > >>> Cc: Andi Kleen <andi.kleen@intel.com> > >>> --- > >>> target-i386/cpu.c | 26 ++++++++++++++++++++++++++ > >>> target-i386/cpu.h | 13 ++++++++++++- > >>> target-i386/kvm.c | 35 +++++++++++++++++++++++++++++++---- > >>> 3 files changed, 69 insertions(+), 5 deletions(-) > >> > >> ... > >> > >>> @@ -1173,6 +1182,8 @@ struct X86CPU { > >>> */ > >>> bool enable_pmu; > >>> > >>> + bool enable_lmce; > >> > >> That struct would go fat pretty fast if it grows a bool per CPU feature. Perhaps a more clever, a-bit-per-featurebit scheme would be in order. > > > > OK, I'll use a 64-bit integer for current and future features. > > No, please keep this as is for now. It can be refactored later. > OK Thanks, Haozhong -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/08/16 13:32, Paolo Bonzini wrote: > > > On 03/06/2016 17:57, Radim Krčmář wrote: > >> > + cenv->msr_ia32_feature_control |= > >> > + MSR_IA32_FEATURE_CONTROL_LMCE | > >> > + MSR_IA32_FEATURE_CONTROL_LOCKED; > > Locking right from the start breaks nested KVM, because nested relies on > > setting VMXON feature from inside of the guest. > > > > Do we keep it unlocked, or move everything into QEMU? > > > > (The latter seems simpler.) > > I think it should be moved into the firmware, with QEMU publishing the > desired setting via fw_cfg. The same as what is done in real hardware, > that's the KVM mantra. :) > > For v4 it's okay to just remove this. > > Paolo Currently, only VMX bits (bit 1 & 2), LMCE bit (bit 20) as well as lock bit (bit 0) in MSR_IA32_FEATURE_CONTROL are used for guest. The availability of features indicated by those bits (except the lock bit) can be discovered from cpuid and other MSR, so it looks not necessary to publish them via fw_cfg. Or do you have other concerns? Thanks, Haozhong -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 13/06/2016 09:55, Haozhong Zhang wrote: > Currently, only VMX bits (bit 1 & 2), LMCE bit (bit 20) as well as > lock bit (bit 0) in MSR_IA32_FEATURE_CONTROL are used for guest. The > availability of features indicated by those bits (except the lock bit) > can be discovered from cpuid and other MSR, so it looks not necessary > to publish them via fw_cfg. Or do you have other concerns? I would prefer to avoid having to change the firmware (SeaBIOS and OVMF) every time a new bit is added. Using fw_cfg makes it possible to develop the feature in the firmware once and for all. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/13/16 10:33, Paolo Bonzini wrote: > > > On 13/06/2016 09:55, Haozhong Zhang wrote: > > Currently, only VMX bits (bit 1 & 2), LMCE bit (bit 20) as well as > > lock bit (bit 0) in MSR_IA32_FEATURE_CONTROL are used for guest. The > > availability of features indicated by those bits (except the lock bit) > > can be discovered from cpuid and other MSR, so it looks not necessary > > to publish them via fw_cfg. Or do you have other concerns? > > I would prefer to avoid having to change the firmware (SeaBIOS and OVMF) > every time a new bit is added. Using fw_cfg makes it possible to > develop the feature in the firmware once and for all. > Thanks for the explanation! Is it proper to add a key in fw_cfg for this purpose, e.g FW_CFG_MSR_FEATURE_CONTROL to pass bits that are supposed to be set by firmware? Thanks, Haozhong -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 13/06/2016 12:01, Haozhong Zhang wrote: > > I would prefer to avoid having to change the firmware (SeaBIOS and OVMF) > > every time a new bit is added. Using fw_cfg makes it possible to > > develop the feature in the firmware once and for all. > > Thanks for the explanation! Is it proper to add a key in fw_cfg for > this purpose, e.g FW_CFG_MSR_FEATURE_CONTROL to pass bits that are > supposed to be set by firmware? We usually add new files now, so it would be a new file named "etc/msr-feature-control". Thanks, Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/13/16 12:07, Paolo Bonzini wrote: > > > On 13/06/2016 12:01, Haozhong Zhang wrote: > > > I would prefer to avoid having to change the firmware (SeaBIOS and OVMF) > > > every time a new bit is added. Using fw_cfg makes it possible to > > > develop the feature in the firmware once and for all. > > > > Thanks for the explanation! Is it proper to add a key in fw_cfg for > > this purpose, e.g FW_CFG_MSR_FEATURE_CONTROL to pass bits that are > > supposed to be set by firmware? > > We usually add new files now, so it would be a new file named > "etc/msr-feature-control". > Got it, thanks! Haozhong -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 895a386..9b4dbab 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -2777,6 +2777,18 @@ static void x86_cpu_machine_reset_cb(void *opaque) } #endif +static bool lmce_supported(void) +{ + uint64_t mce_cap; + + if (!kvm_enabled() || + kvm_ioctl(kvm_state, KVM_X86_GET_MCE_CAP_SUPPORTED, &mce_cap) < 0) { + return false; + } + + return !!(mce_cap & MCG_LMCE_P); +} + static void mce_init(X86CPU *cpu) { CPUX86State *cenv = &cpu->env; @@ -2786,6 +2798,20 @@ static void mce_init(X86CPU *cpu) && (cenv->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) == (CPUID_MCE | CPUID_MCA)) { cenv->mcg_cap = MCE_CAP_DEF | MCE_BANKS_DEF; + + if (cpu->enable_lmce) { + if (lmce_supported()) { + cenv->mcg_cap |= MCG_LMCE_P; + cenv->msr_ia32_feature_control |= + MSR_IA32_FEATURE_CONTROL_LMCE | + MSR_IA32_FEATURE_CONTROL_LOCKED; + } else { + error_report("Warning: KVM unavailable or not support LMCE, " + "LMCE disabled"); + cpu->enable_lmce = false; + } + } + cenv->mcg_ctl = ~(uint64_t)0; for (bank = 0; bank < MCE_BANKS_DEF; bank++) { cenv->mce_banks[bank * 4] = ~(uint64_t)0; diff --git a/target-i386/cpu.h b/target-i386/cpu.h index 0426459..2d411ba 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -292,6 +292,7 @@ #define MCG_CTL_P (1ULL<<8) /* MCG_CAP register available */ #define MCG_SER_P (1ULL<<24) /* MCA recovery/new status bits */ +#define MCG_LMCE_P (1ULL<<27) /* Local Machine Check Supported */ #define MCE_CAP_DEF (MCG_CTL_P|MCG_SER_P) #define MCE_BANKS_DEF 10 @@ -301,6 +302,9 @@ #define MCG_STATUS_RIPV (1ULL<<0) /* restart ip valid */ #define MCG_STATUS_EIPV (1ULL<<1) /* ip points to correct instruction */ #define MCG_STATUS_MCIP (1ULL<<2) /* machine check in progress */ +#define MCG_STATUS_LMCE (1ULL<<3) /* Local MCE signaled */ + +#define MCG_EXT_CTL_LMCE_EN (1ULL<<0) /* Local MCE enabled */ #define MCI_STATUS_VAL (1ULL<<63) /* valid error */ #define MCI_STATUS_OVER (1ULL<<62) /* previous errors lost */ @@ -325,6 +329,8 @@ #define MSR_IA32_APICBASE_ENABLE (1<<11) #define MSR_IA32_APICBASE_BASE (0xfffffU<<12) #define MSR_IA32_FEATURE_CONTROL 0x0000003a +#define MSR_IA32_FEATURE_CONTROL_LOCKED (1<<0) +#define MSR_IA32_FEATURE_CONTROL_LMCE (1<<20) #define MSR_TSC_ADJUST 0x0000003b #define MSR_IA32_TSCDEADLINE 0x6e0 @@ -343,6 +349,7 @@ #define MSR_MCG_CAP 0x179 #define MSR_MCG_STATUS 0x17a #define MSR_MCG_CTL 0x17b +#define MSR_MCG_EXT_CTL 0x4d0 #define MSR_P6_EVNTSEL0 0x186 @@ -1011,7 +1018,6 @@ typedef struct CPUX86State { uint64_t mcg_status; uint64_t msr_ia32_misc_enable; - uint64_t msr_ia32_feature_control; uint64_t msr_fixed_ctr_ctrl; uint64_t msr_global_ctrl; @@ -1104,8 +1110,11 @@ typedef struct CPUX86State { int64_t user_tsc_khz; /* for sanity check only */ void *kvm_xsave_buf; + uint64_t msr_ia32_feature_control; + uint64_t mcg_cap; uint64_t mcg_ctl; + uint64_t mcg_ext_ctl; uint64_t mce_banks[MCE_BANKS_DEF*4]; uint64_t tsc_aux; @@ -1173,6 +1182,8 @@ struct X86CPU { */ bool enable_pmu; + bool enable_lmce; + /* in order to simplify APIC support, we leave this pointer to the user */ struct DeviceState *apic_state; diff --git a/target-i386/kvm.c b/target-i386/kvm.c index abf50e6..ea442b3 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -107,6 +107,8 @@ static int has_xsave; static int has_xcrs; static int has_pit_state2; +static bool has_msr_mcg_ext_ctl; + int kvm_has_pit_state2(void) { return has_pit_state2; @@ -378,10 +380,12 @@ static int kvm_get_mce_cap_supported(KVMState *s, uint64_t *mce_cap, static void kvm_mce_inject(X86CPU *cpu, hwaddr paddr, int code) { + CPUState *cs = CPU(cpu); CPUX86State *env = &cpu->env; uint64_t status = MCI_STATUS_VAL | MCI_STATUS_UC | MCI_STATUS_EN | MCI_STATUS_MISCV | MCI_STATUS_ADDRV | MCI_STATUS_S; uint64_t mcg_status = MCG_STATUS_MCIP; + int flags = 0; if (code == BUS_MCEERR_AR) { status |= MCI_STATUS_AR | 0x134; @@ -390,10 +394,19 @@ static void kvm_mce_inject(X86CPU *cpu, hwaddr paddr, int code) status |= 0xc0; mcg_status |= MCG_STATUS_RIPV; } + + flags = cpu_x86_support_mca_broadcast(env) ? MCE_INJECT_BROADCAST : 0; + /* We need to read back the value of MSR_EXT_MCG_CTL that was set by the + * guest kernel back into env->mcg_ext_ctl. + */ + cpu_synchronize_state(cs); + if (env->mcg_ext_ctl & MCG_EXT_CTL_LMCE_EN) { + mcg_status |= MCG_STATUS_LMCE; + flags = 0; + } + cpu_x86_inject_mce(NULL, cpu, 9, status, mcg_status, paddr, - (MCM_ADDR_PHYS << 6) | 0xc, - cpu_x86_support_mca_broadcast(env) ? - MCE_INJECT_BROADCAST : 0); + (MCM_ADDR_PHYS << 6) | 0xc, flags); } static void hardware_memory_error(void) @@ -878,7 +891,12 @@ int kvm_arch_init_vcpu(CPUState *cs) c = cpuid_find_entry(&cpuid_data.cpuid, 1, 0); if (c) { has_msr_feature_control = !!(c->ecx & CPUID_EXT_VMX) || - !!(c->ecx & CPUID_EXT_SMX); + !!(c->ecx & CPUID_EXT_SMX) || + !!(env->mcg_cap & MCG_LMCE_P); + } + + if (has_msr_feature_control && (env->mcg_cap & MCG_LMCE_P)) { + has_msr_mcg_ext_ctl = true; } c = cpuid_find_entry(&cpuid_data.cpuid, 0x80000007, 0); @@ -1702,6 +1720,9 @@ static int kvm_put_msrs(X86CPU *cpu, int level) kvm_msr_entry_add(cpu, MSR_MCG_STATUS, env->mcg_status); kvm_msr_entry_add(cpu, MSR_MCG_CTL, env->mcg_ctl); + if (has_msr_mcg_ext_ctl) { + kvm_msr_entry_add(cpu, MSR_MCG_EXT_CTL, env->mcg_ext_ctl); + } for (i = 0; i < (env->mcg_cap & 0xff) * 4; i++) { kvm_msr_entry_add(cpu, MSR_MC0_CTL + i, env->mce_banks[i]); } @@ -2005,6 +2026,9 @@ static int kvm_get_msrs(X86CPU *cpu) if (env->mcg_cap) { kvm_msr_entry_add(cpu, MSR_MCG_STATUS, 0); kvm_msr_entry_add(cpu, MSR_MCG_CTL, 0); + if (has_msr_mcg_ext_ctl) { + kvm_msr_entry_add(cpu, MSR_MCG_EXT_CTL, 0); + } for (i = 0; i < (env->mcg_cap & 0xff) * 4; i++) { kvm_msr_entry_add(cpu, MSR_MC0_CTL + i, 0); } @@ -2133,6 +2157,9 @@ static int kvm_get_msrs(X86CPU *cpu) case MSR_MCG_CTL: env->mcg_ctl = msrs[i].data; break; + case MSR_MCG_EXT_CTL: + env->mcg_ext_ctl = msrs[i].data; + break; case MSR_IA32_MISC_ENABLE: env->msr_ia32_misc_enable = msrs[i].data; break;