diff mbox

[v4,1/3] target-i386: KVM: add basic Intel LMCE support

Message ID 20160616060621.30422-2-haozhong.zhang@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Haozhong Zhang June 16, 2016, 6:06 a.m. UTC
From: Ashok Raj <ashok.raj@intel.com>

This patch adds the support to inject SRAR and SRAO as LMCE, i.e. they
are injected to only one VCPU rather than broadcast to all VCPUs. As KVM
reports LMCE support on Intel platforms, this features is only available
on Intel platforms.

LMCE is disabled by default and can be enabled/disabled by cpu option
'lmce=on/off'.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
[Haozhong: Enable LMCE only on Intel platforms
           Disable LMCE by default and add a cpu option 'lmce'
	   Disable LMCE if missing KVM support
	   Remove MCG_LMCE_P from MCE_CAP_DEF
	   Minor code style changes]
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
 target-i386/cpu.c | 23 +++++++++++++++++++++++
 target-i386/cpu.h | 12 ++++++++++++
 target-i386/kvm.c | 35 +++++++++++++++++++++++++++++++----
 3 files changed, 66 insertions(+), 4 deletions(-)

Comments

Paolo Bonzini June 16, 2016, 9:50 a.m. UTC | #1
On 16/06/2016 08:06, Haozhong Zhang wrote:
> +            if (!lmce_supported()) {
> +                error_setg(&local_err, "KVM unavailable or LMCE not supported");
> +                error_propagate(&error_abort, local_err);
> +            }

Aborts should never be triggered by user input.  The error instead
should propagate from mce_init to its caller with a new errp argument
(i.e. error_setg(errp, "KVM unavailable or LMCE not supported")).

x86_cpu_realizefn can pass &local_err and check the outcome through
local_err != NULL.  See the existing call to x86_cpu_apic_create, right
above the call to mce_init.

> @@ -878,7 +891,12 @@ int kvm_arch_init_vcpu(CPUState *cs)
>      c = cpuid_find_entry(&cpuid_data.cpuid, 1, 0);
>      if (c) {
>          has_msr_feature_control = !!(c->ecx & CPUID_EXT_VMX) ||
> -                                  !!(c->ecx & CPUID_EXT_SMX);
> +                                  !!(c->ecx & CPUID_EXT_SMX) ||
> +                                  !!(env->mcg_cap & MCG_LMCE_P);

This part is wrong; env->mcg_cap is independent from CPUID[1].ECX.

> +    }
> +
> +    if (has_msr_feature_control && (env->mcg_cap & MCG_LMCE_P)) {

Don't test has_msr_feature_control here, instead set it to true inside
the "if".

> +        has_msr_mcg_ext_ctl = true;
>      }
>  
>      c = cpuid_find_entry(&cpuid_data.cpuid, 0x80000007, 0);

Which silicon has LMCE?  We may want to enable the property for some CPU
models.  Apart from this, the patch is pretty much okay.

Paolo
Haozhong Zhang June 16, 2016, 10:16 a.m. UTC | #2
On 06/16/16 11:50, Paolo Bonzini wrote:
> 
> 
> On 16/06/2016 08:06, Haozhong Zhang wrote:
> > +            if (!lmce_supported()) {
> > +                error_setg(&local_err, "KVM unavailable or LMCE not supported");
> > +                error_propagate(&error_abort, local_err);
> > +            }
> 
> Aborts should never be triggered by user input.  The error instead
> should propagate from mce_init to its caller with a new errp argument
> (i.e. error_setg(errp, "KVM unavailable or LMCE not supported")).
> 
> x86_cpu_realizefn can pass &local_err and check the outcome through
> local_err != NULL.  See the existing call to x86_cpu_apic_create, right
> above the call to mce_init.
>

Ah yes, I'll pass that local_err into mce_init() in the next version.

> > @@ -878,7 +891,12 @@ int kvm_arch_init_vcpu(CPUState *cs)
> >      c = cpuid_find_entry(&cpuid_data.cpuid, 1, 0);
> >      if (c) {
> >          has_msr_feature_control = !!(c->ecx & CPUID_EXT_VMX) ||
> > -                                  !!(c->ecx & CPUID_EXT_SMX);
> > +                                  !!(c->ecx & CPUID_EXT_SMX) ||
> > +                                  !!(env->mcg_cap & MCG_LMCE_P);
> 
> This part is wrong; env->mcg_cap is independent from CPUID[1].ECX.
>

Along with your next comment, I'll set it in the next if.

> > +    }
> > +
> > +    if (has_msr_feature_control && (env->mcg_cap & MCG_LMCE_P)) {
> 
> Don't test has_msr_feature_control here, instead set it to true inside
> the "if".
> 
> > +        has_msr_mcg_ext_ctl = true;
> >      }
> >  
> >      c = cpuid_find_entry(&cpuid_data.cpuid, 0x80000007, 0);
> 
> Which silicon has LMCE?  We may want to enable the property for some CPU
> models.  Apart from this, the patch is pretty much okay.
>

Skylake-EX

Thanks,
Haozhong
Paolo Bonzini June 16, 2016, 10:23 a.m. UTC | #3
On 16/06/2016 12:16, Haozhong Zhang wrote:
> > 
> > > +        has_msr_mcg_ext_ctl = true;
> > >      }
> > >  
> > >      c = cpuid_find_entry(&cpuid_data.cpuid, 0x80000007, 0);
> > 
> > Which silicon has LMCE?  We may want to enable the property for some CPU
> > models.  Apart from this, the patch is pretty much okay.
>
> Skylake-EX

... However, all virtual CPUs can use LMCE because the rendez-vous is
done in the host.  Is this correct?

Paolo
Haozhong Zhang June 16, 2016, 10:34 a.m. UTC | #4
On 06/16/16 12:23, Paolo Bonzini wrote:
> 
> 
> On 16/06/2016 12:16, Haozhong Zhang wrote:
> > > 
> > > > +        has_msr_mcg_ext_ctl = true;
> > > >      }
> > > >  
> > > >      c = cpuid_find_entry(&cpuid_data.cpuid, 0x80000007, 0);
> > > 
> > > Which silicon has LMCE?  We may want to enable the property for some CPU
> > > models.  Apart from this, the patch is pretty much okay.
> >
> > Skylake-EX
> 
> ... However, all virtual CPUs can use LMCE because the rendez-vous is
> done in the host.  Is this correct?
>

Yes, if it does not confuse the guest which sees LMCE available on
lower end or earlier CPUs (though I think someone would feel
happy). Or do we add it only to qemu64 and kvm64?

Haozhong
Paolo Bonzini June 16, 2016, 10:42 a.m. UTC | #5
On 16/06/2016 12:34, Haozhong Zhang wrote:
> On 06/16/16 12:23, Paolo Bonzini wrote:
>>
>>
>> On 16/06/2016 12:16, Haozhong Zhang wrote:
>>>>
>>>>> +        has_msr_mcg_ext_ctl = true;
>>>>>      }
>>>>>  
>>>>>      c = cpuid_find_entry(&cpuid_data.cpuid, 0x80000007, 0);
>>>>
>>>> Which silicon has LMCE?  We may want to enable the property for some CPU
>>>> models.  Apart from this, the patch is pretty much okay.
>>>
>>> Skylake-EX
>>
>> ... However, all virtual CPUs can use LMCE because the rendez-vous is
>> done in the host.  Is this correct?
>>
> 
> Yes, if it does not confuse the guest which sees LMCE available on
> lower end or earlier CPUs (though I think someone would feel
> happy).

Yes, that's what I expect too.  No confusion, and some happiness. :)

> Or do we add it only to qemu64 and kvm64?

I'm not sure where to add it, actually. :(  Let's wait for Eduardo's
opinion.

Paolo
Eduardo Habkost June 16, 2016, 6:05 p.m. UTC | #6
On Thu, Jun 16, 2016 at 12:42:19PM +0200, Paolo Bonzini wrote:
> 
> 
> On 16/06/2016 12:34, Haozhong Zhang wrote:
> > On 06/16/16 12:23, Paolo Bonzini wrote:
> >>
> >>
> >> On 16/06/2016 12:16, Haozhong Zhang wrote:
> >>>>
> >>>>> +        has_msr_mcg_ext_ctl = true;
> >>>>>      }
> >>>>>  
> >>>>>      c = cpuid_find_entry(&cpuid_data.cpuid, 0x80000007, 0);
> >>>>
> >>>> Which silicon has LMCE?  We may want to enable the property for some CPU
> >>>> models.  Apart from this, the patch is pretty much okay.
> >>>
> >>> Skylake-EX
> >>
> >> ... However, all virtual CPUs can use LMCE because the rendez-vous is
> >> done in the host.  Is this correct?
> >>
> > 
> > Yes, if it does not confuse the guest which sees LMCE available on
> > lower end or earlier CPUs (though I think someone would feel
> > happy).
> 
> Yes, that's what I expect too.  No confusion, and some happiness. :)
> 
> > Or do we add it only to qemu64 and kvm64?
> 
> I'm not sure where to add it, actually. :(  Let's wait for Eduardo's
> opinion.

Unfortunately we can't enable it by default to any existing CPU
model, or we break the machine-type-runnability rule
(machine-type version changes in an existing runnable VM
configuration should never make the VM not runnable in the same
host).

If one day we decide that QEMU as a whole can require a newer
kernel, then we can enable it by default on all CPU models.

What's the first kernel release where LMCE is enabled in
KVM_X86_GET_MCE_CAP_SUPPORTED, BTW?
Paolo Bonzini June 16, 2016, 6:17 p.m. UTC | #7
On 16/06/2016 20:05, Eduardo Habkost wrote:
> On Thu, Jun 16, 2016 at 12:42:19PM +0200, Paolo Bonzini wrote:
>>
>>
>> On 16/06/2016 12:34, Haozhong Zhang wrote:
>>> On 06/16/16 12:23, Paolo Bonzini wrote:
>>>>
>>>>
>>>> On 16/06/2016 12:16, Haozhong Zhang wrote:
>>>>>>
>>>>>>> +        has_msr_mcg_ext_ctl = true;
>>>>>>>      }
>>>>>>>  
>>>>>>>      c = cpuid_find_entry(&cpuid_data.cpuid, 0x80000007, 0);
>>>>>>
>>>>>> Which silicon has LMCE?  We may want to enable the property for some CPU
>>>>>> models.  Apart from this, the patch is pretty much okay.
>>>>>
>>>>> Skylake-EX
>>>>
>>>> ... However, all virtual CPUs can use LMCE because the rendez-vous is
>>>> done in the host.  Is this correct?
>>>>
>>>
>>> Yes, if it does not confuse the guest which sees LMCE available on
>>> lower end or earlier CPUs (though I think someone would feel
>>> happy).
>>
>> Yes, that's what I expect too.  No confusion, and some happiness. :)
>>
>>> Or do we add it only to qemu64 and kvm64?
>>
>> I'm not sure where to add it, actually. :(  Let's wait for Eduardo's
>> opinion.
> 
> Unfortunately we can't enable it by default to any existing CPU
> model, or we break the machine-type-runnability rule
> (machine-type version changes in an existing runnable VM
> configuration should never make the VM not runnable in the same
> host).
> 
> If one day we decide that QEMU as a whole can require a newer
> kernel, then we can enable it by default on all CPU models.
> 
> What's the first kernel release where LMCE is enabled in
> KVM_X86_GET_MCE_CAP_SUPPORTED, BTW?

It will presumably be 4.8.

Paolo
Eduardo Habkost June 16, 2016, 7:37 p.m. UTC | #8
On Thu, Jun 16, 2016 at 02:06:19PM +0800, Haozhong Zhang wrote:
> From: Ashok Raj <ashok.raj@intel.com>
> 
> This patch adds the support to inject SRAR and SRAO as LMCE, i.e. they
> are injected to only one VCPU rather than broadcast to all VCPUs. As KVM
> reports LMCE support on Intel platforms, this features is only available
> on Intel platforms.
> 
> LMCE is disabled by default and can be enabled/disabled by cpu option
> 'lmce=on/off'.
> 
> Signed-off-by: Ashok Raj <ashok.raj@intel.com>
> [Haozhong: Enable LMCE only on Intel platforms
>            Disable LMCE by default and add a cpu option 'lmce'
> 	   Disable LMCE if missing KVM support
> 	   Remove MCG_LMCE_P from MCE_CAP_DEF
> 	   Minor code style changes]

You are mixing tabs and spaces above.

> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> ---
>  target-i386/cpu.c | 23 +++++++++++++++++++++++
>  target-i386/cpu.h | 12 ++++++++++++
>  target-i386/kvm.c | 35 +++++++++++++++++++++++++++++++----
>  3 files changed, 66 insertions(+), 4 deletions(-)
> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index 895a386..bd35db2 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -2777,15 +2777,37 @@ static void x86_cpu_machine_reset_cb(void *opaque)
>  }
>  #endif
>  
> +static bool lmce_supported(void)
> +{
> +    uint64_t mce_cap;
> +
> +    if (!kvm_enabled() ||
> +        kvm_ioctl(kvm_state, KVM_X86_GET_MCE_CAP_SUPPORTED, &mce_cap) < 0) {
> +        return false;
> +    }
> +
> +    return !!(mce_cap & MCG_LMCE_P);
> +}
> +
>  static void mce_init(X86CPU *cpu)
>  {
>      CPUX86State *cenv = &cpu->env;
>      unsigned int bank;
> +    Error *local_err = NULL;
>  
>      if (((cenv->cpuid_version >> 8) & 0xf) >= 6
>          && (cenv->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) ==
>              (CPUID_MCE | CPUID_MCA)) {
>          cenv->mcg_cap = MCE_CAP_DEF | MCE_BANKS_DEF;
> +
> +        if (cpu->enable_lmce) {
> +            if (!lmce_supported()) {
> +                error_setg(&local_err, "KVM unavailable or LMCE not supported");
> +                error_propagate(&error_abort, local_err);
> +            }
> +            cenv->mcg_cap |= MCG_LMCE_P;
> +        }
> +

This duplicates the existing check in kvm_arch_init_vcpu(). The
difference is that the existing code is KVM-specific and doesn't
stop initialization when capabilities are missing. We can unify
them into a single mcg_cap-checking function as a follow-up.
Haozhong Zhang June 17, 2016, 1:26 a.m. UTC | #9
On 06/16/16 16:37, Eduardo Habkost wrote:
> On Thu, Jun 16, 2016 at 02:06:19PM +0800, Haozhong Zhang wrote:
> > From: Ashok Raj <ashok.raj@intel.com>
> > 
> > This patch adds the support to inject SRAR and SRAO as LMCE, i.e. they
> > are injected to only one VCPU rather than broadcast to all VCPUs. As KVM
> > reports LMCE support on Intel platforms, this features is only available
> > on Intel platforms.
> > 
> > LMCE is disabled by default and can be enabled/disabled by cpu option
> > 'lmce=on/off'.
> > 
> > Signed-off-by: Ashok Raj <ashok.raj@intel.com>
> > [Haozhong: Enable LMCE only on Intel platforms
> >            Disable LMCE by default and add a cpu option 'lmce'
> > 	   Disable LMCE if missing KVM support
> > 	   Remove MCG_LMCE_P from MCE_CAP_DEF
> > 	   Minor code style changes]
>
> You are mixing tabs and spaces above.
>

Oops, I missed to take care tabs in the commit message. will fix

> > Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> > ---
> >  target-i386/cpu.c | 23 +++++++++++++++++++++++
> >  target-i386/cpu.h | 12 ++++++++++++
> >  target-i386/kvm.c | 35 +++++++++++++++++++++++++++++++----
> >  3 files changed, 66 insertions(+), 4 deletions(-)
> > 
> > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > index 895a386..bd35db2 100644
> > --- a/target-i386/cpu.c
> > +++ b/target-i386/cpu.c
> > @@ -2777,15 +2777,37 @@ static void x86_cpu_machine_reset_cb(void *opaque)
> >  }
> >  #endif
> >  
> > +static bool lmce_supported(void)
> > +{
> > +    uint64_t mce_cap;
> > +
> > +    if (!kvm_enabled() ||
> > +        kvm_ioctl(kvm_state, KVM_X86_GET_MCE_CAP_SUPPORTED, &mce_cap) < 0) {
> > +        return false;
> > +    }
> > +
> > +    return !!(mce_cap & MCG_LMCE_P);
> > +}
> > +
> >  static void mce_init(X86CPU *cpu)
> >  {
> >      CPUX86State *cenv = &cpu->env;
> >      unsigned int bank;
> > +    Error *local_err = NULL;
> >  
> >      if (((cenv->cpuid_version >> 8) & 0xf) >= 6
> >          && (cenv->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) ==
> >              (CPUID_MCE | CPUID_MCA)) {
> >          cenv->mcg_cap = MCE_CAP_DEF | MCE_BANKS_DEF;
> > +
> > +        if (cpu->enable_lmce) {
> > +            if (!lmce_supported()) {
> > +                error_setg(&local_err, "KVM unavailable or LMCE not supported");
> > +                error_propagate(&error_abort, local_err);
> > +            }
> > +            cenv->mcg_cap |= MCG_LMCE_P;
> > +        }
> > +
> 
> This duplicates the existing check in kvm_arch_init_vcpu(). The
> difference is that the existing code is KVM-specific and doesn't
> stop initialization when capabilities are missing. We can unify
> them into a single mcg_cap-checking function as a follow-up.
>

If I reuse the existing MCE capability check in kvm_arch_init_vcpu(),
is it reasonable to make change to stop initialization if missing
capabilities? Or should we stop only for missing newly added capabilities
(e.g. LMCE) in order to keep backwards compatibility?

Thanks,
Haozhong
Eduardo Habkost June 17, 2016, 4:20 p.m. UTC | #10
On Fri, Jun 17, 2016 at 09:26:57AM +0800, Haozhong Zhang wrote:
[...]
> > >  static void mce_init(X86CPU *cpu)
> > >  {
> > >      CPUX86State *cenv = &cpu->env;
> > >      unsigned int bank;
> > > +    Error *local_err = NULL;
> > >  
> > >      if (((cenv->cpuid_version >> 8) & 0xf) >= 6
> > >          && (cenv->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) ==
> > >              (CPUID_MCE | CPUID_MCA)) {
> > >          cenv->mcg_cap = MCE_CAP_DEF | MCE_BANKS_DEF;
> > > +
> > > +        if (cpu->enable_lmce) {
> > > +            if (!lmce_supported()) {
> > > +                error_setg(&local_err, "KVM unavailable or LMCE not supported");
> > > +                error_propagate(&error_abort, local_err);
> > > +            }
> > > +            cenv->mcg_cap |= MCG_LMCE_P;
> > > +        }
> > > +
> > 
> > This duplicates the existing check in kvm_arch_init_vcpu(). The
> > difference is that the existing code is KVM-specific and doesn't
> > stop initialization when capabilities are missing. We can unify
> > them into a single mcg_cap-checking function as a follow-up.
> >
> 
> If I reuse the existing MCE capability check in kvm_arch_init_vcpu(),
> is it reasonable to make change to stop initialization if missing
> capabilities? Or should we stop only for missing newly added capabilities
> (e.g. LMCE) in order to keep backwards compatibility?

Ideally, yes. But in practice we need to check if we won't break
existing setups that were working. If all kernel versions we care
about always MCG_CTL_P|MCG_SER_P + 10 banks as supported, we can
make all bits mandatory.

I need to re-read the thread were kvm_get_mce_cap_supported() was
discussed, to refresh my memory.
Haozhong Zhang June 20, 2016, 2:04 a.m. UTC | #11
On 06/17/16 13:20, Eduardo Habkost wrote:
> On Fri, Jun 17, 2016 at 09:26:57AM +0800, Haozhong Zhang wrote:
> [...]
> > > >  static void mce_init(X86CPU *cpu)
> > > >  {
> > > >      CPUX86State *cenv = &cpu->env;
> > > >      unsigned int bank;
> > > > +    Error *local_err = NULL;
> > > >  
> > > >      if (((cenv->cpuid_version >> 8) & 0xf) >= 6
> > > >          && (cenv->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) ==
> > > >              (CPUID_MCE | CPUID_MCA)) {
> > > >          cenv->mcg_cap = MCE_CAP_DEF | MCE_BANKS_DEF;
> > > > +
> > > > +        if (cpu->enable_lmce) {
> > > > +            if (!lmce_supported()) {
> > > > +                error_setg(&local_err, "KVM unavailable or LMCE not supported");
> > > > +                error_propagate(&error_abort, local_err);
> > > > +            }
> > > > +            cenv->mcg_cap |= MCG_LMCE_P;
> > > > +        }
> > > > +
> > > 
> > > This duplicates the existing check in kvm_arch_init_vcpu(). The
> > > difference is that the existing code is KVM-specific and doesn't
> > > stop initialization when capabilities are missing. We can unify
> > > them into a single mcg_cap-checking function as a follow-up.
> > >
> > 
> > If I reuse the existing MCE capability check in kvm_arch_init_vcpu(),
> > is it reasonable to make change to stop initialization if missing
> > capabilities? Or should we stop only for missing newly added capabilities
> > (e.g. LMCE) in order to keep backwards compatibility?
> 
> Ideally, yes. But in practice we need to check if we won't break
> existing setups that were working. If all kernel versions we care
> about always MCG_CTL_P|MCG_SER_P + 10 banks as supported, we can
> make all bits mandatory.
>

Let's stop only for LMCE in this patch series. Other bits may be
changed in future after the kernel support is clarified.

Thanks,
Haozhong

> I need to re-read the thread were kvm_get_mce_cap_supported() was
> discussed, to refresh my memory.
>
> -- 
> Eduardo
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 895a386..bd35db2 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -2777,15 +2777,37 @@  static void x86_cpu_machine_reset_cb(void *opaque)
 }
 #endif
 
+static bool lmce_supported(void)
+{
+    uint64_t mce_cap;
+
+    if (!kvm_enabled() ||
+        kvm_ioctl(kvm_state, KVM_X86_GET_MCE_CAP_SUPPORTED, &mce_cap) < 0) {
+        return false;
+    }
+
+    return !!(mce_cap & MCG_LMCE_P);
+}
+
 static void mce_init(X86CPU *cpu)
 {
     CPUX86State *cenv = &cpu->env;
     unsigned int bank;
+    Error *local_err = NULL;
 
     if (((cenv->cpuid_version >> 8) & 0xf) >= 6
         && (cenv->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) ==
             (CPUID_MCE | CPUID_MCA)) {
         cenv->mcg_cap = MCE_CAP_DEF | MCE_BANKS_DEF;
+
+        if (cpu->enable_lmce) {
+            if (!lmce_supported()) {
+                error_setg(&local_err, "KVM unavailable or LMCE not supported");
+                error_propagate(&error_abort, local_err);
+            }
+            cenv->mcg_cap |= MCG_LMCE_P;
+        }
+
         cenv->mcg_ctl = ~(uint64_t)0;
         for (bank = 0; bank < MCE_BANKS_DEF; bank++) {
             cenv->mce_banks[bank * 4] = ~(uint64_t)0;
@@ -3206,6 +3228,7 @@  static Property x86_cpu_properties[] = {
     DEFINE_PROP_UINT32("xlevel", X86CPU, env.cpuid_xlevel, 0),
     DEFINE_PROP_UINT32("xlevel2", X86CPU, env.cpuid_xlevel2, 0),
     DEFINE_PROP_STRING("hv-vendor-id", X86CPU, hyperv_vendor_id),
+    DEFINE_PROP_BOOL("lmce", X86CPU, enable_lmce, false),
     DEFINE_PROP_END_OF_LIST()
 };
 
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index 0426459..f0cb04f 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -292,6 +292,7 @@ 
 
 #define MCG_CTL_P       (1ULL<<8)   /* MCG_CAP register available */
 #define MCG_SER_P       (1ULL<<24) /* MCA recovery/new status bits */
+#define MCG_LMCE_P      (1ULL<<27) /* Local Machine Check Supported */
 
 #define MCE_CAP_DEF     (MCG_CTL_P|MCG_SER_P)
 #define MCE_BANKS_DEF   10
@@ -301,6 +302,9 @@ 
 #define MCG_STATUS_RIPV (1ULL<<0)   /* restart ip valid */
 #define MCG_STATUS_EIPV (1ULL<<1)   /* ip points to correct instruction */
 #define MCG_STATUS_MCIP (1ULL<<2)   /* machine check in progress */
+#define MCG_STATUS_LMCE (1ULL<<3)   /* Local MCE signaled */
+
+#define MCG_EXT_CTL_LMCE_EN (1ULL<<0) /* Local MCE enabled */
 
 #define MCI_STATUS_VAL   (1ULL<<63)  /* valid error */
 #define MCI_STATUS_OVER  (1ULL<<62)  /* previous errors lost */
@@ -343,6 +347,7 @@ 
 #define MSR_MCG_CAP                     0x179
 #define MSR_MCG_STATUS                  0x17a
 #define MSR_MCG_CTL                     0x17b
+#define MSR_MCG_EXT_CTL                 0x4d0
 
 #define MSR_P6_EVNTSEL0                 0x186
 
@@ -1106,6 +1111,7 @@  typedef struct CPUX86State {
 
     uint64_t mcg_cap;
     uint64_t mcg_ctl;
+    uint64_t mcg_ext_ctl;
     uint64_t mce_banks[MCE_BANKS_DEF*4];
 
     uint64_t tsc_aux;
@@ -1173,6 +1179,12 @@  struct X86CPU {
      */
     bool enable_pmu;
 
+    /* LMCE support can be enabled/disabled via cpu option 'lmce=on/off'. It is
+     * disabled by default to avoid breaking migration between QEMU with
+     * different LMCE configurations.
+     */
+    bool enable_lmce;
+
     /* in order to simplify APIC support, we leave this pointer to the
        user */
     struct DeviceState *apic_state;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index abf50e6..ea442b3 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -107,6 +107,8 @@  static int has_xsave;
 static int has_xcrs;
 static int has_pit_state2;
 
+static bool has_msr_mcg_ext_ctl;
+
 int kvm_has_pit_state2(void)
 {
     return has_pit_state2;
@@ -378,10 +380,12 @@  static int kvm_get_mce_cap_supported(KVMState *s, uint64_t *mce_cap,
 
 static void kvm_mce_inject(X86CPU *cpu, hwaddr paddr, int code)
 {
+    CPUState *cs = CPU(cpu);
     CPUX86State *env = &cpu->env;
     uint64_t status = MCI_STATUS_VAL | MCI_STATUS_UC | MCI_STATUS_EN |
                       MCI_STATUS_MISCV | MCI_STATUS_ADDRV | MCI_STATUS_S;
     uint64_t mcg_status = MCG_STATUS_MCIP;
+    int flags = 0;
 
     if (code == BUS_MCEERR_AR) {
         status |= MCI_STATUS_AR | 0x134;
@@ -390,10 +394,19 @@  static void kvm_mce_inject(X86CPU *cpu, hwaddr paddr, int code)
         status |= 0xc0;
         mcg_status |= MCG_STATUS_RIPV;
     }
+
+    flags = cpu_x86_support_mca_broadcast(env) ? MCE_INJECT_BROADCAST : 0;
+    /* We need to read back the value of MSR_EXT_MCG_CTL that was set by the
+     * guest kernel back into env->mcg_ext_ctl.
+     */
+    cpu_synchronize_state(cs);
+    if (env->mcg_ext_ctl & MCG_EXT_CTL_LMCE_EN) {
+        mcg_status |= MCG_STATUS_LMCE;
+        flags = 0;
+    }
+
     cpu_x86_inject_mce(NULL, cpu, 9, status, mcg_status, paddr,
-                       (MCM_ADDR_PHYS << 6) | 0xc,
-                       cpu_x86_support_mca_broadcast(env) ?
-                       MCE_INJECT_BROADCAST : 0);
+                       (MCM_ADDR_PHYS << 6) | 0xc, flags);
 }
 
 static void hardware_memory_error(void)
@@ -878,7 +891,12 @@  int kvm_arch_init_vcpu(CPUState *cs)
     c = cpuid_find_entry(&cpuid_data.cpuid, 1, 0);
     if (c) {
         has_msr_feature_control = !!(c->ecx & CPUID_EXT_VMX) ||
-                                  !!(c->ecx & CPUID_EXT_SMX);
+                                  !!(c->ecx & CPUID_EXT_SMX) ||
+                                  !!(env->mcg_cap & MCG_LMCE_P);
+    }
+
+    if (has_msr_feature_control && (env->mcg_cap & MCG_LMCE_P)) {
+        has_msr_mcg_ext_ctl = true;
     }
 
     c = cpuid_find_entry(&cpuid_data.cpuid, 0x80000007, 0);
@@ -1702,6 +1720,9 @@  static int kvm_put_msrs(X86CPU *cpu, int level)
 
         kvm_msr_entry_add(cpu, MSR_MCG_STATUS, env->mcg_status);
         kvm_msr_entry_add(cpu, MSR_MCG_CTL, env->mcg_ctl);
+        if (has_msr_mcg_ext_ctl) {
+            kvm_msr_entry_add(cpu, MSR_MCG_EXT_CTL, env->mcg_ext_ctl);
+        }
         for (i = 0; i < (env->mcg_cap & 0xff) * 4; i++) {
             kvm_msr_entry_add(cpu, MSR_MC0_CTL + i, env->mce_banks[i]);
         }
@@ -2005,6 +2026,9 @@  static int kvm_get_msrs(X86CPU *cpu)
     if (env->mcg_cap) {
         kvm_msr_entry_add(cpu, MSR_MCG_STATUS, 0);
         kvm_msr_entry_add(cpu, MSR_MCG_CTL, 0);
+        if (has_msr_mcg_ext_ctl) {
+            kvm_msr_entry_add(cpu, MSR_MCG_EXT_CTL, 0);
+        }
         for (i = 0; i < (env->mcg_cap & 0xff) * 4; i++) {
             kvm_msr_entry_add(cpu, MSR_MC0_CTL + i, 0);
         }
@@ -2133,6 +2157,9 @@  static int kvm_get_msrs(X86CPU *cpu)
         case MSR_MCG_CTL:
             env->mcg_ctl = msrs[i].data;
             break;
+        case MSR_MCG_EXT_CTL:
+            env->mcg_ext_ctl = msrs[i].data;
+            break;
         case MSR_IA32_MISC_ENABLE:
             env->msr_ia32_misc_enable = msrs[i].data;
             break;