Message ID | 20170217063936.13208-17-haozhong.zhang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote: > --- a/xen/arch/x86/cpu/mcheck/mcaction.c > +++ b/xen/arch/x86/cpu/mcheck/mcaction.c > @@ -88,17 +88,19 @@ mc_memerr_dhandler(struct mca_binfo *binfo, > goto vmce_failed; > } > > - if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL ) > + vmce_vcpuid = global->mc_vcpuid; > + if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && > + (vmce_vcpuid == -1 || What is this -1 matching up with? Or to say it differently, this needs a #define used at both producing and consuming sides. > + global->mc_domid != bank->mc_domid || I don't understand this check. > --- a/xen/arch/x86/cpu/mcheck/vmce.c > +++ b/xen/arch/x86/cpu/mcheck/vmce.c > @@ -444,14 +444,21 @@ static int vcpu_fill_mc_msrs(struct vcpu *v, uint64_t mcg_status, > } > > int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d, > - uint64_t gstatus, bool broadcast) > + uint64_t gstatus, int vmce_vcpuid) > { > struct vcpu *v = d->vcpu[0]; > + bool broadcast = (vmce_vcpuid == VMCE_INJECT_BROADCAST); > int ret; > > if ( mc_bank->mc_domid == (uint16_t)~0 ) > return -EINVAL; > > + if ( (gstatus & MCG_STATUS_LMCE) && !broadcast ) > + v = d->vcpu[vmce_vcpuid]; While the ID looks to be coming from a trustworthy source, I'd still prefer if there was at least an ASSERT() for it to be in range. > + if ( broadcast ) > + gstatus &= ~MCG_STATUS_LMCE; Please combine the two if()s: if ( broadcast ) ... else if ( gstatus & MCG_STATUS_LMCE ) ... Jan
On 02/22/17 08:48 -0700, Jan Beulich wrote: > >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote: > > --- a/xen/arch/x86/cpu/mcheck/mcaction.c > > +++ b/xen/arch/x86/cpu/mcheck/mcaction.c > > @@ -88,17 +88,19 @@ mc_memerr_dhandler(struct mca_binfo *binfo, > > goto vmce_failed; > > } > > > > - if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL ) > > + vmce_vcpuid = global->mc_vcpuid; > > + if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && > > + (vmce_vcpuid == -1 || > > What is this -1 matching up with? Or to say it differently, this needs a > #define used at both producing and consuming sides. It comes from mca_init_global > > > + global->mc_domid != bank->mc_domid || > > I don't understand this check. It serves for MCE injection test (i.e. xen-mceinj). The check is always false for LMCE that comes from the real hardware error. But for xen-mceinj, which may specify both the injected domain and the injected physical CPU, it's hard for xen-mceinj to ensure, when the injection really happens, the injected CPU is still the one on which a vCPU of the injected domain was running. So I add this check to avoid injecting to the wrong domain. > > > --- a/xen/arch/x86/cpu/mcheck/vmce.c > > +++ b/xen/arch/x86/cpu/mcheck/vmce.c > > @@ -444,14 +444,21 @@ static int vcpu_fill_mc_msrs(struct vcpu *v, uint64_t mcg_status, > > } > > > > int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d, > > - uint64_t gstatus, bool broadcast) > > + uint64_t gstatus, int vmce_vcpuid) > > { > > struct vcpu *v = d->vcpu[0]; > > + bool broadcast = (vmce_vcpuid == VMCE_INJECT_BROADCAST); > > int ret; > > > > if ( mc_bank->mc_domid == (uint16_t)~0 ) > > return -EINVAL; > > > > + if ( (gstatus & MCG_STATUS_LMCE) && !broadcast ) > > + v = d->vcpu[vmce_vcpuid]; > > While the ID looks to be coming from a trustworthy source, I'd > still prefer if there was at least an ASSERT() for it to be in range. > will do > > + if ( broadcast ) > > + gstatus &= ~MCG_STATUS_LMCE; > > Please combine the two if()s: > > if ( broadcast ) > ... > else if ( gstatus & MCG_STATUS_LMCE ) > ... > ditto Thanks, Haozhong
>>> On 23.02.17 at 05:48, <haozhong.zhang@intel.com> wrote: > On 02/22/17 08:48 -0700, Jan Beulich wrote: >> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote: >> > --- a/xen/arch/x86/cpu/mcheck/mcaction.c >> > +++ b/xen/arch/x86/cpu/mcheck/mcaction.c >> > @@ -88,17 +88,19 @@ mc_memerr_dhandler(struct mca_binfo *binfo, >> > goto vmce_failed; >> > } >> > >> > - if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL ) >> > + vmce_vcpuid = global->mc_vcpuid; >> > + if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && >> > + (vmce_vcpuid == -1 || >> >> What is this -1 matching up with? Or to say it differently, this needs a >> #define used at both producing and consuming sides. > > It comes from mca_init_global Now that's why I did add the second sentence: I understand that, but prior to this patch there was (in the hypervisor) only a producing side. That's bad enough, because it leaves the consumer (outside of the hypervisor) at the mercy of this value remaining at that hard coded -1, and doesn't allow for making the connection. The latest with you adding a consumer in the hypervisor, the values should become connectable to one another by using a suitable #define on both sides. Quite likely that #define should actually become part of the public interface, so that users of the value have a proper item to compare against. Having to explain this made me look at the type of the field, btw: It's uint16_t, so the comparison above is always false. The situation looks to be even worse for mc_domid, so I think I'll prepare a patch (where I then may include the mc_vcpuid aspect right away). Jan
diff --git a/xen/arch/x86/cpu/mcheck/mcaction.c b/xen/arch/x86/cpu/mcheck/mcaction.c index 90c68ff..3410bfd 100644 --- a/xen/arch/x86/cpu/mcheck/mcaction.c +++ b/xen/arch/x86/cpu/mcheck/mcaction.c @@ -88,17 +88,19 @@ mc_memerr_dhandler(struct mca_binfo *binfo, goto vmce_failed; } - if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL ) + vmce_vcpuid = global->mc_vcpuid; + if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && + (vmce_vcpuid == -1 || + global->mc_domid != bank->mc_domid || + !(global->mc_gstatus & MCG_STATUS_LMCE) || + !d->vcpu[vmce_vcpuid]->arch.vmce.lmce_enabled) ) vmce_vcpuid = VMCE_INJECT_BROADCAST; - else - vmce_vcpuid = global->mc_vcpuid; bank->mc_addr = gfn << PAGE_SHIFT | (bank->mc_addr & (PAGE_SIZE -1 )); - /* TODO: support injecting LMCE */ if ( fill_vmsr_data(bank, d, - global->mc_gstatus & ~MCG_STATUS_LMCE, - vmce_vcpuid == VMCE_INJECT_BROADCAST) == -1 ) + global->mc_gstatus, + vmce_vcpuid) == -1 ) { mce_printk(MCE_QUIET, "Fill vMCE# data for DOM%d " "failed\n", bank->mc_domid); diff --git a/xen/arch/x86/cpu/mcheck/vmce.c b/xen/arch/x86/cpu/mcheck/vmce.c index 1278839..2a4d3f0 100644 --- a/xen/arch/x86/cpu/mcheck/vmce.c +++ b/xen/arch/x86/cpu/mcheck/vmce.c @@ -444,14 +444,21 @@ static int vcpu_fill_mc_msrs(struct vcpu *v, uint64_t mcg_status, } int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d, - uint64_t gstatus, bool broadcast) + uint64_t gstatus, int vmce_vcpuid) { struct vcpu *v = d->vcpu[0]; + bool broadcast = (vmce_vcpuid == VMCE_INJECT_BROADCAST); int ret; if ( mc_bank->mc_domid == (uint16_t)~0 ) return -EINVAL; + if ( (gstatus & MCG_STATUS_LMCE) && !broadcast ) + v = d->vcpu[vmce_vcpuid]; + + if ( broadcast ) + gstatus &= ~MCG_STATUS_LMCE; + ret = vcpu_fill_mc_msrs(v, gstatus, mc_bank->mc_status, mc_bank->mc_addr, mc_bank->mc_misc); if ( ret || !broadcast ) diff --git a/xen/arch/x86/cpu/mcheck/vmce.h b/xen/arch/x86/cpu/mcheck/vmce.h index 74f6381..2797e00 100644 --- a/xen/arch/x86/cpu/mcheck/vmce.h +++ b/xen/arch/x86/cpu/mcheck/vmce.h @@ -17,7 +17,7 @@ int vmce_amd_rdmsr(const struct vcpu *, uint32_t msr, uint64_t *val); int vmce_amd_wrmsr(struct vcpu *, uint32_t msr, uint64_t val); int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d, - uint64_t gstatus, bool broadcast); + uint64_t gstatus, int vmce_vcpuid); #define VMCE_INJECT_BROADCAST (-1) int inject_vmce(struct domain *d, int vcpu);
Inject LMCE to guest if the host MCE is LMCE and the affected vcpu is known. Otherwise, broadcast MCE to all vcpus on Intel host. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> --- Cc: Christoph Egger <chegger@amazon.de> Cc: Liu Jinsong <jinsong.liu@alibaba-inc.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> --- xen/arch/x86/cpu/mcheck/mcaction.c | 14 ++++++++------ xen/arch/x86/cpu/mcheck/vmce.c | 9 ++++++++- xen/arch/x86/cpu/mcheck/vmce.h | 2 +- 3 files changed, 17 insertions(+), 8 deletions(-)