Message ID | 20170217063936.13208-13-haozhong.zhang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote: > --- a/xen/arch/x86/cpu/mcheck/barrier.c > +++ b/xen/arch/x86/cpu/mcheck/barrier.c > @@ -20,7 +20,7 @@ void mce_barrier_enter(struct mce_softirq_barrier *bar) > { > int gen; > > - if (!mce_broadcast) > + if ( !mce_broadcast || __get_cpu_var(lmce_in_process) ) this_cpu() please instead of __get_cpu_var() (which we should get rid of rather sooner than later). > @@ -462,6 +474,7 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs) > uint64_t gstatus; > mctelem_cookie_t mctc = NULL; > struct mca_summary bs; > + bool *lmce_in_process = &__get_cpu_var(lmce_in_process); > > mce_spin_lock(&mce_logout_lock); > > @@ -505,6 +518,8 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs) > } > mce_spin_unlock(&mce_logout_lock); > > + *lmce_in_process = bs.lmce; You don't need a new local variable for this. > @@ -1709,6 +1724,7 @@ static void mce_softirq(void) > { > int cpu = smp_processor_id(); > unsigned int workcpu; > + bool lmce = per_cpu(lmce_in_process, cpu); Is this flag valid to be looked at anymore at this point in time? MCIP was cleared a lot earlier, so there may well have been a 2nd #MC in between. In any event you again don#t need the local variable here afaict. Jan
On 02/22/17 06:53 -0700, Jan Beulich wrote: > >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote: > > --- a/xen/arch/x86/cpu/mcheck/barrier.c > > +++ b/xen/arch/x86/cpu/mcheck/barrier.c > > @@ -20,7 +20,7 @@ void mce_barrier_enter(struct mce_softirq_barrier *bar) > > { > > int gen; > > > > - if (!mce_broadcast) > > + if ( !mce_broadcast || __get_cpu_var(lmce_in_process) ) > > this_cpu() please instead of __get_cpu_var() (which we should get > rid of rather sooner than later). will use this_cpu() > > > @@ -462,6 +474,7 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs) > > uint64_t gstatus; > > mctelem_cookie_t mctc = NULL; > > struct mca_summary bs; > > + bool *lmce_in_process = &__get_cpu_var(lmce_in_process); > > > > mce_spin_lock(&mce_logout_lock); > > > > @@ -505,6 +518,8 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs) > > } > > mce_spin_unlock(&mce_logout_lock); > > > > + *lmce_in_process = bs.lmce; > > You don't need a new local variable for this. > > > @@ -1709,6 +1724,7 @@ static void mce_softirq(void) > > { > > int cpu = smp_processor_id(); > > unsigned int workcpu; > > + bool lmce = per_cpu(lmce_in_process, cpu); > > Is this flag valid to be looked at anymore at this point in time? MCIP > was cleared a lot earlier, so there may well have been a 2nd #MC > in between. In any event you again don#t need the local variable > here afaict. A non-LMCE MC# coming in between does not cause problem. As lmce_in_process on all CPUs are cleared, mce_softirq() on all CPUs will sync with each other as before and finally one of them will handle the pending LMCE. I think the problem is one flag is not enough rather than non needed. One lmce_in_process flag misses the following case: 1) mcheck_cmn_handler() first handles a non-LMCE MC on CPU#n and raises MACHINE_CHECK_SOFTIRQ. 2) Before mce_softirq() gets chance to run on CPU#n, another LMCE comes to CPU#n. Then mcheck_cmn_handler() sets lmce_in_process on CPU#n and raises MACHINE_CHECK_SOFTIRQ again. 3) mce_softirq() finally gets change to run on CPU#n. It sees lmce_in_process is set and consequently handles pending MCEs on CPU#n w/o waiting for other CPUs. However, one of the pending MCEs is not LMCE. So I'm considering to introduce another local flag "mce_in_process" to indicate whether there is a non-LMCE MC is pending for softirq. 1) When a non-LMCE MC# comes to CPU#n, mcheck_cmn_handler() sets mce_in_process flag on CPU#n. 2) When a LMCE comes to CPU#n, mcheck_cmn_handler() sets lmce_in_process flag on CPU#n. 3) When mce_softirq() starts, it clears lmce_in_process flag if mce_in_process is set, so it will not handle non-LMCE MC w/o waiting for other CPUs. 4) mce_softirq() clears both flags after exiting all MC barriers. Haozhong
>>> On 23.02.17 at 04:06, <haozhong.zhang@intel.com> wrote: > On 02/22/17 06:53 -0700, Jan Beulich wrote: >> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote: >> > @@ -1709,6 +1724,7 @@ static void mce_softirq(void) >> > { >> > int cpu = smp_processor_id(); >> > unsigned int workcpu; >> > + bool lmce = per_cpu(lmce_in_process, cpu); >> >> Is this flag valid to be looked at anymore at this point in time? MCIP >> was cleared a lot earlier, so there may well have been a 2nd #MC >> in between. In any event you again don#t need the local variable >> here afaict. > > A non-LMCE MC# coming in between does not cause problem. As > lmce_in_process on all CPUs are cleared, mce_softirq() on all CPUs > will sync with each other as before and finally one of them will > handle the pending LMCE. > > I think the problem is one flag is not enough rather than non > needed. One lmce_in_process flag misses the following case: > 1) mcheck_cmn_handler() first handles a non-LMCE MC on CPU#n and raises > MACHINE_CHECK_SOFTIRQ. > 2) Before mce_softirq() gets chance to run on CPU#n, another LMCE > comes to CPU#n. Then mcheck_cmn_handler() sets lmce_in_process on > CPU#n and raises MACHINE_CHECK_SOFTIRQ again. > 3) mce_softirq() finally gets change to run on CPU#n. It sees > lmce_in_process is set and consequently handles pending MCEs on > CPU#n w/o waiting for other CPUs. However, one of the pending MCEs > is not LMCE. > > So I'm considering to introduce another local flag "mce_in_process" to > indicate whether there is a non-LMCE MC is pending for softirq. > 1) When a non-LMCE MC# comes to CPU#n, mcheck_cmn_handler() sets > mce_in_process flag on CPU#n. > 2) When a LMCE comes to CPU#n, mcheck_cmn_handler() sets > lmce_in_process flag on CPU#n. > 3) When mce_softirq() starts, it clears lmce_in_process flag if > mce_in_process is set, so it will not handle non-LMCE MC w/o > waiting for other CPUs. > 4) mce_softirq() clears both flags after exiting all MC barriers. I'm afraid I still don't see how a static flag can deal with an arbitrary number of #MC-s arriving prior to the softirq handler getting a chance to run after the very first one came in. Jan
On 02/23/17 00:42 -0700, Jan Beulich wrote: > >>> On 23.02.17 at 04:06, <haozhong.zhang@intel.com> wrote: > > On 02/22/17 06:53 -0700, Jan Beulich wrote: > >> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote: > >> > @@ -1709,6 +1724,7 @@ static void mce_softirq(void) > >> > { > >> > int cpu = smp_processor_id(); > >> > unsigned int workcpu; > >> > + bool lmce = per_cpu(lmce_in_process, cpu); > >> > >> Is this flag valid to be looked at anymore at this point in time? MCIP > >> was cleared a lot earlier, so there may well have been a 2nd #MC > >> in between. In any event you again don#t need the local variable > >> here afaict. > > > > A non-LMCE MC# coming in between does not cause problem. As > > lmce_in_process on all CPUs are cleared, mce_softirq() on all CPUs > > will sync with each other as before and finally one of them will > > handle the pending LMCE. > > > > I think the problem is one flag is not enough rather than non > > needed. One lmce_in_process flag misses the following case: > > 1) mcheck_cmn_handler() first handles a non-LMCE MC on CPU#n and raises > > MACHINE_CHECK_SOFTIRQ. > > 2) Before mce_softirq() gets chance to run on CPU#n, another LMCE > > comes to CPU#n. Then mcheck_cmn_handler() sets lmce_in_process on > > CPU#n and raises MACHINE_CHECK_SOFTIRQ again. > > 3) mce_softirq() finally gets change to run on CPU#n. It sees > > lmce_in_process is set and consequently handles pending MCEs on > > CPU#n w/o waiting for other CPUs. However, one of the pending MCEs > > is not LMCE. > > > > So I'm considering to introduce another local flag "mce_in_process" to > > indicate whether there is a non-LMCE MC is pending for softirq. > > 1) When a non-LMCE MC# comes to CPU#n, mcheck_cmn_handler() sets > > mce_in_process flag on CPU#n. > > 2) When a LMCE comes to CPU#n, mcheck_cmn_handler() sets > > lmce_in_process flag on CPU#n. > > 3) When mce_softirq() starts, it clears lmce_in_process flag if > > mce_in_process is set, so it will not handle non-LMCE MC w/o > > waiting for other CPUs. > > 4) mce_softirq() clears both flags after exiting all MC barriers. > > I'm afraid I still don't see how a static flag can deal with an > arbitrary number of #MC-s arriving prior to the softirq handler > getting a chance to run after the very first one came in. > mce_barrier_enter/exit need to be adapted to check both flags to decide whether it needs to wait for others. void mce_barrier_enter/exit(struct mce_softirq_barrier *bar) { int gen; - if (!mce_broadcast) + if ( !mce_broadcast || + (!this_cpu(mce_in_process) && this_cpu(lmce_in_process)) ) return; When mce softirq handler finally starts to run, regardless how many MCs have happened before, 1) if it sees only lmce_in_process flag is set, all pending MCs in this_cpu(mctctl.pending) are LMCE and the handler does not need to wait for others and mce_barrier_enter/exit allows it to do so. 2) if it sees mce_in_process flag is set, all or some pending MCs in this_cpu(mctctl.pending) are non-LMCE, i.e. other CPUs have received them as well, so mce softirq handler should wait for others and mce_barrier_enter/exit does wait for others. Haozhong
diff --git a/xen/arch/x86/cpu/mcheck/barrier.c b/xen/arch/x86/cpu/mcheck/barrier.c index 5dce1fb..869fd20 100644 --- a/xen/arch/x86/cpu/mcheck/barrier.c +++ b/xen/arch/x86/cpu/mcheck/barrier.c @@ -20,7 +20,7 @@ void mce_barrier_enter(struct mce_softirq_barrier *bar) { int gen; - if (!mce_broadcast) + if ( !mce_broadcast || __get_cpu_var(lmce_in_process) ) return; atomic_inc(&bar->ingen); gen = atomic_read(&bar->outgen); @@ -38,7 +38,7 @@ void mce_barrier_exit(struct mce_softirq_barrier *bar) { int gen; - if ( !mce_broadcast ) + if ( !mce_broadcast || __get_cpu_var(lmce_in_process) ) return; atomic_inc(&bar->outgen); gen = atomic_read(&bar->ingen); diff --git a/xen/arch/x86/cpu/mcheck/mcaction.c b/xen/arch/x86/cpu/mcheck/mcaction.c index 8b2b834..90c68ff 100644 --- a/xen/arch/x86/cpu/mcheck/mcaction.c +++ b/xen/arch/x86/cpu/mcheck/mcaction.c @@ -95,7 +95,9 @@ mc_memerr_dhandler(struct mca_binfo *binfo, bank->mc_addr = gfn << PAGE_SHIFT | (bank->mc_addr & (PAGE_SIZE -1 )); - if ( fill_vmsr_data(bank, d, global->mc_gstatus, + /* TODO: support injecting LMCE */ + if ( fill_vmsr_data(bank, d, + global->mc_gstatus & ~MCG_STATUS_LMCE, vmce_vcpuid == VMCE_INJECT_BROADCAST) == -1 ) { mce_printk(MCE_QUIET, "Fill vMCE# data for DOM%d " diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c index 95a9da3..2d69222 100644 --- a/xen/arch/x86/cpu/mcheck/mce.c +++ b/xen/arch/x86/cpu/mcheck/mce.c @@ -42,6 +42,17 @@ DEFINE_PER_CPU_READ_MOSTLY(struct mca_banks *, poll_bankmask); DEFINE_PER_CPU_READ_MOSTLY(struct mca_banks *, no_cmci_banks); DEFINE_PER_CPU_READ_MOSTLY(struct mca_banks *, mce_clear_banks); +/* + * Flag to indicate whether the current MCE on this CPU is a LMCE. + * + * The MCE handler should set/clear this flag before entering any MCE + * barriers and raising MCE softirq. MCE barriers rely on this flag to + * decide whether they need to wait for other CPUs. MCE softirq handler + * relies on this flag to decide whether it needs to handle pending + * MCEs on other CPUs. + */ +DEFINE_PER_CPU(bool, lmce_in_process); + static void intpose_init(void); static void mcinfo_clear(struct mc_info *); struct mca_banks *mca_allbanks; @@ -399,6 +410,7 @@ mcheck_mca_logout(enum mca_source who, struct mca_banks *bankmask, sp->errcnt = errcnt; sp->ripv = (gstatus & MCG_STATUS_RIPV) != 0; sp->eipv = (gstatus & MCG_STATUS_EIPV) != 0; + sp->lmce = (gstatus & MCG_STATUS_LMCE) != 0; sp->uc = uc; sp->pcc = pcc; sp->recoverable = recover; @@ -462,6 +474,7 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs) uint64_t gstatus; mctelem_cookie_t mctc = NULL; struct mca_summary bs; + bool *lmce_in_process = &__get_cpu_var(lmce_in_process); mce_spin_lock(&mce_logout_lock); @@ -505,6 +518,8 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs) } mce_spin_unlock(&mce_logout_lock); + *lmce_in_process = bs.lmce; + mce_barrier_enter(&mce_trap_bar); if ( mctc != NULL && mce_urgent_action(regs, mctc)) cpumask_set_cpu(smp_processor_id(), &mce_fatal_cpus); @@ -1709,6 +1724,7 @@ static void mce_softirq(void) { int cpu = smp_processor_id(); unsigned int workcpu; + bool lmce = per_cpu(lmce_in_process, cpu); mce_printk(MCE_VERBOSE, "CPU%d enter softirq\n", cpu); @@ -1738,9 +1754,12 @@ static void mce_softirq(void) /* Step1: Fill DOM0 LOG buffer, vMCE injection buffer and * vMCE MSRs virtualization buffer */ - for_each_online_cpu(workcpu) { - mctelem_process_deferred(workcpu, mce_delayed_action); - } + if ( lmce ) + mctelem_process_deferred(cpu, mce_delayed_action); + else + for_each_online_cpu(workcpu) { + mctelem_process_deferred(workcpu, mce_delayed_action); + } /* Step2: Send Log to DOM0 through vIRQ */ if (dom0_vmce_enabled()) { diff --git a/xen/arch/x86/cpu/mcheck/mce.h b/xen/arch/x86/cpu/mcheck/mce.h index 2f4e7a4..2c033af 100644 --- a/xen/arch/x86/cpu/mcheck/mce.h +++ b/xen/arch/x86/cpu/mcheck/mce.h @@ -110,12 +110,15 @@ struct mca_summary { bool_t uc; /* UC flag */ bool_t pcc; /* PCC flag */ bool_t recoverable; /* software error recoverable flag */ + bool_t lmce; /* LMCE flag (Intel specific) */ }; DECLARE_PER_CPU(struct mca_banks *, poll_bankmask); DECLARE_PER_CPU(struct mca_banks *, no_cmci_banks); DECLARE_PER_CPU(struct mca_banks *, mce_clear_banks); +DECLARE_PER_CPU(bool, lmce_in_process); + extern bool_t cmci_support; extern bool_t is_mc_panic; extern bool_t mce_broadcast; diff --git a/xen/arch/x86/cpu/mcheck/x86_mca.h b/xen/arch/x86/cpu/mcheck/x86_mca.h index e25d619..322b7d4 100644 --- a/xen/arch/x86/cpu/mcheck/x86_mca.h +++ b/xen/arch/x86/cpu/mcheck/x86_mca.h @@ -42,7 +42,9 @@ #define MCG_STATUS_RIPV 0x0000000000000001ULL #define MCG_STATUS_EIPV 0x0000000000000002ULL #define MCG_STATUS_MCIP 0x0000000000000004ULL -/* Bits 3-63 are reserved */ +#define MCG_STATUS_LMCE 0x0000000000000008ULL /* Intel specific */ +/* Bits 3-63 are reserved on CPU not supporting LMCE */ +/* Bits 4-63 are reserved on CPU supporting LMCE */ /* Bitfield of MSR_K8_MCi_STATUS registers */ /* MCA error code */
LMCE is sent to only one CPU thread, so MCE handler, barriers and softirq handler should go without waiting for other CPUs, when handling LMCE. Note LMCE is still broadcast to all vcpus as regular MCE on Intel CPU right now. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> --- Cc: Christoph Egger <chegger@amazon.de> Cc: Liu Jinsong <jinsong.liu@alibaba-inc.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> --- xen/arch/x86/cpu/mcheck/barrier.c | 4 ++-- xen/arch/x86/cpu/mcheck/mcaction.c | 4 +++- xen/arch/x86/cpu/mcheck/mce.c | 25 ++++++++++++++++++++++--- xen/arch/x86/cpu/mcheck/mce.h | 3 +++ xen/arch/x86/cpu/mcheck/x86_mca.h | 4 +++- 5 files changed, 33 insertions(+), 7 deletions(-)