Message ID | 20201118151552.1412-2-gabriele.paoloni@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/MCE: some minor fixes | expand |
On Wed, Nov 18, 2020 at 03:15:49PM +0000, Gabriele Paoloni wrote: > Currently if mce_end() fails no_way_out is set equal to worst. > worst is the worst severirty that was found in the MCA banks ^^^^^^^^^ Please introduce a spellchecker into your patch creation workflow. > associated to the current CPU; however at this point no_way_out ^ with > could be already set by mca_start() by looking at all severities I think you mean "could have been already set" here > of all CPUs that entered the MCE handler. > if mce_end() fails we first check if no_way_out is already set and Please use passive voice in your commit message: no "we" or "I", etc. Also, pls start new sentences with a capital letter and end them with a fullstop. > if so we stick to it, otherwise we use the local worst value So basically you're trying to say here that no_way_out might have been already set and other CPUs could overwrite it and that should not happen. Is that what you mean? > Signed-off-by: Gabriele Paoloni <gabriele.paoloni@intel.com> > Reviewed-by: Tony Luck <tony.luck@intel.com> > --- > arch/x86/kernel/cpu/mce/core.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c > index 4102b866e7c0..b990892c6766 100644 > --- a/arch/x86/kernel/cpu/mce/core.c > +++ b/arch/x86/kernel/cpu/mce/core.c > @@ -1385,7 +1385,7 @@ noinstr void do_machine_check(struct pt_regs *regs) > */ > if (!lmce) { > if (mce_end(order) < 0) > - no_way_out = worst >= MCE_PANIC_SEVERITY; > + no_way_out = no_way_out ? no_way_out : worst >= MCE_PANIC_SEVERITY; I had to stare at this a bit to figure out what you're doing. So how about simplifying this: if (!no_way_out) no_way_out = worst >= MCE_PANIC_SEVERITY; ? Thx.
Hi Boris > -----Original Message----- > From: Borislav Petkov <bp@alien8.de> > Sent: Friday, November 20, 2020 6:08 PM > To: Paoloni, Gabriele <gabriele.paoloni@intel.com> > Cc: Luck, Tony <tony.luck@intel.com>; tglx@linutronix.de; > mingo@redhat.com; x86@kernel.org; hpa@zytor.com; linux- > edac@vger.kernel.org; linux-kernel@vger.kernel.org; linux- > safety@lists.elisa.tech > Subject: Re: [PATCH 1/4] x86/mce: do not overwrite no_way_out if > mce_end() fails > > On Wed, Nov 18, 2020 at 03:15:49PM +0000, Gabriele Paoloni wrote: > > Currently if mce_end() fails no_way_out is set equal to worst. > > worst is the worst severirty that was found in the MCA banks > ^^^^^^^^^ > > Please introduce a spellchecker into your patch creation workflow. > > > associated to the current CPU; however at this point no_way_out > ^ > with > > > > could be already set by mca_start() by looking at all severities > > I think you mean "could have been already set" here > > > of all CPUs that entered the MCE handler. > > if mce_end() fails we first check if no_way_out is already set and > > Please use passive voice in your commit message: no "we" or "I", etc. > > Also, pls start new sentences with a capital letter and end them with a > fullstop. Sorry about the grammar errors above, I'll pay more attention in future > > > if so we stick to it, otherwise we use the local worst value > > So basically you're trying to say here that no_way_out might have been > already set and other CPUs could overwrite it and that should not > happen. > > Is that what you mean? I mean that on this CPU thread at this point mce_start() already cached global_nwo and hence could accumulate fatal severities of other CPUs. Now here if mce_end() fails we only consider the local 'worst' severity and we overwrite those already cached. > > > Signed-off-by: Gabriele Paoloni <gabriele.paoloni@intel.com> > > Reviewed-by: Tony Luck <tony.luck@intel.com> > > --- > > arch/x86/kernel/cpu/mce/core.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/arch/x86/kernel/cpu/mce/core.c > b/arch/x86/kernel/cpu/mce/core.c > > index 4102b866e7c0..b990892c6766 100644 > > --- a/arch/x86/kernel/cpu/mce/core.c > > +++ b/arch/x86/kernel/cpu/mce/core.c > > @@ -1385,7 +1385,7 @@ noinstr void do_machine_check(struct pt_regs > *regs) > > */ > > if (!lmce) { > > if (mce_end(order) < 0) > > - no_way_out = worst >= MCE_PANIC_SEVERITY; > > + no_way_out = no_way_out ? no_way_out : worst >= > MCE_PANIC_SEVERITY; > > I had to stare at this a bit to figure out what you're doing. So how > about simplifying this: > > if (!no_way_out) > no_way_out = worst >= Yes that works as well improving readability. If ok I will fix the grammar and rewrite this code in v2. Many Thanks Gab > MCE_PANIC_SEVERITY; > > ? > > Thx. > > -- > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/notes-about-netiquette --------------------------------------------------------------------- INTEL CORPORATION ITALIA S.p.A. con unico socio Sede: Milanofiori Palazzo E 4 CAP 20094 Assago (MI) Capitale Sociale Euro 104.000,00 interamente versato Partita I.V.A. e Codice Fiscale 04236760155 Repertorio Economico Amministrativo n. 997124 Registro delle Imprese di Milano nr. 183983/5281/33 Soggetta ad attivita' di direzione e coordinamento di INTEL CORPORATION, USA This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
On Wed, Nov 18, 2020 at 03:15:49PM +0000, Gabriele Paoloni wrote: > Currently if mce_end() fails no_way_out is set equal to worst. > worst is the worst severirty that was found in the MCA banks > associated to the current CPU; however at this point no_way_out > could be already set by mca_start() by looking at all severities > of all CPUs that entered the MCE handler. > if mce_end() fails we first check if no_way_out is already set and > if so we stick to it, otherwise we use the local worst value > > Signed-off-by: Gabriele Paoloni <gabriele.paoloni@intel.com> > Reviewed-by: Tony Luck <tony.luck@intel.com> Also, this very likely wants Cc: stable, I'd say, considering the severity. Thx.
On Fri, Nov 20, 2020 at 05:31:32PM +0000, Paoloni, Gabriele wrote: > I mean that on this CPU thread at this point mce_start() already cached > global_nwo and hence could accumulate fatal severities of other CPUs. > > Now here if mce_end() fails we only consider the local 'worst' severity > and we overwrite those already cached. Yap, we're on the same page. :) > If ok I will fix the grammar and rewrite this code in v2. Sure, lemme go through the rest first. Thx.
[...] > Also, this very likely wants Cc: stable, I'd say, considering the > severity. Sure, will add stable in v2. Thanks Gab > > Thx. > > -- > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/notes-about-netiquette --------------------------------------------------------------------- INTEL CORPORATION ITALIA S.p.A. con unico socio Sede: Milanofiori Palazzo E 4 CAP 20094 Assago (MI) Capitale Sociale Euro 104.000,00 interamente versato Partita I.V.A. e Codice Fiscale 04236760155 Repertorio Economico Amministrativo n. 997124 Registro delle Imprese di Milano nr. 183983/5281/33 Soggetta ad attivita' di direzione e coordinamento di INTEL CORPORATION, USA This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
On Fri, Nov 20, 2020 at 06:33:42PM +0100, Borislav Petkov wrote:
> Sure, lemme go through the rest first.
Done, thx.
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 4102b866e7c0..b990892c6766 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -1385,7 +1385,7 @@ noinstr void do_machine_check(struct pt_regs *regs) */ if (!lmce) { if (mce_end(order) < 0) - no_way_out = worst >= MCE_PANIC_SEVERITY; + no_way_out = no_way_out ? no_way_out : worst >= MCE_PANIC_SEVERITY; } else { /* * If there was a fatal machine check we should have