Message ID | 20250115073640.77099-4-nik.borisov@suse.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Make mce_notify_irq() static | expand |
> From: Nikolay Borisov <nik.borisov@suse.com> > Sent: Wednesday, January 15, 2025 3:37 PM > To: linux-edac@vger.kernel.org > Cc: x86@kernel.org; linux-kernel@vger.kernel.org; bp@alien8.de; Nikolay > Borisov <nik.borisov@suse.com> > Subject: [RESEND PATCH 3/3] x86/mce: Make mce_notify_irq() depend on > CONFIG_X86_MCELOG_LEGACY > > mce_notify_irq() really depends on the legacy mcelog being enabled as > otherwise mce_work_trigger() will never schedule the trigger work as > mce_helper can't be set unless CONFIG_X86_MCELOG_LEGACY is defined. > > Signed-off-by: Nikolay Borisov <nik.borisov@suse.com> > --- > arch/x86/kernel/cpu/mce/core.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c > index 89625ff79c3b..b21aa1494da0 100644 > --- a/arch/x86/kernel/cpu/mce/core.c > +++ b/arch/x86/kernel/cpu/mce/core.c > @@ -591,6 +591,7 @@ EXPORT_SYMBOL_GPL(mce_is_correctable); > */ > static int mce_notify_irq(void) > { > +#ifdef CONFIG_X86_MCELOG_LEGACY > /* Not more than two messages every minute */ > static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2); > > @@ -602,7 +603,7 @@ static int mce_notify_irq(void) > The message printed inside this function should not depend on CONFIG_X86_MCELOG_LEGACY. User-space tools/scripts might look for this message to detect machine events. It is also useful for debugging purposes. if (__ratelimit(&ratelimit)) pr_info(HW_ERR "Machine check events logged\n"); > return 1; > } > - > +#endif > return 0; > } > > -- > 2.43.0 >
On 15.01.25 г. 15:45 ч., Zhuo, Qiuxu wrote: >> From: Nikolay Borisov <nik.borisov@suse.com> >> Sent: Wednesday, January 15, 2025 3:37 PM >> To: linux-edac@vger.kernel.org >> Cc: x86@kernel.org; linux-kernel@vger.kernel.org; bp@alien8.de; Nikolay >> Borisov <nik.borisov@suse.com> >> Subject: [RESEND PATCH 3/3] x86/mce: Make mce_notify_irq() depend on >> CONFIG_X86_MCELOG_LEGACY >> >> mce_notify_irq() really depends on the legacy mcelog being enabled as >> otherwise mce_work_trigger() will never schedule the trigger work as >> mce_helper can't be set unless CONFIG_X86_MCELOG_LEGACY is defined. >> >> Signed-off-by: Nikolay Borisov <nik.borisov@suse.com> >> --- >> arch/x86/kernel/cpu/mce/core.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c >> index 89625ff79c3b..b21aa1494da0 100644 >> --- a/arch/x86/kernel/cpu/mce/core.c >> +++ b/arch/x86/kernel/cpu/mce/core.c >> @@ -591,6 +591,7 @@ EXPORT_SYMBOL_GPL(mce_is_correctable); >> */ >> static int mce_notify_irq(void) >> { >> +#ifdef CONFIG_X86_MCELOG_LEGACY >> /* Not more than two messages every minute */ >> static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2); >> >> @@ -602,7 +603,7 @@ static int mce_notify_irq(void) >> > > The message printed inside this function should not depend on > CONFIG_X86_MCELOG_LEGACY. User-space tools/scripts might look for this > message to detect machine events. It is also useful for debugging purposes. The thing is if MCELOG_LEGACY is turned off then mce_work_trigger is a noop, hence nothing is really logged which makes this message somewhat bogus. After all the early handler's job is to log to userspace, if we don't log anything no need to spam the kernel log. > > if (__ratelimit(&ratelimit)) > pr_info(HW_ERR "Machine check events logged\n"); > >> return 1; >> } >> - >> +#endif >> return 0; >> } >> >> -- >> 2.43.0 >> >
> From: Nikolay Borisov <nik.borisov@suse.com> > [...] > >> --- a/arch/x86/kernel/cpu/mce/core.c > >> +++ b/arch/x86/kernel/cpu/mce/core.c > >> @@ -591,6 +591,7 @@ EXPORT_SYMBOL_GPL(mce_is_correctable); > >> */ > >> static int mce_notify_irq(void) > >> { > >> +#ifdef CONFIG_X86_MCELOG_LEGACY > >> /* Not more than two messages every minute */ > >> static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2); > >> > >> @@ -602,7 +603,7 @@ static int mce_notify_irq(void) > >> > > > > The message printed inside this function should not depend on > > CONFIG_X86_MCELOG_LEGACY. User-space tools/scripts might look for > > this message to detect machine events. It is also useful for debugging > purposes. > > The thing is if MCELOG_LEGACY is turned off then mce_work_trigger is a > noop, hence nothing is really logged which makes this message somewhat > bogus. After all the early handler's job is to log to userspace, if we don't log > anything no need to spam the kernel log. Currently, some customers have reported that the Intel EDAC driver didn't report errors on some memory DIMMs. The print message here helped me confirm whether the MCE event originated from the x86/mce code or if the MCE event was lost somewhere in the EDAC driver. IMHO, it would be better to keep this print message here, or update it a bit like below if !CONFIG_X86_MCELOG_LEGACY: pr_info(HW_ERR "Machine check events generated\n"); Thanks! -Qiuxu
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 89625ff79c3b..b21aa1494da0 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -591,6 +591,7 @@ EXPORT_SYMBOL_GPL(mce_is_correctable); */ static int mce_notify_irq(void) { +#ifdef CONFIG_X86_MCELOG_LEGACY /* Not more than two messages every minute */ static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2); @@ -602,7 +603,7 @@ static int mce_notify_irq(void) return 1; } - +#endif return 0; }
mce_notify_irq() really depends on the legacy mcelog being enabled as otherwise mce_work_trigger() will never schedule the trigger work as mce_helper can't be set unless CONFIG_X86_MCELOG_LEGACY is defined. Signed-off-by: Nikolay Borisov <nik.borisov@suse.com> --- arch/x86/kernel/cpu/mce/core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) -- 2.43.0