diff mbox series

[RESEND,3/3] x86/mce: Make mce_notify_irq() depend on CONFIG_X86_MCELOG_LEGACY

Message ID 20250115073640.77099-4-nik.borisov@suse.com (mailing list archive)
State New
Headers show
Series Make mce_notify_irq() static | expand

Commit Message

Nikolay Borisov Jan. 15, 2025, 7:36 a.m. UTC
mce_notify_irq() really depends on the legacy mcelog being enabled as
otherwise mce_work_trigger() will never schedule the trigger work as
mce_helper can't be set unless CONFIG_X86_MCELOG_LEGACY is defined.

Signed-off-by: Nikolay Borisov <nik.borisov@suse.com>
---
 arch/x86/kernel/cpu/mce/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--
2.43.0

Comments

Zhuo, Qiuxu Jan. 15, 2025, 1:45 p.m. UTC | #1
> From: Nikolay Borisov <nik.borisov@suse.com>
> Sent: Wednesday, January 15, 2025 3:37 PM
> To: linux-edac@vger.kernel.org
> Cc: x86@kernel.org; linux-kernel@vger.kernel.org; bp@alien8.de; Nikolay
> Borisov <nik.borisov@suse.com>
> Subject: [RESEND PATCH 3/3] x86/mce: Make mce_notify_irq() depend on
> CONFIG_X86_MCELOG_LEGACY
> 
> mce_notify_irq() really depends on the legacy mcelog being enabled as
> otherwise mce_work_trigger() will never schedule the trigger work as
> mce_helper can't be set unless CONFIG_X86_MCELOG_LEGACY is defined.
> 
> Signed-off-by: Nikolay Borisov <nik.borisov@suse.com>
> ---
>  arch/x86/kernel/cpu/mce/core.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index 89625ff79c3b..b21aa1494da0 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -591,6 +591,7 @@ EXPORT_SYMBOL_GPL(mce_is_correctable);
>   */
>  static int mce_notify_irq(void)
>  {
> +#ifdef CONFIG_X86_MCELOG_LEGACY
>  	/* Not more than two messages every minute */
>  	static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2);
> 
> @@ -602,7 +603,7 @@ static int mce_notify_irq(void)
> 

The message printed inside this function should not depend on 
CONFIG_X86_MCELOG_LEGACY.  User-space tools/scripts might look for this
message to detect machine events. It is also useful for debugging purposes.

   if (__ratelimit(&ratelimit))
       pr_info(HW_ERR "Machine check events logged\n");

>  		return 1;
>  	}
> -
> +#endif
>  	return 0;
>  }
> 
> --
> 2.43.0
>
Nikolay Borisov Jan. 15, 2025, 3:02 p.m. UTC | #2
On 15.01.25 г. 15:45 ч., Zhuo, Qiuxu wrote:
>> From: Nikolay Borisov <nik.borisov@suse.com>
>> Sent: Wednesday, January 15, 2025 3:37 PM
>> To: linux-edac@vger.kernel.org
>> Cc: x86@kernel.org; linux-kernel@vger.kernel.org; bp@alien8.de; Nikolay
>> Borisov <nik.borisov@suse.com>
>> Subject: [RESEND PATCH 3/3] x86/mce: Make mce_notify_irq() depend on
>> CONFIG_X86_MCELOG_LEGACY
>>
>> mce_notify_irq() really depends on the legacy mcelog being enabled as
>> otherwise mce_work_trigger() will never schedule the trigger work as
>> mce_helper can't be set unless CONFIG_X86_MCELOG_LEGACY is defined.
>>
>> Signed-off-by: Nikolay Borisov <nik.borisov@suse.com>
>> ---
>>   arch/x86/kernel/cpu/mce/core.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
>> index 89625ff79c3b..b21aa1494da0 100644
>> --- a/arch/x86/kernel/cpu/mce/core.c
>> +++ b/arch/x86/kernel/cpu/mce/core.c
>> @@ -591,6 +591,7 @@ EXPORT_SYMBOL_GPL(mce_is_correctable);
>>    */
>>   static int mce_notify_irq(void)
>>   {
>> +#ifdef CONFIG_X86_MCELOG_LEGACY
>>   	/* Not more than two messages every minute */
>>   	static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2);
>>
>> @@ -602,7 +603,7 @@ static int mce_notify_irq(void)
>>
> 
> The message printed inside this function should not depend on
> CONFIG_X86_MCELOG_LEGACY.  User-space tools/scripts might look for this
> message to detect machine events. It is also useful for debugging purposes.

The thing is if MCELOG_LEGACY is turned off then mce_work_trigger is a 
noop, hence nothing is really logged which makes this message somewhat 
bogus. After all the early handler's job is to log to userspace, if we 
don't log anything no need to spam the kernel log.

> 
>     if (__ratelimit(&ratelimit))
>         pr_info(HW_ERR "Machine check events logged\n");
> 
>>   		return 1;
>>   	}
>> -
>> +#endif
>>   	return 0;
>>   }
>>
>> --
>> 2.43.0
>>
>
Zhuo, Qiuxu Jan. 24, 2025, 10:43 a.m. UTC | #3
> From: Nikolay Borisov <nik.borisov@suse.com>
> [...]
> >> --- a/arch/x86/kernel/cpu/mce/core.c
> >> +++ b/arch/x86/kernel/cpu/mce/core.c
> >> @@ -591,6 +591,7 @@ EXPORT_SYMBOL_GPL(mce_is_correctable);
> >>    */
> >>   static int mce_notify_irq(void)
> >>   {
> >> +#ifdef CONFIG_X86_MCELOG_LEGACY
> >>   	/* Not more than two messages every minute */
> >>   	static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2);
> >>
> >> @@ -602,7 +603,7 @@ static int mce_notify_irq(void)
> >>
> >
> > The message printed inside this function should not depend on
> > CONFIG_X86_MCELOG_LEGACY.  User-space tools/scripts might look for
> > this message to detect machine events. It is also useful for debugging
> purposes.
> 
> The thing is if MCELOG_LEGACY is turned off then mce_work_trigger is a
> noop, hence nothing is really logged which makes this message somewhat
> bogus. After all the early handler's job is to log to userspace, if we don't log
> anything no need to spam the kernel log.

Currently, some customers have reported that the Intel EDAC driver didn't
report errors on some memory DIMMs. The print message here helped
me confirm whether the MCE event originated from the x86/mce code or
if the MCE event was lost somewhere in the EDAC driver.

IMHO, it would be better to keep this print message here, or update it a bit like below 
if !CONFIG_X86_MCELOG_LEGACY:

   pr_info(HW_ERR "Machine check events generated\n");

Thanks!
-Qiuxu
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 89625ff79c3b..b21aa1494da0 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -591,6 +591,7 @@  EXPORT_SYMBOL_GPL(mce_is_correctable);
  */
 static int mce_notify_irq(void)
 {
+#ifdef CONFIG_X86_MCELOG_LEGACY
 	/* Not more than two messages every minute */
 	static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2);

@@ -602,7 +603,7 @@  static int mce_notify_irq(void)

 		return 1;
 	}
-
+#endif
 	return 0;
 }