diff mbox series

[v3,2/2] x86/mce/dev-mcelog: Fix updating kflags in AMD systems

Message ID 20200903234531.162484-3-Smita.KoralahalliChannabasappa@amd.com (mailing list archive)
State New, archived
Headers show
Series Decode raw MSR values of MCA registers in BERT | expand

Commit Message

Smita Koralahalli Sept. 3, 2020, 11:45 p.m. UTC
The mcelog utility is not commonly used on AMD systems. Therefore, errors
logged only by the dev_mce_log() notifier will be missed. This may occur
if the EDAC modules are not loaded in which case it's preferable to print
the error record by the default notifier.

However, the mce->kflags set by dev_mce_log() notifier makes the default
notifier to skip over the errors assuming they are processed by
dev_mce_log().

Do not update kflags in the dev_mce_log() notifier on AMD systems.

Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
---
Link:
https://lkml.kernel.org/r/20200828203332.11129-3-Smita.KoralahalliChannabasappa@amd.com

v3:
	No change
v2:
	No change
---
 arch/x86/kernel/cpu/mce/dev-mcelog.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Borislav Petkov Sept. 14, 2020, 3:34 p.m. UTC | #1
On Thu, Sep 03, 2020 at 06:45:31PM -0500, Smita Koralahalli wrote:
> The mcelog utility is not commonly used on AMD systems. Therefore, errors
> logged only by the dev_mce_log() notifier will be missed. This may occur
> if the EDAC modules are not loaded in which case it's preferable to print
> the error record by the default notifier.
> 
> However, the mce->kflags set by dev_mce_log() notifier makes the default
> notifier to skip over the errors assuming they are processed by
> dev_mce_log().
> 
> Do not update kflags in the dev_mce_log() notifier on AMD systems.
> 
> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> ---
> Link:
> https://lkml.kernel.org/r/20200828203332.11129-3-Smita.KoralahalliChannabasappa@amd.com
> 
> v3:
> 	No change
> v2:
> 	No change
> ---
>  arch/x86/kernel/cpu/mce/dev-mcelog.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/mce/dev-mcelog.c b/arch/x86/kernel/cpu/mce/dev-mcelog.c
> index 03e51053592a..100fbeebdc72 100644
> --- a/arch/x86/kernel/cpu/mce/dev-mcelog.c
> +++ b/arch/x86/kernel/cpu/mce/dev-mcelog.c
> @@ -67,7 +67,9 @@ static int dev_mce_log(struct notifier_block *nb, unsigned long val,
>  unlock:
>  	mutex_unlock(&mce_chrdev_read_mutex);
>  
> -	mce->kflags |= MCE_HANDLED_MCELOG;
> +	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
> +		mce->kflags |= MCE_HANDLED_MCELOG;
> +
>  	return NOTIFY_OK;
>  }
>  
> -- 

This one is not related to your 1/2 so it sounds to me like I should
take this one now, independently?
Koralahalli Channabasappa, Smita Sept. 14, 2020, 9:20 p.m. UTC | #2
On 9/14/20 10:34 AM, Borislav Petkov wrote:

> On Thu, Sep 03, 2020 at 06:45:31PM -0500, Smita Koralahalli wrote:
>> The mcelog utility is not commonly used on AMD systems. Therefore, errors
>> logged only by the dev_mce_log() notifier will be missed. This may occur
>> if the EDAC modules are not loaded in which case it's preferable to print
>> the error record by the default notifier.
>>
>> However, the mce->kflags set by dev_mce_log() notifier makes the default
>> notifier to skip over the errors assuming they are processed by
>> dev_mce_log().
>>
>> Do not update kflags in the dev_mce_log() notifier on AMD systems.
>>
>> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
>> ---
>> Link:
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml.kernel.org%2Fr%2F20200828203332.11129-3-Smita.KoralahalliChannabasappa%40amd.com&amp;data=02%7C01%7CSmita.KoralahalliChannabasappa%40amd.com%7Cc452e9f80fe9459839c708d858c3a763%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637356944652485754&amp;sdata=%2FhYQbBBNld1GtNX8%2FI6PERD0icYfy0e1k5zukQYI%2Fa4%3D&amp;reserved=0
>>
>> v3:
>> 	No change
>> v2:
>> 	No change
>> ---
>>   arch/x86/kernel/cpu/mce/dev-mcelog.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/cpu/mce/dev-mcelog.c b/arch/x86/kernel/cpu/mce/dev-mcelog.c
>> index 03e51053592a..100fbeebdc72 100644
>> --- a/arch/x86/kernel/cpu/mce/dev-mcelog.c
>> +++ b/arch/x86/kernel/cpu/mce/dev-mcelog.c
>> @@ -67,7 +67,9 @@ static int dev_mce_log(struct notifier_block *nb, unsigned long val,
>>   unlock:
>>   	mutex_unlock(&mce_chrdev_read_mutex);
>>   
>> -	mce->kflags |= MCE_HANDLED_MCELOG;
>> +	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
>> +		mce->kflags |= MCE_HANDLED_MCELOG;
>> +
>>   	return NOTIFY_OK;
>>   }
>>   
>> -- 
> This one is not related to your 1/2 so it sounds to me like I should
> take this one now, independently?

Yes, this can be taken independently. I just tagged it along as I came
across the issue of missing error logs while trying to print error
records in the previous patch.

Thanks,
Smita
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/mce/dev-mcelog.c b/arch/x86/kernel/cpu/mce/dev-mcelog.c
index 03e51053592a..100fbeebdc72 100644
--- a/arch/x86/kernel/cpu/mce/dev-mcelog.c
+++ b/arch/x86/kernel/cpu/mce/dev-mcelog.c
@@ -67,7 +67,9 @@  static int dev_mce_log(struct notifier_block *nb, unsigned long val,
 unlock:
 	mutex_unlock(&mce_chrdev_read_mutex);
 
-	mce->kflags |= MCE_HANDLED_MCELOG;
+	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
+		mce->kflags |= MCE_HANDLED_MCELOG;
+
 	return NOTIFY_OK;
 }