diff mbox series

[v2,2/5] x86/mce/inject: Warn the user on a not set valid bit in MCA_STATUS

Message ID 20211019233641.140275-3-Smita.KoralahalliChannabasappa@amd.com (mailing list archive)
State New, archived
Headers show
Series x86/mce: Handle error injection failures in mce-inject module | expand

Commit Message

Smita Koralahalli Oct. 19, 2021, 11:36 p.m. UTC
MCA handlers check the valid bit in each status register (MCA_STATUS[Val])
and will likely ignore signatures if the valid bit is not set.

Warn the user if the valid bit is not set before doing error injection.

Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
---
v2:
	Added a warning statement instead of setting the valid bit.
---
 arch/x86/kernel/cpu/mce/inject.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Luck, Tony Oct. 20, 2021, 3:06 p.m. UTC | #1
+	if (!(i_mce.status & MCI_STATUS_VAL))
+		pr_warn("Handlers might ignore signatures with Val=0 in MCA_STATUS\n");

I don't think there is any "might" about this. All code paths start by checking for MCI_STATUS_VAL
and skipping if it isn't set.

s/might/will/

-Tony
Borislav Petkov Oct. 26, 2021, 10:02 a.m. UTC | #2
On Tue, Oct 19, 2021 at 06:36:38PM -0500, Smita Koralahalli wrote:
> MCA handlers check the valid bit in each status register (MCA_STATUS[Val])
> and will likely ignore signatures if the valid bit is not set.
> 
> Warn the user if the valid bit is not set before doing error injection.
> 
> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> ---
> v2:
> 	Added a warning statement instead of setting the valid bit.
> ---
>  arch/x86/kernel/cpu/mce/inject.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c
> index 601efd104bb4..a993dc3d0333 100644
> --- a/arch/x86/kernel/cpu/mce/inject.c
> +++ b/arch/x86/kernel/cpu/mce/inject.c
> @@ -487,6 +487,9 @@ static void do_inject(void)
>  
>  	i_mce.tsc = rdtsc_ordered();
>  
> +	if (!(i_mce.status & MCI_STATUS_VAL))
> +		pr_warn("Handlers might ignore signatures with Val=0 in MCA_STATUS\n");
> +
>  	if (i_mce.misc)
>  		i_mce.status |= MCI_STATUS_MISCV;
>  
> -- 

So what's the real reason for this?

You've injected and you didn't get any feedback and were wondering why?

If handlers ignore !VAL MCEs, why don't you simply set it
unconditionally on entry to do_inject()?
Koralahalli Channabasappa, Smita Oct. 26, 2021, 4:58 p.m. UTC | #3
Hi Boris,

On 10/26/21 5:02 AM, Borislav Petkov wrote:

> On Tue, Oct 19, 2021 at 06:36:38PM -0500, Smita Koralahalli wrote:
>> MCA handlers check the valid bit in each status register (MCA_STATUS[Val])
>> and will likely ignore signatures if the valid bit is not set.
>>
>> Warn the user if the valid bit is not set before doing error injection.
>>
>> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
>> ---
>> v2:
>> 	Added a warning statement instead of setting the valid bit.
>> ---
>>   arch/x86/kernel/cpu/mce/inject.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c
>> index 601efd104bb4..a993dc3d0333 100644
>> --- a/arch/x86/kernel/cpu/mce/inject.c
>> +++ b/arch/x86/kernel/cpu/mce/inject.c
>> @@ -487,6 +487,9 @@ static void do_inject(void)
>>   
>>   	i_mce.tsc = rdtsc_ordered();
>>   
>> +	if (!(i_mce.status & MCI_STATUS_VAL))
>> +		pr_warn("Handlers might ignore signatures with Val=0 in MCA_STATUS\n");
>> +
>>   	if (i_mce.misc)
>>   		i_mce.status |= MCI_STATUS_MISCV;
>>   
>> -- 
> So what's the real reason for this?
>
> You've injected and you didn't get any feedback and were wondering why?
>
> If handlers ignore !VAL MCEs, why don't you simply set it
> unconditionally on entry to do_inject()?

Like how it was done here?
https://lkml.kernel.org/r/20210915232739.6367-3-Smita.KoralahalliChannabasappa@amd.com

Thanks,
Borislav Petkov Oct. 26, 2021, 5:15 p.m. UTC | #4
On Tue, Oct 26, 2021 at 11:58:58AM -0500, Koralahalli Channabasappa, Smita wrote:
> Like how it was done here?
> https://lkml.kernel.org/r/20210915232739.6367-3-Smita.KoralahalliChannabasappa@amd.com

Whoops, sorry about that.

So let's analyze this properly - there are two cases:

1. warn if VAL=0: what does that bring us? The person doing the injection
will simply have to set the valid bit and repeat the injection.

I guess "maybe the user wants to inject with Val not set" doesn't make a
whole lot of sense because nothing will happen - error will get ignored.

So we can do all the warning we want - it will be useless and in some
cases the user might not even see it.

So it sounds to me like setting the valid bit directly makes a lot more
sense.

2. Automatically set VAL=1 to correct any VAL=0 injections.

Yes, we force the VAL bit to 1 and that is not what the user injected
but the user injecting with VAL=0 will get ignored, i.e., it will be
pointless.

So we "help" here and set the valid bit.

Anything else I'm missing?

Sorry again for being back'n'forth on this.
Koralahalli Channabasappa, Smita Oct. 26, 2021, 6:53 p.m. UTC | #5
On 10/26/21 12:15 PM, Borislav Petkov wrote:

> On Tue, Oct 26, 2021 at 11:58:58AM -0500, Koralahalli Channabasappa, Smita wrote:
> Whoops, sorry about that.
>
> So let's analyze this properly - there are two cases:
>
> 1. warn if VAL=0: what does that bring us? The person doing the injection
> will simply have to set the valid bit and repeat the injection.
>
> I guess "maybe the user wants to inject with Val not set" doesn't make a
> whole lot of sense because nothing will happen - error will get ignored.
>
> So we can do all the warning we want - it will be useless and in some
> cases the user might not even see it.
>
> So it sounds to me like setting the valid bit directly makes a lot more
> sense.
>
> 2. Automatically set VAL=1 to correct any VAL=0 injections.
>
> Yes, we force the VAL bit to 1 and that is not what the user injected
> but the user injecting with VAL=0 will get ignored, i.e., it will be
> pointless.
>
> So we "help" here and set the valid bit.
>
> Anything else I'm missing?
>
> Sorry again for being back'n'forth on this.

Right. We are correcting VAL=0 injections made by the user by setting
valid bit unconditionally.

Not a problem. I could have broken this down in the comments before
coming up with this patch.

I will make the necessary changes and set the valid bit in the next
revision of the patch series.

Thanks,
Smita.
Borislav Petkov Oct. 26, 2021, 8:25 p.m. UTC | #6
On Tue, Oct 26, 2021 at 01:53:32PM -0500, Koralahalli Channabasappa, Smita wrote:
> Not a problem. I could have broken this down in the comments before
> coming up with this patch.

No worries - sometimes proposing the wrong thing even helps in deciding
faster what makes sense and what not. :-)

> I will make the necessary changes and set the valid bit in the next
> revision of the patch series.

Thx.
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c
index 601efd104bb4..a993dc3d0333 100644
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -487,6 +487,9 @@  static void do_inject(void)
 
 	i_mce.tsc = rdtsc_ordered();
 
+	if (!(i_mce.status & MCI_STATUS_VAL))
+		pr_warn("Handlers might ignore signatures with Val=0 in MCA_STATUS\n");
+
 	if (i_mce.misc)
 		i_mce.status |= MCI_STATUS_MISCV;