diff mbox series

[v2,07/16] x86/mce/amd: Simplify DFR handler setup

Message ID 20240404151359.47970-8-yazen.ghannam@amd.com (mailing list archive)
State New
Headers show
Series MCA Updates | expand

Commit Message

Yazen Ghannam April 4, 2024, 3:13 p.m. UTC
AMD systems with the SUCCOR feature can send an APIC LVT interrupt for
deferred errors. The LVT offset is 0x2 by convention, i.e. this is the
default as listed in hardware documentation.

However, the MCA registers may list a different LVT offset for this
interrupt. The kernel should honor the value from the hardware.

Simplify the enable flow by using the hardware-provided value. Any
conflicts will be caught by setup_APIC_eilvt(). Conflicts on production
systems can be handled as quirks, if needed.

Also, rename the function using a "verb-first" style.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---

Notes:
    Link:
    https://lkml.kernel.org/r/20231118193248.1296798-12-yazen.ghannam@amd.com
    
    v1->v2:
    * No change.

 arch/x86/kernel/cpu/mce/amd.c | 33 ++++++++++-----------------------
 1 file changed, 10 insertions(+), 23 deletions(-)

Comments

Borislav Petkov April 24, 2024, 7:06 p.m. UTC | #1
On Thu, Apr 04, 2024 at 10:13:50AM -0500, Yazen Ghannam wrote:
> AMD systems with the SUCCOR feature can send an APIC LVT interrupt for
> deferred errors. The LVT offset is 0x2 by convention, i.e. this is the
> default as listed in hardware documentation.
> 
> However, the MCA registers may list a different LVT offset for this
> interrupt. The kernel should honor the value from the hardware.

There's this "may" thing again.

Is this enablement for some future hw too or do you really trust the
value in MSR_CU_DEF_ERR is programmed correctly in all cases?

> Simplify the enable flow by using the hardware-provided value. Any
> conflicts will be caught by setup_APIC_eilvt(). Conflicts on production
> systems can be handled as quirks, if needed.

Well, which systems support succor?

I'd like to test this on them before we face all the quirkery. :)

That area has been plagued by hw snafus if you look at
setup_APIC_eilvt() and talk to uncle Robert. :-P

> @@ -595,17 +584,15 @@ static void deferred_error_interrupt_enable(struct cpuinfo_x86 *c)
>  	if (rdmsr_safe(MSR_CU_DEF_ERR, &low, &high))
>  		return;
>  
> +	/*
> +	 * Trust the value from hardware.
> +	 * If there's a conflict, then setup_APIC_eilvt() will throw an error.
> +	 */
>  	def_new = (low & MASK_DEF_LVTOFF) >> 4;
> -	if (!(low & MASK_DEF_LVTOFF)) {
> -		pr_err(FW_BUG "Your BIOS is not setting up LVT offset 0x2 for deferred error IRQs correctly.\n");
> -		def_new = DEF_LVT_OFF;
> -		low = (low & ~MASK_DEF_LVTOFF) | (DEF_LVT_OFF << 4);
> -	}
> +	if (setup_APIC_eilvt(def_new, DEFERRED_ERROR_VECTOR, APIC_EILVT_MSG_FIX, 0))
> +		return;
>  
> -	def_offset = setup_APIC_deferred_error(def_offset, def_new);
> -	if ((def_offset == def_new) &&
> -	    (deferred_error_int_vector != amd_deferred_error_interrupt))
> -		deferred_error_int_vector = amd_deferred_error_interrupt;

There was a reason for that - deferred_error_int_vector is a global var
and you're calling enable_deferred_error_interrupt() on each CPU.

> +	deferred_error_int_vector = amd_deferred_error_interrupt;
>  
>  	if (!mce_flags.smca)
>  		low = (low & ~MASK_DEF_INT_TYPE) | DEF_INT_TYPE_APIC;

Thx.
Yazen Ghannam April 25, 2024, 2:12 p.m. UTC | #2
On 4/24/2024 3:06 PM, Borislav Petkov wrote:
> On Thu, Apr 04, 2024 at 10:13:50AM -0500, Yazen Ghannam wrote:
>> AMD systems with the SUCCOR feature can send an APIC LVT interrupt for
>> deferred errors. The LVT offset is 0x2 by convention, i.e. this is the
>> default as listed in hardware documentation.
>>
>> However, the MCA registers may list a different LVT offset for this
>> interrupt. The kernel should honor the value from the hardware.
> 
> There's this "may" thing again.
> 

Right, I should say "the microarchitecture allows it". :)

> Is this enablement for some future hw too or do you really trust the
> value in MSR_CU_DEF_ERR is programmed correctly in all cases?
> 

I trust the value from hardware.

The intention here is to simplify the code for general maintenance and to make
later patches easier.

>> Simplify the enable flow by using the hardware-provided value. Any
>> conflicts will be caught by setup_APIC_eilvt(). Conflicts on production
>> systems can be handled as quirks, if needed.
> 
> Well, which systems support succor?
> 
> I'd like to test this on them before we face all the quirkery. :)
> 

All Zen/SMCA systems. I don't recall any issues in this area.

Some later Family 15h systems (Carrizo?) had it. But I don't know if it was
used in production. It was slightly before my time.

> That area has been plagued by hw snafus if you look at
> setup_APIC_eilvt() and talk to uncle Robert. :-P
>

Right, I found this:
27afdf2008da ("apic, x86: Use BIOS settings for IBS and MCE threshold
interrupt LVT offsets")

Which is basically the same idea: use what is in the register.

But it looks there was an issue with IBS on Family 10h.

Is this the only case of a real issue? If so, then why apply this method to
the THR and DFR interrupts?

Robert, were there any other issues?

>> @@ -595,17 +584,15 @@ static void deferred_error_interrupt_enable(struct cpuinfo_x86 *c)
>>  	if (rdmsr_safe(MSR_CU_DEF_ERR, &low, &high))
>>  		return;
>>  
>> +	/*
>> +	 * Trust the value from hardware.
>> +	 * If there's a conflict, then setup_APIC_eilvt() will throw an error.
>> +	 */
>>  	def_new = (low & MASK_DEF_LVTOFF) >> 4;
>> -	if (!(low & MASK_DEF_LVTOFF)) {
>> -		pr_err(FW_BUG "Your BIOS is not setting up LVT offset 0x2 for deferred error IRQs correctly.\n");
>> -		def_new = DEF_LVT_OFF;
>> -		low = (low & ~MASK_DEF_LVTOFF) | (DEF_LVT_OFF << 4);
>> -	}
>> +	if (setup_APIC_eilvt(def_new, DEFERRED_ERROR_VECTOR, APIC_EILVT_MSG_FIX, 0))
>> +		return;
>>  
>> -	def_offset = setup_APIC_deferred_error(def_offset, def_new);
>> -	if ((def_offset == def_new) &&
>> -	    (deferred_error_int_vector != amd_deferred_error_interrupt))
>> -		deferred_error_int_vector = amd_deferred_error_interrupt;
> 
> There was a reason for that - deferred_error_int_vector is a global var
> and you're calling enable_deferred_error_interrupt() on each CPU.
>

Right, and all CPUs should use the same APIC LVT offset. If they differ, then
setup_APIC_eilvt() will fail above and return.

Why check "if X != Y, then X = Y"? Why not just unconditionally do "X = Y"?

>> +	deferred_error_int_vector = amd_deferred_error_interrupt;
>>  
>>  	if (!mce_flags.smca)
>>  		low = (low & ~MASK_DEF_INT_TYPE) | DEF_INT_TYPE_APIC;
> 
> Thx.
> 

Thanks,
Yazen
Borislav Petkov April 29, 2024, 12:59 p.m. UTC | #3
On Thu, Apr 25, 2024 at 10:12:44AM -0400, Yazen Ghannam wrote:
> I trust the value from hardware.
> 
> The intention here is to simplify the code for general maintenance and to make
> later patches easier.

There's this BIOS thing which programs those and throws a wrench in all
our trusting in the hw.

> All Zen/SMCA systems. I don't recall any issues in this area.
> 
> Some later Family 15h systems (Carrizo?) had it. But I don't know if it was
> used in production. It was slightly before my time.

Yeah, already solved in the previous mail.

> Right, I found this:
> 27afdf2008da ("apic, x86: Use BIOS settings for IBS and MCE threshold
> interrupt LVT offsets")
> 
> Which is basically the same idea: use what is in the register.
> 
> But it looks there was an issue with IBS on Family 10h.

Yap, and it was pretty blatant AFAIR.

> Is this the only case of a real issue?

I don't remember anything else but I'm not excluding there not being
others.

> If so, then why apply this method to the THR and DFR interrupts?

Meaning what exactly? You want to trust the hw for THR and DFR and let
the others use this offset reservation we're doing?

> Right, and all CPUs should use the same APIC LVT offset. If they differ, then
> setup_APIC_eilvt() will fail above and return.
> 
> Why check "if X != Y, then X = Y"? Why not just unconditionally do "X = Y"?

Why unconditionally do the assignment if it is already assigned?

I don't think x86 does store tearing so that we get deferred interrupt
on some core while some other core writes the same function pointer in
there but why even risk it if it can be avoided with a simple test?
Yazen Ghannam April 29, 2024, 1:56 p.m. UTC | #4
On 4/29/2024 8:59 AM, Borislav Petkov wrote:
> On Thu, Apr 25, 2024 at 10:12:44AM -0400, Yazen Ghannam wrote:
>> I trust the value from hardware.
>>
>> The intention here is to simplify the code for general maintenance and to make
>> later patches easier.
> 
> There's this BIOS thing which programs those and throws a wrench in all
> our trusting in the hw.
> 
>> All Zen/SMCA systems. I don't recall any issues in this area.
>>
>> Some later Family 15h systems (Carrizo?) had it. But I don't know if it was
>> used in production. It was slightly before my time.
> 
> Yeah, already solved in the previous mail.
> 
>> Right, I found this:
>> 27afdf2008da ("apic, x86: Use BIOS settings for IBS and MCE threshold
>> interrupt LVT offsets")
>>
>> Which is basically the same idea: use what is in the register.
>>
>> But it looks there was an issue with IBS on Family 10h.
> 
> Yap, and it was pretty blatant AFAIR.
> 
>> Is this the only case of a real issue?
> 
> I don't remember anything else but I'm not excluding there not being
> others.
> 
>> If so, then why apply this method to the THR and DFR interrupts?
> 
> Meaning what exactly? You want to trust the hw for THR and DFR and let
> the others use this offset reservation we're doing?
> 

Right, I mean we should do things the simpler way unless there's a real issue
to address.

>> Right, and all CPUs should use the same APIC LVT offset. If they differ, then
>> setup_APIC_eilvt() will fail above and return.
>>
>> Why check "if X != Y, then X = Y"? Why not just unconditionally do "X = Y"?
> 
> Why unconditionally do the assignment if it is already assigned?
> 
> I don't think x86 does store tearing so that we get deferred interrupt
> on some core while some other core writes the same function pointer in
> there but why even risk it if it can be avoided with a simple test?
> 

I'm not opposed to this, but I don't understand what is at risk.

Is it that the function pointer may not be written atomically? So even if we
write it again with the same value, a concurrent interrupt on another core may
see a partially updated (corrupt) pointer?

intel_init_cmci() does not do this check. So is it more at risk, or is the AMD
code just more cautious?

Again I'm not against the current code. I just think we should simplify it, if
possible.

Thanks,
Yazen
Borislav Petkov April 29, 2024, 2:12 p.m. UTC | #5
On Mon, Apr 29, 2024 at 09:56:56AM -0400, Yazen Ghannam wrote:
> Right, I mean we should do things the simpler way unless there's a real issue
> to address.

You need to pay attention to past issues before you go, simplify it and
break it again.

> I'm not opposed to this, but I don't understand what is at risk.
> 
> Is it that the function pointer may not be written atomically? So even if we
> write it again with the same value, a concurrent interrupt on another core may
> see a partially updated (corrupt) pointer?

Yes, it won't happen, they say as it is guaranteed by the
architecture. But I've heard those "promises".

> intel_init_cmci() does not do this check. So is it more at risk, or is the AMD
> code just more cautious?
> 
> Again I'm not against the current code. I just think we should simplify it, if
> possible.

So in looking at the INTR_CFG MSR, I think we should do a function which
does MCA init stuff only on the BSP exactly for things like that.

There you can set the interrupt handler pointer, the INTR_CFG MSR and so
on. And we don't have such function and I've needed a function like that
in the past.

And just for the general goal of not doing ugly code which should run
only once but is run per-CPU just because our infrastructure doesn't
allow it.

Wanna give that a try?

Thx.
Yazen Ghannam April 29, 2024, 2:25 p.m. UTC | #6
On 4/29/2024 10:12 AM, Borislav Petkov wrote:
> On Mon, Apr 29, 2024 at 09:56:56AM -0400, Yazen Ghannam wrote:
>> Right, I mean we should do things the simpler way unless there's a real issue
>> to address.
> 
> You need to pay attention to past issues before you go, simplify it and
> break it again.
> 

I completely agree. I haven't seen evidence of an issue yet for the DFR case
though. Which is why I thought it'd be safe to do some clean up.

>> I'm not opposed to this, but I don't understand what is at risk.
>>
>> Is it that the function pointer may not be written atomically? So even if we
>> write it again with the same value, a concurrent interrupt on another core may
>> see a partially updated (corrupt) pointer?
> 
> Yes, it won't happen, they say as it is guaranteed by the
> architecture. But I've heard those "promises".
> 
>> intel_init_cmci() does not do this check. So is it more at risk, or is the AMD
>> code just more cautious?
>>
>> Again I'm not against the current code. I just think we should simplify it, if
>> possible.
> 
> So in looking at the INTR_CFG MSR, I think we should do a function which
> does MCA init stuff only on the BSP exactly for things like that.
> 
> There you can set the interrupt handler pointer, the INTR_CFG MSR and so
> on. And we don't have such function and I've needed a function like that
> in the past.
> 
> And just for the general goal of not doing ugly code which should run
> only once but is run per-CPU just because our infrastructure doesn't
> allow it.
> 
> Wanna give that a try?
> 
> Thx.
> 

Yep, "MCA init cleanup" is another thing on my TODO list.

The BSP still completely finishes init before the APs, correct? I recall some
effort to make CPU init more parallel, but I haven't been following it.

Thanks,
Yazen
Robert Richter April 29, 2024, 6:34 p.m. UTC | #7
On 25.04.24 10:12:44, Yazen Ghannam wrote:
> On 4/24/2024 3:06 PM, Borislav Petkov wrote:
> > On Thu, Apr 04, 2024 at 10:13:50AM -0500, Yazen Ghannam wrote:
> >> AMD systems with the SUCCOR feature can send an APIC LVT interrupt for
> >> deferred errors. The LVT offset is 0x2 by convention, i.e. this is the
> >> default as listed in hardware documentation.
> >>
> >> However, the MCA registers may list a different LVT offset for this
> >> interrupt. The kernel should honor the value from the hardware.
> > 
> > There's this "may" thing again.
> > 
> 
> Right, I should say "the microarchitecture allows it". :)
> 
> > Is this enablement for some future hw too or do you really trust the
> > value in MSR_CU_DEF_ERR is programmed correctly in all cases?
> > 
> 
> I trust the value from hardware.
> 
> The intention here is to simplify the code for general maintenance and to make
> later patches easier.
> 
> >> Simplify the enable flow by using the hardware-provided value. Any
> >> conflicts will be caught by setup_APIC_eilvt(). Conflicts on production
> >> systems can be handled as quirks, if needed.
> > 
> > Well, which systems support succor?
> > 
> > I'd like to test this on them before we face all the quirkery. :)
> > 
> 
> All Zen/SMCA systems. I don't recall any issues in this area.
> 
> Some later Family 15h systems (Carrizo?) had it. But I don't know if it was
> used in production. It was slightly before my time.
> 
> > That area has been plagued by hw snafus if you look at
> > setup_APIC_eilvt() and talk to uncle Robert. :-P
> >
> 
> Right, I found this:
> 27afdf2008da ("apic, x86: Use BIOS settings for IBS and MCE threshold
> interrupt LVT offsets")
> 
> Which is basically the same idea: use what is in the register.
> 
> But it looks there was an issue with IBS on Family 10h.

After looking a while into it I think the issue was the following:

IBS offset was not enabled by firmware, but MCE already was (due to
earlier setup). And mce was (maybe) not on all cpus and only one cpu
per socket enabled. The IBS vector should be enabled on all cpus. Now
firmware allocated offset 1 for mce (instead of offset 0 as for
k8). This caused the hardcoded value (offset 1 for IBS) to be already
taken. Also, hardcoded values couldn't be used at all as this would
have not been worked on k8 (for mce). Another issue was to find the
next free offset as you couldn't examine just the current cpu. So even
if the offset on the current was available, another cpu might have
that offset already in use. Yet another problem was that programmed
offsets for mce and ibs overlapped each other and the kernel had to
reassign them (the ibs offset).

I hope a remember correctly here with all details.

Thanks,

-Robert
Borislav Petkov April 30, 2024, 1:47 p.m. UTC | #8
On Mon, Apr 29, 2024 at 10:25:02AM -0400, Yazen Ghannam wrote:
> Yep, "MCA init cleanup" is another thing on my TODO list.

Right, so looking at the code, early_identify_cpu()/identify_boot_cpu()
could call a

mca_bsp_init()

or so which does the once, vendor-agnostic setup.

identify_cpu->mcheck_cpu_init()

already does both the BSP and AP per-CPU init.

With such a split, I think we'll have all the proper places to do setup
at the right time without hackery.

> The BSP still completely finishes init before the APs, correct?

Yes.

> I recall some effort to make CPU init more parallel, but I haven't
> been following it.

I think you mean parallel hotplug but that's for the APs only.

Thx.
Borislav Petkov April 30, 2024, 6:06 p.m. UTC | #9
On Mon, Apr 29, 2024 at 08:34:37PM +0200, Robert Richter wrote:
> After looking a while into it I think the issue was the following:
> 
> IBS offset was not enabled by firmware, but MCE already was (due to
> earlier setup). And mce was (maybe) not on all cpus and only one cpu
> per socket enabled. The IBS vector should be enabled on all cpus. Now
> firmware allocated offset 1 for mce (instead of offset 0 as for
> k8). This caused the hardcoded value (offset 1 for IBS) to be already
> taken. Also, hardcoded values couldn't be used at all as this would
> have not been worked on k8 (for mce). Another issue was to find the
> next free offset as you couldn't examine just the current cpu. So even
> if the offset on the current was available, another cpu might have
> that offset already in use. Yet another problem was that programmed
> offsets for mce and ibs overlapped each other and the kernel had to
> reassign them (the ibs offset).
> 
> I hope a remember correctly here with all details.

I think you're remembering it correct because after I read this, a very
very old and dormant brain cell did light up in my head and said, oh
yeah, that definitely rings a bell!

:-P

Yazen, this is the type of mess I was talking about.
Yazen Ghannam May 2, 2024, 4:02 p.m. UTC | #10
On 4/30/24 2:06 PM, Borislav Petkov wrote:
> On Mon, Apr 29, 2024 at 08:34:37PM +0200, Robert Richter wrote:
>> After looking a while into it I think the issue was the following:
>>
>> IBS offset was not enabled by firmware, but MCE already was (due to
>> earlier setup). And mce was (maybe) not on all cpus and only one cpu
>> per socket enabled. The IBS vector should be enabled on all cpus. Now
>> firmware allocated offset 1 for mce (instead of offset 0 as for
>> k8). This caused the hardcoded value (offset 1 for IBS) to be already
>> taken. Also, hardcoded values couldn't be used at all as this would
>> have not been worked on k8 (for mce). Another issue was to find the
>> next free offset as you couldn't examine just the current cpu. So even
>> if the offset on the current was available, another cpu might have
>> that offset already in use. Yet another problem was that programmed
>> offsets for mce and ibs overlapped each other and the kernel had to
>> reassign them (the ibs offset).
>>
>> I hope a remember correctly here with all details.
> 
> I think you're remembering it correct because after I read this, a very
> very old and dormant brain cell did light up in my head and said, oh
> yeah, that definitely rings a bell!
> 
> :-P
> 
> Yazen, this is the type of mess I was talking about.
>

Yep, I see what you mean. Definitely a pain :/

So is this the only known issue? And was it encountered in production
systems? Were/are people using IBS on K8 (Family Fh) systems? I know
that perf got support at this time, but do people still use it?

Just as an example, this project has Family 10h as the earliest supported.
https://github.com/jlgreathouse/AMD_IBS_Toolkit

My thinking is that we can simplify the code if there are no practical
issues. And we can address any reported issues as they come.

If you think that's okay, then I can continue with this particular clean
up. If not, then at least we have some more context here.

I'm sure there will be more topics like this when redoing the MCA init path.

:)

Thanks,
Yazen
Robert Richter May 2, 2024, 6:48 p.m. UTC | #11
On 02.05.24 12:02:02, Yazen Ghannam wrote:
> On 4/30/24 2:06 PM, Borislav Petkov wrote:
> > On Mon, Apr 29, 2024 at 08:34:37PM +0200, Robert Richter wrote:
> > > After looking a while into it I think the issue was the following:
> > > 
> > > IBS offset was not enabled by firmware, but MCE already was (due to
> > > earlier setup). And mce was (maybe) not on all cpus and only one cpu
> > > per socket enabled. The IBS vector should be enabled on all cpus. Now
> > > firmware allocated offset 1 for mce (instead of offset 0 as for
> > > k8). This caused the hardcoded value (offset 1 for IBS) to be already
> > > taken. Also, hardcoded values couldn't be used at all as this would
> > > have not been worked on k8 (for mce). Another issue was to find the
> > > next free offset as you couldn't examine just the current cpu. So even
> > > if the offset on the current was available, another cpu might have
> > > that offset already in use. Yet another problem was that programmed
> > > offsets for mce and ibs overlapped each other and the kernel had to
> > > reassign them (the ibs offset).
> > > 
> > > I hope a remember correctly here with all details.
> > 
> > I think you're remembering it correct because after I read this, a very
> > very old and dormant brain cell did light up in my head and said, oh
> > yeah, that definitely rings a bell!
> > 
> > :-P
> > 
> > Yazen, this is the type of mess I was talking about.
> > 
> 
> Yep, I see what you mean. Definitely a pain :/
> 
> So is this the only known issue? And was it encountered in production
> systems? Were/are people using IBS on K8 (Family Fh) systems? I know
> that perf got support at this time, but do people still use it?
> 
> Just as an example, this project has Family 10h as the earliest supported.
> https://github.com/jlgreathouse/AMD_IBS_Toolkit

No, IBS was introduced with 10h, but the eilvt offset came already
with k8, but with only one entry. That affected productions systems
and BIOSes.

> 
> My thinking is that we can simplify the code if there are no practical
> issues. And we can address any reported issues as they come.
> 
> If you think that's okay, then I can continue with this particular clean
> up. If not, then at least we have some more context here.

The general approach to use the preprogrammed offsets looks good to
me. Though, existing code [1] checks the offset and reapplies a
hardcoded value of 2 if it is zero. I don't know the history of
this. However, it would be great if that code could be simplified.

-Robert

[1] commit 24fd78a81f6d ("x86/mce/amd: Introduce deferred error interrupt handler")

> 
> I'm sure there will be more topics like this when redoing the MCA init path.
> 
> :)
> 
> Thanks,
> Yazen
Borislav Petkov May 4, 2024, 2:37 p.m. UTC | #12
On Thu, May 02, 2024 at 08:48:34PM +0200, Robert Richter wrote:
> The general approach to use the preprogrammed offsets looks good to
> me. Though, existing code [1] checks the offset and reapplies a
> hardcoded value of 2 if it is zero. I don't know the history of
> this. However, it would be great if that code could be simplified.

Whatever we do, the best approach is to leave old hw as it is - you
don't want to "rework" all this and break some obscure old configuration
which only one user has and debugging on it is hard, to say the least.

So for everyone's sake, do a cutoff at Zen, leave the pre-Zen machines
support as it is and go wild with Zen and SMCA and so on, where we can
still test on all kinds of hw.

And as always, ask yourself whether any "simplification" you're thinking
of, is even worth the effort and is not just code churn without any real
advantages.

Thx.
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index e8e78d91082b..32628a30a5c1 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -48,7 +48,6 @@ 
 #define MSR_CU_DEF_ERR		0xC0000410
 #define MASK_DEF_LVTOFF		0x000000F0
 #define MASK_DEF_INT_TYPE	0x00000006
-#define DEF_LVT_OFF		0x2
 #define DEF_INT_TYPE_APIC	0x2
 #define INTR_TYPE_APIC			0x1
 
@@ -575,19 +574,9 @@  static int setup_APIC_mce_threshold(int reserved, int new)
 	return reserved;
 }
 
-static int setup_APIC_deferred_error(int reserved, int new)
+static void enable_deferred_error_interrupt(void)
 {
-	if (reserved < 0 && !setup_APIC_eilvt(new, DEFERRED_ERROR_VECTOR,
-					      APIC_EILVT_MSG_FIX, 0))
-		return new;
-
-	return reserved;
-}
-
-static void deferred_error_interrupt_enable(struct cpuinfo_x86 *c)
-{
-	u32 low = 0, high = 0;
-	int def_offset = -1, def_new;
+	u32 low = 0, high = 0, def_new;
 
 	if (!mce_flags.succor)
 		return;
@@ -595,17 +584,15 @@  static void deferred_error_interrupt_enable(struct cpuinfo_x86 *c)
 	if (rdmsr_safe(MSR_CU_DEF_ERR, &low, &high))
 		return;
 
+	/*
+	 * Trust the value from hardware.
+	 * If there's a conflict, then setup_APIC_eilvt() will throw an error.
+	 */
 	def_new = (low & MASK_DEF_LVTOFF) >> 4;
-	if (!(low & MASK_DEF_LVTOFF)) {
-		pr_err(FW_BUG "Your BIOS is not setting up LVT offset 0x2 for deferred error IRQs correctly.\n");
-		def_new = DEF_LVT_OFF;
-		low = (low & ~MASK_DEF_LVTOFF) | (DEF_LVT_OFF << 4);
-	}
+	if (setup_APIC_eilvt(def_new, DEFERRED_ERROR_VECTOR, APIC_EILVT_MSG_FIX, 0))
+		return;
 
-	def_offset = setup_APIC_deferred_error(def_offset, def_new);
-	if ((def_offset == def_new) &&
-	    (deferred_error_int_vector != amd_deferred_error_interrupt))
-		deferred_error_int_vector = amd_deferred_error_interrupt;
+	deferred_error_int_vector = amd_deferred_error_interrupt;
 
 	if (!mce_flags.smca)
 		low = (low & ~MASK_DEF_INT_TYPE) | DEF_INT_TYPE_APIC;
@@ -771,7 +758,7 @@  void mce_amd_feature_init(struct cpuinfo_x86 *c)
 	u32 low = 0, high = 0, address = 0;
 	int offset = -1;
 
-	deferred_error_interrupt_enable(c);
+	enable_deferred_error_interrupt();
 
 	for (bank = 0; bank < this_cpu_read(mce_num_banks); ++bank) {
 		if (mce_flags.smca)