Message ID | 20240404151359.47970-9-yazen.ghannam@amd.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | MCA Updates | expand |
On Thu, Apr 04, 2024 at 10:13:51AM -0500, Yazen Ghannam wrote: > -/* Deferred error settings */ > +/* MCA Interrupt Configuration register, one per CPU */ SMCA? > #define MSR_CU_DEF_ERR 0xC0000410 > -#define MASK_DEF_LVTOFF 0x000000F0 > -#define MASK_DEF_INT_TYPE 0x00000006 > -#define DEF_INT_TYPE_APIC 0x2 > +#define MSR_MCA_INTR_CFG 0xC0000410 You do see those other MSRs' prefixes, right? MSR_AMD64_SMCA_... Is this one not part of the SMCA arch? > +#define INTR_CFG_DFR_LVT_OFFSET GENMASK_ULL(7, 4) > +#define INTR_CFG_LEGACY_DFR_INTR_TYPE GENMASK_ULL(2, 1) > #define INTR_TYPE_APIC 0x1 Ditto for its bit(s) names. > +static u64 get_mca_intr_cfg(void) > +{ > + u64 mca_intr_cfg; > + > + if (!mce_flags.succor) > + return 0; > + > + if (rdmsrl_safe(MSR_MCA_INTR_CFG, &mca_intr_cfg)) > + return 0; > + > + return mca_intr_cfg; > +} This is an overkill. If we add a function for every MSR we're reading... Do this differently: prepare the value you're writing back into the INTR_CFG MSR once, save it into mca_intr_cfg and then write it on each core at the end of enable_deferred_error_interrupt(). And make u64 mca_intr_cfg static global to amd.c so that you can refer to it from outside of the functions and then you don't have to pass it around as a function param. Thx.
On 4/29/2024 9:12 AM, Borislav Petkov wrote: > On Thu, Apr 04, 2024 at 10:13:51AM -0500, Yazen Ghannam wrote: >> -/* Deferred error settings */ >> +/* MCA Interrupt Configuration register, one per CPU */ > > SMCA? > >> #define MSR_CU_DEF_ERR 0xC0000410 >> -#define MASK_DEF_LVTOFF 0x000000F0 >> -#define MASK_DEF_INT_TYPE 0x00000006 >> -#define DEF_INT_TYPE_APIC 0x2 >> +#define MSR_MCA_INTR_CFG 0xC0000410 > > You do see those other MSRs' prefixes, right? > > MSR_AMD64_SMCA_... > > Is this one not part of the SMCA arch? > No, it is part of SUCCOR. The old define is above: MSR_CU_DEF_ERR. This is how it is listed in the PPR: MSRC000_0410 [MCA Interrupt Configuration] (Core::X86::Msr::McaIntrCfg) >> +#define INTR_CFG_DFR_LVT_OFFSET GENMASK_ULL(7, 4) >> +#define INTR_CFG_LEGACY_DFR_INTR_TYPE GENMASK_ULL(2, 1) >> #define INTR_TYPE_APIC 0x1 > > Ditto for its bit(s) names. > Okay. >> +static u64 get_mca_intr_cfg(void) >> +{ >> + u64 mca_intr_cfg; >> + >> + if (!mce_flags.succor) >> + return 0; >> + >> + if (rdmsrl_safe(MSR_MCA_INTR_CFG, &mca_intr_cfg)) >> + return 0; >> + >> + return mca_intr_cfg; >> +} > > This is an overkill. If we add a function for every MSR we're reading... > > Do this differently: prepare the value you're writing back into the > INTR_CFG MSR once, save it into mca_intr_cfg and then write it on each > core at the end of enable_deferred_error_interrupt(). > > And make u64 mca_intr_cfg static global to amd.c so that you can refer > to it from outside of the functions and then you don't have to pass it > around as a function param. > > Thx. > Good idea. In fact, we can treat this register as read-only, since we will only handle (SUCCOR && SMCA) systems. The only need to write this register would be on !SMCA systems. We need to assume that the register value will be identical for all CPUs. This is the expectation, but I'll add a comment to highlight this. Also, we don't need the entire register. We just need the LVT offset fields which are 4 bits each. Thanks, Yazen
On Mon, Apr 29, 2024 at 10:18:59AM -0400, Yazen Ghannam wrote: > Good idea. In fact, we can treat this register as read-only, since we will > only handle (SUCCOR && SMCA) systems. The only need to write this register > would be on !SMCA systems. > > We need to assume that the register value will be identical for all CPUs. This > is the expectation, but I'll add a comment to highlight this. > > Also, we don't need the entire register. We just need the LVT offset fields > which are 4 bits each. Yes, you could read out and sanity-check only the LVT offsets and make they're the same across all cores to make sure BIOS hasn't dropped the ball in programming them. But see my previous mail too: I think we should leave pre-Zen as it is and do cleanups and simplifications only for Zen and newer. Thx.
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 32628a30a5c1..f59f4a1c9b21 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -44,11 +44,11 @@ #define MASK_BLKPTR_LO 0xFF000000 #define MCG_XBLK_ADDR 0xC0000400 -/* Deferred error settings */ +/* MCA Interrupt Configuration register, one per CPU */ #define MSR_CU_DEF_ERR 0xC0000410 -#define MASK_DEF_LVTOFF 0x000000F0 -#define MASK_DEF_INT_TYPE 0x00000006 -#define DEF_INT_TYPE_APIC 0x2 +#define MSR_MCA_INTR_CFG 0xC0000410 +#define INTR_CFG_DFR_LVT_OFFSET GENMASK_ULL(7, 4) +#define INTR_CFG_LEGACY_DFR_INTR_TYPE GENMASK_ULL(2, 1) #define INTR_TYPE_APIC 0x1 /* Scalable MCA: */ @@ -574,30 +574,30 @@ static int setup_APIC_mce_threshold(int reserved, int new) return reserved; } -static void enable_deferred_error_interrupt(void) +static void enable_deferred_error_interrupt(u64 mca_intr_cfg) { - u32 low = 0, high = 0, def_new; + u8 dfr_offset; - if (!mce_flags.succor) - return; - - if (rdmsr_safe(MSR_CU_DEF_ERR, &low, &high)) + if (!mca_intr_cfg) return; /* * Trust the value from hardware. * If there's a conflict, then setup_APIC_eilvt() will throw an error. */ - def_new = (low & MASK_DEF_LVTOFF) >> 4; - if (setup_APIC_eilvt(def_new, DEFERRED_ERROR_VECTOR, APIC_EILVT_MSG_FIX, 0)) + dfr_offset = FIELD_GET(INTR_CFG_DFR_LVT_OFFSET, mca_intr_cfg); + if (setup_APIC_eilvt(dfr_offset, DEFERRED_ERROR_VECTOR, APIC_EILVT_MSG_FIX, 0)) return; deferred_error_int_vector = amd_deferred_error_interrupt; - if (!mce_flags.smca) - low = (low & ~MASK_DEF_INT_TYPE) | DEF_INT_TYPE_APIC; + if (mce_flags.smca) + return; - wrmsr(MSR_CU_DEF_ERR, low, high); + mca_intr_cfg &= ~INTR_CFG_LEGACY_DFR_INTR_TYPE; + mca_intr_cfg |= FIELD_PREP(INTR_CFG_LEGACY_DFR_INTR_TYPE, INTR_TYPE_APIC); + + wrmsrl(MSR_MCA_INTR_CFG, mca_intr_cfg); } static u32 smca_get_block_address(unsigned int bank, unsigned int block, @@ -751,14 +751,28 @@ static void disable_err_thresholding(struct cpuinfo_x86 *c, unsigned int bank) wrmsrl(MSR_K7_HWCR, hwcr); } +static u64 get_mca_intr_cfg(void) +{ + u64 mca_intr_cfg; + + if (!mce_flags.succor) + return 0; + + if (rdmsrl_safe(MSR_MCA_INTR_CFG, &mca_intr_cfg)) + return 0; + + return mca_intr_cfg; +} + /* cpu init entry point, called from mce.c with preempt off */ void mce_amd_feature_init(struct cpuinfo_x86 *c) { unsigned int bank, block, cpu = smp_processor_id(); + u64 mca_intr_cfg = get_mca_intr_cfg(); u32 low = 0, high = 0, address = 0; int offset = -1; - enable_deferred_error_interrupt(); + enable_deferred_error_interrupt(mca_intr_cfg); for (bank = 0; bank < this_cpu_read(mce_num_banks); ++bank) { if (mce_flags.smca)
Switch to bitops to help with clarity. Also, avoid an unnecessary wrmsr() for SMCA systems. Use the updated name for MSR 0xC000_0410 to match the documentation for Family 0x17 and later systems. This MSR is used for setting up both Deferred and MCA Thresholding interrupts on current systems. So read it once during init and pass to functions that need it. Start with the Deferred error interrupt case. The MCA Thresholding interrupt case will be handled during refactoring. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> --- Notes: Link: https://lkml.kernel.org/r/20231118193248.1296798-13-yazen.ghannam@amd.com v1->v2: * Remove invalid SMCA check in get_mca_intr_cfg(). (Yazen) arch/x86/kernel/cpu/mce/amd.c | 46 +++++++++++++++++++++++------------ 1 file changed, 30 insertions(+), 16 deletions(-)