Message ID | 20210630152828.162659-5-nchatrad@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/edac/amd64: Add support for noncpu nodes | expand |
On Wed, Jun 30, 2021 at 08:58:25PM +0530, Naveen Krishna Chatradhi wrote: > On AMD systems with SMCA banks on NONCPU nodes, the node id information > is available in the InstanceHI[47:44] of the IPID register. The doesn't read well to me. I saw this as saying "bits 47:44 of the InstanceHi register". Also, the name of the field is "InstanceIdHi" in the documentation. I think it'd be more clear to say "available in MCA_IPID[47:44] (InstanceIdHi)" or something similar. > > Signed-off-by: Muralidhara M K <muralimk@amd.com> > Signed-off-by: Naveen Krishna Chatradhi <nchatrad@amd.com> > --- > drivers/edac/mce_amd.c | 15 +++++++++++++-- > 1 file changed, 13 insertions(+), 2 deletions(-) > > diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c > index 27d56920b469..364dfb6e359d 100644 > --- a/drivers/edac/mce_amd.c > +++ b/drivers/edac/mce_amd.c > @@ -1049,6 +1049,7 @@ static void decode_smca_error(struct mce *m) > enum smca_bank_types bank_type; > const char *ip_name; > u8 xec = XEC(m->status, xec_mask); > + u32 node_id = 0; Why u32? Why not u16 to match topology_die_id() or int to match decode_dram_ecc()? > > if (m->bank >= ARRAY_SIZE(smca_banks)) > return; > @@ -1072,8 +1073,18 @@ static void decode_smca_error(struct mce *m) > if (xec < smca_mce_descs[bank_type].num_descs) > pr_cont(", %s.\n", smca_mce_descs[bank_type].descs[xec]); > > - if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc) > - decode_dram_ecc(topology_die_id(m->extcpu), m); > + /* > + * SMCA_UMC_V2 is used on the noncpu nodes, extract the node id > + * from the InstanceHI[47:44] of the IPID register. > + */ > + if (bank_type == SMCA_UMC_V2 && xec == 0) > + node_id = ((m->ipid >> 44) & 0xF); > + > + if (bank_type == SMCA_UMC && xec == 0) > + node_id = topology_die_id(m->extcpu); > + > + if (decode_dram_ecc) > + decode_dram_ecc(node_id, m); If decode_dram_ecc() is set, then this will call it on every MCA error that comes in. Rather we only want to call it on DRAM ECC errors. You could do something like this: if (decode_dram_ecc && xec == 0) { u32 node_id = 0; if (bank_type == SMCA_UMC) node_id = XXX; else if (bank_type == SMCA_UMC_V2) node_id = YYY; else return; decode_dram_ecc(node_id, m); } This is just an example. Maybe you can save an indentation level by negating those conditions and returning early, etc. Thanks, Yazen
[Public] Hi Yazen -----Original Message----- From: Ghannam, Yazen <Yazen.Ghannam@amd.com> Sent: Thursday, July 29, 2021 10:03 PM To: Chatradhi, Naveen Krishna <NaveenKrishna.Chatradhi@amd.com> Cc: linux-edac@vger.kernel.org; x86@kernel.org; linux-kernel@vger.kernel.org; bp@alien8.de; mingo@redhat.com; mchehab@kernel.org Subject: Re: [PATCH 4/7] EDAC/mce_amd: extract node id from InstanceHi in IPID On Wed, Jun 30, 2021 at 08:58:25PM +0530, Naveen Krishna Chatradhi wrote: > On AMD systems with SMCA banks on NONCPU nodes, the node id > information is available in the InstanceHI[47:44] of the IPID register. The doesn't read well to me. I saw this as saying "bits 47:44 of the InstanceHi register". Also, the name of the field is "InstanceIdHi" in the documentation. I think it'd be more clear to say "available in MCA_IPID[47:44] (InstanceIdHi)" or something similar. [naveenk:] Modified the commit message > > Signed-off-by: Muralidhara M K <muralimk@amd.com> > Signed-off-by: Naveen Krishna Chatradhi <nchatrad@amd.com> > --- > drivers/edac/mce_amd.c | 15 +++++++++++++-- > 1 file changed, 13 insertions(+), 2 deletions(-) > > diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c index > 27d56920b469..364dfb6e359d 100644 > --- a/drivers/edac/mce_amd.c > +++ b/drivers/edac/mce_amd.c > @@ -1049,6 +1049,7 @@ static void decode_smca_error(struct mce *m) > enum smca_bank_types bank_type; > const char *ip_name; > u8 xec = XEC(m->status, xec_mask); > + u32 node_id = 0; Why u32? Why not u16 to match topology_die_id() or int to match decode_dram_ecc()? [naveenk:] Done, used int. > > if (m->bank >= ARRAY_SIZE(smca_banks)) > return; > @@ -1072,8 +1073,18 @@ static void decode_smca_error(struct mce *m) > if (xec < smca_mce_descs[bank_type].num_descs) > pr_cont(", %s.\n", smca_mce_descs[bank_type].descs[xec]); > > - if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc) > - decode_dram_ecc(topology_die_id(m->extcpu), m); > + /* > + * SMCA_UMC_V2 is used on the noncpu nodes, extract the node id > + * from the InstanceHI[47:44] of the IPID register. > + */ > + if (bank_type == SMCA_UMC_V2 && xec == 0) > + node_id = ((m->ipid >> 44) & 0xF); > + > + if (bank_type == SMCA_UMC && xec == 0) > + node_id = topology_die_id(m->extcpu); > + > + if (decode_dram_ecc) > + decode_dram_ecc(node_id, m); If decode_dram_ecc() is set, then this will call it on every MCA error that comes in. Rather we only want to call it on DRAM ECC errors. You could do something like this: if (decode_dram_ecc && xec == 0) { u32 node_id = 0; if (bank_type == SMCA_UMC) node_id = XXX; else if (bank_type == SMCA_UMC_V2) node_id = YYY; else return; decode_dram_ecc(node_id, m); } This is just an example. Maybe you can save an indentation level by negating those conditions and returning early, etc. [naveenk:] modified the ladder. Thanks, Yazen [naveenk:] Thank you.
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c index 27d56920b469..364dfb6e359d 100644 --- a/drivers/edac/mce_amd.c +++ b/drivers/edac/mce_amd.c @@ -1049,6 +1049,7 @@ static void decode_smca_error(struct mce *m) enum smca_bank_types bank_type; const char *ip_name; u8 xec = XEC(m->status, xec_mask); + u32 node_id = 0; if (m->bank >= ARRAY_SIZE(smca_banks)) return; @@ -1072,8 +1073,18 @@ static void decode_smca_error(struct mce *m) if (xec < smca_mce_descs[bank_type].num_descs) pr_cont(", %s.\n", smca_mce_descs[bank_type].descs[xec]); - if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc) - decode_dram_ecc(topology_die_id(m->extcpu), m); + /* + * SMCA_UMC_V2 is used on the noncpu nodes, extract the node id + * from the InstanceHI[47:44] of the IPID register. + */ + if (bank_type == SMCA_UMC_V2 && xec == 0) + node_id = ((m->ipid >> 44) & 0xF); + + if (bank_type == SMCA_UMC && xec == 0) + node_id = topology_die_id(m->extcpu); + + if (decode_dram_ecc) + decode_dram_ecc(node_id, m); } static inline void amd_decode_err_code(u16 ec)