diff mbox series

[4/7] EDAC/mce_amd: extract node id from InstanceHi in IPID

Message ID 20210630152828.162659-5-nchatrad@amd.com (mailing list archive)
State New, archived
Headers show
Series x86/edac/amd64: Add support for noncpu nodes | expand

Commit Message

Naveen Krishna Chatradhi June 30, 2021, 3:28 p.m. UTC
On AMD systems with SMCA banks on NONCPU nodes, the node id information
is available in the InstanceHI[47:44] of the IPID register.

Signed-off-by: Muralidhara M K <muralimk@amd.com>
Signed-off-by: Naveen Krishna Chatradhi <nchatrad@amd.com>
---
 drivers/edac/mce_amd.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

Comments

Yazen Ghannam July 29, 2021, 4:32 p.m. UTC | #1
On Wed, Jun 30, 2021 at 08:58:25PM +0530, Naveen Krishna Chatradhi wrote:
> On AMD systems with SMCA banks on NONCPU nodes, the node id information
> is available in the InstanceHI[47:44] of the IPID register.

The doesn't read well to me. I saw this as saying "bits 47:44 of the
InstanceHi register". Also, the name of the field is "InstanceIdHi" in
the documentation.

I think it'd be more clear to say "available in MCA_IPID[47:44]
(InstanceIdHi)" or something similar. 

> 
> Signed-off-by: Muralidhara M K <muralimk@amd.com>
> Signed-off-by: Naveen Krishna Chatradhi <nchatrad@amd.com>
> ---
>  drivers/edac/mce_amd.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
> index 27d56920b469..364dfb6e359d 100644
> --- a/drivers/edac/mce_amd.c
> +++ b/drivers/edac/mce_amd.c
> @@ -1049,6 +1049,7 @@ static void decode_smca_error(struct mce *m)
>  	enum smca_bank_types bank_type;
>  	const char *ip_name;
>  	u8 xec = XEC(m->status, xec_mask);
> +	u32 node_id = 0;

Why u32? Why not u16 to match topology_die_id() or int to match
decode_dram_ecc()?

>  
>  	if (m->bank >= ARRAY_SIZE(smca_banks))
>  		return;
> @@ -1072,8 +1073,18 @@ static void decode_smca_error(struct mce *m)
>  	if (xec < smca_mce_descs[bank_type].num_descs)
>  		pr_cont(", %s.\n", smca_mce_descs[bank_type].descs[xec]);
>  
> -	if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc)
> -		decode_dram_ecc(topology_die_id(m->extcpu), m);
> +	/*
> +	 * SMCA_UMC_V2 is used on the noncpu nodes, extract the node id
> +	 * from the InstanceHI[47:44] of the IPID register.
> +	 */
> +	if (bank_type == SMCA_UMC_V2 && xec == 0)
> +		node_id = ((m->ipid >> 44) & 0xF);
> +
> +	if (bank_type == SMCA_UMC && xec == 0)
> +		node_id = topology_die_id(m->extcpu);
> +
> +	if (decode_dram_ecc)
> +		decode_dram_ecc(node_id, m);

If decode_dram_ecc() is set, then this will call it on every MCA error
that comes in. Rather we only want to call it on DRAM ECC errors.

You could do something like this:

	if (decode_dram_ecc && xec == 0) {
		u32 node_id = 0;

		if (bank_type == SMCA_UMC)
			node_id = XXX;
		else if (bank_type == SMCA_UMC_V2)
			node_id = YYY;
		else
			return;

		decode_dram_ecc(node_id, m);
	}

This is just an example. Maybe you can save an indentation level by
negating those conditions and returning early, etc.

Thanks,
Yazen
Chatradhi, Naveen Krishna Aug. 10, 2021, 12:45 p.m. UTC | #2
[Public]

Hi Yazen

-----Original Message-----
From: Ghannam, Yazen <Yazen.Ghannam@amd.com> 
Sent: Thursday, July 29, 2021 10:03 PM
To: Chatradhi, Naveen Krishna <NaveenKrishna.Chatradhi@amd.com>
Cc: linux-edac@vger.kernel.org; x86@kernel.org; linux-kernel@vger.kernel.org; bp@alien8.de; mingo@redhat.com; mchehab@kernel.org
Subject: Re: [PATCH 4/7] EDAC/mce_amd: extract node id from InstanceHi in IPID

On Wed, Jun 30, 2021 at 08:58:25PM +0530, Naveen Krishna Chatradhi wrote:
> On AMD systems with SMCA banks on NONCPU nodes, the node id 
> information is available in the InstanceHI[47:44] of the IPID register.

The doesn't read well to me. I saw this as saying "bits 47:44 of the InstanceHi register". Also, the name of the field is "InstanceIdHi" in the documentation.

I think it'd be more clear to say "available in MCA_IPID[47:44] (InstanceIdHi)" or something similar. 
[naveenk:] Modified the commit message

> 
> Signed-off-by: Muralidhara M K <muralimk@amd.com>
> Signed-off-by: Naveen Krishna Chatradhi <nchatrad@amd.com>
> ---
>  drivers/edac/mce_amd.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c index 
> 27d56920b469..364dfb6e359d 100644
> --- a/drivers/edac/mce_amd.c
> +++ b/drivers/edac/mce_amd.c
> @@ -1049,6 +1049,7 @@ static void decode_smca_error(struct mce *m)
>  	enum smca_bank_types bank_type;
>  	const char *ip_name;
>  	u8 xec = XEC(m->status, xec_mask);
> +	u32 node_id = 0;

Why u32? Why not u16 to match topology_die_id() or int to match decode_dram_ecc()?
[naveenk:] Done, used int.

>  
>  	if (m->bank >= ARRAY_SIZE(smca_banks))
>  		return;
> @@ -1072,8 +1073,18 @@ static void decode_smca_error(struct mce *m)
>  	if (xec < smca_mce_descs[bank_type].num_descs)
>  		pr_cont(", %s.\n", smca_mce_descs[bank_type].descs[xec]);
>  
> -	if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc)
> -		decode_dram_ecc(topology_die_id(m->extcpu), m);
> +	/*
> +	 * SMCA_UMC_V2 is used on the noncpu nodes, extract the node id
> +	 * from the InstanceHI[47:44] of the IPID register.
> +	 */
> +	if (bank_type == SMCA_UMC_V2 && xec == 0)
> +		node_id = ((m->ipid >> 44) & 0xF);
> +
> +	if (bank_type == SMCA_UMC && xec == 0)
> +		node_id = topology_die_id(m->extcpu);
> +
> +	if (decode_dram_ecc)
> +		decode_dram_ecc(node_id, m);

If decode_dram_ecc() is set, then this will call it on every MCA error that comes in. Rather we only want to call it on DRAM ECC errors.

You could do something like this:

	if (decode_dram_ecc && xec == 0) {
		u32 node_id = 0;

		if (bank_type == SMCA_UMC)
			node_id = XXX;
		else if (bank_type == SMCA_UMC_V2)
			node_id = YYY;
		else
			return;

		decode_dram_ecc(node_id, m);
	}

This is just an example. Maybe you can save an indentation level by negating those conditions and returning early, etc.
[naveenk:] modified the ladder.

Thanks,
Yazen
[naveenk:] Thank you.
diff mbox series

Patch

diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index 27d56920b469..364dfb6e359d 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -1049,6 +1049,7 @@  static void decode_smca_error(struct mce *m)
 	enum smca_bank_types bank_type;
 	const char *ip_name;
 	u8 xec = XEC(m->status, xec_mask);
+	u32 node_id = 0;
 
 	if (m->bank >= ARRAY_SIZE(smca_banks))
 		return;
@@ -1072,8 +1073,18 @@  static void decode_smca_error(struct mce *m)
 	if (xec < smca_mce_descs[bank_type].num_descs)
 		pr_cont(", %s.\n", smca_mce_descs[bank_type].descs[xec]);
 
-	if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc)
-		decode_dram_ecc(topology_die_id(m->extcpu), m);
+	/*
+	 * SMCA_UMC_V2 is used on the noncpu nodes, extract the node id
+	 * from the InstanceHI[47:44] of the IPID register.
+	 */
+	if (bank_type == SMCA_UMC_V2 && xec == 0)
+		node_id = ((m->ipid >> 44) & 0xF);
+
+	if (bank_type == SMCA_UMC && xec == 0)
+		node_id = topology_die_id(m->extcpu);
+
+	if (decode_dram_ecc)
+		decode_dram_ecc(node_id, m);
 }
 
 static inline void amd_decode_err_code(u16 ec)