diff mbox series

[3/3] x86/mce: Use mce_prep_record() helpers for apei_smca_report_x86_error()

Message ID 20240521125434.1555845-4-yazen.ghannam@amd.com (mailing list archive)
State New, archived
Headers show
Series Rework mce_setup() | expand

Commit Message

Yazen Ghannam May 21, 2024, 12:54 p.m. UTC
Current AMD systems can report MCA errors using the ACPI Boot Error
Record Table (BERT). The BERT entries for MCA errors will be an x86
Common Platform Error Record (CPER) with an MSR register context that
matches the MCAX/SMCA register space.

However, the BERT will not necessarily be processed on the CPU that
reported the MCA errors. Therefore, the correct CPU number needs to be
determined and the information saved in struct mce.

The CPU number is determined by searching all possible CPUs for a Local
APIC ID matching the value in the x86 CPER.

Use the newly defined mce_prep_record_*() helpers to get the correct
data.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
 arch/x86/kernel/cpu/mce/apei.c | 17 +++++++----------
 1 file changed, 7 insertions(+), 10 deletions(-)

Comments

Borislav Petkov May 29, 2024, 5:28 p.m. UTC | #1
On Tue, May 21, 2024 at 07:54:34AM -0500, Yazen Ghannam wrote:
> Current AMD systems can report MCA errors using the ACPI Boot Error
> Record Table (BERT). The BERT entries for MCA errors will be an x86
> Common Platform Error Record (CPER) with an MSR register context that
> matches the MCAX/SMCA register space.
> 
> However, the BERT will not necessarily be processed on the CPU that
> reported the MCA errors. Therefore, the correct CPU number needs to be
> determined and the information saved in struct mce.
> 
> The CPU number is determined by searching all possible CPUs for a Local
> APIC ID matching the value in the x86 CPER.

You're explaining the code again. :)

>  	for_each_possible_cpu(cpu) {
> -		if (cpu_data(cpu).topo.initial_apicid == lapic_id) {
> -			m.extcpu = cpu;
> -			m.socketid = cpu_data(m.extcpu).topo.pkg_id;
> +		if (cpu_data(cpu).topo.initial_apicid == lapic_id)
>  			break;
> -		}
>  	}
>  
> -	m.apicid = lapic_id;
> +	if (!cpu_possible(cpu))
> +		return -EINVAL;

What's that test for? You just iterated over the possible CPUs using
"cpu" as the iterator there...
Yazen Ghannam June 3, 2024, 2:34 p.m. UTC | #2
On 5/29/24 1:28 PM, Borislav Petkov wrote:
> On Tue, May 21, 2024 at 07:54:34AM -0500, Yazen Ghannam wrote:
>> Current AMD systems can report MCA errors using the ACPI Boot Error
>> Record Table (BERT). The BERT entries for MCA errors will be an x86
>> Common Platform Error Record (CPER) with an MSR register context that
>> matches the MCAX/SMCA register space.
>>
>> However, the BERT will not necessarily be processed on the CPU that
>> reported the MCA errors. Therefore, the correct CPU number needs to be
>> determined and the information saved in struct mce.
>>
>> The CPU number is determined by searching all possible CPUs for a Local
>> APIC ID matching the value in the x86 CPER.
> 
> You're explaining the code again. :)
> 

One day I'll break this habit. Thanks again for the reminder. :)

>>  	for_each_possible_cpu(cpu) {
>> -		if (cpu_data(cpu).topo.initial_apicid == lapic_id) {
>> -			m.extcpu = cpu;
>> -			m.socketid = cpu_data(m.extcpu).topo.pkg_id;
>> +		if (cpu_data(cpu).topo.initial_apicid == lapic_id)
>>  			break;
>> -		}
>>  	}
>>  
>> -	m.apicid = lapic_id;
>> +	if (!cpu_possible(cpu))
>> +		return -EINVAL;
> 
> What's that test for? You just iterated over the possible CPUs using
> "cpu" as the iterator there...
> 

This is to catch the case where there was no break from the loop.

If there is no match during the iterator, then "cpu" will be equal to
nr_cpu_ids.

I wanted to use a helper that goes with with the iterator rather than
checking against nr_cpu_ids directly.

Thanks,
Yazen
Borislav Petkov June 3, 2024, 4:55 p.m. UTC | #3
On Mon, Jun 03, 2024 at 10:34:10AM -0400, Yazen Ghannam wrote:
> One day I'll break this habit. Thanks again for the reminder. :)

Sure, np. :-)

> >>  	for_each_possible_cpu(cpu) {
> >> -		if (cpu_data(cpu).topo.initial_apicid == lapic_id) {
> >> -			m.extcpu = cpu;
> >> -			m.socketid = cpu_data(m.extcpu).topo.pkg_id;
> >> +		if (cpu_data(cpu).topo.initial_apicid == lapic_id)
> >>  			break;
> >> -		}
> >>  	}
> >>  
> >> -	m.apicid = lapic_id;
> >> +	if (!cpu_possible(cpu))
> >> +		return -EINVAL;
> > 
> > What's that test for? You just iterated over the possible CPUs using
> > "cpu" as the iterator there...
> > 
> 
> This is to catch the case where there was no break from the loop.

If the CPU is possible != whether there was a apicid match.

Here's how you do that and I'd let you figure out why yours doesn't
always work:

diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c
index 0cbadfaf2400..3885fe05f01e 100644
--- a/arch/x86/kernel/cpu/mce/apei.c
+++ b/arch/x86/kernel/cpu/mce/apei.c
@@ -66,6 +66,7 @@ EXPORT_SYMBOL_GPL(apei_mce_report_mem_error);
 int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id)
 {
 	const u64 *i_mce = ((const u64 *) (ctx_info + 1));
+	bool apicid_found = false;
 	unsigned int cpu;
 	struct mce m;
 
@@ -98,11 +99,13 @@ int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id)
 		return -EINVAL;
 
 	for_each_possible_cpu(cpu) {
-		if (cpu_data(cpu).topo.initial_apicid == lapic_id)
+		if (cpu_data(cpu).topo.initial_apicid == lapic_id) {
+			apicid_found = true;
 			break;
+		}
 	}
 
-	if (!cpu_possible(cpu))
+	if (!apicid_found)
 		return -EINVAL;
 
 	mce_prep_record_common(&m);


Thx.
Yazen Ghannam June 14, 2024, 9:47 p.m. UTC | #4
On Mon, Jun 03, 2024 at 06:55:30PM +0200, Borislav Petkov wrote:
> On Mon, Jun 03, 2024 at 10:34:10AM -0400, Yazen Ghannam wrote:

[...]

> > This is to catch the case where there was no break from the loop.
> 
> If the CPU is possible != whether there was a apicid match.
> 
> Here's how you do that and I'd let you figure out why yours doesn't
> always work:
>

I don't see why it won't work. If there is no break, then the iterator
ends by setting the variable past the last valid value.

For example, I ran this on a system with 512 CPUs:

        unsigned int cpu;

	/* Loops over CPUs 0-511. */
        for_each_possible_cpu(cpu)
                pr_info("loop: cpu=%d\n", cpu);

	/* CPU is now set to 512. */
        pr_info("final: cpu=%d\n", cpu);

	/* CPU 512 is not possible. */
        pr_info("CPU %d is %s possible\n", cpu, cpu_possible(cpu) ? "" : "not");

But...I like your suggestion as it is much more explicit. And I might be
missing something. :/

> diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c
> index 0cbadfaf2400..3885fe05f01e 100644
> --- a/arch/x86/kernel/cpu/mce/apei.c
> +++ b/arch/x86/kernel/cpu/mce/apei.c
> @@ -66,6 +66,7 @@ EXPORT_SYMBOL_GPL(apei_mce_report_mem_error);
>  int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id)
>  {
>  	const u64 *i_mce = ((const u64 *) (ctx_info + 1));
> +	bool apicid_found = false;
>  	unsigned int cpu;
>  	struct mce m;
>  
> @@ -98,11 +99,13 @@ int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id)
>  		return -EINVAL;
>  
>  	for_each_possible_cpu(cpu) {
> -		if (cpu_data(cpu).topo.initial_apicid == lapic_id)
> +		if (cpu_data(cpu).topo.initial_apicid == lapic_id) {
> +			apicid_found = true;
>  			break;
> +		}
>  	}
>  
> -	if (!cpu_possible(cpu))
> +	if (!apicid_found)
>  		return -EINVAL;
>  
>  	mce_prep_record_common(&m);
> 
>

Would you like me to send another revision with this change? Do you have
any other comments?

Thanks,
Yazen
Borislav Petkov June 14, 2024, 10:44 p.m. UTC | #5
On Fri, Jun 14, 2024 at 05:47:36PM -0400, Yazen Ghannam wrote:
> I don't see why it won't work. If there is no break, then the iterator
> ends by setting the variable past the last valid value.
> 
> For example, I ran this on a system with 512 CPUs:
> 
>         unsigned int cpu;
> 
> 	/* Loops over CPUs 0-511. */
>         for_each_possible_cpu(cpu)
>                 pr_info("loop: cpu=%d\n", cpu);
> 
> 	/* CPU is now set to 512. */
>         pr_info("final: cpu=%d\n", cpu);
> 
> 	/* CPU 512 is not possible. */
>         pr_info("CPU %d is %s possible\n", cpu, cpu_possible(cpu) ? "" : "not");
> 
> But...I like your suggestion as it is much more explicit. And I might be
> missing something. :/

I can think of at least three:

* CPU topology and the initial_apicid sometimes can get programmed wrong by the
* FW. Nothing new.

* nr_cpus= - you can enable less CPUs than actually physically present so an MCE
on a CPU which is not enabled by Linux will be -EINVAL

* possible_cpus= - pretty much the same thing

But I haven't actually tried them - am just looking at the code.

And yes, with the apicid_found boolean it is perfectly clear what's going on.

And looking at

  convert_apicid_to_cpu()

which already does that loop, we probably should talk to tglx whether we can
simply export that helper.

And better yet if he's done some more helpful caching of the reverse mapping:
apicid to CPU number. As part of the topology rewrite. Because then we don't
need the loop at all.

Thx.
Yazen Ghannam June 18, 2024, 3:11 p.m. UTC | #6
On Sat, Jun 15, 2024 at 12:44:20AM +0200, Borislav Petkov wrote:
> On Fri, Jun 14, 2024 at 05:47:36PM -0400, Yazen Ghannam wrote:
> > I don't see why it won't work. If there is no break, then the iterator
> > ends by setting the variable past the last valid value.
> > 
> > For example, I ran this on a system with 512 CPUs:
> > 
> >         unsigned int cpu;
> > 
> > 	/* Loops over CPUs 0-511. */
> >         for_each_possible_cpu(cpu)
> >                 pr_info("loop: cpu=%d\n", cpu);
> > 
> > 	/* CPU is now set to 512. */
> >         pr_info("final: cpu=%d\n", cpu);
> > 
> > 	/* CPU 512 is not possible. */
> >         pr_info("CPU %d is %s possible\n", cpu, cpu_possible(cpu) ? "" : "not");
> > 
> > But...I like your suggestion as it is much more explicit. And I might be
> > missing something. :/
> 
> I can think of at least three:
> 
> * CPU topology and the initial_apicid sometimes can get programmed wrong by the
> * FW. Nothing new.
> 
> * nr_cpus= - you can enable less CPUs than actually physically present so an MCE
> on a CPU which is not enabled by Linux will be -EINVAL
> 
> * possible_cpus= - pretty much the same thing
> 
> But I haven't actually tried them - am just looking at the code.
> 
> And yes, with the apicid_found boolean it is perfectly clear what's going on.
> 
> And looking at
> 
>   convert_apicid_to_cpu()
> 
> which already does that loop, we probably should talk to tglx whether we can
> simply export that helper.
> 
> And better yet if he's done some more helpful caching of the reverse mapping:
> apicid to CPU number. As part of the topology rewrite. Because then we don't
> need the loop at all.
>

Agreed. Here's another option: topo_lookup_cpuid()

Thanks,
Yazen
Yazen Ghannam June 18, 2024, 5:24 p.m. UTC | #7
On Tue, Jun 18, 2024 at 11:11:59AM -0400, Ghannam, Yazen wrote:
> On Sat, Jun 15, 2024 at 12:44:20AM +0200, Borislav Petkov wrote:

[...]

> > 
> > And looking at
> > 
> >   convert_apicid_to_cpu()
> > 
> > which already does that loop, we probably should talk to tglx whether we can
> > simply export that helper.
> > 
> > And better yet if he's done some more helpful caching of the reverse mapping:
> > apicid to CPU number. As part of the topology rewrite. Because then we don't
> > need the loop at all.
> >
> 
> Agreed. Here's another option: topo_lookup_cpuid()
>

What do you think of the following example?

Thanks,
Yazen

----

diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index abe3a8f22cbd..d8c3c3c818bc 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -171,11 +171,17 @@ static inline unsigned int topology_num_threads_per_package(void)
 
 #ifdef CONFIG_X86_LOCAL_APIC
 int topology_get_logical_id(u32 apicid, enum x86_topology_domains at_level);
+int topology_get_cpunr(u32 apic_id);
 #else
 static inline int topology_get_logical_id(u32 apicid, enum x86_topology_domains at_level)
 {
 	return 0;
 }
+
+static inline int topology_get_cpunr(u32 apic_id)
+{
+	return -ENODEV;
+}
 #endif
 
 #ifdef CONFIG_SMP
diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c
index 0cbadfaf2400..415cae8d69bf 100644
--- a/arch/x86/kernel/cpu/mce/apei.c
+++ b/arch/x86/kernel/cpu/mce/apei.c
@@ -66,8 +66,8 @@ EXPORT_SYMBOL_GPL(apei_mce_report_mem_error);
 int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id)
 {
 	const u64 *i_mce = ((const u64 *) (ctx_info + 1));
-	unsigned int cpu;
 	struct mce m;
+	int cpu;
 
 	if (!boot_cpu_has(X86_FEATURE_SMCA))
 		return -EINVAL;
@@ -97,13 +97,9 @@ int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id)
 	if (ctx_info->reg_arr_size < 48)
 		return -EINVAL;
 
-	for_each_possible_cpu(cpu) {
-		if (cpu_data(cpu).topo.initial_apicid == lapic_id)
-			break;
-	}
-
-	if (!cpu_possible(cpu))
-		return -EINVAL;
+	cpu = topology_get_cpunr(lapic_id);
+	if (cpu < 0)
+		return cpu;
 
 	mce_prep_record_common(&m);
 	mce_prep_record_per_cpu(cpu, &m);
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 621a151ccf7d..fc74578ee3bd 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -95,7 +95,15 @@ static inline u32 topo_apicid(u32 apicid, enum x86_topology_domains dom)
 	return apicid & (UINT_MAX << x86_topo_system.dom_shifts[dom - 1]);
 }
 
-static int topo_lookup_cpuid(u32 apic_id)
+/**
+ * topology_get_cpunr - Retrieve the CPU number for the given APIC ID
+ * @apic_id:		The APIC ID for which to lookup the CPU number
+ *
+ * Returns:
+ *  - >= 0:	The CPU number for the given APIC ID
+ *  - -ENODEV:	@apic_id does not match any known CPU
+ */
+int topology_get_cpunr(u32 apic_id)
 {
 	int i;
 
@@ -106,10 +114,11 @@ static int topo_lookup_cpuid(u32 apic_id)
 	}
 	return -ENODEV;
 }
+EXPORT_SYMBOL_GPL(topology_get_cpunr);
 
 static __init int topo_get_cpunr(u32 apic_id)
 {
-	int cpu = topo_lookup_cpuid(apic_id);
+	int cpu = topology_get_cpunr(apic_id);
 
 	if (cpu >= 0)
 		return cpu;
@@ -388,7 +397,7 @@ int topology_hotplug_apic(u32 apic_id, u32 acpi_id)
 	if (!test_bit(apic_id, apic_maps[TOPO_SMT_DOMAIN].map))
 		return -ENODEV;
 
-	cpu = topo_lookup_cpuid(apic_id);
+	cpu = topology_get_cpunr(apic_id);
 	if (cpu < 0)
 		return -ENOSPC;
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c
index 8f509c8a4e98..0cbadfaf2400 100644
--- a/arch/x86/kernel/cpu/mce/apei.c
+++ b/arch/x86/kernel/cpu/mce/apei.c
@@ -97,20 +97,17 @@  int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id)
 	if (ctx_info->reg_arr_size < 48)
 		return -EINVAL;
 
-	mce_prep_record(&m);
-
-	m.extcpu = -1;
-	m.socketid = -1;
-
 	for_each_possible_cpu(cpu) {
-		if (cpu_data(cpu).topo.initial_apicid == lapic_id) {
-			m.extcpu = cpu;
-			m.socketid = cpu_data(m.extcpu).topo.pkg_id;
+		if (cpu_data(cpu).topo.initial_apicid == lapic_id)
 			break;
-		}
 	}
 
-	m.apicid = lapic_id;
+	if (!cpu_possible(cpu))
+		return -EINVAL;
+
+	mce_prep_record_common(&m);
+	mce_prep_record_per_cpu(cpu, &m);
+
 	m.bank = (ctx_info->msr_addr >> 4) & 0xFF;
 	m.status = *i_mce;
 	m.addr = *(i_mce + 1);