diff mbox

[RFC,v4,2/3] acpi: apei: Rename ghes_severity() to ghes_cper_severity()

Message ID 20180430213358.8319-2-mr.nuke.me@gmail.com (mailing list archive)
State RFC, archived
Headers show

Commit Message

Alex G. April 30, 2018, 9:33 p.m. UTC
ghes_severity() is a misnomer in this case, as it implies the severity
of the entire GHES structure. Instead, it maps one CPER value to a
monotonically increasing number. ghes_cper_severity() is clearer.

Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com>
---
 drivers/acpi/apei/ghes.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

Comments

Shiju Jose May 4, 2018, 11:56 a.m. UTC | #1
Hi Alex,

> -----Original Message-----
> From: Alexandru Gagniuc [mailto:mr.nuke.me@gmail.com]
> Sent: 30 April 2018 22:34
> To: bp@alien8.de
> Cc: alex_gagniuc@dellteam.com; austin_bolen@dell.com;
> shyam_iyer@dell.com; Alexandru Gagniuc; Rafael J. Wysocki; Len Brown;
> Tony Luck; Mauro Carvalho Chehab; Robert Moore; Erik Schmauss; Tyler
> Baicar; Will Deacon; James Morse; Shiju Jose; Jonathan (Zhixiong) Zhang;
> gengdongjiu; linux-acpi@vger.kernel.org; linux-kernel@vger.kernel.org;
> linux-edac@vger.kernel.org; devel@acpica.org
> Subject: [RFC PATCH v4 2/3] acpi: apei: Rename ghes_severity() to
> ghes_cper_severity()
> 
> ghes_severity() is a misnomer in this case, as it implies the severity
> of the entire GHES structure. Instead, it maps one CPER value to a
> monotonically increasing number. ghes_cper_severity() is clearer.
> 
> Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com>
> ---
>  drivers/acpi/apei/ghes.c | 17 ++++++++---------
>  1 file changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index f9b53a6f55f3..c9f1971333c1 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -271,7 +271,7 @@ static void ghes_fini(struct ghes *ghes)
>  		unmap_gen_v2(ghes);
>  }
> 
> -static inline int ghes_severity(int severity)
> +static inline int ghes_cper_severity(int severity)

[...]
>  	else
>  		ratelimit = &ratelimit_uncorrected;
> @@ -705,9 +705,8 @@ static int ghes_proc(struct ghes *ghes)
>  	if (rc)
>  		goto out;
> 
> -	if (ghes_severity(ghes->estatus->error_severity) >=
> GHES_SEV_PANIC) {
> +	if (ghes_cper_severity(ghes->estatus->error_severity) >=
> GHES_SEV_PANIC)
>  		__ghes_panic(ghes);

PCIe AER fatal errors result panic here.
I think ghes_cper_severity to be replaced with ghes_severity in the ghes_proc function as well in the patch 
"acpi: apei: Do not panic() on PCIe errors reported through GHES"?
> -	}
> 
[...]
> 2.14.3

Thanks,
Shiju

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alex G. May 4, 2018, 11:33 p.m. UTC | #2
On 05/04/2018 06:56 AM, Shiju Jose wrote:
> Hi Alex,

Hi

>> -----Original Message-----
>> From: Alexandru Gagniuc [mailto:mr.nuke.me@gmail.com]

[snip]

>> -static inline int ghes_severity(int severity)
>> +static inline int ghes_cper_severity(int severity)
> 
> [...]
>>   	else
>>   		ratelimit = &ratelimit_uncorrected;
>> @@ -705,9 +705,8 @@ static int ghes_proc(struct ghes *ghes)
>>   	if (rc)
>>   		goto out;
>>
>> -	if (ghes_severity(ghes->estatus->error_severity) >=
>> GHES_SEV_PANIC) {
>> +	if (ghes_cper_severity(ghes->estatus->error_severity) >=
>> GHES_SEV_PANIC)
>>   		__ghes_panic(ghes);
> 
> PCIe AER fatal errors result panic here.
> I think ghes_cper_severity to be replaced with ghes_severity in the ghes_proc function as well in the patch
> "acpi: apei: Do not panic() on PCIe errors reported through GHES"?

Hmm.
ghes_proc calls ghes_do_proc, which is not irq safe. So the entire 
concern we had in v1 about deferring to a less restrictive context is 
moot in this case.

ghes_proc is related, but a little beyond the scope of this series. I'd 
love to fix all cases, but I'd prefer someone who has specific interests 
take ownership of changing ghes_proc.

Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Borislav Petkov May 11, 2018, 3:39 p.m. UTC | #3
On Mon, Apr 30, 2018 at 04:33:51PM -0500, Alexandru Gagniuc wrote:
> ghes_severity() is a misnomer in this case, as it implies the severity
> of the entire GHES structure. Instead, it maps one CPER value to a
> monotonically increasing number.

... as opposed to CPER severity which is something else or what is this
formulation trying to express?
Alex G. May 11, 2018, 3:45 p.m. UTC | #4
On 05/11/2018 10:39 AM, Borislav Petkov wrote:
> On Mon, Apr 30, 2018 at 04:33:51PM -0500, Alexandru Gagniuc wrote:
>> ghes_severity() is a misnomer in this case, as it implies the severity
>> of the entire GHES structure. Instead, it maps one CPER value to a
>> monotonically increasing number.
> 
> ... as opposed to CPER severity which is something else or what is this
> formulation trying to express?
> 

CPER madness goes like this:
	0 - Recoverable
	1 - Fatal
	2 - Corrected
	3 - None

As you can see, the numbering was created by crackmonkeys. GHES_* is an
internal enum that goes up in order of severity, as you'd expect.

If you're confused, you're not alone. I've seen several commit messages
that get this terminology wrong.

Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Borislav Petkov May 11, 2018, 3:58 p.m. UTC | #5
On Fri, May 11, 2018 at 10:45:49AM -0500, Alex G. wrote:
> 
> 
> On 05/11/2018 10:39 AM, Borislav Petkov wrote:
> > On Mon, Apr 30, 2018 at 04:33:51PM -0500, Alexandru Gagniuc wrote:
> >> ghes_severity() is a misnomer in this case, as it implies the severity
> >> of the entire GHES structure. Instead, it maps one CPER value to a
> >> monotonically increasing number.
> > 
> > ... as opposed to CPER severity which is something else or what is this
> > formulation trying to express?
> > 
> 
> CPER madness goes like this:

Let's slow down first. Why is it a "CPER madness"? Maybe this is clear
in your head but I'm not in it.

> 	0 - Recoverable
> 	1 - Fatal
> 	2 - Corrected
> 	3 - None

If you're quoting this:

enum {
        CPER_SEV_RECOVERABLE,
        CPER_SEV_FATAL,
        CPER_SEV_CORRECTED,
        CPER_SEV_INFORMATIONAL,
};

that last 3 is informational.

> As you can see, the numbering was created by crackmonkeys. GHES_* is an
> internal enum that goes up in order of severity, as you'd expect.

So what are you trying to tell me - that those CPER numbers are not
increasing?!

Why does that even matter?
Alex G. May 11, 2018, 4:12 p.m. UTC | #6
On 05/11/2018 10:58 AM, Borislav Petkov wrote:
> On Fri, May 11, 2018 at 10:45:49AM -0500, Alex G. wrote:
>>
>>
>> On 05/11/2018 10:39 AM, Borislav Petkov wrote:
>>> On Mon, Apr 30, 2018 at 04:33:51PM -0500, Alexandru Gagniuc wrote:
>>>> ghes_severity() is a misnomer in this case, as it implies the severity
>>>> of the entire GHES structure. Instead, it maps one CPER value to a
>>>> monotonically increasing number.
>>>
>>> ... as opposed to CPER severity which is something else or what is this
>>> formulation trying to express?
>>>
>>
>> CPER madness goes like this:
> 
> Let's slow down first. Why is it a "CPER madness"? Maybe this is clear
> in your head but I'm not in it.
> 
>> 	0 - Recoverable
>> 	1 - Fatal
>> 	2 - Corrected
>> 	3 - None
> 
> If you're quoting this:

I'm quoting ACPI 6.2, Table 18-381 Generic Error Data Entry, though I'm
certain they got that from the efi spec.

> enum {
>         CPER_SEV_RECOVERABLE,
>         CPER_SEV_FATAL,
>         CPER_SEV_CORRECTED,
>         CPER_SEV_INFORMATIONAL,
> };
> 
> that last 3 is informational.
> 
>> As you can see, the numbering was created by crackmonkeys. GHES_* is an
>> internal enum that goes up in order of severity, as you'd expect.
> 
> So what are you trying to tell me - that those CPER numbers are not
> increasing?!
> 
> Why does that even matter?

Because the GHES structure uses CPER values, but all the code is written
to use GHES_SEV_ values. GHES_SEV_ is a made up enum, specifically for
linux.

Sure, the return in ghes_sec_pcie_severity() should say
GHES_SEV_RECOVERABLE, but that is a Freudian slip rather than
intentional typing. Thank you for catching that :)

Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Borislav Petkov May 11, 2018, 4:19 p.m. UTC | #7
On Fri, May 11, 2018 at 11:12:24AM -0500, Alex G. wrote:
> Because the GHES structure uses CPER values, but all the code is written
> to use GHES_SEV_ values. GHES_SEV_ is a made up enum, specifically for
> linux.

Again, what does that even matter?

They're defines in both cases. The *actual* value means shit.

Ah, I see it:

	...
		sec_sev = ghes_sec_pcie_severity(gdata);

	worst_sev = max(worst_sev, sec_sev);


Yeah, no, you can't do that. No apples and oranges comparisons.
Alex G. May 11, 2018, 5:03 p.m. UTC | #8
On 05/11/2018 11:19 AM, Borislav Petkov wrote:
> On Fri, May 11, 2018 at 11:12:24AM -0500, Alex G. wrote:
>> Because the GHES structure uses CPER values, but all the code is written
>> to use GHES_SEV_ values. GHES_SEV_ is a made up enum, specifically for
>> linux.
> 
> Again, what does that even matter?

I will shorten the commit message.


> They're defines in both cases. The *actual* value means shit.
> 
> Ah, I see it:
> 
> 	...
> 		sec_sev = ghes_sec_pcie_severity(gdata);
> 
> 	worst_sev = max(worst_sev, sec_sev);
> 
> 
> Yeah, no, you can't do that. No apples and oranges comparisons.

That was a mistake on my part. Despite my godlike ability to produce
great fixes, I am, in fact, human.

Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index f9b53a6f55f3..c9f1971333c1 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -271,7 +271,7 @@  static void ghes_fini(struct ghes *ghes)
 		unmap_gen_v2(ghes);
 }
 
-static inline int ghes_severity(int severity)
+static inline int ghes_cper_severity(int severity)
 {
 	switch (severity) {
 	case CPER_SEV_INFORMATIONAL:
@@ -388,7 +388,7 @@  static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int
 #ifdef CONFIG_ACPI_APEI_MEMORY_FAILURE
 	unsigned long pfn;
 	int flags = -1;
-	int sec_sev = ghes_severity(gdata->error_severity);
+	int sec_sev = ghes_cper_severity(gdata->error_severity);
 	struct cper_sec_mem_err *mem_err = acpi_hest_get_payload(gdata);
 
 	if (!(mem_err->validation_bits & CPER_MEM_VALID_PA))
@@ -468,10 +468,10 @@  static void ghes_do_proc(struct ghes *ghes,
 	guid_t *fru_id = &NULL_UUID_LE;
 	char *fru_text = "";
 
-	sev = ghes_severity(estatus->error_severity);
+	sev = ghes_cper_severity(estatus->error_severity);
 	apei_estatus_for_each_section(estatus, gdata) {
 		sec_type = (guid_t *)gdata->section_type;
-		sec_sev = ghes_severity(gdata->error_severity);
+		sec_sev = ghes_cper_severity(gdata->error_severity);
 		if (gdata->validation_bits & CPER_SEC_VALID_FRU_ID)
 			fru_id = (guid_t *)gdata->fru_id;
 
@@ -512,7 +512,7 @@  static void __ghes_print_estatus(const char *pfx,
 	char pfx_seq[64];
 
 	if (pfx == NULL) {
-		if (ghes_severity(estatus->error_severity) <=
+		if (ghes_cper_severity(estatus->error_severity) <=
 		    GHES_SEV_CORRECTED)
 			pfx = KERN_WARNING;
 		else
@@ -534,7 +534,7 @@  static int ghes_print_estatus(const char *pfx,
 	static DEFINE_RATELIMIT_STATE(ratelimit_uncorrected, 5*HZ, 2);
 	struct ratelimit_state *ratelimit;
 
-	if (ghes_severity(estatus->error_severity) <= GHES_SEV_CORRECTED)
+	if (ghes_cper_severity(estatus->error_severity) <= GHES_SEV_CORRECTED)
 		ratelimit = &ratelimit_corrected;
 	else
 		ratelimit = &ratelimit_uncorrected;
@@ -705,9 +705,8 @@  static int ghes_proc(struct ghes *ghes)
 	if (rc)
 		goto out;
 
-	if (ghes_severity(ghes->estatus->error_severity) >= GHES_SEV_PANIC) {
+	if (ghes_cper_severity(ghes->estatus->error_severity) >= GHES_SEV_PANIC)
 		__ghes_panic(ghes);
-	}
 
 	if (!ghes_estatus_cached(ghes->estatus)) {
 		if (ghes_print_estatus(NULL, ghes->generic, ghes->estatus))
@@ -945,7 +944,7 @@  static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 			ret = NMI_HANDLED;
 		}
 
-		sev = ghes_severity(ghes->estatus->error_severity);
+		sev = ghes_cper_severity(ghes->estatus->error_severity);
 		if (sev >= GHES_SEV_PANIC) {
 			oops_begin();
 			ghes_print_queued_estatus();