diff mbox

PCI/AER: update AER status string print to match other AER logs

Message ID 1508254922-30925-1-git-send-email-tbaicar@codeaurora.org (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Tyler Baicar Oct. 17, 2017, 3:42 p.m. UTC
Currently the AER driver uses cper_print_bits() to print the AER status
string. This causes the status string to not include the proper PCI device
name prefix that the other AER prints include. Also, it has a different
print level than all the other AER prints.

Update the AER driver to print the AER status string with the proper string
prefix and proper print level.

Previous log example:

e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
Receiver Error, Bad TLP
e1000e 0003:01:00.1: aer_layer=Physical Layer, aer_agent=Receiver ID
pcieport 0003:00:00.0: aer_status: 0x00001000, aer_mask: 0x0000e000
Replay Timer Timeout
pcieport 0003:00:00.0: aer_layer=Data Link Layer, aer_agent=Transmitter ID

New log:

e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
e1000e 0003:01:00.1: Receiver Error
e1000e 0003:01:00.1: Bad TLP
e1000e 0003:01:00.1: aer_layer=Physical Layer, aer_agent=Receiver ID
pcieport 0003:00:00.0: aer_status: 0x00001000, aer_mask: 0x0000e000
pcieport 0003:00:00.0: Replay Timer Timeout
pcieport 0003:00:00.0: aer_layer=Data Link Layer, aer_agent=Transmitter ID

Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
---
 drivers/pci/pcie/aer/aerdrv_errprint.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

Comments

David Laight Oct. 17, 2017, 4 p.m. UTC | #1
From: Tyler Baicar
> Sent: 17 October 2017 16:42
> Currently the AER driver uses cper_print_bits() to print the AER status
> string. This causes the status string to not include the proper PCI device
> name prefix that the other AER prints include. Also, it has a different
> print level than all the other AER prints.
> 
> Update the AER driver to print the AER status string with the proper string
> prefix and proper print level.
> 
> Previous log example:
> 
> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
> Receiver Error, Bad TLP
...
> New log:
> 
> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
> e1000e 0003:01:00.1: Receiver Error
> e1000e 0003:01:00.1: Bad TLP

Wouldn't it be better to manage to print the above all on 1 line?

...
> index 54c4b69..b718daa 100644
> --- a/drivers/pci/pcie/aer/aerdrv_errprint.c
> +++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
> @@ -206,6 +206,19 @@ void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info)
>  }
> 
>  #ifdef CONFIG_ACPI_APEI_PCIEAER
> +void dev_print_bits(struct pci_dev *dev, unsigned int bits,
> +		    const char * const strs[], unsigned int strs_size)

static and rename to aer_print_bits since this isn't a generic 'dev'
function.

	David
Tyler Baicar Oct. 17, 2017, 5:13 p.m. UTC | #2
On 10/17/2017 12:00 PM, David Laight wrote:
> From: Tyler Baicar
>> Sent: 17 October 2017 16:42
>> Currently the AER driver uses cper_print_bits() to print the AER status
>> string. This causes the status string to not include the proper PCI device
>> name prefix that the other AER prints include. Also, it has a different
>> print level than all the other AER prints.
>>
>> Update the AER driver to print the AER status string with the proper string
>> prefix and proper print level.
>>
>> Previous log example:
>>
>> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
>> Receiver Error, Bad TLP
> ...
>> New log:
>>
>> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
>> e1000e 0003:01:00.1: Receiver Error
>> e1000e 0003:01:00.1: Bad TLP
> Wouldn't it be better to manage to print the above all on 1 line?
Hello David,

I broke them up into separate lines to simplify the code. If you look at 
cper_print_bits(),
it is not a clean solution and involves some hard coded values to try to limit 
the lines to
80 characters.

http://elixir.free-electrons.com/linux/v4.14-rc5/source/drivers/firmware/efi/cper.c#L85

I think printing one error per line in this case is a better solution since the 
code is much
cleaner. If you would like me to add this code to print them in a list and limit 
the lines
to 80 characters I can add that in though.
>
> ...
>> index 54c4b69..b718daa 100644
>> --- a/drivers/pci/pcie/aer/aerdrv_errprint.c
>> +++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
>> @@ -206,6 +206,19 @@ void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info)
>>   }
>>
>>   #ifdef CONFIG_ACPI_APEI_PCIEAER
>> +void dev_print_bits(struct pci_dev *dev, unsigned int bits,
>> +		    const char * const strs[], unsigned int strs_size)
> static and rename to aer_print_bits since this isn't a generic 'dev'
> function.
Will do.

Thanks,
Tyler

--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.
David Laight Oct. 18, 2017, 10:14 a.m. UTC | #3
From: Tyler Baicar [mailto:tbaicar@codeaurora.org]
> Sent: 17 October 2017 18:14
> On 10/17/2017 12:00 PM, David Laight wrote:
> > From: Tyler Baicar
> >> Sent: 17 October 2017 16:42
> >> Currently the AER driver uses cper_print_bits() to print the AER status
> >> string. This causes the status string to not include the proper PCI device
> >> name prefix that the other AER prints include. Also, it has a different
> >> print level than all the other AER prints.
> >>
> >> Update the AER driver to print the AER status string with the proper string
> >> prefix and proper print level.
> >>
> >> Previous log example:
> >>
> >> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
> >> Receiver Error, Bad TLP
> > ...
> >> New log:
> >>
> >> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
> >> e1000e 0003:01:00.1: Receiver Error
> >> e1000e 0003:01:00.1: Bad TLP

> > Wouldn't it be better to manage to print the above all on 1 line?
 
> I broke them up into separate lines to simplify the code. If you look at
> cper_print_bits(),
> it is not a clean solution and involves some hard coded values to try to limit
> the lines to 80 characters.

I'm not sure the 80 char limit is needed.


How about:
#define MAX_STR 32
void pr_bits(unsigned int val, const char *strs[], unsigned int num_str)
{
        const char *str[MAX_STR] = {};
        unsigned int i, num;

        if (num_str > MAX_STR)
                num_str = MAX_STR;
        for (i = 0, num = 0; i < num_str; i++) {
                if (!(val & (1 << i)))
                        continue;
                str[num++] = strs[i];
        }
        printf(" %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s\n" + (MAX_STR - num) * 3,
                str[0], str[1], str[2], str[3],
                str[4], str[5], str[6], str[7],
                str[8], str[9], str[10], str[11],
                str[12], str[13], str[14], str[15],
                str[16], str[17], str[18], str[19],
                str[20], str[21], str[22], str[23],
                str[24], str[25], str[26], str[27],
                str[28], str[29], str[30], str[31]);
}

For kernel use you'd probably want to pass in 'dev' and a printf list
and use %pV to put the fixed text on the front of the line.

All rather begging for a new %p? feature that is passed the value, strings
and separator.

	David
Tyler Baicar Oct. 18, 2017, 6:23 p.m. UTC | #4
On 10/18/2017 6:14 AM, David Laight wrote:
> From: Tyler Baicar [mailto:tbaicar@codeaurora.org]
>> Sent: 17 October 2017 18:14
>> On 10/17/2017 12:00 PM, David Laight wrote:
>>> From: Tyler Baicar
>>>> Sent: 17 October 2017 16:42
>>>> Currently the AER driver uses cper_print_bits() to print the AER status
>>>> string. This causes the status string to not include the proper PCI device
>>>> name prefix that the other AER prints include. Also, it has a different
>>>> print level than all the other AER prints.
>>>>
>>>> Update the AER driver to print the AER status string with the proper string
>>>> prefix and proper print level.
>>>>
>>>> Previous log example:
>>>>
>>>> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
>>>> Receiver Error, Bad TLP
>>> ...
>>>> New log:
>>>>
>>>> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
>>>> e1000e 0003:01:00.1: Receiver Error
>>>> e1000e 0003:01:00.1: Bad TLP
>>> Wouldn't it be better to manage to print the above all on 1 line?
>   
>> I broke them up into separate lines to simplify the code. If you look at
>> cper_print_bits(),
>> it is not a clean solution and involves some hard coded values to try to limit
>> the lines to 80 characters.
> I'm not sure the 80 char limit is needed.
>
>
> How about:
> #define MAX_STR 32
> void pr_bits(unsigned int val, const char *strs[], unsigned int num_str)
> {
>          const char *str[MAX_STR] = {};
>          unsigned int i, num;
>
>          if (num_str > MAX_STR)
>                  num_str = MAX_STR;
>          for (i = 0, num = 0; i < num_str; i++) {
>                  if (!(val & (1 << i)))
>                          continue;
>                  str[num++] = strs[i];
>          }
>          printf(" %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s\n" + (MAX_STR - num) * 3,
>                  str[0], str[1], str[2], str[3],
>                  str[4], str[5], str[6], str[7],
>                  str[8], str[9], str[10], str[11],
>                  str[12], str[13], str[14], str[15],
>                  str[16], str[17], str[18], str[19],
>                  str[20], str[21], str[22], str[23],
>                  str[24], str[25], str[26], str[27],
>                  str[28], str[29], str[30], str[31]);
> }
>
> For kernel use you'd probably want to pass in 'dev' and a printf list
> and use %pV to put the fixed text on the front of the line.
>
> All rather begging for a new %p? feature that is passed the value, strings
> and separator.
Hi David,

This seems like a bad approach. This can make the print in the kernel logs and 
the code both
look pretty awful. I would prefer to have each error that occurred have it's own 
print line in
the logs rather than introduce this code for the sole purpose of keeping the 
list on a single
print line. I don't see any real downside to having a few additional print lines 
in error
scenarios.

Thanks,
Tyler
Bjorn Helgaas Oct. 20, 2017, 11:55 p.m. UTC | #5
On Tue, Oct 17, 2017 at 09:42:02AM -0600, Tyler Baicar wrote:
> Currently the AER driver uses cper_print_bits() to print the AER status
> string. This causes the status string to not include the proper PCI device
> name prefix that the other AER prints include. Also, it has a different
> print level than all the other AER prints.
> 
> Update the AER driver to print the AER status string with the proper string
> prefix and proper print level.
> 
> Previous log example:
> 
> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
> Receiver Error, Bad TLP
> e1000e 0003:01:00.1: aer_layer=Physical Layer, aer_agent=Receiver ID
> pcieport 0003:00:00.0: aer_status: 0x00001000, aer_mask: 0x0000e000
> Replay Timer Timeout
> pcieport 0003:00:00.0: aer_layer=Data Link Layer, aer_agent=Transmitter ID
> 
> New log:
> 
> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
> e1000e 0003:01:00.1: Receiver Error
> e1000e 0003:01:00.1: Bad TLP
> e1000e 0003:01:00.1: aer_layer=Physical Layer, aer_agent=Receiver ID
> pcieport 0003:00:00.0: aer_status: 0x00001000, aer_mask: 0x0000e000
> pcieport 0003:00:00.0: Replay Timer Timeout
> pcieport 0003:00:00.0: aer_layer=Data Link Layer, aer_agent=Transmitter ID

I definitely think it's MUCH better to use dev_err() as you do.

I don't like the cper_print_bits() strategy of inserting line breaks
to fit in 80 columns.  That leads to atomicity issues, e.g., other
printk output getting inserted in the middle of a single AER log, and
suggests an ordering ("Receiver Error" occurred before "Bad TLP") that
isn't real.  It'd be ideal if everything fit on one line per event,
but that might not be practical.

I'm not necessarily attached to the actual strings.  These messages
are for sophisticated users and maybe could be abbreviated as in lspci
output.  It might actually be kind of neat if the output here matched
up with the output of "lspci -vv" (lspci prints all the bits; here you
probably want only the set bits).  Or maybe not.

But even what you have here is a huge improvement.  I *hate*
unattached things in dmesg like we currently get.  There's no reliable
way to connect that "Receiver Error, Bad TLP" with the device.

> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
> ---
>  drivers/pci/pcie/aer/aerdrv_errprint.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c b/drivers/pci/pcie/aer/aerdrv_errprint.c
> index 54c4b69..b718daa 100644
> --- a/drivers/pci/pcie/aer/aerdrv_errprint.c
> +++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
> @@ -206,6 +206,19 @@ void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info)
>  }
>  
>  #ifdef CONFIG_ACPI_APEI_PCIEAER
> +void dev_print_bits(struct pci_dev *dev, unsigned int bits,
> +		    const char * const strs[], unsigned int strs_size)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < strs_size; i++) {
> +		if (!(bits & (1U << i)))
> +			continue;
> +		if (strs[i])
> +			dev_err(&dev->dev, "%s\n", strs[i]);
> +	}
> +}
> +
>  int cper_severity_to_aer(int cper_severity)
>  {
>  	switch (cper_severity) {
> @@ -243,7 +256,7 @@ void cper_print_aer(struct pci_dev *dev, int aer_severity,
>  	agent = AER_GET_AGENT(aer_severity, status);
>  
>  	dev_err(&dev->dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n", status, mask);
> -	cper_print_bits("", status, status_strs, status_strs_size);
> +	dev_print_bits(dev, status, status_strs, status_strs_size);
>  	dev_err(&dev->dev, "aer_layer=%s, aer_agent=%s\n",
>  		aer_error_layer[layer], aer_agent_string[agent]);
>  
> -- 
> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
> Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project.
>
Tyler Baicar Nov. 7, 2017, 11:18 p.m. UTC | #6
On 10/20/2017 7:55 PM, Bjorn Helgaas wrote:
> On Tue, Oct 17, 2017 at 09:42:02AM -0600, Tyler Baicar wrote:
>> Currently the AER driver uses cper_print_bits() to print the AER status
>> string. This causes the status string to not include the proper PCI device
>> name prefix that the other AER prints include. Also, it has a different
>> print level than all the other AER prints.
>>
>> Update the AER driver to print the AER status string with the proper string
>> prefix and proper print level.
>>
>> Previous log example:
>>
>> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
>> Receiver Error, Bad TLP
>> e1000e 0003:01:00.1: aer_layer=Physical Layer, aer_agent=Receiver ID
>> pcieport 0003:00:00.0: aer_status: 0x00001000, aer_mask: 0x0000e000
>> Replay Timer Timeout
>> pcieport 0003:00:00.0: aer_layer=Data Link Layer, aer_agent=Transmitter ID
>>
>> New log:
>>
>> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
>> e1000e 0003:01:00.1: Receiver Error
>> e1000e 0003:01:00.1: Bad TLP
>> e1000e 0003:01:00.1: aer_layer=Physical Layer, aer_agent=Receiver ID
>> pcieport 0003:00:00.0: aer_status: 0x00001000, aer_mask: 0x0000e000
>> pcieport 0003:00:00.0: Replay Timer Timeout
>> pcieport 0003:00:00.0: aer_layer=Data Link Layer, aer_agent=Transmitter ID
> I definitely think it's MUCH better to use dev_err() as you do.
>
> I don't like the cper_print_bits() strategy of inserting line breaks
> to fit in 80 columns.  That leads to atomicity issues, e.g., other
> printk output getting inserted in the middle of a single AER log, and
> suggests an ordering ("Receiver Error" occurred before "Bad TLP") that
> isn't real.  It'd be ideal if everything fit on one line per event,
> but that might not be practical.
>
> I'm not necessarily attached to the actual strings.  These messages
> are for sophisticated users and maybe could be abbreviated as in lspci
> output.  It might actually be kind of neat if the output here matched
> up with the output of "lspci -vv" (lspci prints all the bits; here you
> probably want only the set bits).  Or maybe not.
>
> But even what you have here is a huge improvement.  I *hate*
> unattached things in dmesg like we currently get.  There's no reliable
> way to connect that "Receiver Error, Bad TLP" with the device.
Hello Bjorn,

Thanks for the feedback. Do you think this can get into 4.15?

Thanks,
Tyler
>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>> ---
>>   drivers/pci/pcie/aer/aerdrv_errprint.c | 15 ++++++++++++++-
>>   1 file changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c b/drivers/pci/pcie/aer/aerdrv_errprint.c
>> index 54c4b69..b718daa 100644
>> --- a/drivers/pci/pcie/aer/aerdrv_errprint.c
>> +++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
>> @@ -206,6 +206,19 @@ void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info)
>>   }
>>   
>>   #ifdef CONFIG_ACPI_APEI_PCIEAER
>> +void dev_print_bits(struct pci_dev *dev, unsigned int bits,
>> +		    const char * const strs[], unsigned int strs_size)
>> +{
>> +	unsigned int i;
>> +
>> +	for (i = 0; i < strs_size; i++) {
>> +		if (!(bits & (1U << i)))
>> +			continue;
>> +		if (strs[i])
>> +			dev_err(&dev->dev, "%s\n", strs[i]);
>> +	}
>> +}
>> +
>>   int cper_severity_to_aer(int cper_severity)
>>   {
>>   	switch (cper_severity) {
>> @@ -243,7 +256,7 @@ void cper_print_aer(struct pci_dev *dev, int aer_severity,
>>   	agent = AER_GET_AGENT(aer_severity, status);
>>   
>>   	dev_err(&dev->dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n", status, mask);
>> -	cper_print_bits("", status, status_strs, status_strs_size);
>> +	dev_print_bits(dev, status, status_strs, status_strs_size);
>>   	dev_err(&dev->dev, "aer_layer=%s, aer_agent=%s\n",
>>   		aer_error_layer[layer], aer_agent_string[agent]);
>>   
>> -- 
>> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
>> Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
>> a Linux Foundation Collaborative Project.
>>
Tyler Baicar Nov. 15, 2017, 2:47 p.m. UTC | #7
On 10/17/2017 11:42 AM, Tyler Baicar wrote:
> Currently the AER driver uses cper_print_bits() to print the AER status
> string. This causes the status string to not include the proper PCI device
> name prefix that the other AER prints include. Also, it has a different
> print level than all the other AER prints.
>
> Update the AER driver to print the AER status string with the proper string
> prefix and proper print level.
Hello,

Will this patch be pulled into 4.15?

Thanks,
Tyler
Bjorn Helgaas Nov. 15, 2017, 5:56 p.m. UTC | #8
Hi Tyler,

On Wed, Nov 15, 2017 at 09:47:41AM -0500, Tyler Baicar wrote:
> On 10/17/2017 11:42 AM, Tyler Baicar wrote:
> >Currently the AER driver uses cper_print_bits() to print the AER status
> >string. This causes the status string to not include the proper PCI device
> >name prefix that the other AER prints include. Also, it has a different
> >print level than all the other AER prints.
> >
> >Update the AER driver to print the AER status string with the proper string
> >prefix and proper print level.
> Hello,
> 
> Will this patch be pulled into 4.15?

Sorry, I am preparing the 4.15 pull request right now, and it doesn't
include this change.

I do like the dev_err() change, but would prefer fewer lines of
output.  I could have applied just the dev_err() change, but to
minimize pain for people who parse the logs, I'd rather make one
change in the output instead of making one change now and another
later.

Bjorn
Tyler Baicar Dec. 13, 2017, 4:50 p.m. UTC | #9
On 11/15/2017 12:56 PM, Bjorn Helgaas wrote:
> Hi Tyler,
>
> On Wed, Nov 15, 2017 at 09:47:41AM -0500, Tyler Baicar wrote:
>> On 10/17/2017 11:42 AM, Tyler Baicar wrote:
>>> Currently the AER driver uses cper_print_bits() to print the AER status
>>> string. This causes the status string to not include the proper PCI device
>>> name prefix that the other AER prints include. Also, it has a different
>>> print level than all the other AER prints.
>>>
>>> Update the AER driver to print the AER status string with the proper string
>>> prefix and proper print level.
>> Hello,
>>
>> Will this patch be pulled into 4.15?
> Sorry, I am preparing the 4.15 pull request right now, and it doesn't
> include this change.
>
> I do like the dev_err() change, but would prefer fewer lines of
> output.  I could have applied just the dev_err() change, but to
> minimize pain for people who parse the logs, I'd rather make one
> change in the output instead of making one change now and another
> later.
Hello Bjorn,

Are there existing abbreviations for these AER status strings that I cannot 
find? Or do you want
me to abbreviate them similar to the style used with prints in lspci -vv?

Once they are abbreviated, you'd prefer to have all errors that have occurred to 
be printed on
the same line, correct?

Thanks,
Tyler
Bjorn Helgaas Dec. 13, 2017, 7:24 p.m. UTC | #10
On Wed, Dec 13, 2017 at 11:50:56AM -0500, Tyler Baicar wrote:
> On 11/15/2017 12:56 PM, Bjorn Helgaas wrote:
> >Hi Tyler,
> >
> >On Wed, Nov 15, 2017 at 09:47:41AM -0500, Tyler Baicar wrote:
> >>On 10/17/2017 11:42 AM, Tyler Baicar wrote:
> >>>Currently the AER driver uses cper_print_bits() to print the AER status
> >>>string. This causes the status string to not include the proper PCI device
> >>>name prefix that the other AER prints include. Also, it has a different
> >>>print level than all the other AER prints.
> >>>
> >>>Update the AER driver to print the AER status string with the proper string
> >>>prefix and proper print level.
> >>Hello,
> >>
> >>Will this patch be pulled into 4.15?
> >Sorry, I am preparing the 4.15 pull request right now, and it doesn't
> >include this change.
> >
> >I do like the dev_err() change, but would prefer fewer lines of
> >output.  I could have applied just the dev_err() change, but to
> >minimize pain for people who parse the logs, I'd rather make one
> >change in the output instead of making one change now and another
> >later.
> Hello Bjorn,
> 
> Are there existing abbreviations for these AER status strings that I
> cannot find? Or do you want
> me to abbreviate them similar to the style used with prints in lspci -vv?

I think the terms used by lspci -vv would be a good start.

> Once they are abbreviated, you'd prefer to have all errors that have
> occurred to be printed on
> the same line, correct?

Yes.  Multiple lines suggests an ordering that really isn't there, so
if we can print them all at once, it both improves atomicity and
removes the erroneous suggestion that "this error occurred before this
other one".

Bjorn
diff mbox

Patch

diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c b/drivers/pci/pcie/aer/aerdrv_errprint.c
index 54c4b69..b718daa 100644
--- a/drivers/pci/pcie/aer/aerdrv_errprint.c
+++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
@@ -206,6 +206,19 @@  void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info)
 }
 
 #ifdef CONFIG_ACPI_APEI_PCIEAER
+void dev_print_bits(struct pci_dev *dev, unsigned int bits,
+		    const char * const strs[], unsigned int strs_size)
+{
+	unsigned int i;
+
+	for (i = 0; i < strs_size; i++) {
+		if (!(bits & (1U << i)))
+			continue;
+		if (strs[i])
+			dev_err(&dev->dev, "%s\n", strs[i]);
+	}
+}
+
 int cper_severity_to_aer(int cper_severity)
 {
 	switch (cper_severity) {
@@ -243,7 +256,7 @@  void cper_print_aer(struct pci_dev *dev, int aer_severity,
 	agent = AER_GET_AGENT(aer_severity, status);
 
 	dev_err(&dev->dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n", status, mask);
-	cper_print_bits("", status, status_strs, status_strs_size);
+	dev_print_bits(dev, status, status_strs, status_strs_size);
 	dev_err(&dev->dev, "aer_layer=%s, aer_agent=%s\n",
 		aer_error_layer[layer], aer_agent_string[agent]);