diff mbox series

[v4,1/8] PCI/AER: Remove ID from aer_agent_string[]

Message ID 22b2dae2a6ac340d9d45c28481d746ec1064cd6c.1633453452.git.naveennaidu479@gmail.com (mailing list archive)
State Superseded
Delegated to: Bjorn Helgaas
Headers show
Series Fix long standing AER Error Handling Issues | expand

Commit Message

Naveen Naidu Oct. 5, 2021, 5:18 p.m. UTC
Currently, we do not print the "id" field in the AER error logs. Yet the
aer_agent_string[] has the word "id" in it. The AER error log looks
like:

  pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)

Without the "id" field in the error log, The aer_agent_string[]
(eg: "Receiver ID") does not make sense. A user reading the
aer_agent_string[] in the log, might inadvertently look for an "id"
field and not finding it might lead to confusion.

Remove the "ID" from the aer_agent_string[].

The following are sample dummy errors inject via aer-inject.

Before
=======

In 010caed4ccb6 ("PCI/AER: Decode Error Source Requester ID"),
the "id" field was removed from the AER error logs, so currently AER
logs look like:

  pcieport 0000:00:03.0: AER: Corrected error received: 0000:00:03:0
  pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) <--- no id field
  pcieport 0000:00:03.0:   device [1b36:000c] error status/mask=00000040/0000e000
  pcieport 0000:00:03.0:    [ 6] BadTLP

After
======

  pcieport 0000:00:03.0: AER: Corrected error received: 0000:00:03.0
  pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver)
  pcieport 0000:00:03.0:   device [1b36:000c] error status/mask=00000040/0000e000
  pcieport 0000:00:03.0:    [ 6] BadTLP

Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com>
---
 drivers/pci/pcie/aer.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Comments

Bjorn Helgaas Oct. 21, 2021, 1:28 a.m. UTC | #1
On Tue, Oct 05, 2021 at 10:48:08PM +0530, Naveen Naidu wrote:
> Currently, we do not print the "id" field in the AER error logs. Yet the
> aer_agent_string[] has the word "id" in it. The AER error log looks
> like:
> 
>   pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
> 
> Without the "id" field in the error log, The aer_agent_string[]
> (eg: "Receiver ID") does not make sense. A user reading the
> aer_agent_string[] in the log, might inadvertently look for an "id"
> field and not finding it might lead to confusion.
> 
> Remove the "ID" from the aer_agent_string[].
> 
> The following are sample dummy errors inject via aer-inject.

I like this, and the problem it fixes was my fault because
these "ID" strings should have been removed by 010caed4ccb6.

If it's straightforward enough, it would be nice to have the
aer-inject command line here in the commit log to make it easier
for people to play with this.

> Before
> =======
> 
> In 010caed4ccb6 ("PCI/AER: Decode Error Source Requester ID"),
> the "id" field was removed from the AER error logs, so currently AER
> logs look like:
> 
>   pcieport 0000:00:03.0: AER: Corrected error received: 0000:00:03:0
>   pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) <--- no id field
>   pcieport 0000:00:03.0:   device [1b36:000c] error status/mask=00000040/0000e000
>   pcieport 0000:00:03.0:    [ 6] BadTLP
> 
> After
> ======
> 
>   pcieport 0000:00:03.0: AER: Corrected error received: 0000:00:03.0
>   pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver)
>   pcieport 0000:00:03.0:   device [1b36:000c] error status/mask=00000040/0000e000
>   pcieport 0000:00:03.0:    [ 6] BadTLP
> 
> Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com>
> ---
>  drivers/pci/pcie/aer.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 9784fdcf3006..241ff361b43c 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -516,10 +516,10 @@ static const char *aer_uncorrectable_error_string[] = {
>  };
>  
>  static const char *aer_agent_string[] = {
> -	"Receiver ID",
> -	"Requester ID",
> -	"Completer ID",
> -	"Transmitter ID"
> +	"Receiver",
> +	"Requester",
> +	"Completer",
> +	"Transmitter"
>  };
>  
>  #define aer_stats_dev_attr(name, stats_array, strings_array,		\
> @@ -703,7 +703,7 @@ void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
>  	const char *level;
>  
>  	if (!info->status) {
> -		pci_err(dev, "PCIe Bus Error: severity=%s, type=Inaccessible, (Unregistered Agent ID)\n",
> +		pci_err(dev, "PCIe Bus Error: severity=%s, type=Inaccessible, (Unregistered Agent)\n",
>  			aer_error_severity_string[info->severity]);
>  		goto out;
>  	}
> -- 
> 2.25.1
> 
> _______________________________________________
> Linux-kernel-mentees mailing list
> Linux-kernel-mentees@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees
Naveen Naidu Oct. 21, 2021, 4:30 p.m. UTC | #2
On 20/10, Bjorn Helgaas wrote:
> On Tue, Oct 05, 2021 at 10:48:08PM +0530, Naveen Naidu wrote:
> > Currently, we do not print the "id" field in the AER error logs. Yet the
> > aer_agent_string[] has the word "id" in it. The AER error log looks
> > like:
> > 
> >   pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
> > 
> > Without the "id" field in the error log, The aer_agent_string[]
> > (eg: "Receiver ID") does not make sense. A user reading the
> > aer_agent_string[] in the log, might inadvertently look for an "id"
> > field and not finding it might lead to confusion.
> > 
> > Remove the "ID" from the aer_agent_string[].
> > 
> > The following are sample dummy errors inject via aer-inject.
> 
> I like this, and the problem it fixes was my fault because
> these "ID" strings should have been removed by 010caed4ccb6.
> 
> If it's straightforward enough, it would be nice to have the
> aer-inject command line here in the commit log to make it easier
> for people to play with this.
>

Thank you for the review. Do you mean something like:

The following sample dummy errors are injected via aer-inject via the
following steps:

  1. The steps to compile the aer-inject tool is mentioned in (Section
     4. Software error inject) of the document [1]

     [1]: https://www.kernel.org/doc/Documentation/PCI/pcieaer-howto.txt

     Make sure to place the aer-inject executable at the home directory
     of the qemu system or at any other place.

  2. Emulate a PCIE architecture using qemu, A sample looks like
     following:
     
		qemu-system-x86_64 -kernel ../linux/arch/x86_64/boot/bzImage \
        -initrd  buildroot-build/images/rootfs.cpio.gz \
        -append "console=ttyS0"  \
        -enable-kvm -nographic \
        -M q35 \
        -device pcie-root-port,bus=pcie.0,id=rp1,slot=1 \
        -device pcie-pci-bridge,id=br1,bus=rp1 \
        -device e1000,bus=br1,addr=8
       
    Note that the PCIe features are available only when using the 
    'q35' Machine [2]
    [2]: https://github.com/qemu/qemu/blob/master/docs/pcie.txt

  3. Once the qemu system starts up, create a sample aer-file or use any
     example aer file from [3]

     [3]:
     https://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/aer-inject.git/tree/examples

  4. Inject any aer-error using
      
      ./aer-inject aer-file

This does look a tad bit longer for a commit log so I am unsure if you
would like to have it there. If you are okay with it, I would be happy
to add it to that :)

> > Before
> > =======
> > 
> > In 010caed4ccb6 ("PCI/AER: Decode Error Source Requester ID"),
> > the "id" field was removed from the AER error logs, so currently AER
> > logs look like:
> > 
> >   pcieport 0000:00:03.0: AER: Corrected error received: 0000:00:03:0
> >   pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) <--- no id field
> >   pcieport 0000:00:03.0:   device [1b36:000c] error status/mask=00000040/0000e000
> >   pcieport 0000:00:03.0:    [ 6] BadTLP
> > 
> > After
> > ======
> > 
> >   pcieport 0000:00:03.0: AER: Corrected error received: 0000:00:03.0
> >   pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver)
> >   pcieport 0000:00:03.0:   device [1b36:000c] error status/mask=00000040/0000e000
> >   pcieport 0000:00:03.0:    [ 6] BadTLP
> > 
> > Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com>
> > ---
> >  drivers/pci/pcie/aer.c | 10 +++++-----
> >  1 file changed, 5 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> > index 9784fdcf3006..241ff361b43c 100644
> > --- a/drivers/pci/pcie/aer.c
> > +++ b/drivers/pci/pcie/aer.c
> > @@ -516,10 +516,10 @@ static const char *aer_uncorrectable_error_string[] = {
> >  };
> >  
> >  static const char *aer_agent_string[] = {
> > -	"Receiver ID",
> > -	"Requester ID",
> > -	"Completer ID",
> > -	"Transmitter ID"
> > +	"Receiver",
> > +	"Requester",
> > +	"Completer",
> > +	"Transmitter"
> >  };
> >  
> >  #define aer_stats_dev_attr(name, stats_array, strings_array,		\
> > @@ -703,7 +703,7 @@ void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
> >  	const char *level;
> >  
> >  	if (!info->status) {
> > -		pci_err(dev, "PCIe Bus Error: severity=%s, type=Inaccessible, (Unregistered Agent ID)\n",
> > +		pci_err(dev, "PCIe Bus Error: severity=%s, type=Inaccessible, (Unregistered Agent)\n",
> >  			aer_error_severity_string[info->severity]);
> >  		goto out;
> >  	}
> > -- 
> > 2.25.1
> > 
> > _______________________________________________
> > Linux-kernel-mentees mailing list
> > Linux-kernel-mentees@lists.linuxfoundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees
Bjorn Helgaas Oct. 21, 2021, 5:03 p.m. UTC | #3
On Thu, Oct 21, 2021 at 10:00:21PM +0530, Naveen Naidu wrote:
> On 20/10, Bjorn Helgaas wrote:
> > On Tue, Oct 05, 2021 at 10:48:08PM +0530, Naveen Naidu wrote:
> > > Currently, we do not print the "id" field in the AER error logs. Yet the
> > > aer_agent_string[] has the word "id" in it. The AER error log looks
> > > like:
> > > 
> > >   pcieport 0000:00:03.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
> > > 
> > > Without the "id" field in the error log, The aer_agent_string[]
> > > (eg: "Receiver ID") does not make sense. A user reading the
> > > aer_agent_string[] in the log, might inadvertently look for an "id"
> > > field and not finding it might lead to confusion.
> > > 
> > > Remove the "ID" from the aer_agent_string[].
> > > 
> > > The following are sample dummy errors inject via aer-inject.
> > 
> > I like this, and the problem it fixes was my fault because
> > these "ID" strings should have been removed by 010caed4ccb6.
> > 
> > If it's straightforward enough, it would be nice to have the
> > aer-inject command line here in the commit log to make it easier
> > for people to play with this.
> >
> 
> Thank you for the review. Do you mean something like:
> 
> The following sample dummy errors are injected via aer-inject via the
> following steps:
> 
>   1. The steps to compile the aer-inject tool is mentioned in (Section
>      4. Software error inject) of the document [1]
> 
>      [1]: https://www.kernel.org/doc/Documentation/PCI/pcieaer-howto.txt
> 
>      Make sure to place the aer-inject executable at the home directory
>      of the qemu system or at any other place.
> 
>   2. Emulate a PCIE architecture using qemu, A sample looks like
>      following:
>      
> 		qemu-system-x86_64 -kernel ../linux/arch/x86_64/boot/bzImage \
>         -initrd  buildroot-build/images/rootfs.cpio.gz \
>         -append "console=ttyS0"  \
>         -enable-kvm -nographic \
>         -M q35 \
>         -device pcie-root-port,bus=pcie.0,id=rp1,slot=1 \
>         -device pcie-pci-bridge,id=br1,bus=rp1 \
>         -device e1000,bus=br1,addr=8
>        
>     Note that the PCIe features are available only when using the 
>     'q35' Machine [2]
>     [2]: https://github.com/qemu/qemu/blob/master/docs/pcie.txt
> 
>   3. Once the qemu system starts up, create a sample aer-file or use any
>      example aer file from [3]
> 
>      [3]:
>      https://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/aer-inject.git/tree/examples
> 
>   4. Inject any aer-error using
>       
>       ./aer-inject aer-file
> 
> This does look a tad bit longer for a commit log so I am unsure if you
> would like to have it there. If you are okay with it, I would be happy
> to add it to that :)

Yes, that's kind of long.  Something like this
https://git.kernel.org/linus/d95f20c4f070 would be enough for the
commit log, especially since you've now provided all the details in
the email thread, where we can find them via the Link: tag.

Bjorn
diff mbox series

Patch

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 9784fdcf3006..241ff361b43c 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -516,10 +516,10 @@  static const char *aer_uncorrectable_error_string[] = {
 };
 
 static const char *aer_agent_string[] = {
-	"Receiver ID",
-	"Requester ID",
-	"Completer ID",
-	"Transmitter ID"
+	"Receiver",
+	"Requester",
+	"Completer",
+	"Transmitter"
 };
 
 #define aer_stats_dev_attr(name, stats_array, strings_array,		\
@@ -703,7 +703,7 @@  void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
 	const char *level;
 
 	if (!info->status) {
-		pci_err(dev, "PCIe Bus Error: severity=%s, type=Inaccessible, (Unregistered Agent ID)\n",
+		pci_err(dev, "PCIe Bus Error: severity=%s, type=Inaccessible, (Unregistered Agent)\n",
 			aer_error_severity_string[info->severity]);
 		goto out;
 	}