diff mbox series

[2/3] PCI: iproc: Stop using generic config read/write functions

Message ID 20200730033747.18931-2-mark.tomlinson@alliedtelesis.co.nz (mailing list archive)
State Superseded, archived
Delegated to: Lorenzo Pieralisi
Headers show
Series [1/3] PCI: iproc: Add bus number parameter to read/write functions | expand

Commit Message

Mark Tomlinson July 30, 2020, 3:37 a.m. UTC
The pci_generic_config_write32() function will give warning messages
whenever writing less than 4 bytes at a time. As there is nothing we can
do about this without changing the hardware, the message is just a
nuisance. So instead of using the generic functions, use the functions
that have already been written for reading/writing the config registers.

Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
---
 drivers/pci/controller/pcie-iproc.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Bjorn Helgaas July 30, 2020, 4:09 p.m. UTC | #1
[+cc Lorenzo, Rob]

On Thu, Jul 30, 2020 at 03:37:46PM +1200, Mark Tomlinson wrote:
> The pci_generic_config_write32() function will give warning messages
> whenever writing less than 4 bytes at a time. As there is nothing we can
> do about this without changing the hardware, the message is just a
> nuisance. So instead of using the generic functions, use the functions
> that have already been written for reading/writing the config registers.

The reason that pci_generic_config_write32() message is there is
because, as the message says, a read/modify/write may corrupt bits in
adjacent registers.  

It makes me a little queasy to do these read/modify/write sequences
silently.  A generic driver doing an 8- or 16-bit config write has no
idea that the write may corrupt an adjacent register.  That leads to
bugs that are very difficult to debug and only reproducible on iProc.

The ratelimiting on the current pci_generic_config_write32() message
is based on the call site, not on the device.  That's not ideal: we
may emit several messages for device A, trigger ratelimiting, then do
a write for device B that doesn't generate a message.

I think it would be better to have a warning once per device, so if
XYZ device has a problem and we look at the dmesg log, we would find a
single message for device XYZ as a hint.  Would that reduce the
nuisance level enough?

So I think I did it wrong in fb2659230120 ("PCI: Warn on possible RW1C
corruption for sub-32 bit config writes").  Ratelimiting is the wrong
concept because what we want is a single warning per device, not a
limit on the similar messages for *all* devices, maybe something like
this:

diff --git a/drivers/pci/access.c b/drivers/pci/access.c
index 79c4a2ef269a..e5f956b7e3b7 100644
--- a/drivers/pci/access.c
+++ b/drivers/pci/access.c
@@ -160,9 +160,12 @@ int pci_generic_config_write32(struct pci_bus *bus, unsigned int devfn,
 	 * write happen to have any RW1C (write-one-to-clear) bits set, we
 	 * just inadvertently cleared something we shouldn't have.
 	 */
-	dev_warn_ratelimited(&bus->dev, "%d-byte config write to %04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
+	if (!(bus->unsafe_warn & (1 << devfn))) {
+		dev_warn(&bus->dev, "%d-byte config write to %04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
 			     size, pci_domain_nr(bus), bus->number,
 			     PCI_SLOT(devfn), PCI_FUNC(devfn), where);
+		bus->unsafe_warn |= 1 << devfn;
+	}
 
 	mask = ~(((1 << (size * 8)) - 1) << ((where & 0x3) * 8));
 	tmp = readl(addr) & mask;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index c79d83304e52..264b009fa4a6 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -613,6 +613,7 @@ struct pci_bus {
 	unsigned char	primary;	/* Number of primary bridge */
 	unsigned char	max_bus_speed;	/* enum pci_bus_speed */
 	unsigned char	cur_bus_speed;	/* enum pci_bus_speed */
+	u8		unsafe_warn;	/* warned about R/M/W config write */
 #ifdef CONFIG_PCI_DOMAINS_GENERIC
 	int		domain_nr;
 #endif

> Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
> ---
>  drivers/pci/controller/pcie-iproc.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/controller/pcie-iproc.c b/drivers/pci/controller/pcie-iproc.c
> index 2c836eede42c..68ecd3050529 100644
> --- a/drivers/pci/controller/pcie-iproc.c
> +++ b/drivers/pci/controller/pcie-iproc.c
> @@ -709,12 +709,13 @@ static int iproc_pcie_config_read32(struct pci_bus *bus, unsigned int devfn,
>  {
>  	int ret;
>  	struct iproc_pcie *pcie = iproc_data(bus);
> +	int busno = bus->number;
>  
>  	iproc_pcie_apb_err_disable(bus, true);
>  	if (pcie->iproc_cfg_read)
>  		ret = iproc_pcie_config_read(bus, devfn, where, size, val);
>  	else
> -		ret = pci_generic_config_read32(bus, devfn, where, size, val);
> +		ret = iproc_pci_raw_config_read32(pcie, busno, devfn, where, size, val);
>  	iproc_pcie_apb_err_disable(bus, false);
>  
>  	return ret;
> @@ -724,9 +725,11 @@ static int iproc_pcie_config_write32(struct pci_bus *bus, unsigned int devfn,
>  				     int where, int size, u32 val)
>  {
>  	int ret;
> +	struct iproc_pcie *pcie = iproc_data(bus);
> +	int busno = bus->number;
>  
>  	iproc_pcie_apb_err_disable(bus, true);
> -	ret = pci_generic_config_write32(bus, devfn, where, size, val);
> +	ret = iproc_pci_raw_config_write32(pcie, busno, devfn, where, size, val);
>  	iproc_pcie_apb_err_disable(bus, false);
>  
>  	return ret;
> -- 
> 2.28.0
>
Ray Jui July 30, 2020, 4:36 p.m. UTC | #2
On 7/30/2020 9:09 AM, Bjorn Helgaas wrote:
> [+cc Lorenzo, Rob]
> 
> On Thu, Jul 30, 2020 at 03:37:46PM +1200, Mark Tomlinson wrote:
>> The pci_generic_config_write32() function will give warning messages
>> whenever writing less than 4 bytes at a time. As there is nothing we can
>> do about this without changing the hardware, the message is just a
>> nuisance. So instead of using the generic functions, use the functions
>> that have already been written for reading/writing the config registers.
> 
> The reason that pci_generic_config_write32() message is there is
> because, as the message says, a read/modify/write may corrupt bits in
> adjacent registers.  
> 
> It makes me a little queasy to do these read/modify/write sequences
> silently.  A generic driver doing an 8- or 16-bit config write has no
> idea that the write may corrupt an adjacent register.  That leads to
> bugs that are very difficult to debug and only reproducible on iProc.
> 
> The ratelimiting on the current pci_generic_config_write32() message
> is based on the call site, not on the device.  That's not ideal: we
> may emit several messages for device A, trigger ratelimiting, then do
> a write for device B that doesn't generate a message.
> 
> I think it would be better to have a warning once per device, so if
> XYZ device has a problem and we look at the dmesg log, we would find a
> single message for device XYZ as a hint.  Would that reduce the
> nuisance level enough?
> 

I'm in favor of this. I agree with you that we do need the warnings
because some PCIe config registers that are read/write to clear.

But the current amount of warning messages generated from these config
register access is quite massive and often concerns the users who are
less familiar with the reason/purpose of the warnings. We were asked
about these warnings by multiple customers. People freaked out when they
see "corrupt" in the warning messages, :)

Limiting the warning to once per device seems to be a reasonable
compromise to me.

Thanks,

Ray

> So I think I did it wrong in fb2659230120 ("PCI: Warn on possible RW1C
> corruption for sub-32 bit config writes").  Ratelimiting is the wrong
> concept because what we want is a single warning per device, not a
> limit on the similar messages for *all* devices, maybe something like
> this:
> 
> diff --git a/drivers/pci/access.c b/drivers/pci/access.c
> index 79c4a2ef269a..e5f956b7e3b7 100644
> --- a/drivers/pci/access.c
> +++ b/drivers/pci/access.c
> @@ -160,9 +160,12 @@ int pci_generic_config_write32(struct pci_bus *bus, unsigned int devfn,
>  	 * write happen to have any RW1C (write-one-to-clear) bits set, we
>  	 * just inadvertently cleared something we shouldn't have.
>  	 */
> -	dev_warn_ratelimited(&bus->dev, "%d-byte config write to %04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
> +	if (!(bus->unsafe_warn & (1 << devfn))) {
> +		dev_warn(&bus->dev, "%d-byte config write to %04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
>  			     size, pci_domain_nr(bus), bus->number,
>  			     PCI_SLOT(devfn), PCI_FUNC(devfn), where);
> +		bus->unsafe_warn |= 1 << devfn;
> +	}
>  
>  	mask = ~(((1 << (size * 8)) - 1) << ((where & 0x3) * 8));
>  	tmp = readl(addr) & mask;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index c79d83304e52..264b009fa4a6 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -613,6 +613,7 @@ struct pci_bus {
>  	unsigned char	primary;	/* Number of primary bridge */
>  	unsigned char	max_bus_speed;	/* enum pci_bus_speed */
>  	unsigned char	cur_bus_speed;	/* enum pci_bus_speed */
> +	u8		unsafe_warn;	/* warned about R/M/W config write */
>  #ifdef CONFIG_PCI_DOMAINS_GENERIC
>  	int		domain_nr;
>  #endif
> 
>> Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
>> ---
>>  drivers/pci/controller/pcie-iproc.c | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/pci/controller/pcie-iproc.c b/drivers/pci/controller/pcie-iproc.c
>> index 2c836eede42c..68ecd3050529 100644
>> --- a/drivers/pci/controller/pcie-iproc.c
>> +++ b/drivers/pci/controller/pcie-iproc.c
>> @@ -709,12 +709,13 @@ static int iproc_pcie_config_read32(struct pci_bus *bus, unsigned int devfn,
>>  {
>>  	int ret;
>>  	struct iproc_pcie *pcie = iproc_data(bus);
>> +	int busno = bus->number;
>>  
>>  	iproc_pcie_apb_err_disable(bus, true);
>>  	if (pcie->iproc_cfg_read)
>>  		ret = iproc_pcie_config_read(bus, devfn, where, size, val);
>>  	else
>> -		ret = pci_generic_config_read32(bus, devfn, where, size, val);
>> +		ret = iproc_pci_raw_config_read32(pcie, busno, devfn, where, size, val);
>>  	iproc_pcie_apb_err_disable(bus, false);
>>  
>>  	return ret;
>> @@ -724,9 +725,11 @@ static int iproc_pcie_config_write32(struct pci_bus *bus, unsigned int devfn,
>>  				     int where, int size, u32 val)
>>  {
>>  	int ret;
>> +	struct iproc_pcie *pcie = iproc_data(bus);
>> +	int busno = bus->number;
>>  
>>  	iproc_pcie_apb_err_disable(bus, true);
>> -	ret = pci_generic_config_write32(bus, devfn, where, size, val);
>> +	ret = iproc_pci_raw_config_write32(pcie, busno, devfn, where, size, val);
>>  	iproc_pcie_apb_err_disable(bus, false);
>>  
>>  	return ret;
>> -- 
>> 2.28.0
>>
Bjorn Helgaas July 30, 2020, 4:45 p.m. UTC | #3
On Thu, Jul 30, 2020 at 09:36:14AM -0700, Ray Jui wrote:
> On 7/30/2020 9:09 AM, Bjorn Helgaas wrote:
> > On Thu, Jul 30, 2020 at 03:37:46PM +1200, Mark Tomlinson wrote:
> >> The pci_generic_config_write32() function will give warning messages
> >> whenever writing less than 4 bytes at a time. As there is nothing we can
> >> do about this without changing the hardware, the message is just a
> >> nuisance. So instead of using the generic functions, use the functions
> >> that have already been written for reading/writing the config registers.
> > 
> > The reason that pci_generic_config_write32() message is there is
> > because, as the message says, a read/modify/write may corrupt bits in
> > adjacent registers.  
> > 
> > It makes me a little queasy to do these read/modify/write sequences
> > silently.  A generic driver doing an 8- or 16-bit config write has no
> > idea that the write may corrupt an adjacent register.  That leads to
> > bugs that are very difficult to debug and only reproducible on iProc.
> > 
> > The ratelimiting on the current pci_generic_config_write32() message
> > is based on the call site, not on the device.  That's not ideal: we
> > may emit several messages for device A, trigger ratelimiting, then do
> > a write for device B that doesn't generate a message.
> > 
> > I think it would be better to have a warning once per device, so if
> > XYZ device has a problem and we look at the dmesg log, we would find a
> > single message for device XYZ as a hint.  Would that reduce the
> > nuisance level enough?
> 
> I'm in favor of this. I agree with you that we do need the warnings
> because some PCIe config registers that are read/write to clear.
> 
> But the current amount of warning messages generated from these config
> register access is quite massive and often concerns the users who are
> less familiar with the reason/purpose of the warnings. We were asked
> about these warnings by multiple customers. People freaked out when they
> see "corrupt" in the warning messages, :)

Yeah, I'm sure they would.  Hopefully the message makes it all the way
back to the hardware designers ;)

> Limiting the warning to once per device seems to be a reasonable
> compromise to me.

We (you, I mean :)) could also look at the particular warnings.  If
they're triggered by PCI core writes that are 8- or 16-bits when they
*could* be 32-bits, it might make sense to widen them.  I know there
are places that do 8-bit writes to 16-bit registers; maybe there are
similar ones to 32-bit registers.

Bjorn
Mark Tomlinson July 30, 2020, 10:58 p.m. UTC | #4
On Thu, 2020-07-30 at 11:09 -0500, Bjorn Helgaas wrote:
> I think it would be better to have a warning once per device, so if
> XYZ device has a problem and we look at the dmesg log, we would find a
> single message for device XYZ as a hint.  Would that reduce the
> nuisance level enough?

We would be OK with that.

> So I think I did it wrong in fb2659230120 ("PCI: Warn on possible RW1C
> corruption for sub-32 bit config writes").  Ratelimiting is the wrong
> concept because what we want is a single warning per device, not a
> limit on the similar messages for *all* devices, maybe something like
> this:
> 
> diff --git a/drivers/pci/access.c b/drivers/pci/access.c
> index 79c4a2ef269a..e5f956b7e3b7 100644
> --- a/drivers/pci/access.c
> +++ b/drivers/pci/access.c
> @@ -160,9 +160,12 @@ int pci_generic_config_write32(struct pci_bus *bus, unsigned int devfn,
>  	 * write happen to have any RW1C (write-one-to-clear) bits set, we
>  	 * just inadvertently cleared something we shouldn't have.
>  	 */
> -	dev_warn_ratelimited(&bus->dev, "%d-byte config write to %04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
> +	if (!(bus->unsafe_warn & (1 << devfn))) {
> +		dev_warn(&bus->dev, "%d-byte config write to %04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
>  			     size, pci_domain_nr(bus), bus->number,
>  			     PCI_SLOT(devfn), PCI_FUNC(devfn), where);
> +		bus->unsafe_warn |= 1 << devfn;
> +	}

As I understand it, devfn is an 8-bit value with five bits of device
and three bits of function. So (1 << devfn) is not going to fit in an
8-bit mask. Am I missing something here? (I do admit that my PCI
knowledge is not great).
Bjorn Helgaas July 30, 2020, 11:06 p.m. UTC | #5
On Thu, Jul 30, 2020 at 10:58:03PM +0000, Mark Tomlinson wrote:
> On Thu, 2020-07-30 at 11:09 -0500, Bjorn Helgaas wrote:
> > I think it would be better to have a warning once per device, so if
> > XYZ device has a problem and we look at the dmesg log, we would find a
> > single message for device XYZ as a hint.  Would that reduce the
> > nuisance level enough?
> 
> We would be OK with that.
> 
> > So I think I did it wrong in fb2659230120 ("PCI: Warn on possible RW1C
> > corruption for sub-32 bit config writes").  Ratelimiting is the wrong
> > concept because what we want is a single warning per device, not a
> > limit on the similar messages for *all* devices, maybe something like
> > this:
> > 
> > diff --git a/drivers/pci/access.c b/drivers/pci/access.c
> > index 79c4a2ef269a..e5f956b7e3b7 100644
> > --- a/drivers/pci/access.c
> > +++ b/drivers/pci/access.c
> > @@ -160,9 +160,12 @@ int pci_generic_config_write32(struct pci_bus *bus, unsigned int devfn,
> >  	 * write happen to have any RW1C (write-one-to-clear) bits set, we
> >  	 * just inadvertently cleared something we shouldn't have.
> >  	 */
> > -	dev_warn_ratelimited(&bus->dev, "%d-byte config write to %04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
> > +	if (!(bus->unsafe_warn & (1 << devfn))) {
> > +		dev_warn(&bus->dev, "%d-byte config write to %04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
> >  			     size, pci_domain_nr(bus), bus->number,
> >  			     PCI_SLOT(devfn), PCI_FUNC(devfn), where);
> > +		bus->unsafe_warn |= 1 << devfn;
> > +	}
> 
> As I understand it, devfn is an 8-bit value with five bits of device
> and three bits of function. So (1 << devfn) is not going to fit in an
> 8-bit mask. Am I missing something here? (I do admit that my PCI
> knowledge is not great).

You're not missing anything, I just screwed up.  What I was really
*hoping* to do was just put a bit in the pci_dev, but of course, these
functions don't have a pci_dev.  256 bits in the bus seems like a
little overkill though.  Maybe we just give up on the exact device and
warn only once per *bus* instead of once per device.

Bjorn
diff mbox series

Patch

diff --git a/drivers/pci/controller/pcie-iproc.c b/drivers/pci/controller/pcie-iproc.c
index 2c836eede42c..68ecd3050529 100644
--- a/drivers/pci/controller/pcie-iproc.c
+++ b/drivers/pci/controller/pcie-iproc.c
@@ -709,12 +709,13 @@  static int iproc_pcie_config_read32(struct pci_bus *bus, unsigned int devfn,
 {
 	int ret;
 	struct iproc_pcie *pcie = iproc_data(bus);
+	int busno = bus->number;
 
 	iproc_pcie_apb_err_disable(bus, true);
 	if (pcie->iproc_cfg_read)
 		ret = iproc_pcie_config_read(bus, devfn, where, size, val);
 	else
-		ret = pci_generic_config_read32(bus, devfn, where, size, val);
+		ret = iproc_pci_raw_config_read32(pcie, busno, devfn, where, size, val);
 	iproc_pcie_apb_err_disable(bus, false);
 
 	return ret;
@@ -724,9 +725,11 @@  static int iproc_pcie_config_write32(struct pci_bus *bus, unsigned int devfn,
 				     int where, int size, u32 val)
 {
 	int ret;
+	struct iproc_pcie *pcie = iproc_data(bus);
+	int busno = bus->number;
 
 	iproc_pcie_apb_err_disable(bus, true);
-	ret = pci_generic_config_write32(bus, devfn, where, size, val);
+	ret = iproc_pci_raw_config_write32(pcie, busno, devfn, where, size, val);
 	iproc_pcie_apb_err_disable(bus, false);
 
 	return ret;