diff mbox series

PCI: Align MPS to upstream bridge for SAFE and PERFORMANCE mode

Message ID 20220610150131.6256-1-Zhiqiang.Hou@nxp.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show
Series PCI: Align MPS to upstream bridge for SAFE and PERFORMANCE mode | expand

Commit Message

Z.Q. Hou June 10, 2022, 3:01 p.m. UTC
From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>

The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge")
made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT
mode, so that it's more likely that a hot-added device will work in
a system with an optimized MPS configuration.

Obviously, the Linux itself optimizes the MPS settings in the
PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also
for these modes.

Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
---
 drivers/pci/probe.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Z.Q. Hou July 13, 2022, 10:21 a.m. UTC | #1
Hi Bjorn and Lorenzo,

Any comments on this change?

-----Original Message-----
From: Z.Q. Hou <zhiqiang.hou@nxp.com> 
Sent: 2022年6月10日 23:02
To: linux-pci@vger.kernel.org; bhelgaas@google.com
Cc: Z.Q. Hou <zhiqiang.hou@nxp.com>
Subject: [PATCH] PCI: Align MPS to upstream bridge for SAFE and PERFORMANCE mode

From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>

The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge") made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT mode, so that it's more likely that a hot-added device will work in a system with an optimized MPS configuration.

Obviously, the Linux itself optimizes the MPS settings in the PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also for these modes.

Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
---
 drivers/pci/probe.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 17a969942d37..2c5a1aefd9cb 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2034,7 +2034,9 @@ static void pci_configure_mps(struct pci_dev *dev)
 	 * Fancier MPS configuration is done later by
 	 * pcie_bus_configure_settings()
 	 */
-	if (pcie_bus_config != PCIE_BUS_DEFAULT)
+	if (pcie_bus_config != PCIE_BUS_DEFAULT &&
+	    pcie_bus_config != PCIE_BUS_SAFE &&
+	    pcie_bus_config != PCIE_BUS_PERFORMANCE)
 		return;
 
 	mpss = 128 << dev->pcie_mpss;
@@ -2047,7 +2049,7 @@ static void pci_configure_mps(struct pci_dev *dev)
 
 	rc = pcie_set_mps(dev, p_mps);
 	if (rc) {
-		pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n",
+		pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use 
+\"pci=pcie_bus_peer2peer\" and report a bug\n",
 			 p_mps);
 		return;
 	}
--
2.17.1

Thanks,
Zhiqiang
Tyler Hicks Aug. 26, 2022, 3:49 p.m. UTC | #2
On 2022-06-10 23:01:31, Zhiqiang Hou wrote:
> From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> 
> The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge")
> made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT
> mode, so that it's more likely that a hot-added device will work in
> a system with an optimized MPS configuration.
> 
> Obviously, the Linux itself optimizes the MPS settings in the
> PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also
> for these modes.
> 
> Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> ---

Hi Bjorn - We have some interest in this patch and I am hoping it can be
considered in preparation for v6.1. I took a look at it and it makes
sense to me but I'm not an expert in this area. Thanks!

Tyler

>  drivers/pci/probe.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 17a969942d37..2c5a1aefd9cb 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -2034,7 +2034,9 @@ static void pci_configure_mps(struct pci_dev *dev)
>  	 * Fancier MPS configuration is done later by
>  	 * pcie_bus_configure_settings()
>  	 */
> -	if (pcie_bus_config != PCIE_BUS_DEFAULT)
> +	if (pcie_bus_config != PCIE_BUS_DEFAULT &&
> +	    pcie_bus_config != PCIE_BUS_SAFE &&
> +	    pcie_bus_config != PCIE_BUS_PERFORMANCE)
>  		return;
>  
>  	mpss = 128 << dev->pcie_mpss;
> @@ -2047,7 +2049,7 @@ static void pci_configure_mps(struct pci_dev *dev)
>  
>  	rc = pcie_set_mps(dev, p_mps);
>  	if (rc) {
> -		pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n",
> +		pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_peer2peer\" and report a bug\n",
>  			 p_mps);
>  		return;
>  	}
> -- 
> 2.17.1
>
Tyler Hicks Sept. 14, 2022, 2:41 p.m. UTC | #3
On 2022-08-26 10:49:39, Tyler Hicks wrote:
> On 2022-06-10 23:01:31, Zhiqiang Hou wrote:
> > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> > 
> > The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge")
> > made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT
> > mode, so that it's more likely that a hot-added device will work in
> > a system with an optimized MPS configuration.
> > 
> > Obviously, the Linux itself optimizes the MPS settings in the
> > PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also
> > for these modes.
> > 
> > Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> > ---
> 
> Hi Bjorn - We have some interest in this patch and I am hoping it can be
> considered in preparation for v6.1. I took a look at it and it makes
> sense to me but I'm not an expert in this area. Thanks!

Ping. Do you expect to get any free cycles to review this in prep for
v6.1?

Tyler

> 
> Tyler
> 
> >  drivers/pci/probe.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > index 17a969942d37..2c5a1aefd9cb 100644
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -2034,7 +2034,9 @@ static void pci_configure_mps(struct pci_dev *dev)
> >  	 * Fancier MPS configuration is done later by
> >  	 * pcie_bus_configure_settings()
> >  	 */
> > -	if (pcie_bus_config != PCIE_BUS_DEFAULT)
> > +	if (pcie_bus_config != PCIE_BUS_DEFAULT &&
> > +	    pcie_bus_config != PCIE_BUS_SAFE &&
> > +	    pcie_bus_config != PCIE_BUS_PERFORMANCE)
> >  		return;
> >  
> >  	mpss = 128 << dev->pcie_mpss;
> > @@ -2047,7 +2049,7 @@ static void pci_configure_mps(struct pci_dev *dev)
> >  
> >  	rc = pcie_set_mps(dev, p_mps);
> >  	if (rc) {
> > -		pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n",
> > +		pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_peer2peer\" and report a bug\n",
> >  			 p_mps);
> >  		return;
> >  	}
> > -- 
> > 2.17.1
> >
Tyler Hicks Oct. 13, 2022, 3:48 a.m. UTC | #4
On 2022-09-14 09:41:05, Tyler Hicks wrote:
> On 2022-08-26 10:49:39, Tyler Hicks wrote:
> > On 2022-06-10 23:01:31, Zhiqiang Hou wrote:
> > > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> > > 
> > > The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge")
> > > made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT
> > > mode, so that it's more likely that a hot-added device will work in
> > > a system with an optimized MPS configuration.
> > > 
> > > Obviously, the Linux itself optimizes the MPS settings in the
> > > PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also
> > > for these modes.
> > > 
> > > Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> > > ---
> > 
> > Hi Bjorn - We have some interest in this patch and I am hoping it can be
> > considered in preparation for v6.1. I took a look at it and it makes
> > sense to me but I'm not an expert in this area. Thanks!
> 
> Ping. Do you expect to get any free cycles to review this in prep for
> v6.1?

Ping now that you've got the v6.1 PR out for the PCI subsystem. Thanks. 

> 
> Tyler
> 
> > 
> > Tyler
> > 
> > >  drivers/pci/probe.c | 6 ++++--
> > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > > index 17a969942d37..2c5a1aefd9cb 100644
> > > --- a/drivers/pci/probe.c
> > > +++ b/drivers/pci/probe.c
> > > @@ -2034,7 +2034,9 @@ static void pci_configure_mps(struct pci_dev *dev)
> > >  	 * Fancier MPS configuration is done later by
> > >  	 * pcie_bus_configure_settings()
> > >  	 */
> > > -	if (pcie_bus_config != PCIE_BUS_DEFAULT)
> > > +	if (pcie_bus_config != PCIE_BUS_DEFAULT &&
> > > +	    pcie_bus_config != PCIE_BUS_SAFE &&
> > > +	    pcie_bus_config != PCIE_BUS_PERFORMANCE)
> > >  		return;
> > >  
> > >  	mpss = 128 << dev->pcie_mpss;
> > > @@ -2047,7 +2049,7 @@ static void pci_configure_mps(struct pci_dev *dev)
> > >  
> > >  	rc = pcie_set_mps(dev, p_mps);
> > >  	if (rc) {
> > > -		pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n",
> > > +		pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_peer2peer\" and report a bug\n",
> > >  			 p_mps);
> > >  		return;
> > >  	}
> > > -- 
> > > 2.17.1
> > >
Tyler Hicks Oct. 19, 2022, 6:25 p.m. UTC | #5
On 2022-06-10 23:01:31, Zhiqiang Hou wrote:
> From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> 
> The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge")

Adding Keith, as the author of that commit.

> made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT
> mode, so that it's more likely that a hot-added device will work in
> a system with an optimized MPS configuration.
> 
> Obviously, the Linux itself optimizes the MPS settings in the
> PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also
> for these modes.
> 
> Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> ---

I wanted to give a little more information about the issue we're seeing.
We're having trouble retaining the optimized Max Payload Size (MPS)
value of a PCIe endpoint after hotplug/rescan. In this case,
`pcie_ports=native` and `pci=pcie_bus_safe` are set on the cmdline and
we expect the Linux kernel to retain the MPS value. Commit 27d868b5e6cf
preserved the MPS value when using the default PCIe bus mode
(PCIE_BUS_DEFAULT) and we're hopeful that this can be extended to other
modes, as well.

Tyler

>  drivers/pci/probe.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 17a969942d37..2c5a1aefd9cb 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -2034,7 +2034,9 @@ static void pci_configure_mps(struct pci_dev *dev)
>  	 * Fancier MPS configuration is done later by
>  	 * pcie_bus_configure_settings()
>  	 */
> -	if (pcie_bus_config != PCIE_BUS_DEFAULT)
> +	if (pcie_bus_config != PCIE_BUS_DEFAULT &&
> +	    pcie_bus_config != PCIE_BUS_SAFE &&
> +	    pcie_bus_config != PCIE_BUS_PERFORMANCE)
>  		return;
>  
>  	mpss = 128 << dev->pcie_mpss;
> @@ -2047,7 +2049,7 @@ static void pci_configure_mps(struct pci_dev *dev)
>  
>  	rc = pcie_set_mps(dev, p_mps);
>  	if (rc) {
> -		pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n",
> +		pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_peer2peer\" and report a bug\n",
>  			 p_mps);
>  		return;
>  	}
> -- 
> 2.17.1
>
Keith Busch Oct. 19, 2022, 6:30 p.m. UTC | #6
On Wed, Oct 19, 2022 at 01:25:59PM -0500, Tyler Hicks wrote:
> On 2022-06-10 23:01:31, Zhiqiang Hou wrote:
> > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> > 
> > The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge")
> 
> Adding Keith, as the author of that commit.
> 
> > made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT
> > mode, so that it's more likely that a hot-added device will work in
> > a system with an optimized MPS configuration.
> > 
> > Obviously, the Linux itself optimizes the MPS settings in the
> > PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also
> > for these modes.
> > 
> > Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> > ---
> 
> I wanted to give a little more information about the issue we're seeing.
> We're having trouble retaining the optimized Max Payload Size (MPS)
> value of a PCIe endpoint after hotplug/rescan. In this case,
> `pcie_ports=native` and `pci=pcie_bus_safe` are set on the cmdline and
> we expect the Linux kernel to retain the MPS value. Commit 27d868b5e6cf
> preserved the MPS value when using the default PCIe bus mode
> (PCIE_BUS_DEFAULT) and we're hopeful that this can be extended to other
> modes, as well.

As I recall, the PCIE_BUS_DEFAULT mode was created specifically because
we didn't want to change the behavior of PCIE_BUS_SAFE. What reason do
you have for using that mode, anyway? That's specifically saying not to
retune bridges after the initial scan.
Bjorn Helgaas Oct. 20, 2022, 8:24 p.m. UTC | #7
On Wed, Oct 19, 2022 at 01:25:59PM -0500, Tyler Hicks wrote:
> On 2022-06-10 23:01:31, Zhiqiang Hou wrote:
> > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> > 
> > The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge")
> > made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT
> > mode, so that it's more likely that a hot-added device will work in
> > a system with an optimized MPS configuration.
> > 
> > Obviously, the Linux itself optimizes the MPS settings in the
> > PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also
> > for these modes.
> > 
> > Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> 
> I wanted to give a little more information about the issue we're seeing.
> We're having trouble retaining the optimized Max Payload Size (MPS)
> value of a PCIe endpoint after hotplug/rescan. In this case,
> `pcie_ports=native` and `pci=pcie_bus_safe` are set on the cmdline and
> we expect the Linux kernel to retain the MPS value. Commit 27d868b5e6cf
> preserved the MPS value when using the default PCIe bus mode
> (PCIE_BUS_DEFAULT) and we're hopeful that this can be extended to other
> modes, as well.

Thanks, Tyler.  I need help understanding what's going on here.  An
example of the topology and what happens and what *should* happen
might help.

Some MPS configuration is done per-device in pci_configure_mps(), and
some considers a hierarchy in pcie_bus_configure_settings().  In the
current tree, in the PCIE_BUS_SAFE case:

  - pci_configure_mps() does nothing (except for RCiEPs).

  - pcie_bus_configure_settings(bus) looks at the hierarchy rooted at
    the bridge leading to "bus".  If the hierarchy contains a hotplug
    Switch Downstream Port, it sets MPS and MRRS to 128 for
    everything.

    If it does not contain such a bridge, it finds the smallest
    MPS_Supported ("smpss") of any device in the hierarchy and sets
    MPS and MRRS to that for everything.

If you boot with a hotplug Root Port leading to an empty slot, I think
the RP MPS will end up being whatever BIOS put there.

A subsequent hot-add will do nothing in pci_configure_mps(), and
pcie_bus_configure_settings() looks like it would set the RP and EP
MPS to the minimum of the RP and EP MPS_Supported.

Is that what you see?  What would you like to see instead?

Bjorn
Tyler Hicks Oct. 25, 2022, 5:07 a.m. UTC | #8
On 2022-10-20 15:24:37, Bjorn Helgaas wrote:
> On Wed, Oct 19, 2022 at 01:25:59PM -0500, Tyler Hicks wrote:
> > On 2022-06-10 23:01:31, Zhiqiang Hou wrote:
> > > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> > > 
> > > The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge")
> > > made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT
> > > mode, so that it's more likely that a hot-added device will work in
> > > a system with an optimized MPS configuration.
> > > 
> > > Obviously, the Linux itself optimizes the MPS settings in the
> > > PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also
> > > for these modes.
> > > 
> > > Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> > 
> > I wanted to give a little more information about the issue we're seeing.
> > We're having trouble retaining the optimized Max Payload Size (MPS)
> > value of a PCIe endpoint after hotplug/rescan. In this case,
> > `pcie_ports=native` and `pci=pcie_bus_safe` are set on the cmdline and
> > we expect the Linux kernel to retain the MPS value. Commit 27d868b5e6cf
> > preserved the MPS value when using the default PCIe bus mode
> > (PCIE_BUS_DEFAULT) and we're hopeful that this can be extended to other
> > modes, as well.
> 
> Thanks, Tyler.  I need help understanding what's going on here.  An
> example of the topology and what happens and what *should* happen
> might help.

Hey Bjorn and Keith! Thanks for both of your responses and your
patience. They spurred good investigations on my side and I'm learning a
lot as I go.

> Some MPS configuration is done per-device in pci_configure_mps(), and
> some considers a hierarchy in pcie_bus_configure_settings().  In the
> current tree, in the PCIE_BUS_SAFE case:
> 
>   - pci_configure_mps() does nothing (except for RCiEPs).
> 
>   - pcie_bus_configure_settings(bus) looks at the hierarchy rooted at
>     the bridge leading to "bus".  If the hierarchy contains a hotplug
>     Switch Downstream Port, it sets MPS and MRRS to 128 for
>     everything.
> 
>     If it does not contain such a bridge, it finds the smallest
>     MPS_Supported ("smpss") of any device in the hierarchy and sets
>     MPS and MRRS to that for everything.
> 
> If you boot with a hotplug Root Port leading to an empty slot, I think
> the RP MPS will end up being whatever BIOS put there.

I've been talking to the firmware folks on why SAFE mode was selected,
based on Keith's question from Wednesday. From what I've been told,
U-Boot doesn't seed the RP MPS with a value so the kernel sees a value
of 128 for p_mps in pci_configure_mps() when using the DEFAULT mode.
Apparently UEFI does seed the RP MPS but we don't get that with U-Boot.
Therefore, SAFE mode was selected to get an initial, tuned RP MPS value
set to 256.

> A subsequent hot-add will do nothing in pci_configure_mps(), and
> pcie_bus_configure_settings() looks like it would set the RP and EP
> MPS to the minimum of the RP and EP MPS_Supported.
> 
> Is that what you see?  What would you like to see instead?

No, not quite. After hot-add, we see the EP MPS set to 128 with SAFE
mode even though the RP MPS is 256.

So, the question is if SAFE/PERFORMANCE modes are expected to tune the
EP when it is added? We would have thought that EP's would be tuned
based on the algorithm selected as they're hot-added.

Tyler

> 
> Bjorn
Bjorn Helgaas Oct. 27, 2022, 10:51 p.m. UTC | #9
On Tue, Oct 25, 2022 at 12:07:47AM -0500, Tyler Hicks wrote:
> On 2022-10-20 15:24:37, Bjorn Helgaas wrote:
> > On Wed, Oct 19, 2022 at 01:25:59PM -0500, Tyler Hicks wrote:
> > > On 2022-06-10 23:01:31, Zhiqiang Hou wrote:
> > > > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> > > > 
> > > > The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge")
> > > > made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT
> > > > mode, so that it's more likely that a hot-added device will work in
> > > > a system with an optimized MPS configuration.
> > > > 
> > > > Obviously, the Linux itself optimizes the MPS settings in the
> > > > PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also
> > > > for these modes.
> > > > 
> > > > Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> > > 
> > > I wanted to give a little more information about the issue we're seeing.
> > > We're having trouble retaining the optimized Max Payload Size (MPS)
> > > value of a PCIe endpoint after hotplug/rescan. In this case,
> > > `pcie_ports=native` and `pci=pcie_bus_safe` are set on the cmdline and
> > > we expect the Linux kernel to retain the MPS value. Commit 27d868b5e6cf
> > > preserved the MPS value when using the default PCIe bus mode
> > > (PCIE_BUS_DEFAULT) and we're hopeful that this can be extended to other
> > > modes, as well.
> > 
> > Thanks, Tyler.  I need help understanding what's going on here.  An
> > example of the topology and what happens and what *should* happen
> > might help.
> 
> Hey Bjorn and Keith! Thanks for both of your responses and your
> patience. They spurred good investigations on my side and I'm learning a
> lot as I go.
> 
> > Some MPS configuration is done per-device in pci_configure_mps(), and
> > some considers a hierarchy in pcie_bus_configure_settings().  In the
> > current tree, in the PCIE_BUS_SAFE case:
> > 
> >   - pci_configure_mps() does nothing (except for RCiEPs).
> > 
> >   - pcie_bus_configure_settings(bus) looks at the hierarchy rooted at
> >     the bridge leading to "bus".  If the hierarchy contains a hotplug
> >     Switch Downstream Port, it sets MPS and MRRS to 128 for
> >     everything.
> > 
> >     If it does not contain such a bridge, it finds the smallest
> >     MPS_Supported ("smpss") of any device in the hierarchy and sets
> >     MPS and MRRS to that for everything.
> > 
> > If you boot with a hotplug Root Port leading to an empty slot, I think
> > the RP MPS will end up being whatever BIOS put there.
> 
> I've been talking to the firmware folks on why SAFE mode was selected,
> based on Keith's question from Wednesday. From what I've been told,
> U-Boot doesn't seed the RP MPS with a value so the kernel sees a value
> of 128 for p_mps in pci_configure_mps() when using the DEFAULT mode.
> Apparently UEFI does seed the RP MPS but we don't get that with U-Boot.
> Therefore, SAFE mode was selected to get an initial, tuned RP MPS value
> set to 256.

Are there any devices below the RP at the time we set MPS=256?

> > A subsequent hot-add will do nothing in pci_configure_mps(), and
> > pcie_bus_configure_settings() looks like it would set the RP and EP
> > MPS to the minimum of the RP and EP MPS_Supported.
> > 
> > Is that what you see?  What would you like to see instead?
> 
> No, not quite. After hot-add, we see the EP MPS set to 128 with SAFE
> mode even though the RP MPS is 256.

Can you add the relevant topology here so we can work through the
concrete details?  Is the endpoint hot-added directly below a Root
Port?  Or is there a switch involved?  What are the MPS_Supported
values for all the devices?  If there's a switch in the picture, it
looks like we currently restrict to 128, I think because it's possible
an endpoint that can only do 128 may be added.

Bjorn
Bjorn Helgaas Nov. 3, 2022, 10:14 p.m. UTC | #10
On Thu, Oct 27, 2022 at 05:51:49PM -0500, Bjorn Helgaas wrote:
> On Tue, Oct 25, 2022 at 12:07:47AM -0500, Tyler Hicks wrote:
> > On 2022-10-20 15:24:37, Bjorn Helgaas wrote:

> > I've been talking to the firmware folks on why SAFE mode was selected,
> > based on Keith's question from Wednesday. From what I've been told,
> > U-Boot doesn't seed the RP MPS with a value so the kernel sees a value
> > of 128 for p_mps in pci_configure_mps() when using the DEFAULT mode.
> > Apparently UEFI does seed the RP MPS but we don't get that with U-Boot.
> > Therefore, SAFE mode was selected to get an initial, tuned RP MPS value
> > set to 256.
> 
> Are there any devices below the RP at the time we set MPS=256?
> 
> > > A subsequent hot-add will do nothing in pci_configure_mps(), and
> > > pcie_bus_configure_settings() looks like it would set the RP and EP
> > > MPS to the minimum of the RP and EP MPS_Supported.
> > > 
> > > Is that what you see?  What would you like to see instead?
> > 
> > No, not quite. After hot-add, we see the EP MPS set to 128 with SAFE
> > mode even though the RP MPS is 256.
> 
> Can you add the relevant topology here so we can work through the
> concrete details?  Is the endpoint hot-added directly below a Root
> Port?  Or is there a switch involved?  What are the MPS_Supported
> values for all the devices?  If there's a switch in the picture, it
> looks like we currently restrict to 128, I think because it's possible
> an endpoint that can only do 128 may be added.

Ping?  I'd like to talk about a concrete scenario.  It's too hard for
me to imagine all the possible things that could go wrong.

I guess part of what's interesting here is that things work better
when firmware has already configured MPS?  It doesn't seem like we
should *depend* on firmware having done that.

Bjorn
Tyler Hicks Nov. 9, 2022, 11:41 p.m. UTC | #11
On 2022-11-03 17:14:29, Bjorn Helgaas wrote:
> On Thu, Oct 27, 2022 at 05:51:49PM -0500, Bjorn Helgaas wrote:
> > On Tue, Oct 25, 2022 at 12:07:47AM -0500, Tyler Hicks wrote:
> > > On 2022-10-20 15:24:37, Bjorn Helgaas wrote:
> 
> > > I've been talking to the firmware folks on why SAFE mode was selected,
> > > based on Keith's question from Wednesday. From what I've been told,
> > > U-Boot doesn't seed the RP MPS with a value so the kernel sees a value
> > > of 128 for p_mps in pci_configure_mps() when using the DEFAULT mode.
> > > Apparently UEFI does seed the RP MPS but we don't get that with U-Boot.
> > > Therefore, SAFE mode was selected to get an initial, tuned RP MPS value
> > > set to 256.
> > 
> > Are there any devices below the RP at the time we set MPS=256?
> > 
> > > > A subsequent hot-add will do nothing in pci_configure_mps(), and
> > > > pcie_bus_configure_settings() looks like it would set the RP and EP
> > > > MPS to the minimum of the RP and EP MPS_Supported.
> > > > 
> > > > Is that what you see?  What would you like to see instead?
> > > 
> > > No, not quite. After hot-add, we see the EP MPS set to 128 with SAFE
> > > mode even though the RP MPS is 256.
> > 
> > Can you add the relevant topology here so we can work through the
> > concrete details?

# lspci -t
-[0000:00]---00.0-[01-ff]--+-00.0
                           +-00.1
                           +-00.2
                           +-00.3
                           \-00.4


> > Is the endpoint hot-added directly below a Root Port?  Or is there a
> > switch involved?

There's not a switch involved. The multifunction endpoint is hot-added directly
below the root port.

> > What are the MPS_Supported values for all the devices?  If there's a switch
> > in the picture, it looks like we currently restrict to 128, I think because
> > it's possible an endpoint that can only do 128 may be added.

0000:00's MPS_Supported value is 256.

The multifunction endpoint's MPS_Supported is 512.

> Ping?  I'd like to talk about a concrete scenario.  It's too hard for
> me to imagine all the possible things that could go wrong.

Sorry for the slow reply here. Thanks for your interest in getting more
details.

> I guess part of what's interesting here is that things work better
> when firmware has already configured MPS?  It doesn't seem like we
> should *depend* on firmware having done that.

Our firmware folks felt the same way.

Tyler

> 
> Bjorn
Z.Q. Hou Nov. 13, 2022, 6:39 p.m. UTC | #12
> -----Original Message-----
> From: Bjorn Helgaas <helgaas@kernel.org>
> Sent: Friday, October 21, 2022 4:25 AM
> To: Tyler Hicks <code@tyhicks.com>
> Cc: bhelgaas@google.com; Keith Busch <kbusch@kernel.org>;
> linux-pci@vger.kernel.org; Z.Q. Hou <zhiqiang.hou@nxp.com>; Lorenzo
> Pieralisi <lpieralisi@kernel.org>
> Subject: Re: [PATCH] PCI: Align MPS to upstream bridge for SAFE and
> PERFORMANCE mode
> 
> On Wed, Oct 19, 2022 at 01:25:59PM -0500, Tyler Hicks wrote:
> > On 2022-06-10 23:01:31, Zhiqiang Hou wrote:
> > > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> > >
> > > The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge")
> > > made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT
> > > mode, so that it's more likely that a hot-added device will work in
> > > a system with an optimized MPS configuration.
> > >
> > > Obviously, the Linux itself optimizes the MPS settings in the
> > > PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also
> > > for these modes.
> > >
> > > Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
> >
> > I wanted to give a little more information about the issue we're seeing.
> > We're having trouble retaining the optimized Max Payload Size (MPS)
> > value of a PCIe endpoint after hotplug/rescan. In this case,
> > `pcie_ports=native` and `pci=pcie_bus_safe` are set on the cmdline and
> > we expect the Linux kernel to retain the MPS value. Commit
> > 27d868b5e6cf preserved the MPS value when using the default PCIe bus
> > mode
> > (PCIE_BUS_DEFAULT) and we're hopeful that this can be extended to
> > other modes, as well.
> 
> Thanks, Tyler.  I need help understanding what's going on here.  An example
> of the topology and what happens and what *should* happen might help.
> 
> Some MPS configuration is done per-device in pci_configure_mps(), and some
> considers a hierarchy in pcie_bus_configure_settings().  In the current tree,
> in the PCIE_BUS_SAFE case:
> 
>   - pci_configure_mps() does nothing (except for RCiEPs).
> 
>   - pcie_bus_configure_settings(bus) looks at the hierarchy rooted at
>     the bridge leading to "bus".  If the hierarchy contains a hotplug
>     Switch Downstream Port, it sets MPS and MRRS to 128 for
>     everything.
> 
>     If it does not contain such a bridge, it finds the smallest
>     MPS_Supported ("smpss") of any device in the hierarchy and sets
>     MPS and MRRS to that for everything.
> 
> If you boot with a hotplug Root Port leading to an empty slot, I think the RP
> MPS will end up being whatever BIOS put there.
> 
> A subsequent hot-add will do nothing in pci_configure_mps(), and
> pcie_bus_configure_settings() looks like it would set the RP and EP MPS to the
> minimum of the RP and EP MPS_Supported.
> 
> Is that what you see?  What would you like to see instead?
> 

Hi Bjorn, Thanks for your comments!
This patch is for the case that kernel boot with 'pci=pcie_buf_perf', the MPS is tuned during the enumeration, but if the EP is removed and then rescan via the sysfs, the MPS won't be tuned in the rescan process. And the MPS is also tuned in the 'pcie_bus_safe' mode (see the Documentation/admin-guide/kernel-parameters.txt, I pasted the 2 options below for your reference). 
Is this expected behavior? If yes, can you help understand the reason.
                pcie_bus_safe   Set every device's MPS to the largest value
                                supported by all devices below the root complex.
                pcie_bus_perf   Set device MPS to the largest allowable MPS
                                based on its parent bus. Also set MRRS (Max 
                                Read Request Size) to the largest supported
                                value (no larger than the MPS that the device
                                or bus can support) for best performance.
Thanks,
Zhiqiang

> Bjorn
Tyler Hicks Jan. 4, 2023, 12:14 a.m. UTC | #13
On 2022-11-09 17:42:10, Tyler Hicks (Microsoft) wrote:
> On 2022-11-03 17:14:29, Bjorn Helgaas wrote:
> > On Thu, Oct 27, 2022 at 05:51:49PM -0500, Bjorn Helgaas wrote:
> > > On Tue, Oct 25, 2022 at 12:07:47AM -0500, Tyler Hicks wrote:
> > > > On 2022-10-20 15:24:37, Bjorn Helgaas wrote:
> > 
> > > > I've been talking to the firmware folks on why SAFE mode was selected,
> > > > based on Keith's question from Wednesday. From what I've been told,
> > > > U-Boot doesn't seed the RP MPS with a value so the kernel sees a value
> > > > of 128 for p_mps in pci_configure_mps() when using the DEFAULT mode.
> > > > Apparently UEFI does seed the RP MPS but we don't get that with U-Boot.
> > > > Therefore, SAFE mode was selected to get an initial, tuned RP MPS value
> > > > set to 256.
> > > 
> > > Are there any devices below the RP at the time we set MPS=256?
> > > 
> > > > > A subsequent hot-add will do nothing in pci_configure_mps(), and
> > > > > pcie_bus_configure_settings() looks like it would set the RP and EP
> > > > > MPS to the minimum of the RP and EP MPS_Supported.
> > > > > 
> > > > > Is that what you see?  What would you like to see instead?
> > > > 
> > > > No, not quite. After hot-add, we see the EP MPS set to 128 with SAFE
> > > > mode even though the RP MPS is 256.
> > > 
> > > Can you add the relevant topology here so we can work through the
> > > concrete details?
> 
> # lspci -t
> -[0000:00]---00.0-[01-ff]--+-00.0
>                            +-00.1
>                            +-00.2
>                            +-00.3
>                            \-00.4
> 
> 
> > > Is the endpoint hot-added directly below a Root Port?  Or is there a
> > > switch involved?
> 
> There's not a switch involved. The multifunction endpoint is hot-added directly
> below the root port.
> 
> > > What are the MPS_Supported values for all the devices?  If there's a switch
> > > in the picture, it looks like we currently restrict to 128, I think because
> > > it's possible an endpoint that can only do 128 may be added.
> 
> 0000:00's MPS_Supported value is 256.
> 
> The multifunction endpoint's MPS_Supported is 512.
> 
> > Ping?  I'd like to talk about a concrete scenario.  It's too hard for
> > me to imagine all the possible things that could go wrong.
> 
> Sorry for the slow reply here. Thanks for your interest in getting more
> details.

Hey Bjorn - I wanted to re-ping you on this discussion since we're on
the other side of the merge window now. Let me know if you need anymore
details. Thanks!

Tyler

> 
> > I guess part of what's interesting here is that things work better
> > when firmware has already configured MPS?  It doesn't seem like we
> > should *depend* on firmware having done that.
> 
> Our firmware folks felt the same way.
> 
> Tyler
> 
> > 
> > Bjorn
diff mbox series

Patch

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 17a969942d37..2c5a1aefd9cb 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2034,7 +2034,9 @@  static void pci_configure_mps(struct pci_dev *dev)
 	 * Fancier MPS configuration is done later by
 	 * pcie_bus_configure_settings()
 	 */
-	if (pcie_bus_config != PCIE_BUS_DEFAULT)
+	if (pcie_bus_config != PCIE_BUS_DEFAULT &&
+	    pcie_bus_config != PCIE_BUS_SAFE &&
+	    pcie_bus_config != PCIE_BUS_PERFORMANCE)
 		return;
 
 	mpss = 128 << dev->pcie_mpss;
@@ -2047,7 +2049,7 @@  static void pci_configure_mps(struct pci_dev *dev)
 
 	rc = pcie_set_mps(dev, p_mps);
 	if (rc) {
-		pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n",
+		pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_peer2peer\" and report a bug\n",
 			 p_mps);
 		return;
 	}