diff mbox series

Revert "PCI: dwc: Wait for link up only if link is started"

Message ID 20230706082610.26584-1-johan+linaro@kernel.org (mailing list archive)
State Accepted
Commit 98777c45aaf6bf4ec1e78e960ed01b93ce3b10eb
Delegated to: Krzysztof Wilczyński
Headers show
Series Revert "PCI: dwc: Wait for link up only if link is started" | expand

Commit Message

Johan Hovold July 6, 2023, 8:26 a.m. UTC
This reverts commit da56a1bfbab55189595e588f1d984bdfb5cf5924.

A recent commit broke controller probe by returning an error in case the
link does not come up during host initialisation.

As explained in commit 886a9c134755 ("PCI: dwc: Move link handling into
common code") and as indicated by the comment "Ignore errors, the link
may come up later" in the code, waiting for link up and ignoring errors
is the intended behaviour:

	 Let's standardize this to succeed as there are usecases where
	 devices (and the link) appear later even without hotplug. For
	 example, a reconfigured FPGA device.

Reverting the offending commit specifically fixes a regression on
Qualcomm platforms like the Lenovo ThinkPad X13s which no longer reach
the interconnect sync state if a slot does not have a device populated
(e.g. an optional modem).

Note that enabling asynchronous probing by default as was done for
Qualcomm platforms by commit c0e1eb441b1d ("PCI: qcom: Enable async
probe by default"), should take care of any related boot time concerns.

Finally, note that the intel-gw driver is the only driver currently not
providing a start_link callback and instead starts the link in its
host_init callback, and which may avoid an additional one-second timeout
during probe by making the link-up wait conditional. If anyone cares,
that can be done in a follow-up patch with a proper motivation.

Fixes: da56a1bfbab5 ("PCI: dwc: Wait for link up only if link is started")
Reported-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Cc: Sajid Dalvi <sdalvi@google.com>
Cc: Ajay Agarwal <ajayagarwal@google.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
---
 .../pci/controller/dwc/pcie-designware-host.c | 13 ++++--------
 drivers/pci/controller/dwc/pcie-designware.c  | 20 +++++++------------
 drivers/pci/controller/dwc/pcie-designware.h  |  1 -
 3 files changed, 11 insertions(+), 23 deletions(-)

Comments

Manivannan Sadhasivam July 6, 2023, 12:58 p.m. UTC | #1
On Thu, Jul 06, 2023 at 10:26:10AM +0200, Johan Hovold wrote:
> This reverts commit da56a1bfbab55189595e588f1d984bdfb5cf5924.
> 
> A recent commit broke controller probe by returning an error in case the
> link does not come up during host initialisation.
> 
> As explained in commit 886a9c134755 ("PCI: dwc: Move link handling into
> common code") and as indicated by the comment "Ignore errors, the link
> may come up later" in the code, waiting for link up and ignoring errors
> is the intended behaviour:
> 
> 	 Let's standardize this to succeed as there are usecases where
> 	 devices (and the link) appear later even without hotplug. For
> 	 example, a reconfigured FPGA device.
> 
> Reverting the offending commit specifically fixes a regression on
> Qualcomm platforms like the Lenovo ThinkPad X13s which no longer reach
> the interconnect sync state if a slot does not have a device populated
> (e.g. an optional modem).
> 
> Note that enabling asynchronous probing by default as was done for
> Qualcomm platforms by commit c0e1eb441b1d ("PCI: qcom: Enable async
> probe by default"), should take care of any related boot time concerns.
> 
> Finally, note that the intel-gw driver is the only driver currently not
> providing a start_link callback and instead starts the link in its
> host_init callback, and which may avoid an additional one-second timeout
> during probe by making the link-up wait conditional. If anyone cares,
> that can be done in a follow-up patch with a proper motivation.
> 

The offending commit is bogus since it makes the intel-gw _special_ w.r.t
waiting for the link up. Most of the drivers call dw_pcie_host_init() during the
probe time and they all have to wait for 1 sec if the slot is empty.

As Johan noted, intel-gw should make use of the async probe to avoid the boot
delay instead of adding a special case.

> Fixes: da56a1bfbab5 ("PCI: dwc: Wait for link up only if link is started")
> Reported-by: Bjorn Andersson <quic_bjorande@quicinc.com>
> Cc: Sajid Dalvi <sdalvi@google.com>
> Cc: Ajay Agarwal <ajayagarwal@google.com>
> Signed-off-by: Johan Hovold <johan+linaro@kernel.org>

Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>

- Mani

> ---
>  .../pci/controller/dwc/pcie-designware-host.c | 13 ++++--------
>  drivers/pci/controller/dwc/pcie-designware.c  | 20 +++++++------------
>  drivers/pci/controller/dwc/pcie-designware.h  |  1 -
>  3 files changed, 11 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> index cf61733bf78d..9952057c8819 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -485,20 +485,15 @@ int dw_pcie_host_init(struct dw_pcie_rp *pp)
>  	if (ret)
>  		goto err_remove_edma;
>  
> -	if (dw_pcie_link_up(pci)) {
> -		dw_pcie_print_link_status(pci);
> -	} else {
> +	if (!dw_pcie_link_up(pci)) {
>  		ret = dw_pcie_start_link(pci);
>  		if (ret)
>  			goto err_remove_edma;
> -
> -		if (pci->ops && pci->ops->start_link) {
> -			ret = dw_pcie_wait_for_link(pci);
> -			if (ret)
> -				goto err_stop_link;
> -		}
>  	}
>  
> +	/* Ignore errors, the link may come up later */
> +	dw_pcie_wait_for_link(pci);
> +
>  	bridge->sysdata = pp;
>  
>  	ret = pci_host_probe(bridge);
> diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> index df092229e97d..8e33e6e59e68 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.c
> +++ b/drivers/pci/controller/dwc/pcie-designware.c
> @@ -644,20 +644,9 @@ void dw_pcie_disable_atu(struct dw_pcie *pci, u32 dir, int index)
>  	dw_pcie_writel_atu(pci, dir, index, PCIE_ATU_REGION_CTRL2, 0);
>  }
>  
> -void dw_pcie_print_link_status(struct dw_pcie *pci)
> -{
> -	u32 offset, val;
> -
> -	offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
> -	val = dw_pcie_readw_dbi(pci, offset + PCI_EXP_LNKSTA);
> -
> -	dev_info(pci->dev, "PCIe Gen.%u x%u link up\n",
> -		 FIELD_GET(PCI_EXP_LNKSTA_CLS, val),
> -		 FIELD_GET(PCI_EXP_LNKSTA_NLW, val));
> -}
> -
>  int dw_pcie_wait_for_link(struct dw_pcie *pci)
>  {
> +	u32 offset, val;
>  	int retries;
>  
>  	/* Check if the link is up or not */
> @@ -673,7 +662,12 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
>  		return -ETIMEDOUT;
>  	}
>  
> -	dw_pcie_print_link_status(pci);
> +	offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
> +	val = dw_pcie_readw_dbi(pci, offset + PCI_EXP_LNKSTA);
> +
> +	dev_info(pci->dev, "PCIe Gen.%u x%u link up\n",
> +		 FIELD_GET(PCI_EXP_LNKSTA_CLS, val),
> +		 FIELD_GET(PCI_EXP_LNKSTA_NLW, val));
>  
>  	return 0;
>  }
> diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
> index 615660640801..79713ce075cc 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.h
> +++ b/drivers/pci/controller/dwc/pcie-designware.h
> @@ -429,7 +429,6 @@ void dw_pcie_setup(struct dw_pcie *pci);
>  void dw_pcie_iatu_detect(struct dw_pcie *pci);
>  int dw_pcie_edma_detect(struct dw_pcie *pci);
>  void dw_pcie_edma_remove(struct dw_pcie *pci);
> -void dw_pcie_print_link_status(struct dw_pcie *pci);
>  
>  static inline void dw_pcie_writel_dbi(struct dw_pcie *pci, u32 reg, u32 val)
>  {
> -- 
> 2.39.3
>
Johan Hovold July 7, 2023, 12:47 p.m. UTC | #2
On Thu, Jul 06, 2023 at 06:28:11PM +0530, Manivannan Sadhasivam wrote:
> On Thu, Jul 06, 2023 at 10:26:10AM +0200, Johan Hovold wrote:

> > Finally, note that the intel-gw driver is the only driver currently not
> > providing a start_link callback and instead starts the link in its
> > host_init callback, and which may avoid an additional one-second timeout
> > during probe by making the link-up wait conditional. If anyone cares,
> > that can be done in a follow-up patch with a proper motivation.

> The offending commit is bogus since it makes the intel-gw _special_ w.r.t
> waiting for the link up. Most of the drivers call dw_pcie_host_init() during the
> probe time and they all have to wait for 1 sec if the slot is empty.

Just to clarify, the intel-gw driver starts the link and waits for link
up in its host_init() callback, which is called during probe. That wait
could possibly just be dropped in favour of the one in
dw_pcie_host_init() and/or the driver could be reworked to implement
start_link().

Either way, the call in dw_pcie_host_init() will only add an additional
1 second delay in cases where the link did *not* come up.

> As Johan noted, intel-gw should make use of the async probe to avoid the boot
> delay instead of adding a special case.

Indeed.

Johan
Ajay Agarwal July 10, 2023, 4:21 p.m. UTC | #3
On Fri, Jul 07, 2023 at 02:47:56PM +0200, Johan Hovold wrote:
> On Thu, Jul 06, 2023 at 06:28:11PM +0530, Manivannan Sadhasivam wrote:
> > On Thu, Jul 06, 2023 at 10:26:10AM +0200, Johan Hovold wrote:
> 
> > > Finally, note that the intel-gw driver is the only driver currently not
> > > providing a start_link callback and instead starts the link in its
> > > host_init callback, and which may avoid an additional one-second timeout
> > > during probe by making the link-up wait conditional. If anyone cares,
> > > that can be done in a follow-up patch with a proper motivation.
> 
> > The offending commit is bogus since it makes the intel-gw _special_ w.r.t
> > waiting for the link up. Most of the drivers call dw_pcie_host_init() during the
> > probe time and they all have to wait for 1 sec if the slot is empty.
Mani, can you please explain how my commit made the intel-gw driver
special? The intel driver actually fails the dw_pcie_host_init if the
link does not come up. That was my motivation behind adding the fail
logic in the core driver as well.
> 
> Just to clarify, the intel-gw driver starts the link and waits for link
> up in its host_init() callback, which is called during probe. That wait
> could possibly just be dropped in favour of the one in
> dw_pcie_host_init() and/or the driver could be reworked to implement
> start_link().
> 
> Either way, the call in dw_pcie_host_init() will only add an additional
> 1 second delay in cases where the link did *not* come up.
> 
> > As Johan noted, intel-gw should make use of the async probe to avoid the boot
> > delay instead of adding a special case.
> 
> Indeed.
> 
> Johan
Johan, Mani
My apologies for adding this regression in some of the SOCs.
May I suggest to keep my patch and make the following change instead?
This shall keep the existing behavior as is, and save the boot time
for drivers that do not define the start_link()?

```
diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index cf61733bf78d..af6a7cd060b1 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -492,11 +492,8 @@ int dw_pcie_host_init(struct dw_pcie_rp *pp)
                if (ret)
                        goto err_remove_edma;

-               if (pci->ops && pci->ops->start_link) {
-                       ret = dw_pcie_wait_for_link(pci);
-                       if (ret)
-                               goto err_stop_link;
-               }
+               if (pci->ops && pci->ops->start_link)
+                       dw_pcie_wait_for_link(pci);
        }

        bridge->sysdata = pp;
```
Ajay Agarwal July 10, 2023, 4:42 p.m. UTC | #4
On Mon, Jul 10, 2023 at 09:51:22PM +0530, Ajay Agarwal wrote:
> On Fri, Jul 07, 2023 at 02:47:56PM +0200, Johan Hovold wrote:
> > On Thu, Jul 06, 2023 at 06:28:11PM +0530, Manivannan Sadhasivam wrote:
> > > On Thu, Jul 06, 2023 at 10:26:10AM +0200, Johan Hovold wrote:
> > 
> > > > Finally, note that the intel-gw driver is the only driver currently not
> > > > providing a start_link callback and instead starts the link in its
> > > > host_init callback, and which may avoid an additional one-second timeout
> > > > during probe by making the link-up wait conditional. If anyone cares,
> > > > that can be done in a follow-up patch with a proper motivation.
> > 
> > > The offending commit is bogus since it makes the intel-gw _special_ w.r.t
> > > waiting for the link up. Most of the drivers call dw_pcie_host_init() during the
> > > probe time and they all have to wait for 1 sec if the slot is empty.
> Mani, can you please explain how my commit made the intel-gw driver
> special? The intel driver actually fails the dw_pcie_host_init if the
> link does not come up. That was my motivation behind adding the fail
> logic in the core driver as well.
> > 
> > Just to clarify, the intel-gw driver starts the link and waits for link
> > up in its host_init() callback, which is called during probe. That wait
> > could possibly just be dropped in favour of the one in
> > dw_pcie_host_init() and/or the driver could be reworked to implement
> > start_link().
> > 
> > Either way, the call in dw_pcie_host_init() will only add an additional
> > 1 second delay in cases where the link did *not* come up.
> > 
> > > As Johan noted, intel-gw should make use of the async probe to avoid the boot
> > > delay instead of adding a special case.
> > 
> > Indeed.
> > 
> > Johan
> Johan, Mani
> My apologies for adding this regression in some of the SOCs.
> May I suggest to keep my patch and make the following change instead?
> This shall keep the existing behavior as is, and save the boot time
> for drivers that do not define the start_link()?
> 
> ```
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> index cf61733bf78d..af6a7cd060b1 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -492,11 +492,8 @@ int dw_pcie_host_init(struct dw_pcie_rp *pp)
>                 if (ret)
>                         goto err_remove_edma;
> 
> -               if (pci->ops && pci->ops->start_link) {
> -                       ret = dw_pcie_wait_for_link(pci);
> -                       if (ret)
> -                               goto err_stop_link;
> -               }
> +               if (pci->ops && pci->ops->start_link)
> +                       dw_pcie_wait_for_link(pci);
>         }
> 
>         bridge->sysdata = pp;
> ```
I just realized that Fabio pushed exactly the same patch as I suggested
here:
https://lore.kernel.org/all/20230704122635.1362156-1-festevam@gmail.com/.
I think it is better we take it instead of reverting my commit.
Krzysztof Wilczyński July 10, 2023, 5:06 p.m. UTC | #5
Hello,

> > > > > Finally, note that the intel-gw driver is the only driver currently not
> > > > > providing a start_link callback and instead starts the link in its
> > > > > host_init callback, and which may avoid an additional one-second timeout
> > > > > during probe by making the link-up wait conditional. If anyone cares,
> > > > > that can be done in a follow-up patch with a proper motivation.
> > > 
> > > > The offending commit is bogus since it makes the intel-gw _special_ w.r.t
> > > > waiting for the link up. Most of the drivers call dw_pcie_host_init() during the
> > > > probe time and they all have to wait for 1 sec if the slot is empty.
> >
> > Mani, can you please explain how my commit made the intel-gw driver
> > special? The intel driver actually fails the dw_pcie_host_init if the
> > link does not come up. That was my motivation behind adding the fail
> > logic in the core driver as well.
> > > 
> > > Just to clarify, the intel-gw driver starts the link and waits for link
> > > up in its host_init() callback, which is called during probe. That wait
> > > could possibly just be dropped in favour of the one in
> > > dw_pcie_host_init() and/or the driver could be reworked to implement
> > > start_link().
> > > 
> > > Either way, the call in dw_pcie_host_init() will only add an additional
> > > 1 second delay in cases where the link did *not* come up.
> > > 
> > > > As Johan noted, intel-gw should make use of the async probe to avoid the boot
> > > > delay instead of adding a special case.
> > > 
> > > Indeed.

The whole conversation above about the intel-gw driver: would something
need to be addressed here?  Or can I pick the suggested fix?

> > My apologies for adding this regression in some of the SOCs.
> > May I suggest to keep my patch and make the following change instead?
> > This shall keep the existing behavior as is, and save the boot time
> > for drivers that do not define the start_link()?
[...]

> I just realized that Fabio pushed exactly the same patch as I suggested
> here:
> https://lore.kernel.org/all/20230704122635.1362156-1-festevam@gmail.com/.
> I think it is better we take it instead of reverting my commit.

Will do.  I will also make sure that we have correct attributions in place.

	Krzysztof
Johan Hovold July 11, 2023, 6:52 a.m. UTC | #6
On Tue, Jul 11, 2023 at 02:06:08AM +0900, Krzysztof Wilczyński wrote:

> > > > > > Finally, note that the intel-gw driver is the only driver currently not
> > > > > > providing a start_link callback and instead starts the link in its
> > > > > > host_init callback, and which may avoid an additional one-second timeout
> > > > > > during probe by making the link-up wait conditional. If anyone cares,
> > > > > > that can be done in a follow-up patch with a proper motivation.

> The whole conversation above about the intel-gw driver: would something
> need to be addressed here?  Or can I pick the suggested fix?

No, it's just another indication that the offending commit was confused.

All mainline drivers already start the link before that
wait-for-link-up, so the commit in question makes very little sense.
That's why I prefer reverting it, so as to not pollute the git logs
(e.g. for git blame) with misleading justifications.

> > > My apologies for adding this regression in some of the SOCs.
> > > May I suggest to keep my patch and make the following change instead?
> > > This shall keep the existing behavior as is, and save the boot time
> > > for drivers that do not define the start_link()?
> [...]
> 
> > I just realized that Fabio pushed exactly the same patch as I suggested
> > here:
> > https://lore.kernel.org/all/20230704122635.1362156-1-festevam@gmail.com/.
> > I think it is better we take it instead of reverting my commit.
> 
> Will do.  I will also make sure that we have correct attributions in place.

As I mentioned in the commit message, I think the commit should just be
reverted and if there's a valid argument to be made for a similar type
of change (without the breakage), that can be done as a follow-up with a
proper motivation.

Johan
Manivannan Sadhasivam July 11, 2023, 7:37 a.m. UTC | #7
On Mon, Jul 10, 2023 at 09:51:22PM +0530, Ajay Agarwal wrote:
> On Fri, Jul 07, 2023 at 02:47:56PM +0200, Johan Hovold wrote:
> > On Thu, Jul 06, 2023 at 06:28:11PM +0530, Manivannan Sadhasivam wrote:
> > > On Thu, Jul 06, 2023 at 10:26:10AM +0200, Johan Hovold wrote:
> > 
> > > > Finally, note that the intel-gw driver is the only driver currently not
> > > > providing a start_link callback and instead starts the link in its
> > > > host_init callback, and which may avoid an additional one-second timeout
> > > > during probe by making the link-up wait conditional. If anyone cares,
> > > > that can be done in a follow-up patch with a proper motivation.
> > 
> > > The offending commit is bogus since it makes the intel-gw _special_ w.r.t
> > > waiting for the link up. Most of the drivers call dw_pcie_host_init() during the
> > > probe time and they all have to wait for 1 sec if the slot is empty.
> Mani, can you please explain how my commit made the intel-gw driver
> special? The intel driver actually fails the dw_pcie_host_init if the
> link does not come up. That was my motivation behind adding the fail
> logic in the core driver as well.

Your commit ended up failing the probe, if dw_pcie_wait_for_link() fails for
SoCs defining start_link() callback, which is the case for all the drivers
except intel-gw. I take back my _special_ argument since it was special before
your commit and now you just made its behavior applicable to all SoCs.

> > 
> > Just to clarify, the intel-gw driver starts the link and waits for link
> > up in its host_init() callback, which is called during probe. That wait
> > could possibly just be dropped in favour of the one in
> > dw_pcie_host_init() and/or the driver could be reworked to implement
> > start_link().
> > 
> > Either way, the call in dw_pcie_host_init() will only add an additional
> > 1 second delay in cases where the link did *not* come up.
> > 
> > > As Johan noted, intel-gw should make use of the async probe to avoid the boot
> > > delay instead of adding a special case.
> > 
> > Indeed.
> > 
> > Johan
> Johan, Mani
> My apologies for adding this regression in some of the SOCs.
> May I suggest to keep my patch and make the following change instead?
> This shall keep the existing behavior as is, and save the boot time
> for drivers that do not define the start_link()?
> 

No, IMO the offending commit was wrong in serving its purpose so a revert makes
sense. Because, if the intention was to reduce the boot delay then it did not
fix that because dw_pcie_wait_for_link() is still called from intel-gw's
host_init() callback. You just skipped another instance which is there in
dw_pcie_host_init().

So to fix this issue properly intel-gw needs to do 2 things:

1. Move the ltssm_enable to start_link() callback and get rid of
dw_pcie_wait_for_link() from its host_init() callback. If there is any special
reason to not do this way, please explain.

2. Enable async probe so that other drivers can continue probing while this
driver waits for the link to be up. This will almost make the delay negligible.

The above 2 should be done in separate patches.

- Mani

> ```
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> index cf61733bf78d..af6a7cd060b1 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -492,11 +492,8 @@ int dw_pcie_host_init(struct dw_pcie_rp *pp)
>                 if (ret)
>                         goto err_remove_edma;
> 
> -               if (pci->ops && pci->ops->start_link) {
> -                       ret = dw_pcie_wait_for_link(pci);
> -                       if (ret)
> -                               goto err_stop_link;
> -               }
> +               if (pci->ops && pci->ops->start_link)
> +                       dw_pcie_wait_for_link(pci);
>         }
> 
>         bridge->sysdata = pp;
> ```
Ajay Agarwal July 12, 2023, 5:45 p.m. UTC | #8
On Tue, Jul 11, 2023 at 08:52:23AM +0200, Johan Hovold wrote:
> On Tue, Jul 11, 2023 at 02:06:08AM +0900, Krzysztof Wilczyński wrote:
> 
> > > > > > > Finally, note that the intel-gw driver is the only driver currently not
> > > > > > > providing a start_link callback and instead starts the link in its
> > > > > > > host_init callback, and which may avoid an additional one-second timeout
> > > > > > > during probe by making the link-up wait conditional. If anyone cares,
> > > > > > > that can be done in a follow-up patch with a proper motivation.
> 
> > The whole conversation above about the intel-gw driver: would something
> > need to be addressed here?  Or can I pick the suggested fix?
> 
> No, it's just another indication that the offending commit was confused.
> 
> All mainline drivers already start the link before that
> wait-for-link-up, so the commit in question makes very little sense.
> That's why I prefer reverting it, so as to not pollute the git logs
> (e.g. for git blame) with misleading justifications.
Johan, Mani,
I am developing a PCIe driver which will not have the start_link
callback defined. Instead, the link will be coming up much later based
on some other trigger. So my driver will not attempt the LTSSM training
on probe. So even if the probe is made asynchronous, it will still end
up wasting 1 second of time.
> 
> > > > My apologies for adding this regression in some of the SOCs.
> > > > May I suggest to keep my patch and make the following change instead?
> > > > This shall keep the existing behavior as is, and save the boot time
> > > > for drivers that do not define the start_link()?
> > [...]
> > 
> > > I just realized that Fabio pushed exactly the same patch as I suggested
> > > here:
> > > https://lore.kernel.org/all/20230704122635.1362156-1-festevam@gmail.com/.
> > > I think it is better we take it instead of reverting my commit.
> > 
> > Will do.  I will also make sure that we have correct attributions in place.
> 
> As I mentioned in the commit message, I think the commit should just be
> reverted and if there's a valid argument to be made for a similar type
> of change (without the breakage), that can be done as a follow-up with a
> proper motivation.
> 
> Johan
I agree that my commit created regression in some of the existing SOCs.
I should not have taken the liberty to return an error if the
wait-for-link-up call fails in the probe.
But my commit's message body clearly mentions the motivation behind
calling dw_pcie_wait_for_link() only if the start_link is defined. Can
you please re-evaluate the decision to revert my patch and pick up the
suggested fix instead?

Thanks
Ajay
Johan Hovold July 14, 2023, 8:55 a.m. UTC | #9
On Wed, Jul 12, 2023 at 11:15:54PM +0530, Ajay Agarwal wrote:
> On Tue, Jul 11, 2023 at 08:52:23AM +0200, Johan Hovold wrote:

> > All mainline drivers already start the link before that
> > wait-for-link-up, so the commit in question makes very little sense.
> > That's why I prefer reverting it, so as to not pollute the git logs
> > (e.g. for git blame) with misleading justifications.

> I am developing a PCIe driver which will not have the start_link
> callback defined. Instead, the link will be coming up much later based
> on some other trigger. So my driver will not attempt the LTSSM training
> on probe. So even if the probe is made asynchronous, it will still end
> up wasting 1 second of time.

Yeah, I had the suspicion that this was really motivated by some
out-of-tree driver, which as I'm sure you know, is not a concern for
mainline.

Vendor drivers do all sorts of crazy stuff and we don't carry code in
mainline for the sole benefit of such drivers that have not been
upstreamed (and likely never will be).

So again, I think this patch should just be reverted.

If you want to get something like this in, you can send a follow-on
patch describing your actual motivation and use case. But as it appears
to boil down to "I need this for my out-of-tree driver", I suspect such
a patch would still be rejected.

Johan
Bjorn Helgaas July 25, 2023, 8:05 p.m. UTC | #10
[+cc Fabio, Xiaolei, Jon]

On Thu, Jul 06, 2023 at 10:26:10AM +0200, Johan Hovold wrote:
> This reverts commit da56a1bfbab55189595e588f1d984bdfb5cf5924.
> 
> A recent commit broke controller probe by returning an error in case the
> link does not come up during host initialisation.
> 
> As explained in commit 886a9c134755 ("PCI: dwc: Move link handling into
> common code") and as indicated by the comment "Ignore errors, the link
> may come up later" in the code, waiting for link up and ignoring errors
> is the intended behaviour:
> 
> 	 Let's standardize this to succeed as there are usecases where
> 	 devices (and the link) appear later even without hotplug. For
> 	 example, a reconfigured FPGA device.
> 
> Reverting the offending commit specifically fixes a regression on
> Qualcomm platforms like the Lenovo ThinkPad X13s which no longer reach
> the interconnect sync state if a slot does not have a device populated
> (e.g. an optional modem).
> 
> Note that enabling asynchronous probing by default as was done for
> Qualcomm platforms by commit c0e1eb441b1d ("PCI: qcom: Enable async
> probe by default"), should take care of any related boot time concerns.
> 
> Finally, note that the intel-gw driver is the only driver currently not
> providing a start_link callback and instead starts the link in its
> host_init callback, and which may avoid an additional one-second timeout
> during probe by making the link-up wait conditional. If anyone cares,
> that can be done in a follow-up patch with a proper motivation.
> 
> Fixes: da56a1bfbab5 ("PCI: dwc: Wait for link up only if link is started")
> Reported-by: Bjorn Andersson <quic_bjorande@quicinc.com>
> Cc: Sajid Dalvi <sdalvi@google.com>
> Cc: Ajay Agarwal <ajayagarwal@google.com>
> Signed-off-by: Johan Hovold <johan+linaro@kernel.org>

da56a1bfbab5 appeared in v6.5-rc1, so we should definitely fix this
before v6.5.

Based on the conversation here, I applied this to for-linus for v6.5.

I looked for Bjorn A's report but couldn't find it; I'd like to
include the URL if there is one.  I did add the reports from Fabio
Estevam, Xiaolei Wang, and Jon Hunter (Fabio and Xiaolei even included
patches).

Current commit log, corrections/additions welcome:

  This reverts commit da56a1bfbab55189595e588f1d984bdfb5cf5924.

  Bjorn Andersson, Fabio Estevam, Xiaolei Wang, and Jon Hunter reported that
  da56a1bfbab5 ("PCI: dwc: Wait for link up only if link is started") broke
  controller probing by returning an error in case the link does not come up
  during host initialisation, e.g., when the slot is empty.

  As explained in commit 886a9c134755 ("PCI: dwc: Move link handling into
  common code") and as indicated by the comment "Ignore errors, the link may
  come up later" in the code, waiting for link up and ignoring errors is the
  intended behaviour:

    Let's standardize this to succeed as there are usecases where devices
    (and the link) appear later even without hotplug. For example, a
    reconfigured FPGA device.

  Reverting the offending commit specifically fixes a regression on Qualcomm
  platforms like the Lenovo ThinkPad X13s which no longer reach the
  interconnect sync state if a slot does not have a device populated (e.g. an
  optional modem).

  Note that enabling asynchronous probing by default as was done for Qualcomm
  platforms by commit c0e1eb441b1d ("PCI: qcom: Enable async probe by
  default"), should take care of any related boot time concerns.

  Finally, note that the intel-gw driver is the only driver currently not
  providing a .start_link() callback and instead starts the link in its
  .host_init() callback, which may avoid an additional one-second timeout
  during probe by making the link-up wait conditional. If anyone cares, that
  can be done in a follow-up patch with a proper motivation.

  [bhelgaas: add Fabio Estevam, Xiaolei Wang, Jon Hunter reports]
  Fixes: da56a1bfbab5 ("PCI: dwc: Wait for link up only if link is started")
  Link: https://lore.kernel.org/r/20230704122635.1362156-1-festevam@gmail.com/
  Link: https://lore.kernel.org/r/20230705010624.3912934-1-xiaolei.wang@windriver.com/
  Link: https://lore.kernel.org/r/6ca287a1-6c7c-7b90-9022-9e73fb82b564@nvidia.com
  Link: https://lore.kernel.org/r/20230706082610.26584-1-johan+linaro@kernel.org
  Reported-by: Bjorn Andersson <quic_bjorande@quicinc.com>
  Reported-by: Fabio Estevam <festevam@gmail.com>
  Reported-by: Xiaolei Wang <xiaolei.wang@windriver.com>
  Reported-by: Jon Hunter <jonathanh@nvidia.com>
  Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
  Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
  Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
  Cc: Sajid Dalvi <sdalvi@google.com>
  Cc: Ajay Agarwal <ajayagarwal@google.com>

> ---
>  .../pci/controller/dwc/pcie-designware-host.c | 13 ++++--------
>  drivers/pci/controller/dwc/pcie-designware.c  | 20 +++++++------------
>  drivers/pci/controller/dwc/pcie-designware.h  |  1 -
>  3 files changed, 11 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> index cf61733bf78d..9952057c8819 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -485,20 +485,15 @@ int dw_pcie_host_init(struct dw_pcie_rp *pp)
>  	if (ret)
>  		goto err_remove_edma;
>  
> -	if (dw_pcie_link_up(pci)) {
> -		dw_pcie_print_link_status(pci);
> -	} else {
> +	if (!dw_pcie_link_up(pci)) {
>  		ret = dw_pcie_start_link(pci);
>  		if (ret)
>  			goto err_remove_edma;
> -
> -		if (pci->ops && pci->ops->start_link) {
> -			ret = dw_pcie_wait_for_link(pci);
> -			if (ret)
> -				goto err_stop_link;
> -		}
>  	}
>  
> +	/* Ignore errors, the link may come up later */
> +	dw_pcie_wait_for_link(pci);
> +
>  	bridge->sysdata = pp;
>  
>  	ret = pci_host_probe(bridge);
> diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> index df092229e97d..8e33e6e59e68 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.c
> +++ b/drivers/pci/controller/dwc/pcie-designware.c
> @@ -644,20 +644,9 @@ void dw_pcie_disable_atu(struct dw_pcie *pci, u32 dir, int index)
>  	dw_pcie_writel_atu(pci, dir, index, PCIE_ATU_REGION_CTRL2, 0);
>  }
>  
> -void dw_pcie_print_link_status(struct dw_pcie *pci)
> -{
> -	u32 offset, val;
> -
> -	offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
> -	val = dw_pcie_readw_dbi(pci, offset + PCI_EXP_LNKSTA);
> -
> -	dev_info(pci->dev, "PCIe Gen.%u x%u link up\n",
> -		 FIELD_GET(PCI_EXP_LNKSTA_CLS, val),
> -		 FIELD_GET(PCI_EXP_LNKSTA_NLW, val));
> -}
> -
>  int dw_pcie_wait_for_link(struct dw_pcie *pci)
>  {
> +	u32 offset, val;
>  	int retries;
>  
>  	/* Check if the link is up or not */
> @@ -673,7 +662,12 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
>  		return -ETIMEDOUT;
>  	}
>  
> -	dw_pcie_print_link_status(pci);
> +	offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
> +	val = dw_pcie_readw_dbi(pci, offset + PCI_EXP_LNKSTA);
> +
> +	dev_info(pci->dev, "PCIe Gen.%u x%u link up\n",
> +		 FIELD_GET(PCI_EXP_LNKSTA_CLS, val),
> +		 FIELD_GET(PCI_EXP_LNKSTA_NLW, val));
>  
>  	return 0;
>  }
> diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
> index 615660640801..79713ce075cc 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.h
> +++ b/drivers/pci/controller/dwc/pcie-designware.h
> @@ -429,7 +429,6 @@ void dw_pcie_setup(struct dw_pcie *pci);
>  void dw_pcie_iatu_detect(struct dw_pcie *pci);
>  int dw_pcie_edma_detect(struct dw_pcie *pci);
>  void dw_pcie_edma_remove(struct dw_pcie *pci);
> -void dw_pcie_print_link_status(struct dw_pcie *pci);
>  
>  static inline void dw_pcie_writel_dbi(struct dw_pcie *pci, u32 reg, u32 val)
>  {
> -- 
> 2.39.3
>
Johan Hovold July 26, 2023, 8:30 a.m. UTC | #11
On Tue, Jul 25, 2023 at 03:05:15PM -0500, Bjorn Helgaas wrote:
> [+cc Fabio, Xiaolei, Jon]
> 
> On Thu, Jul 06, 2023 at 10:26:10AM +0200, Johan Hovold wrote:
> > This reverts commit da56a1bfbab55189595e588f1d984bdfb5cf5924.
> > 
> > A recent commit broke controller probe by returning an error in case the
> > link does not come up during host initialisation.
> > 
> > As explained in commit 886a9c134755 ("PCI: dwc: Move link handling into
> > common code") and as indicated by the comment "Ignore errors, the link
> > may come up later" in the code, waiting for link up and ignoring errors
> > is the intended behaviour:
> > 
> > 	 Let's standardize this to succeed as there are usecases where
> > 	 devices (and the link) appear later even without hotplug. For
> > 	 example, a reconfigured FPGA device.
> > 
> > Reverting the offending commit specifically fixes a regression on
> > Qualcomm platforms like the Lenovo ThinkPad X13s which no longer reach
> > the interconnect sync state if a slot does not have a device populated
> > (e.g. an optional modem).
> > 
> > Note that enabling asynchronous probing by default as was done for
> > Qualcomm platforms by commit c0e1eb441b1d ("PCI: qcom: Enable async
> > probe by default"), should take care of any related boot time concerns.
> > 
> > Finally, note that the intel-gw driver is the only driver currently not
> > providing a start_link callback and instead starts the link in its
> > host_init callback, and which may avoid an additional one-second timeout
> > during probe by making the link-up wait conditional. If anyone cares,
> > that can be done in a follow-up patch with a proper motivation.
> > 
> > Fixes: da56a1bfbab5 ("PCI: dwc: Wait for link up only if link is started")
> > Reported-by: Bjorn Andersson <quic_bjorande@quicinc.com>
> > Cc: Sajid Dalvi <sdalvi@google.com>
> > Cc: Ajay Agarwal <ajayagarwal@google.com>
> > Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
> 
> da56a1bfbab5 appeared in v6.5-rc1, so we should definitely fix this
> before v6.5.
> 
> Based on the conversation here, I applied this to for-linus for v6.5.
> 
> I looked for Bjorn A's report but couldn't find it; I'd like to
> include the URL if there is one.  I did add the reports from Fabio
> Estevam, Xiaolei Wang, and Jon Hunter (Fabio and Xiaolei even included
> patches).

Bjorn only mentioned to me off-list that that something was wrong with
PCI in linux-next and that it prevented us from reaching the sync state.
So there is no URL to a public report to link to.

> Current commit log, corrections/additions welcome:
> 
>   This reverts commit da56a1bfbab55189595e588f1d984bdfb5cf5924.
> 
>   Bjorn Andersson, Fabio Estevam, Xiaolei Wang, and Jon Hunter reported that
>   da56a1bfbab5 ("PCI: dwc: Wait for link up only if link is started") broke
>   controller probing by returning an error in case the link does not come up
>   during host initialisation, e.g., when the slot is empty.

I think that adding the corresponding Reported-by tags and Links after
my SoB below should be enough to credit reports that I was not aware of
when investigating this.

But if you decide to rewrite this paragraph, then please spell out "for
example" as I would not use "e.g." outside of parentheses.

>   As explained in commit 886a9c134755 ("PCI: dwc: Move link handling into
>   common code") and as indicated by the comment "Ignore errors, the link may
>   come up later" in the code, waiting for link up and ignoring errors is the
>   intended behaviour:
> 
>     Let's standardize this to succeed as there are usecases where devices
>     (and the link) appear later even without hotplug. For example, a
>     reconfigured FPGA device.
> 
>   Reverting the offending commit specifically fixes a regression on Qualcomm
>   platforms like the Lenovo ThinkPad X13s which no longer reach the
>   interconnect sync state if a slot does not have a device populated (e.g. an
>   optional modem).
> 
>   Note that enabling asynchronous probing by default as was done for Qualcomm
>   platforms by commit c0e1eb441b1d ("PCI: qcom: Enable async probe by
>   default"), should take care of any related boot time concerns.
> 
>   Finally, note that the intel-gw driver is the only driver currently not
>   providing a .start_link() callback and instead starts the link in its
>   .host_init() callback, which may avoid an additional one-second timeout
>   during probe by making the link-up wait conditional. If anyone cares, that
>   can be done in a follow-up patch with a proper motivation.
> 
>   [bhelgaas: add Fabio Estevam, Xiaolei Wang, Jon Hunter reports]
>   Fixes: da56a1bfbab5 ("PCI: dwc: Wait for link up only if link is started")
>   Link: https://lore.kernel.org/r/20230704122635.1362156-1-festevam@gmail.com/
>   Link: https://lore.kernel.org/r/20230705010624.3912934-1-xiaolei.wang@windriver.com/
>   Link: https://lore.kernel.org/r/6ca287a1-6c7c-7b90-9022-9e73fb82b564@nvidia.com
>   Link: https://lore.kernel.org/r/20230706082610.26584-1-johan+linaro@kernel.org
>   Reported-by: Bjorn Andersson <quic_bjorande@quicinc.com>
>   Reported-by: Fabio Estevam <festevam@gmail.com>
>   Reported-by: Xiaolei Wang <xiaolei.wang@windriver.com>
>   Reported-by: Jon Hunter <jonathanh@nvidia.com>
>   Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
>   Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>   Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
>   Cc: Sajid Dalvi <sdalvi@google.com>
>   Cc: Ajay Agarwal <ajayagarwal@google.com>

Looks like you've "sorted" the trailers here instead of keeping the
temporal order (which would make it more clear what you added after I
posted the patch) and adding each Link after its corresponding
Reported-by tag (e.g. as suggested by checkpatch these days).

Johan
Bjorn Helgaas July 26, 2023, 3:58 p.m. UTC | #12
On Wed, Jul 26, 2023 at 10:30:24AM +0200, Johan Hovold wrote:
> On Tue, Jul 25, 2023 at 03:05:15PM -0500, Bjorn Helgaas wrote:
> ...

> I think that adding the corresponding Reported-by tags and Links after
> my SoB below should be enough to credit reports that I was not aware of
> when investigating this.
> 
> But if you decide to rewrite this paragraph, then please spell out "for
> example" as I would not use "e.g." outside of parentheses.
> ...

> Looks like you've "sorted" the trailers here instead of keeping the
> temporal order (which would make it more clear what you added after I
> posted the patch) and adding each Link after its corresponding
> Reported-by tag (e.g. as suggested by checkpatch these days).

Updated as you suggested, thanks.

Bjorn
Ajay Agarwal Jan. 11, 2024, 3:43 p.m. UTC | #13
On Fri, Jul 14, 2023 at 10:55:29AM +0200, Johan Hovold wrote:
> On Wed, Jul 12, 2023 at 11:15:54PM +0530, Ajay Agarwal wrote:
> > On Tue, Jul 11, 2023 at 08:52:23AM +0200, Johan Hovold wrote:
> 
> > > All mainline drivers already start the link before that
> > > wait-for-link-up, so the commit in question makes very little sense.
> > > That's why I prefer reverting it, so as to not pollute the git logs
> > > (e.g. for git blame) with misleading justifications.
> 
> > I am developing a PCIe driver which will not have the start_link
> > callback defined. Instead, the link will be coming up much later based
> > on some other trigger. So my driver will not attempt the LTSSM training
> > on probe. So even if the probe is made asynchronous, it will still end
> > up wasting 1 second of time.
> 
> Yeah, I had the suspicion that this was really motivated by some
> out-of-tree driver, which as I'm sure you know, is not a concern for
> mainline.
> 
> Vendor drivers do all sorts of crazy stuff and we don't carry code in
> mainline for the sole benefit of such drivers that have not been
> upstreamed (and likely never will be).
> 
> So again, I think this patch should just be reverted.
> 
> If you want to get something like this in, you can send a follow-on
> patch describing your actual motivation and use case. But as it appears
> to boil down to "I need this for my out-of-tree driver", I suspect such
> a patch would still be rejected.
> 
> Johan

Johan, Mani,
I submitted https://lore.kernel.org/all/20240111152517.1881382-1-ajayagarwal@google.com/
which does not return an error value if the dw_pcie_wait_for_link fails
in probe. Can you please check and provide your comments?

Thanks
Ajay
Ajay Agarwal Jan. 12, 2024, 10 a.m. UTC | #14
On Tue, Jul 11, 2023 at 01:07:19PM +0530, Manivannan Sadhasivam wrote:
> On Mon, Jul 10, 2023 at 09:51:22PM +0530, Ajay Agarwal wrote:
> > On Fri, Jul 07, 2023 at 02:47:56PM +0200, Johan Hovold wrote:
> > > On Thu, Jul 06, 2023 at 06:28:11PM +0530, Manivannan Sadhasivam wrote:
> > > > On Thu, Jul 06, 2023 at 10:26:10AM +0200, Johan Hovold wrote:
> > > 
> > > > > Finally, note that the intel-gw driver is the only driver currently not
> > > > > providing a start_link callback and instead starts the link in its
> > > > > host_init callback, and which may avoid an additional one-second timeout
> > > > > during probe by making the link-up wait conditional. If anyone cares,
> > > > > that can be done in a follow-up patch with a proper motivation.
> > > 
> > > > The offending commit is bogus since it makes the intel-gw _special_ w.r.t
> > > > waiting for the link up. Most of the drivers call dw_pcie_host_init() during the
> > > > probe time and they all have to wait for 1 sec if the slot is empty.
> > Mani, can you please explain how my commit made the intel-gw driver
> > special? The intel driver actually fails the dw_pcie_host_init if the
> > link does not come up. That was my motivation behind adding the fail
> > logic in the core driver as well.
> 
> Your commit ended up failing the probe, if dw_pcie_wait_for_link() fails for
> SoCs defining start_link() callback, which is the case for all the drivers
> except intel-gw. I take back my _special_ argument since it was special before
> your commit and now you just made its behavior applicable to all SoCs.
>
You are right. I should not have returned an error from the
dw_pcie_wait_for_link check. Raised v5 with the error return removed:
https://lore.kernel.org/all/20240112093006.2832105-1-ajayagarwal@google.com/

> > > 
> > > Just to clarify, the intel-gw driver starts the link and waits for link
> > > up in its host_init() callback, which is called during probe. That wait
> > > could possibly just be dropped in favour of the one in
> > > dw_pcie_host_init() and/or the driver could be reworked to implement
> > > start_link().
> > > 
> > > Either way, the call in dw_pcie_host_init() will only add an additional
> > > 1 second delay in cases where the link did *not* come up.
> > > 
> > > > As Johan noted, intel-gw should make use of the async probe to avoid the boot
> > > > delay instead of adding a special case.
> > > 
> > > Indeed.
> > > 
> > > Johan
> > Johan, Mani
> > My apologies for adding this regression in some of the SOCs.
> > May I suggest to keep my patch and make the following change instead?
> > This shall keep the existing behavior as is, and save the boot time
> > for drivers that do not define the start_link()?
> > 
> 
> No, IMO the offending commit was wrong in serving its purpose so a revert makes
> sense. Because, if the intention was to reduce the boot delay then it did not
> fix that because dw_pcie_wait_for_link() is still called from intel-gw's
> host_init() callback. You just skipped another instance which is there in
> dw_pcie_host_init().
> 
> So to fix this issue properly intel-gw needs to do 2 things:
> 
> 1. Move the ltssm_enable to start_link() callback and get rid of
> dw_pcie_wait_for_link() from its host_init() callback. If there is any special
> reason to not do this way, please explain.
> 
> 2. Enable async probe so that other drivers can continue probing while this
> driver waits for the link to be up. This will almost make the delay negligible.
> 
> The above 2 should be done in separate patches.
> 
> - Mani
>
Mani, the intention is not to fix the intel-gw driver in any manner. It
calls dw_pcie_wait_for_link explicitly in the probe path and checks for
the error as well. So it has to live with the delay and the probe
failure if the link does not come up.

My intention is just to get rid of the 1 sec delay for the drivers that
do not define the start_link callback, and hence do not expect that the
link will come up during probe anyway.

> > ```
> > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> > index cf61733bf78d..af6a7cd060b1 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > @@ -492,11 +492,8 @@ int dw_pcie_host_init(struct dw_pcie_rp *pp)
> >                 if (ret)
> >                         goto err_remove_edma;
> > 
> > -               if (pci->ops && pci->ops->start_link) {
> > -                       ret = dw_pcie_wait_for_link(pci);
> > -                       if (ret)
> > -                               goto err_stop_link;
> > -               }
> > +               if (pci->ops && pci->ops->start_link)
> > +                       dw_pcie_wait_for_link(pci);
> >         }
> > 
> >         bridge->sysdata = pp;
> > ```
> 
> -- 
> மணிவண்ணன் சதாசிவம்
Manivannan Sadhasivam Jan. 19, 2024, 7:40 a.m. UTC | #15
On Fri, Jan 12, 2024 at 03:30:15PM +0530, Ajay Agarwal wrote:

[...]

> > No, IMO the offending commit was wrong in serving its purpose so a revert makes
> > sense. Because, if the intention was to reduce the boot delay then it did not
> > fix that because dw_pcie_wait_for_link() is still called from intel-gw's
> > host_init() callback. You just skipped another instance which is there in
> > dw_pcie_host_init().
> > 
> > So to fix this issue properly intel-gw needs to do 2 things:
> > 
> > 1. Move the ltssm_enable to start_link() callback and get rid of
> > dw_pcie_wait_for_link() from its host_init() callback. If there is any special
> > reason to not do this way, please explain.
> > 
> > 2. Enable async probe so that other drivers can continue probing while this
> > driver waits for the link to be up. This will almost make the delay negligible.
> > 
> > The above 2 should be done in separate patches.
> > 
> > - Mani
> >
> Mani, the intention is not to fix the intel-gw driver in any manner. It
> calls dw_pcie_wait_for_link explicitly in the probe path and checks for
> the error as well. So it has to live with the delay and the probe
> failure if the link does not come up.
> 
> My intention is just to get rid of the 1 sec delay for the drivers that
> do not define the start_link callback, and hence do not expect that the
> link will come up during probe anyway.
> 

Ok, this clarifies, thanks.

- Mani
diff mbox series

Patch

diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index cf61733bf78d..9952057c8819 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -485,20 +485,15 @@  int dw_pcie_host_init(struct dw_pcie_rp *pp)
 	if (ret)
 		goto err_remove_edma;
 
-	if (dw_pcie_link_up(pci)) {
-		dw_pcie_print_link_status(pci);
-	} else {
+	if (!dw_pcie_link_up(pci)) {
 		ret = dw_pcie_start_link(pci);
 		if (ret)
 			goto err_remove_edma;
-
-		if (pci->ops && pci->ops->start_link) {
-			ret = dw_pcie_wait_for_link(pci);
-			if (ret)
-				goto err_stop_link;
-		}
 	}
 
+	/* Ignore errors, the link may come up later */
+	dw_pcie_wait_for_link(pci);
+
 	bridge->sysdata = pp;
 
 	ret = pci_host_probe(bridge);
diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index df092229e97d..8e33e6e59e68 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -644,20 +644,9 @@  void dw_pcie_disable_atu(struct dw_pcie *pci, u32 dir, int index)
 	dw_pcie_writel_atu(pci, dir, index, PCIE_ATU_REGION_CTRL2, 0);
 }
 
-void dw_pcie_print_link_status(struct dw_pcie *pci)
-{
-	u32 offset, val;
-
-	offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
-	val = dw_pcie_readw_dbi(pci, offset + PCI_EXP_LNKSTA);
-
-	dev_info(pci->dev, "PCIe Gen.%u x%u link up\n",
-		 FIELD_GET(PCI_EXP_LNKSTA_CLS, val),
-		 FIELD_GET(PCI_EXP_LNKSTA_NLW, val));
-}
-
 int dw_pcie_wait_for_link(struct dw_pcie *pci)
 {
+	u32 offset, val;
 	int retries;
 
 	/* Check if the link is up or not */
@@ -673,7 +662,12 @@  int dw_pcie_wait_for_link(struct dw_pcie *pci)
 		return -ETIMEDOUT;
 	}
 
-	dw_pcie_print_link_status(pci);
+	offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
+	val = dw_pcie_readw_dbi(pci, offset + PCI_EXP_LNKSTA);
+
+	dev_info(pci->dev, "PCIe Gen.%u x%u link up\n",
+		 FIELD_GET(PCI_EXP_LNKSTA_CLS, val),
+		 FIELD_GET(PCI_EXP_LNKSTA_NLW, val));
 
 	return 0;
 }
diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
index 615660640801..79713ce075cc 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -429,7 +429,6 @@  void dw_pcie_setup(struct dw_pcie *pci);
 void dw_pcie_iatu_detect(struct dw_pcie *pci);
 int dw_pcie_edma_detect(struct dw_pcie *pci);
 void dw_pcie_edma_remove(struct dw_pcie *pci);
-void dw_pcie_print_link_status(struct dw_pcie *pci);
 
 static inline void dw_pcie_writel_dbi(struct dw_pcie *pci, u32 reg, u32 val)
 {