diff mbox series

[v2] PCI: dra7xx: Fix reset behaviour

Message ID 20210531090540.2663171-1-luca@lucaceresoli.net (mailing list archive)
State New, archived
Headers show
Series [v2] PCI: dra7xx: Fix reset behaviour | expand

Commit Message

Luca Ceresoli May 31, 2021, 9:05 a.m. UTC
The PCIe PERSTn reset pin is active low and should be asserted, then
deasserted.

The current implementation only drives the pin once in "HIGH" position,
thus presumably it was intended to deassert the pin. This has two problems:

  1) it assumes the pin was asserted by other means before loading the
     driver
  2) it has the wrong polarity, since "HIGH" means "active", and the pin is
     presumably configured as active low coherently with the PCIe
     convention, thus it is driven physically to 0, keeping the device
     under reset unless the pin is configured as active high.

Fix both problems by:

  1) keeping devm_gpiod_get_optional(dev, NULL, GPIOD_OUT_HIGH) as is, but
     assuming the pin is correctly configured as "active low" this now
     becomes a reset assertion
  2) adding gpiod_set_value(reset, 0) after a delay to deassert reset

Fixes: 78bdcad05ea1 ("PCI: dra7xx: Add support to make GPIO drive PERST# line")
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>

---

Changes v1 -> v2:
 - No changes to the patch
 - Reword commit message according to suggestions from Bjorn Helgaas (from
   another patchset)
 - Add Fixes: tag
---
 drivers/pci/controller/dwc/pci-dra7xx.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Pali Rohár May 31, 2021, 1:32 p.m. UTC | #1
On Monday 31 May 2021 11:05:40 Luca Ceresoli wrote:
> The PCIe PERSTn reset pin is active low and should be asserted, then
> deasserted.
> 
> The current implementation only drives the pin once in "HIGH" position,
> thus presumably it was intended to deassert the pin. This has two problems:
> 
>   1) it assumes the pin was asserted by other means before loading the
>      driver
>   2) it has the wrong polarity, since "HIGH" means "active", and the pin is
>      presumably configured as active low coherently with the PCIe
>      convention, thus it is driven physically to 0, keeping the device
>      under reset unless the pin is configured as active high.
> 
> Fix both problems by:
> 
>   1) keeping devm_gpiod_get_optional(dev, NULL, GPIOD_OUT_HIGH) as is, but
>      assuming the pin is correctly configured as "active low" this now
>      becomes a reset assertion
>   2) adding gpiod_set_value(reset, 0) after a delay to deassert reset
> 
> Fixes: 78bdcad05ea1 ("PCI: dra7xx: Add support to make GPIO drive PERST# line")
> Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
> 
> ---
> 
> Changes v1 -> v2:
>  - No changes to the patch
>  - Reword commit message according to suggestions from Bjorn Helgaas (from
>    another patchset)
>  - Add Fixes: tag
> ---
>  drivers/pci/controller/dwc/pci-dra7xx.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/pci/controller/dwc/pci-dra7xx.c b/drivers/pci/controller/dwc/pci-dra7xx.c
> index cb5d4c245ff6..11f392b7a9a2 100644
> --- a/drivers/pci/controller/dwc/pci-dra7xx.c
> +++ b/drivers/pci/controller/dwc/pci-dra7xx.c
> @@ -801,6 +801,8 @@ static int dra7xx_pcie_probe(struct platform_device *pdev)
>  		dev_err(&pdev->dev, "gpio request failed, ret %d\n", ret);
>  		goto err_gpio;
>  	}
> +	usleep_range(1000, 2000);

Hello! Just a note that this is again a new code pattern in another
driver for different wait value of PCIe Warm Reset timeout. I sent email
about these issues:
https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/

Luca, how did you choose value 1000-2000 us? Do you have some reference
or specification which says that this value needs to be used?

> +	gpiod_set_value(reset, 0);
>  
>  	reg = dra7xx_pcie_readl(dra7xx, PCIECTRL_DRA7XX_CONF_DEVICE_CMD);
>  	reg &= ~LTSSM_EN;
> -- 
> 2.25.1
>
Luca Ceresoli May 31, 2021, 1:54 p.m. UTC | #2
Hi Pali,

On 31/05/21 15:32, Pali Rohár wrote:
> On Monday 31 May 2021 11:05:40 Luca Ceresoli wrote:
>> The PCIe PERSTn reset pin is active low and should be asserted, then
>> deasserted.
>>
>> The current implementation only drives the pin once in "HIGH" position,
>> thus presumably it was intended to deassert the pin. This has two problems:
>>
>>   1) it assumes the pin was asserted by other means before loading the
>>      driver
>>   2) it has the wrong polarity, since "HIGH" means "active", and the pin is
>>      presumably configured as active low coherently with the PCIe
>>      convention, thus it is driven physically to 0, keeping the device
>>      under reset unless the pin is configured as active high.
>>
>> Fix both problems by:
>>
>>   1) keeping devm_gpiod_get_optional(dev, NULL, GPIOD_OUT_HIGH) as is, but
>>      assuming the pin is correctly configured as "active low" this now
>>      becomes a reset assertion
>>   2) adding gpiod_set_value(reset, 0) after a delay to deassert reset
>>
>> Fixes: 78bdcad05ea1 ("PCI: dra7xx: Add support to make GPIO drive PERST# line")
>> Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
>>
>> ---
>>
>> Changes v1 -> v2:
>>  - No changes to the patch
>>  - Reword commit message according to suggestions from Bjorn Helgaas (from
>>    another patchset)
>>  - Add Fixes: tag
>> ---
>>  drivers/pci/controller/dwc/pci-dra7xx.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/pci/controller/dwc/pci-dra7xx.c b/drivers/pci/controller/dwc/pci-dra7xx.c
>> index cb5d4c245ff6..11f392b7a9a2 100644
>> --- a/drivers/pci/controller/dwc/pci-dra7xx.c
>> +++ b/drivers/pci/controller/dwc/pci-dra7xx.c
>> @@ -801,6 +801,8 @@ static int dra7xx_pcie_probe(struct platform_device *pdev)
>>  		dev_err(&pdev->dev, "gpio request failed, ret %d\n", ret);
>>  		goto err_gpio;
>>  	}
>> +	usleep_range(1000, 2000);
> 
> Hello! Just a note that this is again a new code pattern in another
> driver for different wait value of PCIe Warm Reset timeout. I sent email
> about these issues:
> https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/
> 
> Luca, how did you choose value 1000-2000 us? Do you have some reference
> or specification which says that this value needs to be used?

Sadly I haven't access to the PCIe specification.

I'd be very happy to know what a correct value should be and update my
patch.
Kishon Vijay Abraham I May 31, 2021, 4 p.m. UTC | #3
Hi,

On 31/05/21 7:24 pm, Luca Ceresoli wrote:
> Hi Pali,
> 
> On 31/05/21 15:32, Pali Rohár wrote:
>> On Monday 31 May 2021 11:05:40 Luca Ceresoli wrote:
>>> The PCIe PERSTn reset pin is active low and should be asserted, then
>>> deasserted.
>>>
>>> The current implementation only drives the pin once in "HIGH" position,
>>> thus presumably it was intended to deassert the pin. This has two problems:
>>>
>>>   1) it assumes the pin was asserted by other means before loading the
>>>      driver
>>>   2) it has the wrong polarity, since "HIGH" means "active", and the pin is
>>>      presumably configured as active low coherently with the PCIe
>>>      convention, thus it is driven physically to 0, keeping the device
>>>      under reset unless the pin is configured as active high.
>>>
>>> Fix both problems by:
>>>
>>>   1) keeping devm_gpiod_get_optional(dev, NULL, GPIOD_OUT_HIGH) as is, but
>>>      assuming the pin is correctly configured as "active low" this now
>>>      becomes a reset assertion
>>>   2) adding gpiod_set_value(reset, 0) after a delay to deassert reset
>>>
>>> Fixes: 78bdcad05ea1 ("PCI: dra7xx: Add support to make GPIO drive PERST# line")
>>> Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
>>>
>>> ---
>>>
>>> Changes v1 -> v2:
>>>  - No changes to the patch
>>>  - Reword commit message according to suggestions from Bjorn Helgaas (from
>>>    another patchset)
>>>  - Add Fixes: tag
>>> ---
>>>  drivers/pci/controller/dwc/pci-dra7xx.c | 2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/pci/controller/dwc/pci-dra7xx.c b/drivers/pci/controller/dwc/pci-dra7xx.c
>>> index cb5d4c245ff6..11f392b7a9a2 100644
>>> --- a/drivers/pci/controller/dwc/pci-dra7xx.c
>>> +++ b/drivers/pci/controller/dwc/pci-dra7xx.c
>>> @@ -801,6 +801,8 @@ static int dra7xx_pcie_probe(struct platform_device *pdev)
>>>  		dev_err(&pdev->dev, "gpio request failed, ret %d\n", ret);
>>>  		goto err_gpio;
>>>  	}
>>> +	usleep_range(1000, 2000);
>>
>> Hello! Just a note that this is again a new code pattern in another
>> driver for different wait value of PCIe Warm Reset timeout. I sent email
>> about these issues:
>> https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/
>>
>> Luca, how did you choose value 1000-2000 us? Do you have some reference
>> or specification which says that this value needs to be used?
> 
> Sadly I haven't access to the PCIe specification.
> 
> I'd be very happy to know what a correct value should be and update my
> patch.

I had given the timing mentioned in the specification here
https://lore.kernel.org/r/023c9b59-70bb-ed8d-a4c0-76eae726b574@ti.com

The PCI EXPRESS CARD ELECTROMECHANICAL SPECIFICATION defines the Power
Sequencing and Reset Signal Timings in Table 2-4. Please also refer Figure
2-10: Power Up of the CEM.

╔═════════════╤══════════════════════════════════════╤═════╤═════╤═══════╗
║ Symbol      │ Parameter                            │ Min │ Max │ Units ║
╠═════════════╪══════════════════════════════════════╪═════╪═════╪═══════╣
║ T PVPERL    │ Power stable to PERST# inactive      │ 100 │     │ ms    ║
╟─────────────┼──────────────────────────────────────┼─────┼─────┼───────╢
║ T PERST-CLK │ REFCLK stable before PERST# inactive │ 100 │     │ μs    ║
╟─────────────┼──────────────────────────────────────┼─────┼─────┼───────╢
║ T PERST     │ PERST# active time                   │ 100 │     │ μs    ║
╟─────────────┼──────────────────────────────────────┼─────┼─────┼───────╢
║ T FAIL      │ Power level invalid to PERST# active │     │ 500 │ ns    ║
╟─────────────┼──────────────────────────────────────┼─────┼─────┼───────╢
║ T WKRF      │ WAKE# rise – fall time               │     │ 100 │ ns    ║
╚═════════════╧══════════════════════════════════════╧═════╧═════╧═══════╝

The de-assertion of #PERST is w.r.t both power stable and refclk stable.

I'm yet to validate this patch, but IIRC devm_gpiod_get_optional(dev,
NULL, GPIOD_OUT_HIGH) will already de-assert the PERST line. Please note
the board here can have various combinations of NOT gate before the gpio
line is actually connected to the connector.

Thanks
Kishon
Pali Rohár May 31, 2021, 4:22 p.m. UTC | #4
Hello Kishon!

On Monday 31 May 2021 21:30:30 Kishon Vijay Abraham I wrote:
> I had given the timing mentioned in the specification here
> https://lore.kernel.org/r/023c9b59-70bb-ed8d-a4c0-76eae726b574@ti.com
> 
> The PCI EXPRESS CARD ELECTROMECHANICAL SPECIFICATION defines the Power
> Sequencing and Reset Signal Timings in Table 2-4. Please also refer Figure
> 2-10: Power Up of the CEM.
> 
> ╔═════════════╤══════════════════════════════════════╤═════╤═════╤═══════╗
> ║ Symbol      │ Parameter                            │ Min │ Max │ Units ║
> ╠═════════════╪══════════════════════════════════════╪═════╪═════╪═══════╣
> ║ T PVPERL    │ Power stable to PERST# inactive      │ 100 │     │ ms    ║
> ╟─────────────┼──────────────────────────────────────┼─────┼─────┼───────╢
> ║ T PERST-CLK │ REFCLK stable before PERST# inactive │ 100 │     │ μs    ║
> ╟─────────────┼──────────────────────────────────────┼─────┼─────┼───────╢
> ║ T PERST     │ PERST# active time                   │ 100 │     │ μs    ║
> ╟─────────────┼──────────────────────────────────────┼─────┼─────┼───────╢
> ║ T FAIL      │ Power level invalid to PERST# active │     │ 500 │ ns    ║
> ╟─────────────┼──────────────────────────────────────┼─────┼─────┼───────╢
> ║ T WKRF      │ WAKE# rise – fall time               │     │ 100 │ ns    ║
> ╚═════════════╧══════════════════════════════════════╧═════╧═════╧═══════╝
> 
> The de-assertion of #PERST is w.r.t both power stable and refclk stable.

I think this does not fully answer this problematic question. One thing
is initial power on and second thing is warm reset (when both power and
clock is stable).

On more ARM boards, power is not SW controllable and is automatically
enabled when powering board on. So Tₚᵥₚₑᵣₗ is calculated since
bootloader and therefore not needed to take into account in kernel.

Tₚₑᵣₛₜ₋cₗₖ is only 100 µs and experiments proved that 100 µs not enough
for toggling PERST# GPIO. At least one 1 ms is needed and for some cards
at least 10 ms. Otherwise cards are not detected.

So when you have both power and clock stable and you want to reset card
via PERST# signal, above table does not say how long it is needed to
have PERST# in reset state.

> I'm yet to validate this patch, but IIRC devm_gpiod_get_optional(dev,
> NULL, GPIOD_OUT_HIGH) will already de-assert the PERST line. Please note
> the board here can have various combinations of NOT gate before the gpio
> line is actually connected to the connector.
> 
> Thanks
> Kishon
Luca Ceresoli June 1, 2021, 9:03 a.m. UTC | #5
Hi Kishon,

On 31/05/21 18:00, Kishon Vijay Abraham I wrote:
> Hi,
> 
> On 31/05/21 7:24 pm, Luca Ceresoli wrote:
>> Hi Pali,
>>
>> On 31/05/21 15:32, Pali Rohár wrote:
>>> On Monday 31 May 2021 11:05:40 Luca Ceresoli wrote:
>>>> The PCIe PERSTn reset pin is active low and should be asserted, then
>>>> deasserted.
>>>>
>>>> The current implementation only drives the pin once in "HIGH" position,
>>>> thus presumably it was intended to deassert the pin. This has two problems:
>>>>
>>>>   1) it assumes the pin was asserted by other means before loading the
>>>>      driver
>>>>   2) it has the wrong polarity, since "HIGH" means "active", and the pin is
>>>>      presumably configured as active low coherently with the PCIe
>>>>      convention, thus it is driven physically to 0, keeping the device
>>>>      under reset unless the pin is configured as active high.
>>>>
>>>> Fix both problems by:
>>>>
>>>>   1) keeping devm_gpiod_get_optional(dev, NULL, GPIOD_OUT_HIGH) as is, but
>>>>      assuming the pin is correctly configured as "active low" this now
>>>>      becomes a reset assertion
>>>>   2) adding gpiod_set_value(reset, 0) after a delay to deassert reset
>>>>
>>>> Fixes: 78bdcad05ea1 ("PCI: dra7xx: Add support to make GPIO drive PERST# line")
>>>> Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
>>>>
>>>> ---
>>>>
>>>> Changes v1 -> v2:
>>>>  - No changes to the patch
>>>>  - Reword commit message according to suggestions from Bjorn Helgaas (from
>>>>    another patchset)
>>>>  - Add Fixes: tag
>>>> ---
>>>>  drivers/pci/controller/dwc/pci-dra7xx.c | 2 ++
>>>>  1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/drivers/pci/controller/dwc/pci-dra7xx.c b/drivers/pci/controller/dwc/pci-dra7xx.c
>>>> index cb5d4c245ff6..11f392b7a9a2 100644
>>>> --- a/drivers/pci/controller/dwc/pci-dra7xx.c
>>>> +++ b/drivers/pci/controller/dwc/pci-dra7xx.c
>>>> @@ -801,6 +801,8 @@ static int dra7xx_pcie_probe(struct platform_device *pdev)
>>>>  		dev_err(&pdev->dev, "gpio request failed, ret %d\n", ret);
>>>>  		goto err_gpio;
>>>>  	}
>>>> +	usleep_range(1000, 2000);
>>>
>>> Hello! Just a note that this is again a new code pattern in another
>>> driver for different wait value of PCIe Warm Reset timeout. I sent email
>>> about these issues:
>>> https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/
>>>
>>> Luca, how did you choose value 1000-2000 us? Do you have some reference
>>> or specification which says that this value needs to be used?
>>
>> Sadly I haven't access to the PCIe specification.
>>
>> I'd be very happy to know what a correct value should be and update my
>> patch.
> 
> I had given the timing mentioned in the specification here
> https://lore.kernel.org/r/023c9b59-70bb-ed8d-a4c0-76eae726b574@ti.com
> 
> The PCI EXPRESS CARD ELECTROMECHANICAL SPECIFICATION defines the Power
> Sequencing and Reset Signal Timings in Table 2-4. Please also refer Figure
> 2-10: Power Up of the CEM.
> 
> ╔═════════════╤══════════════════════════════════════╤═════╤═════╤═══════╗
> ║ Symbol      │ Parameter                            │ Min │ Max │ Units ║
> ╠═════════════╪══════════════════════════════════════╪═════╪═════╪═══════╣
> ║ T PVPERL    │ Power stable to PERST# inactive      │ 100 │     │ ms    ║
> ╟─────────────┼──────────────────────────────────────┼─────┼─────┼───────╢
> ║ T PERST-CLK │ REFCLK stable before PERST# inactive │ 100 │     │ μs    ║
> ╟─────────────┼──────────────────────────────────────┼─────┼─────┼───────╢
> ║ T PERST     │ PERST# active time                   │ 100 │     │ μs    ║
> ╟─────────────┼──────────────────────────────────────┼─────┼─────┼───────╢
> ║ T FAIL      │ Power level invalid to PERST# active │     │ 500 │ ns    ║
> ╟─────────────┼──────────────────────────────────────┼─────┼─────┼───────╢
> ║ T WKRF      │ WAKE# rise – fall time               │     │ 100 │ ns    ║
> ╚═════════════╧══════════════════════════════════════╧═════╧═════╧═══════╝
> 
> The de-assertion of #PERST is w.r.t both power stable and refclk stable.
> 
> I'm yet to validate this patch, but IIRC devm_gpiod_get_optional(dev,
> NULL, GPIOD_OUT_HIGH) will already de-assert the PERST line. 

Perhaps in all the cases you faced, but GPIOD_OUT_HIGH [0] really means
"active", not "electrically high", and here we want reset to be
deasserted (=deactivated), not asserted (=activated).

I guess it works when the GPIO drives PERSTn without inversion (no NOT
gates or an even number of NOT gates) _and_ device tree does specify the
GPIO as active high (which is incorrect: PERSTn is active low).

> Please note
> the board here can have various combinations of NOT gate before the
> gpio
> line is actually connected to the connector.

Exactly for this reason a portable driver must never drive the signal
"electrically low" or "electrically high". That's why with my patch I
propose to give the proper interpretation [1] to GPIOD_OUT_HIGH, i.e.
"active", i.e. "reset asserted". Device tree will describe if active
means electrically low (no NOT gates between GPIO pin and device PERSTn
pin) or high (odd number of NOT gates).

Additionally, as per patch description, even in the cases where the
driver deasserts the reset, it does not assert it. Should the signal be
asserted before dra7xx_pcie_probe(), devm_gpiod_get_optional(dev, NULL,
GPIOD_OUT_HIGH) would not move the line and thus would not reset the device.

The only problem I can imagine with my patch is with existing code. If
you have a board with the reset GPIO described as active high in DT
while it is active low (no/even NOR gates on board), then you should
apply this patch _and_ fix the board device tree.

I hope the intent of the patch is clearer now.

[0]
https://www.kernel.org/doc/html/latest/driver-api/gpio/consumer.html#obtaining-and-disposing-gpios
[1]
https://www.kernel.org/doc/html/latest/driver-api/gpio/consumer.html#the-active-low-and-open-drain-semantics
Luca Ceresoli June 22, 2021, 10:57 a.m. UTC | #6
Hi,

On 31/05/21 18:22, Pali Rohár wrote:
> Hello Kishon!
> 
> On Monday 31 May 2021 21:30:30 Kishon Vijay Abraham I wrote:
>> I had given the timing mentioned in the specification here
>> https://lore.kernel.org/r/023c9b59-70bb-ed8d-a4c0-76eae726b574@ti.com
>>
>> The PCI EXPRESS CARD ELECTROMECHANICAL SPECIFICATION defines the Power
>> Sequencing and Reset Signal Timings in Table 2-4. Please also refer Figure
>> 2-10: Power Up of the CEM.
>>
>> ╔═════════════╤══════════════════════════════════════╤═════╤═════╤═══════╗
>> ║ Symbol      │ Parameter                            │ Min │ Max │ Units ║
>> ╠═════════════╪══════════════════════════════════════╪═════╪═════╪═══════╣
>> ║ T PVPERL    │ Power stable to PERST# inactive      │ 100 │     │ ms    ║
>> ╟─────────────┼──────────────────────────────────────┼─────┼─────┼───────╢
>> ║ T PERST-CLK │ REFCLK stable before PERST# inactive │ 100 │     │ μs    ║
>> ╟─────────────┼──────────────────────────────────────┼─────┼─────┼───────╢
>> ║ T PERST     │ PERST# active time                   │ 100 │     │ μs    ║
>> ╟─────────────┼──────────────────────────────────────┼─────┼─────┼───────╢
>> ║ T FAIL      │ Power level invalid to PERST# active │     │ 500 │ ns    ║
>> ╟─────────────┼──────────────────────────────────────┼─────┼─────┼───────╢
>> ║ T WKRF      │ WAKE# rise – fall time               │     │ 100 │ ns    ║
>> ╚═════════════╧══════════════════════════════════════╧═════╧═════╧═══════╝
>>
>> The de-assertion of #PERST is w.r.t both power stable and refclk stable.
> 
> I think this does not fully answer this problematic question. One thing
> is initial power on and second thing is warm reset (when both power and
> clock is stable).
> 
> On more ARM boards, power is not SW controllable and is automatically
> enabled when powering board on. So Tₚᵥₚₑᵣₗ is calculated since
> bootloader and therefore not needed to take into account in kernel.
> 
> Tₚₑᵣₛₜ₋cₗₖ is only 100 µs and experiments proved that 100 µs not enough
> for toggling PERST# GPIO. At least one 1 ms is needed and for some cards
> at least 10 ms. Otherwise cards are not detected.
> 
> So when you have both power and clock stable and you want to reset card
> via PERST# signal, above table does not say how long it is needed to
> have PERST# in reset state.

Nothing happened after a few weeks... I understand that knowing the
correct reset timings is relevant, but unfortunately I cannot help much
in finding out the correct values.

However I'm wondering what should happen to this patch. It *does* fix a
real bug, but potentially with an incorrect or non-optimal usleep range.
Do we really want to ignore a bugfix because we are not sure about how
long this delay should be?
Pali Rohár June 22, 2021, 11:06 a.m. UTC | #7
Hello!

On Tuesday 22 June 2021 12:57:22 Luca Ceresoli wrote:
> Nothing happened after a few weeks... I understand that knowing the
> correct reset timings is relevant, but unfortunately I cannot help much
> in finding out the correct values.
> 
> However I'm wondering what should happen to this patch. It *does* fix a
> real bug, but potentially with an incorrect or non-optimal usleep range.
> Do we really want to ignore a bugfix because we are not sure about how
> long this delay should be?

As there is no better solution right now, I'm fine with your patch. But
patch needs to be approved by Lorenzo, so please wait for his final
answer.

I would suggest to add a comment for call "usleep_range(1000, 2000);"
that you have chosen some "random" values which worked fine on your
setup and that they fix mentioned bug. Comment just to mark this sleep
code that is suboptimal / not-so-correct and to prevent other people to
copy+paste this code into other (new) drivers...
Lorenzo Pieralisi June 22, 2021, 11:56 a.m. UTC | #8
[Adding Linus for GPIO discussion, thread:
https://lore.kernel.org/linux-pci/20210531090540.2663171-1-luca@lucaceresoli.net]

On Tue, Jun 22, 2021 at 01:06:27PM +0200, Pali Rohár wrote:
> Hello!
> 
> On Tuesday 22 June 2021 12:57:22 Luca Ceresoli wrote:
> > Nothing happened after a few weeks... I understand that knowing the
> > correct reset timings is relevant, but unfortunately I cannot help much
> > in finding out the correct values.
> > 
> > However I'm wondering what should happen to this patch. It *does* fix a
> > real bug, but potentially with an incorrect or non-optimal usleep range.
> > Do we really want to ignore a bugfix because we are not sure about how
> > long this delay should be?
> 
> As there is no better solution right now, I'm fine with your patch. But
> patch needs to be approved by Lorenzo, so please wait for his final
> answer.

I am not a GPIO expert and I have a feeling this is platform specific
beyond what the PCI specification can actually define architecturally.

There are two things I'd like to see:

1) If Linus can have a look at the GPIO bits in this thread that would
   definitely help clarify any pending controversy
2) Kishon to test on *existing* platforms and confirm there are no
   regressions triggered

> I would suggest to add a comment for call "usleep_range(1000, 2000);"
> that you have chosen some "random" values which worked fine on your
> setup and that they fix mentioned bug. Comment just to mark this sleep
> code that is suboptimal / not-so-correct and to prevent other people to
> copy+paste this code into other (new) drivers...

Yes a comment would help but as I say above I am afraid this is
a platform specific set-up, ie that delay is somewhat tied to
a platform, not sure there is anything we can do.

If Linus and Kishon are happy with the approach we can merge this
patch.

Lorenzo
Pali Rohár June 22, 2021, 12:16 p.m. UTC | #9
On Tuesday 22 June 2021 12:56:04 Lorenzo Pieralisi wrote:
> [Adding Linus for GPIO discussion, thread:
> https://lore.kernel.org/linux-pci/20210531090540.2663171-1-luca@lucaceresoli.net]
> 
> On Tue, Jun 22, 2021 at 01:06:27PM +0200, Pali Rohár wrote:
> > Hello!
> > 
> > On Tuesday 22 June 2021 12:57:22 Luca Ceresoli wrote:
> > > Nothing happened after a few weeks... I understand that knowing the
> > > correct reset timings is relevant, but unfortunately I cannot help much
> > > in finding out the correct values.
> > > 
> > > However I'm wondering what should happen to this patch. It *does* fix a
> > > real bug, but potentially with an incorrect or non-optimal usleep range.
> > > Do we really want to ignore a bugfix because we are not sure about how
> > > long this delay should be?
> > 
> > As there is no better solution right now, I'm fine with your patch. But
> > patch needs to be approved by Lorenzo, so please wait for his final
> > answer.
> 
> I am not a GPIO expert and I have a feeling this is platform specific
> beyond what the PCI specification can actually define architecturally.

In my opinion timeout is not platform specific as I wrote in email:
https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/

My experiments already proved that some PCIe cards needs to be in reset
state for some minimal time otherwise they cannot be enumerated. And it
does not matter to which platform you connect those (endpoint) cards.

I do not think that timeout itself is platform specific. GPIO controls
PERST# pin and therefore specified sleep value directly drives how long
is card on the other end of PCIe slot in Warm Reset state. PCIe CEM spec
directly says that PERST# signal controls PCIe Warm Reset.

What is here platform specific thing is that PERST# signal is controlled
by GPIO. But value of signal (high / low) and how long is in signal in
which state for me sounds like not an platform specific thing, but as
PCIe / CEM related.

> There are two things I'd like to see:
> 
> 1) If Linus can have a look at the GPIO bits in this thread that would
>    definitely help clarify any pending controversy
> 2) Kishon to test on *existing* platforms and confirm there are no
>    regressions triggered
> 
> > I would suggest to add a comment for call "usleep_range(1000, 2000);"
> > that you have chosen some "random" values which worked fine on your
> > setup and that they fix mentioned bug. Comment just to mark this sleep
> > code that is suboptimal / not-so-correct and to prevent other people to
> > copy+paste this code into other (new) drivers...
> 
> Yes a comment would help but as I say above I am afraid this is
> a platform specific set-up, ie that delay is somewhat tied to
> a platform, not sure there is anything we can do.
> 
> If Linus and Kishon are happy with the approach we can merge this
> patch.
> 
> Lorenzo
Luca Ceresoli June 22, 2021, 1:31 p.m. UTC | #10
Hi,

On 22/06/21 14:16, Pali Rohár wrote:
> On Tuesday 22 June 2021 12:56:04 Lorenzo Pieralisi wrote:
>> [Adding Linus for GPIO discussion, thread:
>> https://lore.kernel.org/linux-pci/20210531090540.2663171-1-luca@lucaceresoli.net]
>>
>> On Tue, Jun 22, 2021 at 01:06:27PM +0200, Pali Rohár wrote:
>>> Hello!
>>>
>>> On Tuesday 22 June 2021 12:57:22 Luca Ceresoli wrote:
>>>> Nothing happened after a few weeks... I understand that knowing the
>>>> correct reset timings is relevant, but unfortunately I cannot help much
>>>> in finding out the correct values.
>>>>
>>>> However I'm wondering what should happen to this patch. It *does* fix a
>>>> real bug, but potentially with an incorrect or non-optimal usleep range.
>>>> Do we really want to ignore a bugfix because we are not sure about how
>>>> long this delay should be?
>>>
>>> As there is no better solution right now, I'm fine with your patch. But
>>> patch needs to be approved by Lorenzo, so please wait for his final
>>> answer.
>>
>> I am not a GPIO expert and I have a feeling this is platform specific
>> beyond what the PCI specification can actually define architecturally.
> 
> In my opinion timeout is not platform specific as I wrote in email:
> https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/
> 
> My experiments already proved that some PCIe cards needs to be in reset
> state for some minimal time otherwise they cannot be enumerated. And it
> does not matter to which platform you connect those (endpoint) cards.
> 
> I do not think that timeout itself is platform specific. GPIO controls
> PERST# pin and therefore specified sleep value directly drives how long
> is card on the other end of PCIe slot in Warm Reset state. PCIe CEM spec
> directly says that PERST# signal controls PCIe Warm Reset.
> 
> What is here platform specific thing is that PERST# signal is controlled
> by GPIO. But value of signal (high / low) and how long is in signal in
> which state for me sounds like not an platform specific thing, but as
> PCIe / CEM related.

That's exactly my understanding of this matter. At least for the dra7xx
controller it works exactly like this, PERSTn# is nothing but a GPIO
output from the SoC that drives the PERSTn# input of the external chip
without affecting the controller directly.
Kishon Vijay Abraham I June 22, 2021, 1:57 p.m. UTC | #11
Hi Luca, Pali,

On 22/06/21 7:01 pm, Luca Ceresoli wrote:
> Hi,
> 
> On 22/06/21 14:16, Pali Rohár wrote:
>> On Tuesday 22 June 2021 12:56:04 Lorenzo Pieralisi wrote:
>>> [Adding Linus for GPIO discussion, thread:
>>> https://lore.kernel.org/linux-pci/20210531090540.2663171-1-luca@lucaceresoli.net]
>>>
>>> On Tue, Jun 22, 2021 at 01:06:27PM +0200, Pali Rohár wrote:
>>>> Hello!
>>>>
>>>> On Tuesday 22 June 2021 12:57:22 Luca Ceresoli wrote:
>>>>> Nothing happened after a few weeks... I understand that knowing the
>>>>> correct reset timings is relevant, but unfortunately I cannot help much
>>>>> in finding out the correct values.
>>>>>
>>>>> However I'm wondering what should happen to this patch. It *does* fix a
>>>>> real bug, but potentially with an incorrect or non-optimal usleep range.
>>>>> Do we really want to ignore a bugfix because we are not sure about how
>>>>> long this delay should be?
>>>>
>>>> As there is no better solution right now, I'm fine with your patch. But
>>>> patch needs to be approved by Lorenzo, so please wait for his final
>>>> answer.
>>>
>>> I am not a GPIO expert and I have a feeling this is platform specific
>>> beyond what the PCI specification can actually define architecturally.
>>
>> In my opinion timeout is not platform specific as I wrote in email:
>> https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/
>>
>> My experiments already proved that some PCIe cards needs to be in reset
>> state for some minimal time otherwise they cannot be enumerated. And it
>> does not matter to which platform you connect those (endpoint) cards.
>>
>> I do not think that timeout itself is platform specific. GPIO controls
>> PERST# pin and therefore specified sleep value directly drives how long
>> is card on the other end of PCIe slot in Warm Reset state. PCIe CEM spec
>> directly says that PERST# signal controls PCIe Warm Reset.
>>
>> What is here platform specific thing is that PERST# signal is controlled
>> by GPIO. But value of signal (high / low) and how long is in signal in
>> which state for me sounds like not an platform specific thing, but as
>> PCIe / CEM related.
> 
> That's exactly my understanding of this matter. At least for the dra7xx
> controller it works exactly like this, PERSTn# is nothing but a GPIO
> output from the SoC that drives the PERSTn# input of the external chip
> without affecting the controller directly.
> 

While the patch itself is correct, this kind-of changes the behavior on
already upstreamed platforms. Previously the driver expected #PERST to
be asserted be external means (or default power-up state) and only takes
care of de-asserting the #PERST line.

There are 2 platforms that will be impacted due to this change
1) arch/arm/boot/dts/am57xx-beagle-x15-common.dtsi (has an inverter on
GPIO line)
2) arch/arm/boot/dts/am571x-idk.dts (directly connected to #PERST)

For 1), gpiod_set_value(reset, 0) will assert the PERST line due to the
inverter (and GPIO_ACTIVE_LOW)
For 2), gpiod_set_value(reset, 0) will assert the PERST line because we
have GPIO_ACTIVE_HIGH

So this patch should have to be accompanied with DT changes (and this
patch also breaks old DT compatibility).

Thanks
Kishon
Lorenzo Pieralisi June 22, 2021, 2:23 p.m. UTC | #12
On Tue, Jun 22, 2021 at 02:16:49PM +0200, Pali Rohár wrote:
> On Tuesday 22 June 2021 12:56:04 Lorenzo Pieralisi wrote:
> > [Adding Linus for GPIO discussion, thread:
> > https://lore.kernel.org/linux-pci/20210531090540.2663171-1-luca@lucaceresoli.net]
> > 
> > On Tue, Jun 22, 2021 at 01:06:27PM +0200, Pali Rohár wrote:
> > > Hello!
> > > 
> > > On Tuesday 22 June 2021 12:57:22 Luca Ceresoli wrote:
> > > > Nothing happened after a few weeks... I understand that knowing the
> > > > correct reset timings is relevant, but unfortunately I cannot help much
> > > > in finding out the correct values.
> > > > 
> > > > However I'm wondering what should happen to this patch. It *does* fix a
> > > > real bug, but potentially with an incorrect or non-optimal usleep range.
> > > > Do we really want to ignore a bugfix because we are not sure about how
> > > > long this delay should be?
> > > 
> > > As there is no better solution right now, I'm fine with your patch. But
> > > patch needs to be approved by Lorenzo, so please wait for his final
> > > answer.
> > 
> > I am not a GPIO expert and I have a feeling this is platform specific
> > beyond what the PCI specification can actually define architecturally.
> 
> In my opinion timeout is not platform specific as I wrote in email:
> https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/
> 
> My experiments already proved that some PCIe cards needs to be in reset
> state for some minimal time otherwise they cannot be enumerated. And it
> does not matter to which platform you connect those (endpoint) cards.
> 
> I do not think that timeout itself is platform specific. GPIO controls
> PERST# pin and therefore specified sleep value directly drives how long
> is card on the other end of PCIe slot in Warm Reset state. PCIe CEM spec
> directly says that PERST# signal controls PCIe Warm Reset.

Point taken but regardless this deviates from the PCI electromechanical
specifications (ie T-PERST-CLK), does not it ? I misused "platform" to
define something that apparently is not contemplated by the PCI
specifications (and I would like to understand why).
 
I guess on ACPI systems (ie where the PERST# handling is implemented in
FW) this is handled in BIOS/UEFI - need to peruse the code to check how
PERST# is handled and whether the delay is per host controller driver.

> 
> What is here platform specific thing is that PERST# signal is controlled
> by GPIO. But value of signal (high / low) and how long is in signal in
> which state for me sounds like not an platform specific thing, but as
> PCIe / CEM related.

There are two different things to agree on this patch 1) how GPIO drives
PERST# 2) the PERST# de-assertion delay.

I appreciate they are related and that Luca had to handle them together
but logically they are separated "issues", it'd be great if we manage
to nail down how they should be handled before we merge this code.

Lorenzo

> 
> > There are two things I'd like to see:
> > 
> > 1) If Linus can have a look at the GPIO bits in this thread that would
> >    definitely help clarify any pending controversy
> > 2) Kishon to test on *existing* platforms and confirm there are no
> >    regressions triggered
> > 
> > > I would suggest to add a comment for call "usleep_range(1000, 2000);"
> > > that you have chosen some "random" values which worked fine on your
> > > setup and that they fix mentioned bug. Comment just to mark this sleep
> > > code that is suboptimal / not-so-correct and to prevent other people to
> > > copy+paste this code into other (new) drivers...
> > 
> > Yes a comment would help but as I say above I am afraid this is
> > a platform specific set-up, ie that delay is somewhat tied to
> > a platform, not sure there is anything we can do.
> > 
> > If Linus and Kishon are happy with the approach we can merge this
> > patch.
> > 
> > Lorenzo
Pali Rohár June 22, 2021, 8:48 p.m. UTC | #13
On Tuesday 22 June 2021 15:23:25 Lorenzo Pieralisi wrote:
> On Tue, Jun 22, 2021 at 02:16:49PM +0200, Pali Rohár wrote:
> > On Tuesday 22 June 2021 12:56:04 Lorenzo Pieralisi wrote:
> > > [Adding Linus for GPIO discussion, thread:
> > > https://lore.kernel.org/linux-pci/20210531090540.2663171-1-luca@lucaceresoli.net]
> > > 
> > > On Tue, Jun 22, 2021 at 01:06:27PM +0200, Pali Rohár wrote:
> > > > Hello!
> > > > 
> > > > On Tuesday 22 June 2021 12:57:22 Luca Ceresoli wrote:
> > > > > Nothing happened after a few weeks... I understand that knowing the
> > > > > correct reset timings is relevant, but unfortunately I cannot help much
> > > > > in finding out the correct values.
> > > > > 
> > > > > However I'm wondering what should happen to this patch. It *does* fix a
> > > > > real bug, but potentially with an incorrect or non-optimal usleep range.
> > > > > Do we really want to ignore a bugfix because we are not sure about how
> > > > > long this delay should be?
> > > > 
> > > > As there is no better solution right now, I'm fine with your patch. But
> > > > patch needs to be approved by Lorenzo, so please wait for his final
> > > > answer.
> > > 
> > > I am not a GPIO expert and I have a feeling this is platform specific
> > > beyond what the PCI specification can actually define architecturally.
> > 
> > In my opinion timeout is not platform specific as I wrote in email:
> > https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/
> > 
> > My experiments already proved that some PCIe cards needs to be in reset
> > state for some minimal time otherwise they cannot be enumerated. And it
> > does not matter to which platform you connect those (endpoint) cards.
> > 
> > I do not think that timeout itself is platform specific. GPIO controls
> > PERST# pin and therefore specified sleep value directly drives how long
> > is card on the other end of PCIe slot in Warm Reset state. PCIe CEM spec
> > directly says that PERST# signal controls PCIe Warm Reset.
> 
> Point taken but regardless this deviates from the PCI electromechanical
> specifications (ie T-PERST-CLK), does not it ?

Well, I was not able to understand and decode PCIe base and PCIe CEM
specs to figure out which timeout value should be used. You wrote about
T-PERST-CLK but I'm really not sure if it is this one... Therefore I
cannot say if something deviates from spec or not.

> I misused "platform" to
> define something that apparently is not contemplated by the PCI
> specifications (and I would like to understand why).
>  
> I guess on ACPI systems (ie where the PERST# handling is implemented in
> FW) this is handled in BIOS/UEFI

PCIe base spec does not define any standard interface for controlling
PCIe Warm Reset and PCIe CEM spec does not define any SW interface for
PERST# pin. So every board / computer with PCIe slot may connect PERST#
pin in different way to CPU. Some ARM boards connect all PERST# pins to
just one GPIO, and so via SW you can reset all PCIe cards at the same
time. No granularity to reset just one card. Some other connects all
PERST# pin to CPU reset output pin, so when CPU / board resets it cause
also reset of all PCIe cards.

I read that some server machines have some dedicated device connected to
CPU via i2c/smbus, which controls PERST# pins for each PCIe slot
individually. And on these machines people use userspace i2cset
application to control PERST# and therefore can reset cards manually.

If ACPI / BIOS / UEFI system has some kind of PCIe support && PERST# is
controller by software then for sure it needs to reset PCIe card (at
least putting it from reset state to normal) prior trying to read PCI
device/vendor ID from config space.

> need to peruse the code to check how
> PERST# is handled and whether the delay is per host controller driver.

Are there any open source implementations? Or we are just limited to
dump ACPI bytecode or BIOS / UEFI firmware and start reverse engineering
it? Because this would not be simple.

And major problems with PCIe Warm Reset / PERST# signal I saw only on
boards where there is no BIOS / UEFI / ACPI; just native PCIe controller
drivers which talks directly to HW.

I was not able to find any way how to control PERST# on any my x86
laptop (standard setup with UEFI and ACPI). So I'm even not sure if on
x86 laptops is PERST# controllable by SW. I can imagine that this PIN
may be connected to some reset circuit from Embedded Controller which
may take full control of resetting card when it is needed at correct
time.

So it is possible that code which controls PERST# on x86 does not have
to run on CPU and may be "burned" as part of other hardware...

> > 
> > What is here platform specific thing is that PERST# signal is controlled
> > by GPIO. But value of signal (high / low) and how long is in signal in
> > which state for me sounds like not an platform specific thing, but as
> > PCIe / CEM related.
> 
> There are two different things to agree on this patch
> 1) how GPIO drives PERST#

I'm not sure what do you mean by this 1). GPIO is set to output
direction and can be either in low or high state. One of this states
represents RESET state on PERST# pin and which it is (low or high) is
defined by DTS (reset-gpio).

So setting GPIO with output direction to value 1 (active) always puts
card into reset state and setting GPIO to value 0 (inactive) puts card
into normal state.

> 2) the PERST# de-assertion delay.

This is open question.

> I appreciate they are related and that Luca had to handle them together
> but logically they are separated "issues", it'd be great if we manage
> to nail down how they should be handled before we merge this code.
> 
> Lorenzo
> 
> > 
> > > There are two things I'd like to see:
> > > 
> > > 1) If Linus can have a look at the GPIO bits in this thread that would
> > >    definitely help clarify any pending controversy
> > > 2) Kishon to test on *existing* platforms and confirm there are no
> > >    regressions triggered
> > > 
> > > > I would suggest to add a comment for call "usleep_range(1000, 2000);"
> > > > that you have chosen some "random" values which worked fine on your
> > > > setup and that they fix mentioned bug. Comment just to mark this sleep
> > > > code that is suboptimal / not-so-correct and to prevent other people to
> > > > copy+paste this code into other (new) drivers...
> > > 
> > > Yes a comment would help but as I say above I am afraid this is
> > > a platform specific set-up, ie that delay is somewhat tied to
> > > a platform, not sure there is anything we can do.
> > > 
> > > If Linus and Kishon are happy with the approach we can merge this
> > > patch.
> > > 
> > > Lorenzo
Pali Rohár June 22, 2021, 8:52 p.m. UTC | #14
On Tuesday 22 June 2021 19:27:37 Kishon Vijay Abraham I wrote:
> Hi Luca, Pali,
> 
> On 22/06/21 7:01 pm, Luca Ceresoli wrote:
> > Hi,
> > 
> > On 22/06/21 14:16, Pali Rohár wrote:
> >> On Tuesday 22 June 2021 12:56:04 Lorenzo Pieralisi wrote:
> >>> [Adding Linus for GPIO discussion, thread:
> >>> https://lore.kernel.org/linux-pci/20210531090540.2663171-1-luca@lucaceresoli.net]
> >>>
> >>> On Tue, Jun 22, 2021 at 01:06:27PM +0200, Pali Rohár wrote:
> >>>> Hello!
> >>>>
> >>>> On Tuesday 22 June 2021 12:57:22 Luca Ceresoli wrote:
> >>>>> Nothing happened after a few weeks... I understand that knowing the
> >>>>> correct reset timings is relevant, but unfortunately I cannot help much
> >>>>> in finding out the correct values.
> >>>>>
> >>>>> However I'm wondering what should happen to this patch. It *does* fix a
> >>>>> real bug, but potentially with an incorrect or non-optimal usleep range.
> >>>>> Do we really want to ignore a bugfix because we are not sure about how
> >>>>> long this delay should be?
> >>>>
> >>>> As there is no better solution right now, I'm fine with your patch. But
> >>>> patch needs to be approved by Lorenzo, so please wait for his final
> >>>> answer.
> >>>
> >>> I am not a GPIO expert and I have a feeling this is platform specific
> >>> beyond what the PCI specification can actually define architecturally.
> >>
> >> In my opinion timeout is not platform specific as I wrote in email:
> >> https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/
> >>
> >> My experiments already proved that some PCIe cards needs to be in reset
> >> state for some minimal time otherwise they cannot be enumerated. And it
> >> does not matter to which platform you connect those (endpoint) cards.
> >>
> >> I do not think that timeout itself is platform specific. GPIO controls
> >> PERST# pin and therefore specified sleep value directly drives how long
> >> is card on the other end of PCIe slot in Warm Reset state. PCIe CEM spec
> >> directly says that PERST# signal controls PCIe Warm Reset.
> >>
> >> What is here platform specific thing is that PERST# signal is controlled
> >> by GPIO. But value of signal (high / low) and how long is in signal in
> >> which state for me sounds like not an platform specific thing, but as
> >> PCIe / CEM related.
> > 
> > That's exactly my understanding of this matter. At least for the dra7xx
> > controller it works exactly like this, PERSTn# is nothing but a GPIO
> > output from the SoC that drives the PERSTn# input of the external chip
> > without affecting the controller directly.
> > 
> 
> While the patch itself is correct, this kind-of changes the behavior on
> already upstreamed platforms. Previously the driver expected #PERST to
> be asserted be external means (or default power-up state) and only takes
> care of de-asserting the #PERST line.
> 
> There are 2 platforms that will be impacted due to this change
> 1) arch/arm/boot/dts/am57xx-beagle-x15-common.dtsi (has an inverter on
> GPIO line)
> 2) arch/arm/boot/dts/am571x-idk.dts (directly connected to #PERST)
> 
> For 1), gpiod_set_value(reset, 0) will assert the PERST line due to the
> inverter (and GPIO_ACTIVE_LOW)
> For 2), gpiod_set_value(reset, 0) will assert the PERST line because we
> have GPIO_ACTIVE_HIGH

Ou! This is a problem in DT. It needs to be defined in a way that state
is same for every DTS device which uses this driver.

> So this patch should have to be accompanied with DT changes (and this
> patch also breaks old DT compatibility).

This of course needs to be fixed somehow prior accepting this patch. It
is blocker as in current state it breaks some platforms.

> 
> Thanks
> Kishon
Pali Rohár June 22, 2021, 8:55 p.m. UTC | #15
On Tuesday 22 June 2021 22:48:46 Pali Rohár wrote:
> On Tuesday 22 June 2021 15:23:25 Lorenzo Pieralisi wrote:
> > On Tue, Jun 22, 2021 at 02:16:49PM +0200, Pali Rohár wrote:
> > > On Tuesday 22 June 2021 12:56:04 Lorenzo Pieralisi wrote:
> > > > [Adding Linus for GPIO discussion, thread:
> > > > https://lore.kernel.org/linux-pci/20210531090540.2663171-1-luca@lucaceresoli.net]
> > > > 
> > > > On Tue, Jun 22, 2021 at 01:06:27PM +0200, Pali Rohár wrote:
> > > > > Hello!
> > > > > 
> > > > > On Tuesday 22 June 2021 12:57:22 Luca Ceresoli wrote:
> > > > > > Nothing happened after a few weeks... I understand that knowing the
> > > > > > correct reset timings is relevant, but unfortunately I cannot help much
> > > > > > in finding out the correct values.
> > > > > > 
> > > > > > However I'm wondering what should happen to this patch. It *does* fix a
> > > > > > real bug, but potentially with an incorrect or non-optimal usleep range.
> > > > > > Do we really want to ignore a bugfix because we are not sure about how
> > > > > > long this delay should be?
> > > > > 
> > > > > As there is no better solution right now, I'm fine with your patch. But
> > > > > patch needs to be approved by Lorenzo, so please wait for his final
> > > > > answer.
> > > > 
> > > > I am not a GPIO expert and I have a feeling this is platform specific
> > > > beyond what the PCI specification can actually define architecturally.
> > > 
> > > In my opinion timeout is not platform specific as I wrote in email:
> > > https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/
> > > 
> > > My experiments already proved that some PCIe cards needs to be in reset
> > > state for some minimal time otherwise they cannot be enumerated. And it
> > > does not matter to which platform you connect those (endpoint) cards.
> > > 
> > > I do not think that timeout itself is platform specific. GPIO controls
> > > PERST# pin and therefore specified sleep value directly drives how long
> > > is card on the other end of PCIe slot in Warm Reset state. PCIe CEM spec
> > > directly says that PERST# signal controls PCIe Warm Reset.
> > 
> > Point taken but regardless this deviates from the PCI electromechanical
> > specifications (ie T-PERST-CLK), does not it ?
> 
> Well, I was not able to understand and decode PCIe base and PCIe CEM
> specs to figure out which timeout value should be used. You wrote about
> T-PERST-CLK but I'm really not sure if it is this one... Therefore I
> cannot say if something deviates from spec or not.
> 
> > I misused "platform" to
> > define something that apparently is not contemplated by the PCI
> > specifications (and I would like to understand why).
> >  
> > I guess on ACPI systems (ie where the PERST# handling is implemented in
> > FW) this is handled in BIOS/UEFI
> 
> PCIe base spec does not define any standard interface for controlling
> PCIe Warm Reset and PCIe CEM spec does not define any SW interface for
> PERST# pin. So every board / computer with PCIe slot may connect PERST#
> pin in different way to CPU. Some ARM boards connect all PERST# pins to
> just one GPIO, and so via SW you can reset all PCIe cards at the same
> time. No granularity to reset just one card. Some other connects all
> PERST# pin to CPU reset output pin, so when CPU / board resets it cause
> also reset of all PCIe cards.
> 
> I read that some server machines have some dedicated device connected to
> CPU via i2c/smbus, which controls PERST# pins for each PCIe slot
> individually. And on these machines people use userspace i2cset
> application to control PERST# and therefore can reset cards manually.
> 
> If ACPI / BIOS / UEFI system has some kind of PCIe support && PERST# is
> controller by software then for sure it needs to reset PCIe card (at
> least putting it from reset state to normal) prior trying to read PCI
> device/vendor ID from config space.
> 
> > need to peruse the code to check how
> > PERST# is handled and whether the delay is per host controller driver.
> 
> Are there any open source implementations? Or we are just limited to
> dump ACPI bytecode or BIOS / UEFI firmware and start reverse engineering
> it? Because this would not be simple.
> 
> And major problems with PCIe Warm Reset / PERST# signal I saw only on
> boards where there is no BIOS / UEFI / ACPI; just native PCIe controller
> drivers which talks directly to HW.
> 
> I was not able to find any way how to control PERST# on any my x86
> laptop (standard setup with UEFI and ACPI). So I'm even not sure if on
> x86 laptops is PERST# controllable by SW. I can imagine that this PIN
> may be connected to some reset circuit from Embedded Controller which
> may take full control of resetting card when it is needed at correct
> time.
> 
> So it is possible that code which controls PERST# on x86 does not have
> to run on CPU and may be "burned" as part of other hardware...
> 
> > > 
> > > What is here platform specific thing is that PERST# signal is controlled
> > > by GPIO. But value of signal (high / low) and how long is in signal in
> > > which state for me sounds like not an platform specific thing, but as
> > > PCIe / CEM related.
> > 
> > There are two different things to agree on this patch
> > 1) how GPIO drives PERST#
> 
> I'm not sure what do you mean by this 1). GPIO is set to output
> direction and can be either in low or high state. One of this states
> represents RESET state on PERST# pin and which it is (low or high) is
> defined by DTS (reset-gpio).
> 
> So setting GPIO with output direction to value 1 (active) always puts
> card into reset state and setting GPIO to value 0 (inactive) puts card
> into normal state.

Now I see what you mean. Some boards define in DTS that reset-gpio in
inactive state puts card into reset state. Which contradicts my lines...

> > 2) the PERST# de-assertion delay.
> 
> This is open question.
> 
> > I appreciate they are related and that Luca had to handle them together
> > but logically they are separated "issues", it'd be great if we manage
> > to nail down how they should be handled before we merge this code.
> > 
> > Lorenzo
> > 
> > > 
> > > > There are two things I'd like to see:
> > > > 
> > > > 1) If Linus can have a look at the GPIO bits in this thread that would
> > > >    definitely help clarify any pending controversy
> > > > 2) Kishon to test on *existing* platforms and confirm there are no
> > > >    regressions triggered
> > > > 
> > > > > I would suggest to add a comment for call "usleep_range(1000, 2000);"
> > > > > that you have chosen some "random" values which worked fine on your
> > > > > setup and that they fix mentioned bug. Comment just to mark this sleep
> > > > > code that is suboptimal / not-so-correct and to prevent other people to
> > > > > copy+paste this code into other (new) drivers...
> > > > 
> > > > Yes a comment would help but as I say above I am afraid this is
> > > > a platform specific set-up, ie that delay is somewhat tied to
> > > > a platform, not sure there is anything we can do.
> > > > 
> > > > If Linus and Kishon are happy with the approach we can merge this
> > > > patch.
> > > > 
> > > > Lorenzo
Luca Ceresoli June 22, 2021, 9:04 p.m. UTC | #16
Hi Kishon,

On 22/06/21 15:57, Kishon Vijay Abraham I wrote:
> Hi Luca, Pali,
> 
> On 22/06/21 7:01 pm, Luca Ceresoli wrote:
>> Hi,
>>
>> On 22/06/21 14:16, Pali Rohár wrote:
>>> On Tuesday 22 June 2021 12:56:04 Lorenzo Pieralisi wrote:
>>>> [Adding Linus for GPIO discussion, thread:
>>>> https://lore.kernel.org/linux-pci/20210531090540.2663171-1-luca@lucaceresoli.net]
>>>>
>>>> On Tue, Jun 22, 2021 at 01:06:27PM +0200, Pali Rohár wrote:
>>>>> Hello!
>>>>>
>>>>> On Tuesday 22 June 2021 12:57:22 Luca Ceresoli wrote:
>>>>>> Nothing happened after a few weeks... I understand that knowing the
>>>>>> correct reset timings is relevant, but unfortunately I cannot help much
>>>>>> in finding out the correct values.
>>>>>>
>>>>>> However I'm wondering what should happen to this patch. It *does* fix a
>>>>>> real bug, but potentially with an incorrect or non-optimal usleep range.
>>>>>> Do we really want to ignore a bugfix because we are not sure about how
>>>>>> long this delay should be?
>>>>>
>>>>> As there is no better solution right now, I'm fine with your patch. But
>>>>> patch needs to be approved by Lorenzo, so please wait for his final
>>>>> answer.
>>>>
>>>> I am not a GPIO expert and I have a feeling this is platform specific
>>>> beyond what the PCI specification can actually define architecturally.
>>>
>>> In my opinion timeout is not platform specific as I wrote in email:
>>> https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/
>>>
>>> My experiments already proved that some PCIe cards needs to be in reset
>>> state for some minimal time otherwise they cannot be enumerated. And it
>>> does not matter to which platform you connect those (endpoint) cards.
>>>
>>> I do not think that timeout itself is platform specific. GPIO controls
>>> PERST# pin and therefore specified sleep value directly drives how long
>>> is card on the other end of PCIe slot in Warm Reset state. PCIe CEM spec
>>> directly says that PERST# signal controls PCIe Warm Reset.
>>>
>>> What is here platform specific thing is that PERST# signal is controlled
>>> by GPIO. But value of signal (high / low) and how long is in signal in
>>> which state for me sounds like not an platform specific thing, but as
>>> PCIe / CEM related.
>>
>> That's exactly my understanding of this matter. At least for the dra7xx
>> controller it works exactly like this, PERSTn# is nothing but a GPIO
>> output from the SoC that drives the PERSTn# input of the external chip
>> without affecting the controller directly.
>>
> 
> While the patch itself is correct, this kind-of changes the behavior on
> already upstreamed platforms. Previously the driver expected #PERST to
> be asserted be external means (or default power-up state) and only takes
> care of de-asserting the #PERST line.
> 
> There are 2 platforms that will be impacted due to this change
> 1) arch/arm/boot/dts/am57xx-beagle-x15-common.dtsi (has an inverter on
> GPIO line)
> 2) arch/arm/boot/dts/am571x-idk.dts (directly connected to #PERST)
> For 1), gpiod_set_value(reset, 0) will assert the PERST line due to the
> inverter (and GPIO_ACTIVE_LOW)
> For 2), gpiod_set_value(reset, 0) will assert the PERST line because we
> have GPIO_ACTIVE_HIGH
> 
> So this patch should have to be accompanied with DT changes (and this
> patch also breaks old DT compatibility).

Thanks for researching and reporting which platforms are affected. I can
certainly take care of changing these two DTs in the next patch
iteration but I have no way to test them: I have access to an X15 but
without any expansion boards, and no IDK.
Luca Ceresoli June 22, 2021, 9:08 p.m. UTC | #17
On 22/06/21 22:52, Pali Rohár wrote:
> On Tuesday 22 June 2021 19:27:37 Kishon Vijay Abraham I wrote:
>> Hi Luca, Pali,
>>
>> On 22/06/21 7:01 pm, Luca Ceresoli wrote:
>>> Hi,
>>>
>>> On 22/06/21 14:16, Pali Rohár wrote:
>>>> On Tuesday 22 June 2021 12:56:04 Lorenzo Pieralisi wrote:
>>>>> [Adding Linus for GPIO discussion, thread:
>>>>> https://lore.kernel.org/linux-pci/20210531090540.2663171-1-luca@lucaceresoli.net]
>>>>>
>>>>> On Tue, Jun 22, 2021 at 01:06:27PM +0200, Pali Rohár wrote:
>>>>>> Hello!
>>>>>>
>>>>>> On Tuesday 22 June 2021 12:57:22 Luca Ceresoli wrote:
>>>>>>> Nothing happened after a few weeks... I understand that knowing the
>>>>>>> correct reset timings is relevant, but unfortunately I cannot help much
>>>>>>> in finding out the correct values.
>>>>>>>
>>>>>>> However I'm wondering what should happen to this patch. It *does* fix a
>>>>>>> real bug, but potentially with an incorrect or non-optimal usleep range.
>>>>>>> Do we really want to ignore a bugfix because we are not sure about how
>>>>>>> long this delay should be?
>>>>>>
>>>>>> As there is no better solution right now, I'm fine with your patch. But
>>>>>> patch needs to be approved by Lorenzo, so please wait for his final
>>>>>> answer.
>>>>>
>>>>> I am not a GPIO expert and I have a feeling this is platform specific
>>>>> beyond what the PCI specification can actually define architecturally.
>>>>
>>>> In my opinion timeout is not platform specific as I wrote in email:
>>>> https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/
>>>>
>>>> My experiments already proved that some PCIe cards needs to be in reset
>>>> state for some minimal time otherwise they cannot be enumerated. And it
>>>> does not matter to which platform you connect those (endpoint) cards.
>>>>
>>>> I do not think that timeout itself is platform specific. GPIO controls
>>>> PERST# pin and therefore specified sleep value directly drives how long
>>>> is card on the other end of PCIe slot in Warm Reset state. PCIe CEM spec
>>>> directly says that PERST# signal controls PCIe Warm Reset.
>>>>
>>>> What is here platform specific thing is that PERST# signal is controlled
>>>> by GPIO. But value of signal (high / low) and how long is in signal in
>>>> which state for me sounds like not an platform specific thing, but as
>>>> PCIe / CEM related.
>>>
>>> That's exactly my understanding of this matter. At least for the dra7xx
>>> controller it works exactly like this, PERSTn# is nothing but a GPIO
>>> output from the SoC that drives the PERSTn# input of the external chip
>>> without affecting the controller directly.
>>>
>>
>> While the patch itself is correct, this kind-of changes the behavior on
>> already upstreamed platforms. Previously the driver expected #PERST to
>> be asserted be external means (or default power-up state) and only takes
>> care of de-asserting the #PERST line.
>>
>> There are 2 platforms that will be impacted due to this change
>> 1) arch/arm/boot/dts/am57xx-beagle-x15-common.dtsi (has an inverter on
>> GPIO line)
>> 2) arch/arm/boot/dts/am571x-idk.dts (directly connected to #PERST)
>>
>> For 1), gpiod_set_value(reset, 0) will assert the PERST line due to the
>> inverter (and GPIO_ACTIVE_LOW)
>> For 2), gpiod_set_value(reset, 0) will assert the PERST line because we
>> have GPIO_ACTIVE_HIGH
> 
> Ou! This is a problem in DT. It needs to be defined in a way that state
> is same for every DTS device which uses this driver.

Why? These are different boards and each specifies its own polarity.
They are already opposite to each other right now:

https://elixir.bootlin.com/linux/v5.13-rc7/source/arch/arm/boot/dts/am57xx-beagle-x15-common.dtsi#L602

https://elixir.bootlin.com/linux/v5.13-rc7/source/arch/arm/boot/dts/am571x-idk.dts#L196
Luca Ceresoli June 22, 2021, 9:13 p.m. UTC | #18
Hi Pali,

On 22/06/21 13:06, Pali Rohár wrote:
> Hello!
> 
> On Tuesday 22 June 2021 12:57:22 Luca Ceresoli wrote:
>> Nothing happened after a few weeks... I understand that knowing the
>> correct reset timings is relevant, but unfortunately I cannot help much
>> in finding out the correct values.
>>
>> However I'm wondering what should happen to this patch. It *does* fix a
>> real bug, but potentially with an incorrect or non-optimal usleep range.
>> Do we really want to ignore a bugfix because we are not sure about how
>> long this delay should be?
> 
> As there is no better solution right now, I'm fine with your patch. But
> patch needs to be approved by Lorenzo, so please wait for his final
> answer.
> 
> I would suggest to add a comment for call "usleep_range(1000, 2000);"
> that you have chosen some "random" values which worked fine on your
> setup and that they fix mentioned bug. Comment just to mark this sleep
> code that is suboptimal / not-so-correct and to prevent other people to
> copy+paste this code into other (new) drivers...

Sure, good idea.

I'm following this thread and will send v3 after a direction is agreed upon.
Pali Rohár June 22, 2021, 9:19 p.m. UTC | #19
On Tuesday 22 June 2021 23:08:07 Luca Ceresoli wrote:
> On 22/06/21 22:52, Pali Rohár wrote:
> > On Tuesday 22 June 2021 19:27:37 Kishon Vijay Abraham I wrote:
> >> Hi Luca, Pali,
> >>
> >> On 22/06/21 7:01 pm, Luca Ceresoli wrote:
> >>> Hi,
> >>>
> >>> On 22/06/21 14:16, Pali Rohár wrote:
> >>>> On Tuesday 22 June 2021 12:56:04 Lorenzo Pieralisi wrote:
> >>>>> [Adding Linus for GPIO discussion, thread:
> >>>>> https://lore.kernel.org/linux-pci/20210531090540.2663171-1-luca@lucaceresoli.net]
> >>>>>
> >>>>> On Tue, Jun 22, 2021 at 01:06:27PM +0200, Pali Rohár wrote:
> >>>>>> Hello!
> >>>>>>
> >>>>>> On Tuesday 22 June 2021 12:57:22 Luca Ceresoli wrote:
> >>>>>>> Nothing happened after a few weeks... I understand that knowing the
> >>>>>>> correct reset timings is relevant, but unfortunately I cannot help much
> >>>>>>> in finding out the correct values.
> >>>>>>>
> >>>>>>> However I'm wondering what should happen to this patch. It *does* fix a
> >>>>>>> real bug, but potentially with an incorrect or non-optimal usleep range.
> >>>>>>> Do we really want to ignore a bugfix because we are not sure about how
> >>>>>>> long this delay should be?
> >>>>>>
> >>>>>> As there is no better solution right now, I'm fine with your patch. But
> >>>>>> patch needs to be approved by Lorenzo, so please wait for his final
> >>>>>> answer.
> >>>>>
> >>>>> I am not a GPIO expert and I have a feeling this is platform specific
> >>>>> beyond what the PCI specification can actually define architecturally.
> >>>>
> >>>> In my opinion timeout is not platform specific as I wrote in email:
> >>>> https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/
> >>>>
> >>>> My experiments already proved that some PCIe cards needs to be in reset
> >>>> state for some minimal time otherwise they cannot be enumerated. And it
> >>>> does not matter to which platform you connect those (endpoint) cards.
> >>>>
> >>>> I do not think that timeout itself is platform specific. GPIO controls
> >>>> PERST# pin and therefore specified sleep value directly drives how long
> >>>> is card on the other end of PCIe slot in Warm Reset state. PCIe CEM spec
> >>>> directly says that PERST# signal controls PCIe Warm Reset.
> >>>>
> >>>> What is here platform specific thing is that PERST# signal is controlled
> >>>> by GPIO. But value of signal (high / low) and how long is in signal in
> >>>> which state for me sounds like not an platform specific thing, but as
> >>>> PCIe / CEM related.
> >>>
> >>> That's exactly my understanding of this matter. At least for the dra7xx
> >>> controller it works exactly like this, PERSTn# is nothing but a GPIO
> >>> output from the SoC that drives the PERSTn# input of the external chip
> >>> without affecting the controller directly.
> >>>
> >>
> >> While the patch itself is correct, this kind-of changes the behavior on
> >> already upstreamed platforms. Previously the driver expected #PERST to
> >> be asserted be external means (or default power-up state) and only takes
> >> care of de-asserting the #PERST line.
> >>
> >> There are 2 platforms that will be impacted due to this change
> >> 1) arch/arm/boot/dts/am57xx-beagle-x15-common.dtsi (has an inverter on
> >> GPIO line)
> >> 2) arch/arm/boot/dts/am571x-idk.dts (directly connected to #PERST)
> >>
> >> For 1), gpiod_set_value(reset, 0) will assert the PERST line due to the
> >> inverter (and GPIO_ACTIVE_LOW)
> >> For 2), gpiod_set_value(reset, 0) will assert the PERST line because we
> >> have GPIO_ACTIVE_HIGH
> > 
> > Ou! This is a problem in DT. It needs to be defined in a way that state
> > is same for every DTS device which uses this driver.
> 
> Why?

I'm starting to be confused by triple or more negations (asserting,
signal inverter, active low)...

In your patch is GPIO set value to 0 and Kishon wrote that GPIO set
value to 0 for those two boards assert PERST# line. Asserting PERST#
line cause endpoint PCIe card to be in reset state. And in pci-dra7xx.c
driver there is no other code which de-asserts PERST# line.

So based on all this information I deduced that your patch will cause
putting PCIe cards into reset state (forever) and therefore they would
not work.

Or do I have here some mistake?

> These are different boards and each specifies its own polarity.
> They are already opposite to each other right now:
> 
> https://elixir.bootlin.com/linux/v5.13-rc7/source/arch/arm/boot/dts/am57xx-beagle-x15-common.dtsi#L602
> 
> https://elixir.bootlin.com/linux/v5.13-rc7/source/arch/arm/boot/dts/am571x-idk.dts#L196
> 
> -- 
> Luca
>
Luca Ceresoli June 22, 2021, 9:36 p.m. UTC | #20
Hi Pali,

On 22/06/21 23:19, Pali Rohár wrote:
> On Tuesday 22 June 2021 23:08:07 Luca Ceresoli wrote:
>> On 22/06/21 22:52, Pali Rohár wrote:
>>> On Tuesday 22 June 2021 19:27:37 Kishon Vijay Abraham I wrote:
>>>> Hi Luca, Pali,
>>>>
>>>> On 22/06/21 7:01 pm, Luca Ceresoli wrote:
>>>>> Hi,
>>>>>
>>>>> On 22/06/21 14:16, Pali Rohár wrote:
>>>>>> On Tuesday 22 June 2021 12:56:04 Lorenzo Pieralisi wrote:
>>>>>>> [Adding Linus for GPIO discussion, thread:
>>>>>>> https://lore.kernel.org/linux-pci/20210531090540.2663171-1-luca@lucaceresoli.net]
>>>>>>>
>>>>>>> On Tue, Jun 22, 2021 at 01:06:27PM +0200, Pali Rohár wrote:
>>>>>>>> Hello!
>>>>>>>>
>>>>>>>> On Tuesday 22 June 2021 12:57:22 Luca Ceresoli wrote:
>>>>>>>>> Nothing happened after a few weeks... I understand that knowing the
>>>>>>>>> correct reset timings is relevant, but unfortunately I cannot help much
>>>>>>>>> in finding out the correct values.
>>>>>>>>>
>>>>>>>>> However I'm wondering what should happen to this patch. It *does* fix a
>>>>>>>>> real bug, but potentially with an incorrect or non-optimal usleep range.
>>>>>>>>> Do we really want to ignore a bugfix because we are not sure about how
>>>>>>>>> long this delay should be?
>>>>>>>>
>>>>>>>> As there is no better solution right now, I'm fine with your patch. But
>>>>>>>> patch needs to be approved by Lorenzo, so please wait for his final
>>>>>>>> answer.
>>>>>>>
>>>>>>> I am not a GPIO expert and I have a feeling this is platform specific
>>>>>>> beyond what the PCI specification can actually define architecturally.
>>>>>>
>>>>>> In my opinion timeout is not platform specific as I wrote in email:
>>>>>> https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/
>>>>>>
>>>>>> My experiments already proved that some PCIe cards needs to be in reset
>>>>>> state for some minimal time otherwise they cannot be enumerated. And it
>>>>>> does not matter to which platform you connect those (endpoint) cards.
>>>>>>
>>>>>> I do not think that timeout itself is platform specific. GPIO controls
>>>>>> PERST# pin and therefore specified sleep value directly drives how long
>>>>>> is card on the other end of PCIe slot in Warm Reset state. PCIe CEM spec
>>>>>> directly says that PERST# signal controls PCIe Warm Reset.
>>>>>>
>>>>>> What is here platform specific thing is that PERST# signal is controlled
>>>>>> by GPIO. But value of signal (high / low) and how long is in signal in
>>>>>> which state for me sounds like not an platform specific thing, but as
>>>>>> PCIe / CEM related.
>>>>>
>>>>> That's exactly my understanding of this matter. At least for the dra7xx
>>>>> controller it works exactly like this, PERSTn# is nothing but a GPIO
>>>>> output from the SoC that drives the PERSTn# input of the external chip
>>>>> without affecting the controller directly.
>>>>>
>>>>
>>>> While the patch itself is correct, this kind-of changes the behavior on
>>>> already upstreamed platforms. Previously the driver expected #PERST to
>>>> be asserted be external means (or default power-up state) and only takes
>>>> care of de-asserting the #PERST line.
>>>>
>>>> There are 2 platforms that will be impacted due to this change
>>>> 1) arch/arm/boot/dts/am57xx-beagle-x15-common.dtsi (has an inverter on
>>>> GPIO line)
>>>> 2) arch/arm/boot/dts/am571x-idk.dts (directly connected to #PERST)
>>>>
>>>> For 1), gpiod_set_value(reset, 0) will assert the PERST line due to the
>>>> inverter (and GPIO_ACTIVE_LOW)
>>>> For 2), gpiod_set_value(reset, 0) will assert the PERST line because we
>>>> have GPIO_ACTIVE_HIGH
>>>
>>> Ou! This is a problem in DT. It needs to be defined in a way that state
>>> is same for every DTS device which uses this driver.
>>
>> Why?
> 
> I'm starting to be confused by triple or more negations (asserting,
> signal inverter, active low)...
> 
> In your patch is GPIO set value to 0 and Kishon wrote that GPIO set
> value to 0 for those two boards assert PERST# line. Asserting PERST#
> line cause endpoint PCIe card to be in reset state. And in pci-dra7xx.c
> driver there is no other code which de-asserts PERST# line.
> 
> So based on all this information I deduced that your patch will cause
> putting PCIe cards into reset state (forever) and therefore they would
> not work.
> 
> Or do I have here some mistake?

Uhm, at time time in the night I'm not sure I can do much more than
adding a few notes on top of the commit message. I hope it helps anyway.

The PCIe PERSTn reset pin is active low and should be asserted, then
deasserted.

The current implementation only drives the pin once in "HIGH" position,
thus presumably it was intended to deassert the pin. This has two problems:

  1) it assumes the pin was asserted by other means before loading the
     driver [Note: Kishon confirmed so far]
  2) it has the wrong polarity, since "HIGH" means "active", and the pin is
     presumably configured as active low coherently with the PCIe
     convention, thus it is driven physically to 0, keeping the device
     under reset unless the pin is configured as active high.
     [Note: the curren 2 DTS files pointed to by Kishon have different
      polarities]

Fix both problems by:

  1) keeping devm_gpiod_get_optional(dev, NULL, GPIOD_OUT_HIGH) as is, but
     assuming the pin is correctly configured as "active low" this now
     becomes a reset assertion
  2) adding gpiod_set_value(reset, 0) after a delay to deassert reset
[Note: this is exactly the current idea, but with the additional need to
fix (=invert) the current polarities in DT]
Pali Rohár June 22, 2021, 10:23 p.m. UTC | #21
On Tuesday 22 June 2021 23:36:35 Luca Ceresoli wrote:
> Hi Pali,
> 
> On 22/06/21 23:19, Pali Rohár wrote:
> > On Tuesday 22 June 2021 23:08:07 Luca Ceresoli wrote:
> >> On 22/06/21 22:52, Pali Rohár wrote:
> >>> On Tuesday 22 June 2021 19:27:37 Kishon Vijay Abraham I wrote:
> >>>> Hi Luca, Pali,
> >>>>
> >>>> On 22/06/21 7:01 pm, Luca Ceresoli wrote:
> >>>>> Hi,
> >>>>>
> >>>>> On 22/06/21 14:16, Pali Rohár wrote:
> >>>>>> On Tuesday 22 June 2021 12:56:04 Lorenzo Pieralisi wrote:
> >>>>>>> [Adding Linus for GPIO discussion, thread:
> >>>>>>> https://lore.kernel.org/linux-pci/20210531090540.2663171-1-luca@lucaceresoli.net]
> >>>>>>>
> >>>>>>> On Tue, Jun 22, 2021 at 01:06:27PM +0200, Pali Rohár wrote:
> >>>>>>>> Hello!
> >>>>>>>>
> >>>>>>>> On Tuesday 22 June 2021 12:57:22 Luca Ceresoli wrote:
> >>>>>>>>> Nothing happened after a few weeks... I understand that knowing the
> >>>>>>>>> correct reset timings is relevant, but unfortunately I cannot help much
> >>>>>>>>> in finding out the correct values.
> >>>>>>>>>
> >>>>>>>>> However I'm wondering what should happen to this patch. It *does* fix a
> >>>>>>>>> real bug, but potentially with an incorrect or non-optimal usleep range.
> >>>>>>>>> Do we really want to ignore a bugfix because we are not sure about how
> >>>>>>>>> long this delay should be?
> >>>>>>>>
> >>>>>>>> As there is no better solution right now, I'm fine with your patch. But
> >>>>>>>> patch needs to be approved by Lorenzo, so please wait for his final
> >>>>>>>> answer.
> >>>>>>>
> >>>>>>> I am not a GPIO expert and I have a feeling this is platform specific
> >>>>>>> beyond what the PCI specification can actually define architecturally.
> >>>>>>
> >>>>>> In my opinion timeout is not platform specific as I wrote in email:
> >>>>>> https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/
> >>>>>>
> >>>>>> My experiments already proved that some PCIe cards needs to be in reset
> >>>>>> state for some minimal time otherwise they cannot be enumerated. And it
> >>>>>> does not matter to which platform you connect those (endpoint) cards.
> >>>>>>
> >>>>>> I do not think that timeout itself is platform specific. GPIO controls
> >>>>>> PERST# pin and therefore specified sleep value directly drives how long
> >>>>>> is card on the other end of PCIe slot in Warm Reset state. PCIe CEM spec
> >>>>>> directly says that PERST# signal controls PCIe Warm Reset.
> >>>>>>
> >>>>>> What is here platform specific thing is that PERST# signal is controlled
> >>>>>> by GPIO. But value of signal (high / low) and how long is in signal in
> >>>>>> which state for me sounds like not an platform specific thing, but as
> >>>>>> PCIe / CEM related.
> >>>>>
> >>>>> That's exactly my understanding of this matter. At least for the dra7xx
> >>>>> controller it works exactly like this, PERSTn# is nothing but a GPIO
> >>>>> output from the SoC that drives the PERSTn# input of the external chip
> >>>>> without affecting the controller directly.
> >>>>>
> >>>>
> >>>> While the patch itself is correct, this kind-of changes the behavior on
> >>>> already upstreamed platforms. Previously the driver expected #PERST to
> >>>> be asserted be external means (or default power-up state) and only takes
> >>>> care of de-asserting the #PERST line.
> >>>>
> >>>> There are 2 platforms that will be impacted due to this change
> >>>> 1) arch/arm/boot/dts/am57xx-beagle-x15-common.dtsi (has an inverter on
> >>>> GPIO line)
> >>>> 2) arch/arm/boot/dts/am571x-idk.dts (directly connected to #PERST)
> >>>>
> >>>> For 1), gpiod_set_value(reset, 0) will assert the PERST line due to the
> >>>> inverter (and GPIO_ACTIVE_LOW)
> >>>> For 2), gpiod_set_value(reset, 0) will assert the PERST line because we
> >>>> have GPIO_ACTIVE_HIGH
> >>>
> >>> Ou! This is a problem in DT. It needs to be defined in a way that state
> >>> is same for every DTS device which uses this driver.
> >>
> >> Why?
> > 
> > I'm starting to be confused by triple or more negations (asserting,
> > signal inverter, active low)...
> > 
> > In your patch is GPIO set value to 0 and Kishon wrote that GPIO set
> > value to 0 for those two boards assert PERST# line. Asserting PERST#
> > line cause endpoint PCIe card to be in reset state. And in pci-dra7xx.c
> > driver there is no other code which de-asserts PERST# line.
> > 
> > So based on all this information I deduced that your patch will cause
> > putting PCIe cards into reset state (forever) and therefore they would
> > not work.
> > 
> > Or do I have here some mistake?
> 
> Uhm, at time time in the night I'm not sure I can do much more than
> adding a few notes on top of the commit message. I hope it helps anyway.
> 
> The PCIe PERSTn reset pin is active low and should be asserted, then
> deasserted.
> 
> The current implementation only drives the pin once in "HIGH" position,
> thus presumably it was intended to deassert the pin. This has two problems:
> 
>   1) it assumes the pin was asserted by other means before loading the
>      driver [Note: Kishon confirmed so far]

This is easily solvable. Just assert PERST# pin explicitly via
gpiod_set_value() call prior calling that sleep function. And it would
work whatever state that pin has at init time. This has advantage that
reader of that code does not need to do too much investigation to check
at which state is GPIO at probe time and what implication it has...

Some other driver are doing it too, e.g. pci-aardvark.c.

Due to fact that also bootloader may use PCIe bus (maybe not now, but in
future; like it happened with pci-aardvark after introducing boot
support from NVMe disks), initial state may change.

>   2) it has the wrong polarity, since "HIGH" means "active", and the pin is
>      presumably configured as active low coherently with the PCIe
>      convention, thus it is driven physically to 0, keeping the device
>      under reset unless the pin is configured as active high.
>      [Note: the curren 2 DTS files pointed to by Kishon have different
>       polarities]
> 
> Fix both problems by:
> 
>   1) keeping devm_gpiod_get_optional(dev, NULL, GPIOD_OUT_HIGH) as is, but
>      assuming the pin is correctly configured as "active low" this now
>      becomes a reset assertion
>   2) adding gpiod_set_value(reset, 0) after a delay to deassert reset
> [Note: this is exactly the current idea, but with the additional need to
> fix (=invert) the current polarities in DT]

Lorenzo asked a good question how GPIO drives PERST#. And maybe it would
be a good idea to unify all pci controller drivers to use same GPIO
value for asserting PERST# pin. If it is possible. As we can see it is a
big mess.

Personally I would like to a see two helper functions like

  void pcie_assert_perst(struct gpio_desc *gpio);
  void pcie_deassert_perst(struct gpio_desc *gpio);

which pci controller driver will use and we will not more handle active
high / low state or polarity inversion and meditate if gpio set to zero
means assert or de-assert.

> 
> -- 
> Luca
>
Luca Ceresoli June 24, 2021, 9:31 p.m. UTC | #22
Hi Pali,

On 23/06/21 00:23, Pali Rohár wrote:
> On Tuesday 22 June 2021 23:36:35 Luca Ceresoli wrote:
>> Hi Pali,
>>
>> On 22/06/21 23:19, Pali Rohár wrote:
>>> On Tuesday 22 June 2021 23:08:07 Luca Ceresoli wrote:
>>>> On 22/06/21 22:52, Pali Rohár wrote:
>>>>> On Tuesday 22 June 2021 19:27:37 Kishon Vijay Abraham I wrote:
>>>>>> Hi Luca, Pali,
>>>>>>
>>>>>> On 22/06/21 7:01 pm, Luca Ceresoli wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 22/06/21 14:16, Pali Rohár wrote:
>>>>>>>> On Tuesday 22 June 2021 12:56:04 Lorenzo Pieralisi wrote:
>>>>>>>>> [Adding Linus for GPIO discussion, thread:
>>>>>>>>> https://lore.kernel.org/linux-pci/20210531090540.2663171-1-luca@lucaceresoli.net]
>>>>>>>>>
>>>>>>>>> On Tue, Jun 22, 2021 at 01:06:27PM +0200, Pali Rohár wrote:
>>>>>>>>>> Hello!
>>>>>>>>>>
>>>>>>>>>> On Tuesday 22 June 2021 12:57:22 Luca Ceresoli wrote:
>>>>>>>>>>> Nothing happened after a few weeks... I understand that knowing the
>>>>>>>>>>> correct reset timings is relevant, but unfortunately I cannot help much
>>>>>>>>>>> in finding out the correct values.
>>>>>>>>>>>
>>>>>>>>>>> However I'm wondering what should happen to this patch. It *does* fix a
>>>>>>>>>>> real bug, but potentially with an incorrect or non-optimal usleep range.
>>>>>>>>>>> Do we really want to ignore a bugfix because we are not sure about how
>>>>>>>>>>> long this delay should be?
>>>>>>>>>>
>>>>>>>>>> As there is no better solution right now, I'm fine with your patch. But
>>>>>>>>>> patch needs to be approved by Lorenzo, so please wait for his final
>>>>>>>>>> answer.
>>>>>>>>>
>>>>>>>>> I am not a GPIO expert and I have a feeling this is platform specific
>>>>>>>>> beyond what the PCI specification can actually define architecturally.
>>>>>>>>
>>>>>>>> In my opinion timeout is not platform specific as I wrote in email:
>>>>>>>> https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/
>>>>>>>>
>>>>>>>> My experiments already proved that some PCIe cards needs to be in reset
>>>>>>>> state for some minimal time otherwise they cannot be enumerated. And it
>>>>>>>> does not matter to which platform you connect those (endpoint) cards.
>>>>>>>>
>>>>>>>> I do not think that timeout itself is platform specific. GPIO controls
>>>>>>>> PERST# pin and therefore specified sleep value directly drives how long
>>>>>>>> is card on the other end of PCIe slot in Warm Reset state. PCIe CEM spec
>>>>>>>> directly says that PERST# signal controls PCIe Warm Reset.
>>>>>>>>
>>>>>>>> What is here platform specific thing is that PERST# signal is controlled
>>>>>>>> by GPIO. But value of signal (high / low) and how long is in signal in
>>>>>>>> which state for me sounds like not an platform specific thing, but as
>>>>>>>> PCIe / CEM related.
>>>>>>>
>>>>>>> That's exactly my understanding of this matter. At least for the dra7xx
>>>>>>> controller it works exactly like this, PERSTn# is nothing but a GPIO
>>>>>>> output from the SoC that drives the PERSTn# input of the external chip
>>>>>>> without affecting the controller directly.
>>>>>>>
>>>>>>
>>>>>> While the patch itself is correct, this kind-of changes the behavior on
>>>>>> already upstreamed platforms. Previously the driver expected #PERST to
>>>>>> be asserted be external means (or default power-up state) and only takes
>>>>>> care of de-asserting the #PERST line.
>>>>>>
>>>>>> There are 2 platforms that will be impacted due to this change
>>>>>> 1) arch/arm/boot/dts/am57xx-beagle-x15-common.dtsi (has an inverter on
>>>>>> GPIO line)
>>>>>> 2) arch/arm/boot/dts/am571x-idk.dts (directly connected to #PERST)
>>>>>>
>>>>>> For 1), gpiod_set_value(reset, 0) will assert the PERST line due to the
>>>>>> inverter (and GPIO_ACTIVE_LOW)
>>>>>> For 2), gpiod_set_value(reset, 0) will assert the PERST line because we
>>>>>> have GPIO_ACTIVE_HIGH
>>>>>
>>>>> Ou! This is a problem in DT. It needs to be defined in a way that state
>>>>> is same for every DTS device which uses this driver.
>>>>
>>>> Why?
>>>
>>> I'm starting to be confused by triple or more negations (asserting,
>>> signal inverter, active low)...
>>>
>>> In your patch is GPIO set value to 0 and Kishon wrote that GPIO set
>>> value to 0 for those two boards assert PERST# line. Asserting PERST#
>>> line cause endpoint PCIe card to be in reset state. And in pci-dra7xx.c
>>> driver there is no other code which de-asserts PERST# line.
>>>
>>> So based on all this information I deduced that your patch will cause
>>> putting PCIe cards into reset state (forever) and therefore they would
>>> not work.
>>>
>>> Or do I have here some mistake?
>>
>> Uhm, at time time in the night I'm not sure I can do much more than
>> adding a few notes on top of the commit message. I hope it helps anyway.
>>
>> The PCIe PERSTn reset pin is active low and should be asserted, then
>> deasserted.
>>
>> The current implementation only drives the pin once in "HIGH" position,
>> thus presumably it was intended to deassert the pin. This has two problems:
>>
>>   1) it assumes the pin was asserted by other means before loading the
>>      driver [Note: Kishon confirmed so far]
> 
> This is easily solvable. Just assert PERST# pin explicitly via
> gpiod_set_value() call prior calling that sleep function. And it would
> work whatever state that pin has at init time. This has advantage that
> reader of that code does not need to do too much investigation to check
> at which state is GPIO at probe time and what implication it has...

I agree, it's what my patch does.

> Some other driver are doing it too, e.g. pci-aardvark.c.
> 
> Due to fact that also bootloader may use PCIe bus (maybe not now, but in
> future; like it happened with pci-aardvark after introducing boot
> support from NVMe disks), initial state may change.
> 
>>   2) it has the wrong polarity, since "HIGH" means "active", and the pin is
>>      presumably configured as active low coherently with the PCIe
>>      convention, thus it is driven physically to 0, keeping the device
>>      under reset unless the pin is configured as active high.
>>      [Note: the curren 2 DTS files pointed to by Kishon have different
>>       polarities]
>>
>> Fix both problems by:
>>
>>   1) keeping devm_gpiod_get_optional(dev, NULL, GPIOD_OUT_HIGH) as is, but
>>      assuming the pin is correctly configured as "active low" this now
>>      becomes a reset assertion
>>   2) adding gpiod_set_value(reset, 0) after a delay to deassert reset
>> [Note: this is exactly the current idea, but with the additional need to
>> fix (=invert) the current polarities in DT]
> 
> Lorenzo asked a good question how GPIO drives PERST#. And maybe it would
> be a good idea to unify all pci controller drivers to use same GPIO
> value for asserting PERST# pin. If it is possible. As we can see it is a
> big mess.

I might be short-righted, but I can think of only one way the code
should look like in controller drivers. Which is, unsurprisingly, what
my patch does:

  /* 1 == assert reset == put device under reset */
  gpiod_set_value(reset, 1);
  /* or: devm_gpiod_get_optional(..., GPIOD_OUT_HIGH); */

  usleep_range(/* values under discussion */);

  /* 0 == deassert reset == release device from reset */
  gpiod_set_value(reset, 0);

The PCI controller driver should and can't care about any line
inversion. It's board-dependent, and as such it should be marked in
device tree (or ACPI or whatever -- I'm assuming ACPI can describe this it).

Am I overlooking anything?
Pali Rohár June 24, 2021, 9:42 p.m. UTC | #23
On Thursday 24 June 2021 23:31:10 Luca Ceresoli wrote:
> Hi Pali,
> 
> On 23/06/21 00:23, Pali Rohár wrote:
> > On Tuesday 22 June 2021 23:36:35 Luca Ceresoli wrote:
> >> Hi Pali,
> >>
> >> On 22/06/21 23:19, Pali Rohár wrote:
> >>> On Tuesday 22 June 2021 23:08:07 Luca Ceresoli wrote:
> >>>> On 22/06/21 22:52, Pali Rohár wrote:
> >>>>> On Tuesday 22 June 2021 19:27:37 Kishon Vijay Abraham I wrote:
> >>>>>> Hi Luca, Pali,
> >>>>>>
> >>>>>> On 22/06/21 7:01 pm, Luca Ceresoli wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> On 22/06/21 14:16, Pali Rohár wrote:
> >>>>>>>> On Tuesday 22 June 2021 12:56:04 Lorenzo Pieralisi wrote:
> >>>>>>>>> [Adding Linus for GPIO discussion, thread:
> >>>>>>>>> https://lore.kernel.org/linux-pci/20210531090540.2663171-1-luca@lucaceresoli.net]
> >>>>>>>>>
> >>>>>>>>> On Tue, Jun 22, 2021 at 01:06:27PM +0200, Pali Rohár wrote:
> >>>>>>>>>> Hello!
> >>>>>>>>>>
> >>>>>>>>>> On Tuesday 22 June 2021 12:57:22 Luca Ceresoli wrote:
> >>>>>>>>>>> Nothing happened after a few weeks... I understand that knowing the
> >>>>>>>>>>> correct reset timings is relevant, but unfortunately I cannot help much
> >>>>>>>>>>> in finding out the correct values.
> >>>>>>>>>>>
> >>>>>>>>>>> However I'm wondering what should happen to this patch. It *does* fix a
> >>>>>>>>>>> real bug, but potentially with an incorrect or non-optimal usleep range.
> >>>>>>>>>>> Do we really want to ignore a bugfix because we are not sure about how
> >>>>>>>>>>> long this delay should be?
> >>>>>>>>>>
> >>>>>>>>>> As there is no better solution right now, I'm fine with your patch. But
> >>>>>>>>>> patch needs to be approved by Lorenzo, so please wait for his final
> >>>>>>>>>> answer.
> >>>>>>>>>
> >>>>>>>>> I am not a GPIO expert and I have a feeling this is platform specific
> >>>>>>>>> beyond what the PCI specification can actually define architecturally.
> >>>>>>>>
> >>>>>>>> In my opinion timeout is not platform specific as I wrote in email:
> >>>>>>>> https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/
> >>>>>>>>
> >>>>>>>> My experiments already proved that some PCIe cards needs to be in reset
> >>>>>>>> state for some minimal time otherwise they cannot be enumerated. And it
> >>>>>>>> does not matter to which platform you connect those (endpoint) cards.
> >>>>>>>>
> >>>>>>>> I do not think that timeout itself is platform specific. GPIO controls
> >>>>>>>> PERST# pin and therefore specified sleep value directly drives how long
> >>>>>>>> is card on the other end of PCIe slot in Warm Reset state. PCIe CEM spec
> >>>>>>>> directly says that PERST# signal controls PCIe Warm Reset.
> >>>>>>>>
> >>>>>>>> What is here platform specific thing is that PERST# signal is controlled
> >>>>>>>> by GPIO. But value of signal (high / low) and how long is in signal in
> >>>>>>>> which state for me sounds like not an platform specific thing, but as
> >>>>>>>> PCIe / CEM related.
> >>>>>>>
> >>>>>>> That's exactly my understanding of this matter. At least for the dra7xx
> >>>>>>> controller it works exactly like this, PERSTn# is nothing but a GPIO
> >>>>>>> output from the SoC that drives the PERSTn# input of the external chip
> >>>>>>> without affecting the controller directly.
> >>>>>>>
> >>>>>>
> >>>>>> While the patch itself is correct, this kind-of changes the behavior on
> >>>>>> already upstreamed platforms. Previously the driver expected #PERST to
> >>>>>> be asserted be external means (or default power-up state) and only takes
> >>>>>> care of de-asserting the #PERST line.
> >>>>>>
> >>>>>> There are 2 platforms that will be impacted due to this change
> >>>>>> 1) arch/arm/boot/dts/am57xx-beagle-x15-common.dtsi (has an inverter on
> >>>>>> GPIO line)
> >>>>>> 2) arch/arm/boot/dts/am571x-idk.dts (directly connected to #PERST)
> >>>>>>
> >>>>>> For 1), gpiod_set_value(reset, 0) will assert the PERST line due to the
> >>>>>> inverter (and GPIO_ACTIVE_LOW)
> >>>>>> For 2), gpiod_set_value(reset, 0) will assert the PERST line because we
> >>>>>> have GPIO_ACTIVE_HIGH
> >>>>>
> >>>>> Ou! This is a problem in DT. It needs to be defined in a way that state
> >>>>> is same for every DTS device which uses this driver.
> >>>>
> >>>> Why?
> >>>
> >>> I'm starting to be confused by triple or more negations (asserting,
> >>> signal inverter, active low)...
> >>>
> >>> In your patch is GPIO set value to 0 and Kishon wrote that GPIO set
> >>> value to 0 for those two boards assert PERST# line. Asserting PERST#
> >>> line cause endpoint PCIe card to be in reset state. And in pci-dra7xx.c
> >>> driver there is no other code which de-asserts PERST# line.
> >>>
> >>> So based on all this information I deduced that your patch will cause
> >>> putting PCIe cards into reset state (forever) and therefore they would
> >>> not work.
> >>>
> >>> Or do I have here some mistake?
> >>
> >> Uhm, at time time in the night I'm not sure I can do much more than
> >> adding a few notes on top of the commit message. I hope it helps anyway.
> >>
> >> The PCIe PERSTn reset pin is active low and should be asserted, then
> >> deasserted.
> >>
> >> The current implementation only drives the pin once in "HIGH" position,
> >> thus presumably it was intended to deassert the pin. This has two problems:
> >>
> >>   1) it assumes the pin was asserted by other means before loading the
> >>      driver [Note: Kishon confirmed so far]
> > 
> > This is easily solvable. Just assert PERST# pin explicitly via
> > gpiod_set_value() call prior calling that sleep function. And it would
> > work whatever state that pin has at init time. This has advantage that
> > reader of that code does not need to do too much investigation to check
> > at which state is GPIO at probe time and what implication it has...
> 
> I agree, it's what my patch does.
> 
> > Some other driver are doing it too, e.g. pci-aardvark.c.
> > 
> > Due to fact that also bootloader may use PCIe bus (maybe not now, but in
> > future; like it happened with pci-aardvark after introducing boot
> > support from NVMe disks), initial state may change.
> > 
> >>   2) it has the wrong polarity, since "HIGH" means "active", and the pin is
> >>      presumably configured as active low coherently with the PCIe
> >>      convention, thus it is driven physically to 0, keeping the device
> >>      under reset unless the pin is configured as active high.
> >>      [Note: the curren 2 DTS files pointed to by Kishon have different
> >>       polarities]
> >>
> >> Fix both problems by:
> >>
> >>   1) keeping devm_gpiod_get_optional(dev, NULL, GPIOD_OUT_HIGH) as is, but
> >>      assuming the pin is correctly configured as "active low" this now
> >>      becomes a reset assertion
> >>   2) adding gpiod_set_value(reset, 0) after a delay to deassert reset
> >> [Note: this is exactly the current idea, but with the additional need to
> >> fix (=invert) the current polarities in DT]
> > 
> > Lorenzo asked a good question how GPIO drives PERST#. And maybe it would
> > be a good idea to unify all pci controller drivers to use same GPIO
> > value for asserting PERST# pin. If it is possible. As we can see it is a
> > big mess.
> 
> I might be short-righted, but I can think of only one way the code
> should look like in controller drivers. Which is, unsurprisingly, what
> my patch does:
> 
>   /* 1 == assert reset == put device under reset */
>   gpiod_set_value(reset, 1);
>   /* or: devm_gpiod_get_optional(..., GPIOD_OUT_HIGH); */
> 
>   usleep_range(/* values under discussion */);
> 
>   /* 0 == deassert reset == release device from reset */
>   gpiod_set_value(reset, 0);

This logic looks to be correct.

> The PCI controller driver should and can't care about any line
> inversion. It's board-dependent, and as such it should be marked in
> device tree (or ACPI or whatever -- I'm assuming ACPI can describe this it).

Exactly! I agree with you. And it has to be specified in DTS correctly
(or ACPI or whatever is used).

> Am I overlooking anything?

I'm still confused by sentence "gpiod_set_value(reset, 0) will assert
the PERST line" written by Kishon as you wrote that: 0 == deassert...

So I will wait for a new patch with comments and hopefully Kishon either
review that new patch is correct or find out if there is an issue
(either in some DTS file or driver code).

> -- 
> Luca
>
Linus Walleij June 24, 2021, 11:11 p.m. UTC | #24
On Tue, Jun 22, 2021 at 3:57 PM Kishon Vijay Abraham I <kishon@ti.com> wrote:

> While the patch itself is correct, this kind-of changes the behavior on
> already upstreamed platforms. Previously the driver expected #PERST to
> be asserted be external means (or default power-up state) and only takes
> care of de-asserting the #PERST line.
>
> There are 2 platforms that will be impacted due to this change
> 1) arch/arm/boot/dts/am57xx-beagle-x15-common.dist (has an inverter on
> GPIO line)
> 2) arch/arm/boot/dts/am571x-idk.dts (directly connected to #PERST)
>
> For 1), gpiod_set_value(reset, 0) will assert the PERST line due to the
> inverter (and GPIO_ACTIVE_LOW)
> For 2), gpiod_set_value(reset, 0) will assert the PERST line because we
> have GPIO_ACTIVE_HIGH

The presence of an inverter makes it necessary to model this the right
way to get out of the situation.

> So this patch should have to be accompanied with DT changes (and this
> patch also breaks old DT compatibility).

There are ways to deal with this perfectly. It may or may not be worth
the extra work. But I can show how it is done.

Make the patch to the driver that assumes driving the gpio descriptor
to 1 (asserted) has the desired effect.

In this patch, I would include a hunk that fixes the above device trees,
so they are correct from this point. This is one of the few cases where
I think it warrants to fix the driver and the DTS file at the same time,
but the DTS can also be patched separately because of the described
solution below:

To avoid regressions with old device trees, add code to
drivers/gpio/gpiolib-of.c in function of_gpio_flags_quirks()
to react to the old incorrect device trees. This is where we
stockpile OF errors and bug fixes.

This needs to be pretty elaborate. It begins like this:

if (IS_ENABLED(CONFIG_PCI) &&
      (of_machine_is_compatible("ti,am572x-beagle-x15") ||
      of_machine_is_compatible("ti,am5718-idk")) &&
      of_node_name_eq(np, "pcie")) {
      /* ... add code to check and enforce the flags ... */
}

You see the idea here. Include this in the patch to make the
Perfect(TM) solution to this problem both fixing all device trees
in place and dealing with the old erroneous ones using some
elaborate code.

There are plenty of examples on how to detect warn and
modify flags in of_gpio_flags_quirks() make it clear and add
some warning prints and comments. Keep me and Bartosz
in the loop. It'll look fine in the end.

Yours,
Linus Walleij
Linus Walleij June 24, 2021, 11:18 p.m. UTC | #25
On Wed, Jun 23, 2021 at 12:23 AM Pali Rohár <pali@kernel.org> wrote:

> Lorenzo asked a good question how GPIO drives PERST#. And maybe it would
> be a good idea to unify all pci controller drivers to use same GPIO
> value for asserting PERST# pin. If it is possible. As we can see it is a
> big mess.
>
> Personally I would like to a see two helper functions like
>
>   void pcie_assert_perst(struct gpio_desc *gpio);
>   void pcie_deassert_perst(struct gpio_desc *gpio);
>
> which pci controller driver will use and we will not more handle active
> high / low state or polarity inversion and meditate if gpio set to zero
> means assert or de-assert.

GPIO descriptors (as are used in this driver) are supposed to hide
and encapsulate polarity inversion so:

gpiod_set_value(gpiod, 1) == assert the line
gpiod_set_value(gpiod, 0) == de-assert the line

Whether the line is asserted by physically driving the line low or
high should not be a concern, that is handled in the machine
description, we support OF, ACPI and even board files to
define this.

I would use gpiod_set_value() directly as above and maybe
add some comments explaining what is going on and that
the resulting polarity inversion is handled inside gpiolib.

Because of common misunderstandings we have pondered to just
search/replace the last argument of gpiod_set_value() from
an (int value) to a (bool asserted) to make things clear.
I just never get around to do that.

Yours,
Linus Walleij
Pali Rohár June 24, 2021, 11:34 p.m. UTC | #26
On Friday 25 June 2021 01:18:43 Linus Walleij wrote:
> On Wed, Jun 23, 2021 at 12:23 AM Pali Rohár <pali@kernel.org> wrote:
> 
> > Lorenzo asked a good question how GPIO drives PERST#. And maybe it would
> > be a good idea to unify all pci controller drivers to use same GPIO
> > value for asserting PERST# pin. If it is possible. As we can see it is a
> > big mess.
> >
> > Personally I would like to a see two helper functions like
> >
> >   void pcie_assert_perst(struct gpio_desc *gpio);
> >   void pcie_deassert_perst(struct gpio_desc *gpio);
> >
> > which pci controller driver will use and we will not more handle active
> > high / low state or polarity inversion and meditate if gpio set to zero
> > means assert or de-assert.
> 
> GPIO descriptors (as are used in this driver) are supposed to hide
> and encapsulate polarity inversion so:
> 
> gpiod_set_value(gpiod, 1) == assert the line
> gpiod_set_value(gpiod, 0) == de-assert the line

Problem is that some pci controller drivers (e.g. pci-j721e.c or
pcie-rockchip-host.c) expects that gpiod_set_value_cansleep(gpiod, 1)
de-asserts the line and it is already used in this way.

Which is opposite of the behavior which you wrote above.

> Whether the line is asserted by physically driving the line low or
> high should not be a concern, that is handled in the machine
> description, we support OF, ACPI and even board files to
> define this.
> 
> I would use gpiod_set_value() directly as above and maybe
> add some comments explaining what is going on and that
> the resulting polarity inversion is handled inside gpiolib.
> 
> Because of common misunderstandings we have pondered to just
> search/replace the last argument of gpiod_set_value() from
> an (int value) to a (bool asserted) to make things clear.
> I just never get around to do that.

I would suggest to define enum/macro with word ASSERT and DEASSERT in
its name instead of just true/false boolean or 0/1 int.

In case of this PERST# misunderstanding, having assert/deassert in name
should really help.

> 
> Yours,
> Linus Walleij
Linus Walleij June 25, 2021, 12:09 a.m. UTC | #27
On Fri, Jun 25, 2021 at 1:34 AM Pali Rohár <pali@kernel.org> wrote:

> > gpiod_set_value(gpiod, 1) == assert the line
> > gpiod_set_value(gpiod, 0) == de-assert the line
>
> Problem is that some pci controller drivers (e.g. pci-j721e.c or
> pcie-rockchip-host.c) expects that gpiod_set_value_cansleep(gpiod, 1)
> de-asserts the line and it is already used in this way.
>
> Which is opposite of the behavior which you wrote above.

I sketched a way out of the problem using a quirk in
gpiolib in another response. We have a few of these
cases where we have to code our way out of mistakes,
such things happen.

The problem is common, and due to the fact that device tree
authors ignores the flag GPIO_ACTIVE_HIGH (which has
been around since the early days of device tree on PowerPC)
instead they opt to do the inversion in code. Which violates the
contract that the DT should describe the hardware.

The ambition of the DT specifications/schemas are to be operating
system independent, and this kind of stuff creates a situation
where other operating systems can't use the specification without
also going and looking at how Linux has implemented stuff.
Which is against the ambition of the device tree work.

> I would suggest to define enum/macro with word ASSERT and DEASSERT in
> its name instead of just true/false boolean or 0/1 int.
>
> In case of this PERST# misunderstanding, having assert/deassert in name
> should really help.

Hm that looks useful, Bart &co what do you think?

#define GPIOD_ASSERTED 1
#define GPIOD_DEASSERTED 0

in consumer.h, would that be helpful for users?

Yours,
Linus Walleij
Luca Ceresoli June 25, 2021, 8:05 a.m. UTC | #28
Hi,

On 25/06/21 02:09, Linus Walleij wrote:
> On Fri, Jun 25, 2021 at 1:34 AM Pali Rohár <pali@kernel.org> wrote:
> 
>>> gpiod_set_value(gpiod, 1) == assert the line
>>> gpiod_set_value(gpiod, 0) == de-assert the line
>>
>> Problem is that some pci controller drivers (e.g. pci-j721e.c or
>> pcie-rockchip-host.c) expects that gpiod_set_value_cansleep(gpiod, 1)
>> de-asserts the line and it is already used in this way.
>>
>> Which is opposite of the behavior which you wrote above.
> 
> I sketched a way out of the problem using a quirk in
> gpiolib in another response. We have a few of these
> cases where we have to code our way out of mistakes,
> such things happen.
> 
> The problem is common, and due to the fact that device tree
> authors ignores the flag GPIO_ACTIVE_HIGH (which has
> been around since the early days of device tree on PowerPC)
> instead they opt to do the inversion in code. Which violates the
> contract that the DT should describe the hardware.
> 
> The ambition of the DT specifications/schemas are to be operating
> system independent, and this kind of stuff creates a situation
> where other operating systems can't use the specification without
> also going and looking at how Linux has implemented stuff.
> Which is against the ambition of the device tree work.
> 
>> I would suggest to define enum/macro with word ASSERT and DEASSERT in
>> its name instead of just true/false boolean or 0/1 int.
>>
>> In case of this PERST# misunderstanding, having assert/deassert in name
>> should really help.
> 
> Hm that looks useful, Bart &co what do you think?
> 
> #define GPIOD_ASSERTED 1
> #define GPIOD_DEASSERTED 0
> 
> in consumer.h, would that be helpful for users?

Looks like a good idea to me. It would help making people aware that
gpiod_set_value() & co do _not_ take a physical line value. It's done
that way since ages, it's documented, yet many developers are still
unaware of that...
Luca Ceresoli June 25, 2021, 8:10 a.m. UTC | #29
Hi Linus,

On 25/06/21 01:11, Linus Walleij wrote:
> On Tue, Jun 22, 2021 at 3:57 PM Kishon Vijay Abraham I <kishon@ti.com> wrote:
> 
>> While the patch itself is correct, this kind-of changes the behavior on
>> already upstreamed platforms. Previously the driver expected #PERST to
>> be asserted be external means (or default power-up state) and only takes
>> care of de-asserting the #PERST line.
>>
>> There are 2 platforms that will be impacted due to this change
>> 1) arch/arm/boot/dts/am57xx-beagle-x15-common.dist (has an inverter on
>> GPIO line)
>> 2) arch/arm/boot/dts/am571x-idk.dts (directly connected to #PERST)
>>
>> For 1), gpiod_set_value(reset, 0) will assert the PERST line due to the
>> inverter (and GPIO_ACTIVE_LOW)
>> For 2), gpiod_set_value(reset, 0) will assert the PERST line because we
>> have GPIO_ACTIVE_HIGH
> 
> The presence of an inverter makes it necessary to model this the right
> way to get out of the situation.
> 
>> So this patch should have to be accompanied with DT changes (and this
>> patch also breaks old DT compatibility).
> 
> There are ways to deal with this perfectly. It may or may not be worth
> the extra work. But I can show how it is done.
> 
> Make the patch to the driver that assumes driving the gpio descriptor
> to 1 (asserted) has the desired effect.
> 
> In this patch, I would include a hunk that fixes the above device trees,
> so they are correct from this point. This is one of the few cases where
> I think it warrants to fix the driver and the DTS file at the same time,
> but the DTS can also be patched separately because of the described
> solution below:
> 
> To avoid regressions with old device trees, add code to
> drivers/gpio/gpiolib-of.c in function of_gpio_flags_quirks()
> to react to the old incorrect device trees. This is where we
> stockpile OF errors and bug fixes.
> 
> This needs to be pretty elaborate. It begins like this:
> 
> if (IS_ENABLED(CONFIG_PCI) &&
>       (of_machine_is_compatible("ti,am572x-beagle-x15") ||
>       of_machine_is_compatible("ti,am5718-idk")) &&
>       of_node_name_eq(np, "pcie")) {
>       /* ... add code to check and enforce the flags ... */
> }
> 
> You see the idea here. Include this in the patch to make the
> Perfect(TM) solution to this problem both fixing all device trees
> in place and dealing with the old erroneous ones using some
> elaborate code.
> 
> There are plenty of examples on how to detect warn and
> modify flags in of_gpio_flags_quirks() make it clear and add
> some warning prints and comments. Keep me and Bartosz
> in the loop. It'll look fine in the end.

Thanks for thaking the time to explain this in detail!

As I volounteered to write the patch, and since I'm lazy, I was going to
vote for the former solution. But to be honest the latter has some good
reason to exist as it handles the case of kernel upgrade without DT
upgrade...

My bad, this two-liner patch is growing so big. :-)
diff mbox series

Patch

diff --git a/drivers/pci/controller/dwc/pci-dra7xx.c b/drivers/pci/controller/dwc/pci-dra7xx.c
index cb5d4c245ff6..11f392b7a9a2 100644
--- a/drivers/pci/controller/dwc/pci-dra7xx.c
+++ b/drivers/pci/controller/dwc/pci-dra7xx.c
@@ -801,6 +801,8 @@  static int dra7xx_pcie_probe(struct platform_device *pdev)
 		dev_err(&pdev->dev, "gpio request failed, ret %d\n", ret);
 		goto err_gpio;
 	}
+	usleep_range(1000, 2000);
+	gpiod_set_value(reset, 0);
 
 	reg = dra7xx_pcie_readl(dra7xx, PCIECTRL_DRA7XX_CONF_DEVICE_CMD);
 	reg &= ~LTSSM_EN;