diff mbox series

ARM: dts: bcm283x: increase dwc2's RX FIFO size

Message ID e9e7d070-593c-122f-3a5c-2435bb147ab2@ivitera.com (mailing list archive)
State Accepted
Commit 278407a53c3b33fb820332c4d39eb39316c3879a
Headers show
Series ARM: dts: bcm283x: increase dwc2's RX FIFO size | expand

Commit Message

Pavel Hofman Nov. 13, 2020, 7:18 a.m. UTC
The previous version of the dwc2 overlay set the RX FIFO size to
256 4-byte words. This is not enough for 1024 bytes of the largest
isochronous high speed packet allowed, because it doesn't take into
account extra space needed by dwc2.

RX FIFO's size is calculated based on the following (in 4byte words):
- 13 locations for SETUP packets
  5*n + 8 for Slave and Buffer DMA mode where n is number of control
  endpoints which is 1 on the bcm283x core

- 1 location for Global OUT NAK

- 2 * 257 locations for status information and the received packet.
  Typically two spaces are recommended so that when the previous packet
  is being transferred to AHB, the USB can receive the subsequent
  packet.

- 10 * 1 location for transfer complete status for last packet of each
  endpoint. The bcm283x core has 5 IN and 5 OUT EPs

- 10 * 1 additional location for EPDisable status for each endpoint

- 5 * 2 additional locations are recommended for each OUT endpoint

Total is 558 locations.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Signed-off-by: Pavel Hofman <pavel.hofman@ivitera.com>
---
 arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi        | 2 +-
 arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Comments

Nicolas Saenz Julienne Nov. 20, 2020, 4:44 p.m. UTC | #1
On Fri, 2020-11-13 at 08:18 +0100, Pavel Hofman wrote:
> The previous version of the dwc2 overlay set the RX FIFO size to
> 256 4-byte words. This is not enough for 1024 bytes of the largest
> isochronous high speed packet allowed, because it doesn't take into
> account extra space needed by dwc2.
> 
> RX FIFO's size is calculated based on the following (in 4byte words):
> - 13 locations for SETUP packets
>   5*n + 8 for Slave and Buffer DMA mode where n is number of control
>   endpoints which is 1 on the bcm283x core
> 
> - 1 location for Global OUT NAK
> 
> - 2 * 257 locations for status information and the received packet.
>   Typically two spaces are recommended so that when the previous packet
>   is being transferred to AHB, the USB can receive the subsequent
>   packet.
> 
> - 10 * 1 location for transfer complete status for last packet of each
>   endpoint. The bcm283x core has 5 IN and 5 OUT EPs
> 
> - 10 * 1 additional location for EPDisable status for each endpoint
> 
> - 5 * 2 additional locations are recommended for each OUT endpoint
> 
> Total is 558 locations.
> 
> Signed-off-by: Phil Elwell <phil@raspberrypi.com>
> Signed-off-by: Pavel Hofman <pavel.hofman@ivitera.com>
> ---

Applied for next. Thanks!

Regards,
Nicolas
Stefan Wahren May 26, 2021, 5:12 p.m. UTC | #2
Hi Pavel,

Am 13.11.20 um 08:18 schrieb Pavel Hofman:
> The previous version of the dwc2 overlay set the RX FIFO size to
> 256 4-byte words. This is not enough for 1024 bytes of the largest
> isochronous high speed packet allowed, because it doesn't take into
> account extra space needed by dwc2.
>
> RX FIFO's size is calculated based on the following (in 4byte words):
> - 13 locations for SETUP packets
>   5*n + 8 for Slave and Buffer DMA mode where n is number of control
>   endpoints which is 1 on the bcm283x core
>
> - 1 location for Global OUT NAK
>
> - 2 * 257 locations for status information and the received packet.
>   Typically two spaces are recommended so that when the previous packet
>   is being transferred to AHB, the USB can receive the subsequent
>   packet.
>
> - 10 * 1 location for transfer complete status for last packet of each
>   endpoint. The bcm283x core has 5 IN and 5 OUT EPs
>
> - 10 * 1 additional location for EPDisable status for each endpoint
>
> - 5 * 2 additional locations are recommended for each OUT endpoint
>
> Total is 558 locations.
>
> Signed-off-by: Phil Elwell <phil@raspberrypi.com>
> Signed-off-by: Pavel Hofman <pavel.hofman@ivitera.com>
> ---
>  arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi        | 2 +-
>  arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi
> b/arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi
> index e2fd961..20322de 100644
> --- a/arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi
> +++ b/arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi
> @@ -1,7 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0
>  &usb {
>  	dr_mode = "otg";
> -	g-rx-fifo-size = <256>;
> +	g-rx-fifo-size = <558>;
>  	g-np-tx-fifo-size = <32>;
>  	/*
>  	 * According to dwc2 the sum of all device EP
> diff --git a/arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi
> b/arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi
> index 0ff0e9e..1409d1b 100644
> --- a/arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi
> +++ b/arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi
> @@ -1,7 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0
>  &usb {
>  	dr_mode = "peripheral";
> -	g-rx-fifo-size = <256>;
> +	g-rx-fifo-size = <558>;

sorry for being late at the party, but this change introduce a
regression on Raspberry Pi 4 B:

dwc2 fe980000.usb: dwc2_check_param_tx_fifo_sizes: Invalid parameter
g-tx-fifo-size, setting to default average

I known you didn't change the tx fifo size, but there are complex
constrains regarding the total fifo size.

Are you able to test this with a mainline kernel (not Raspberry Pi
kernel) and send a fix for this?

>  	g-np-tx-fifo-size = <32>;
>  	g-tx-fifo-size = <256 256 512 512 512 768 768>;
>  };
Pavel Hofman May 27, 2021, 1:17 p.m. UTC | #3
Hi Stefan,

Dne 26. 05. 21 v 19:12 Stefan Wahren napsal(a):
> 
> Am 13.11.20 um 08:18 schrieb Pavel Hofman:
>> The previous version of the dwc2 overlay set the RX FIFO size to
>> 256 4-byte words. This is not enough for 1024 bytes of the largest
>> isochronous high speed packet allowed, because it doesn't take into
>> account extra space needed by dwc2.
>>
>> RX FIFO's size is calculated based on the following (in 4byte words):
>> - 13 locations for SETUP packets
>>   5*n + 8 for Slave and Buffer DMA mode where n is number of control
>>   endpoints which is 1 on the bcm283x core
>>
>> - 1 location for Global OUT NAK
>>
>> - 2 * 257 locations for status information and the received packet.
>>   Typically two spaces are recommended so that when the previous packet
>>   is being transferred to AHB, the USB can receive the subsequent
>>   packet.
>>
>> - 10 * 1 location for transfer complete status for last packet of each
>>   endpoint. The bcm283x core has 5 IN and 5 OUT EPs
>>
>> - 10 * 1 additional location for EPDisable status for each endpoint
>>
>> - 5 * 2 additional locations are recommended for each OUT endpoint
>>
>> Total is 558 locations.
>>
>> Signed-off-by: Phil Elwell <phil@raspberrypi.com>
>> Signed-off-by: Pavel Hofman <pavel.hofman@ivitera.com>
>> ---
>>  arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi        | 2 +-
>>  arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi | 2 +-
>>  2 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi
>> b/arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi
>> index e2fd961..20322de 100644
>> --- a/arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi
>> +++ b/arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi
>> @@ -1,7 +1,7 @@
>>  // SPDX-License-Identifier: GPL-2.0
>>  &usb {
>>  	dr_mode = "otg";
>> -	g-rx-fifo-size = <256>;
>> +	g-rx-fifo-size = <558>;
>>  	g-np-tx-fifo-size = <32>;
>>  	/*
>>  	 * According to dwc2 the sum of all device EP
>> diff --git a/arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi
>> b/arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi
>> index 0ff0e9e..1409d1b 100644
>> --- a/arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi
>> +++ b/arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi
>> @@ -1,7 +1,7 @@
>>  // SPDX-License-Identifier: GPL-2.0
>>  &usb {
>>  	dr_mode = "peripheral";
>> -	g-rx-fifo-size = <256>;
>> +	g-rx-fifo-size = <558>;
> 
> sorry for being late at the party, but this change introduce a
> regression on Raspberry Pi 4 B:
> 
> dwc2 fe980000.usb: dwc2_check_param_tx_fifo_sizes: Invalid parameter
> g-tx-fifo-size, setting to default average
> 
> I known you didn't change the tx fifo size, but there are complex
> constrains regarding the total fifo size.
> 
> Are you able to test this with a mainline kernel (not Raspberry Pi
> kernel) and send a fix for this?
> 
>>  	g-np-tx-fifo-size = <32>;
>>  	g-tx-fifo-size = <256 256 512 512 512 768 768>;
>>  };
> 

I think I see the problem.

IIUC the calculations and checks, all g-tx-fifo-size values +
g-rx-fifo-size + g-np-tx-fifo-size must not exceed total_fifo_size. My
RPi4 reports the total_fifo_size as 4080 (in
/sys/kernel/debug/usb/fe980000.usb/hw_params).

Linux mainline
https://github.com/torvalds/linux/search?p=3&q=g-tx-fifo-size :

The increase in value of g-rx-fifo-size exceeds the limit for the DTSI
files we patched:

Both bcm283x-rpi-usb-peripheral.dtsi and bcm283x-rpi-usb-otg.dtsi:
558 + 32 + 256 + 256 + 512 + 512 + 512 + 768 + 768 = 4174 > 4080

while the sum with the previous value of 256 reached just 3872 < 4080.


The raspberrypi repo
https://github.com/raspberrypi/linux/search?q=g-tx-fifo-size :

It has a different mix of the DTSI files
dwc2-overlay.dts
upstream-overlay.dts
upstream-pi4-overlay.dts

all of which define
g-tx-fifo-size = <512 512 512 512 512 256 256>;

Here the calculation holds:
558 + 32 + 512 + 512 + 512 + 512 + 512 + 256 + 256 = 3662 < 4080

My RPi4 uses one of these DTSIs, because my
/sys/kernel/debug/usb/fe980000.usb/params says:

g_rx_fifo_size                : 558
g_np_tx_fifo_size             : 32
g_tx_fifo_size[0]             : 0
g_tx_fifo_size[1]             : 512
g_tx_fifo_size[2]             : 512
g_tx_fifo_size[3]             : 512
g_tx_fifo_size[4]             : 512
g_tx_fifo_size[5]             : 512
g_tx_fifo_size[6]             : 256
g_tx_fifo_size[7]             : 256


IIUC the tx_fifo values in bcm283x-rpi-usb-peripheral.dtsi and
bcm283x-rpi-usb-otg.dtsi files can be lowered to the values used and
tested (at least by me) in the RPi repo. But this is outside of my
knowledge, honestly I do not know what is the most appropriate
distribution of the remaining fifo space among the g_tx_fifo buffers.
Please can the RPi developers (Phil?) suggest a fix?

Thanks a lot to Stefan and to everyone involved.

Pavel.
Stefan Wahren May 27, 2021, 1:47 p.m. UTC | #4
Hi Pavel,

fix Nicolas address. Sorry about this.

Am 27.05.21 um 15:17 schrieb Pavel Hofman:
> Hi Stefan,
>
> Dne 26. 05. 21 v 19:12 Stefan Wahren napsal(a):
>> Am 13.11.20 um 08:18 schrieb Pavel Hofman:
>>> The previous version of the dwc2 overlay set the RX FIFO size to
>>> 256 4-byte words. This is not enough for 1024 bytes of the largest
>>> isochronous high speed packet allowed, because it doesn't take into
>>> account extra space needed by dwc2.
>>>
>>> RX FIFO's size is calculated based on the following (in 4byte words):
>>> - 13 locations for SETUP packets
>>>   5*n + 8 for Slave and Buffer DMA mode where n is number of control
>>>   endpoints which is 1 on the bcm283x core
>>>
>>> - 1 location for Global OUT NAK
>>>
>>> - 2 * 257 locations for status information and the received packet.
>>>   Typically two spaces are recommended so that when the previous packet
>>>   is being transferred to AHB, the USB can receive the subsequent
>>>   packet.
>>>
>>> - 10 * 1 location for transfer complete status for last packet of each
>>>   endpoint. The bcm283x core has 5 IN and 5 OUT EPs
>>>
>>> - 10 * 1 additional location for EPDisable status for each endpoint
>>>
>>> - 5 * 2 additional locations are recommended for each OUT endpoint
>>>
>>> Total is 558 locations.
>>>
>>> Signed-off-by: Phil Elwell <phil@raspberrypi.com>
>>> Signed-off-by: Pavel Hofman <pavel.hofman@ivitera.com>
>>> ---
>>>  arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi        | 2 +-
>>>  arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi | 2 +-
>>>  2 files changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi
>>> b/arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi
>>> index e2fd961..20322de 100644
>>> --- a/arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi
>>> +++ b/arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi
>>> @@ -1,7 +1,7 @@
>>>  // SPDX-License-Identifier: GPL-2.0
>>>  &usb {
>>>  	dr_mode = "otg";
>>> -	g-rx-fifo-size = <256>;
>>> +	g-rx-fifo-size = <558>;
>>>  	g-np-tx-fifo-size = <32>;
>>>  	/*
>>>  	 * According to dwc2 the sum of all device EP
>>> diff --git a/arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi
>>> b/arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi
>>> index 0ff0e9e..1409d1b 100644
>>> --- a/arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi
>>> +++ b/arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi
>>> @@ -1,7 +1,7 @@
>>>  // SPDX-License-Identifier: GPL-2.0
>>>  &usb {
>>>  	dr_mode = "peripheral";
>>> -	g-rx-fifo-size = <256>;
>>> +	g-rx-fifo-size = <558>;
>> sorry for being late at the party, but this change introduce a
>> regression on Raspberry Pi 4 B:
>>
>> dwc2 fe980000.usb: dwc2_check_param_tx_fifo_sizes: Invalid parameter
>> g-tx-fifo-size, setting to default average
>>
>> I known you didn't change the tx fifo size, but there are complex
>> constrains regarding the total fifo size.
>>
>> Are you able to test this with a mainline kernel (not Raspberry Pi
>> kernel) and send a fix for this?
>>
>>>  	g-np-tx-fifo-size = <32>;
>>>  	g-tx-fifo-size = <256 256 512 512 512 768 768>;
>>>  };
> I think I see the problem.
>
> IIUC the calculations and checks, all g-tx-fifo-size values +
> g-rx-fifo-size + g-np-tx-fifo-size must not exceed total_fifo_size. My
> RPi4 reports the total_fifo_size as 4080 (in
> /sys/kernel/debug/usb/fe980000.usb/hw_params).
>
> Linux mainline
> https://github.com/torvalds/linux/search?p=3&q=g-tx-fifo-size :
>
> The increase in value of g-rx-fifo-size exceeds the limit for the DTSI
> files we patched:
>
> Both bcm283x-rpi-usb-peripheral.dtsi and bcm283x-rpi-usb-otg.dtsi:
> 558 + 32 + 256 + 256 + 512 + 512 + 512 + 768 + 768 = 4174 > 4080
>
> while the sum with the previous value of 256 reached just 3872 < 4080.
>
>
> The raspberrypi repo
> https://github.com/raspberrypi/linux/search?q=g-tx-fifo-size :
>
> It has a different mix of the DTSI files
> dwc2-overlay.dts
> upstream-overlay.dts
> upstream-pi4-overlay.dts
yes these overlay files are vendor specific and doesn't exist in
mainline. The upstream*dts were intended to "simulate" mainline
behavior, but unfortunately differ in this case.
>
> all of which define
> g-tx-fifo-size = <512 512 512 512 512 256 256>;
>
> Here the calculation holds:
> 558 + 32 + 512 + 512 + 512 + 512 + 512 + 256 + 256 = 3662 < 4080
>
> My RPi4 uses one of these DTSIs, because my
> /sys/kernel/debug/usb/fe980000.usb/params says:
>
> g_rx_fifo_size                : 558
> g_np_tx_fifo_size             : 32
> g_tx_fifo_size[0]             : 0
> g_tx_fifo_size[1]             : 512
> g_tx_fifo_size[2]             : 512
> g_tx_fifo_size[3]             : 512
> g_tx_fifo_size[4]             : 512
> g_tx_fifo_size[5]             : 512
> g_tx_fifo_size[6]             : 256
> g_tx_fifo_size[7]             : 256
>
>
> IIUC the tx_fifo values in bcm283x-rpi-usb-peripheral.dtsi and
> bcm283x-rpi-usb-otg.dtsi files can be lowered to the values used and
> tested (at least by me) in the RPi repo. But this is outside of my
> knowledge, honestly I do not know what is the most appropriate
> distribution of the remaining fifo space among the g_tx_fifo buffers.
> Please can the RPi developers (Phil?) suggest a fix?

As author of the mainline bcm283x-rpi-usb-otg.dtsi i was trying to
optimize the fifo sizes for EP 6 and 7. But i don't remember why. So my
suggestion for a fix would be:

g-tx-fifo-size = <256 256 256 512 512 768 768>;

But i'm also unsure about the values.

>
> Thanks a lot to Stefan and to everyone involved.
>
> Pavel.
Pavel Hofman May 28, 2021, 8:59 a.m. UTC | #5
Dne 27. 05. 21 v 15:47 Stefan Wahren napsal(a):

>> I think I see the problem.
>>
>> IIUC the calculations and checks, all g-tx-fifo-size values +
>> g-rx-fifo-size + g-np-tx-fifo-size must not exceed total_fifo_size. My
>> RPi4 reports the total_fifo_size as 4080 (in
>> /sys/kernel/debug/usb/fe980000.usb/hw_params).
>>
>> Linux mainline
>> https://github.com/torvalds/linux/search?p=3&q=g-tx-fifo-size :
>>
>> The increase in value of g-rx-fifo-size exceeds the limit for the DTSI
>> files we patched:
>>
>> Both bcm283x-rpi-usb-peripheral.dtsi and bcm283x-rpi-usb-otg.dtsi:
>> 558 + 32 + 256 + 256 + 512 + 512 + 512 + 768 + 768 = 4174 > 4080
>>
>> while the sum with the previous value of 256 reached just 3872 < 4080.
>>
>>
>> The raspberrypi repo
>> https://github.com/raspberrypi/linux/search?q=g-tx-fifo-size :
>>
>> It has a different mix of the DTSI files
>> dwc2-overlay.dts
>> upstream-overlay.dts
>> upstream-pi4-overlay.dts
> yes these overlay files are vendor specific and doesn't exist in
> mainline. The upstream*dts were intended to "simulate" mainline
> behavior, but unfortunately differ in this case.
>>
>> all of which define
>> g-tx-fifo-size = <512 512 512 512 512 256 256>;
>>
>> Here the calculation holds:
>> 558 + 32 + 512 + 512 + 512 + 512 + 512 + 256 + 256 = 3662 < 4080
>>
>> My RPi4 uses one of these DTSIs, because my
>> /sys/kernel/debug/usb/fe980000.usb/params says:
>>
>> g_rx_fifo_size                : 558
>> g_np_tx_fifo_size             : 32
>> g_tx_fifo_size[0]             : 0
>> g_tx_fifo_size[1]             : 512
>> g_tx_fifo_size[2]             : 512
>> g_tx_fifo_size[3]             : 512
>> g_tx_fifo_size[4]             : 512
>> g_tx_fifo_size[5]             : 512
>> g_tx_fifo_size[6]             : 256
>> g_tx_fifo_size[7]             : 256
>>
>>
>> IIUC the tx_fifo values in bcm283x-rpi-usb-peripheral.dtsi and
>> bcm283x-rpi-usb-otg.dtsi files can be lowered to the values used and
>> tested (at least by me) in the RPi repo. But this is outside of my
>> knowledge, honestly I do not know what is the most appropriate
>> distribution of the remaining fifo space among the g_tx_fifo buffers.
>> Please can the RPi developers (Phil?) suggest a fix?
> 
> As author of the mainline bcm283x-rpi-usb-otg.dtsi i was trying to
> optimize the fifo sizes for EP 6 and 7. But i don't remember why. So my
> suggestion for a fix would be:
> 
> g-tx-fifo-size = <256 256 256 512 512 768 768>;
> 
> But i'm also unsure about the values.
> 

IIUC this code
https://github.com/torvalds/linux/blob/master/drivers/usb/dwc2/gadget.c#L4091
optimizes the FIFO assignment to endpoints. From that I would conclude
that correct values are specific for each use-case configuration of
endpoints. Maybe a varied selection (256, 512, 768) is more convenient
than just 256 and 512. I really do not know what use cases need what TX
fifo values.


BTW perhaps this comment
https://github.com/torvalds/linux/blob/master/drivers/usb/dwc2/gadget.c#L327
is a bit misleading since that code does not do the assignment, just
stores the size distribution to the DPTXFSIZN register, IIUC.

Best regards,

Pavel.
Pavel Hofman Aug. 6, 2021, 1:03 p.m. UTC | #6
Dne 28. 05. 21 v 10:59 Pavel Hofman napsal(a):
> Dne 27. 05. 21 v 15:47 Stefan Wahren napsal(a):
> 
>>> I think I see the problem.
>>>
>>> IIUC the calculations and checks, all g-tx-fifo-size values +
>>> g-rx-fifo-size + g-np-tx-fifo-size must not exceed total_fifo_size. My
>>> RPi4 reports the total_fifo_size as 4080 (in
>>> /sys/kernel/debug/usb/fe980000.usb/hw_params).
>>>
>>> Linux mainline
>>> https://github.com/torvalds/linux/search?p=3&q=g-tx-fifo-size :
>>>
>>> The increase in value of g-rx-fifo-size exceeds the limit for the DTSI
>>> files we patched:
>>>
>>> Both bcm283x-rpi-usb-peripheral.dtsi and bcm283x-rpi-usb-otg.dtsi:
>>> 558 + 32 + 256 + 256 + 512 + 512 + 512 + 768 + 768 = 4174 > 4080
>>>
>>> while the sum with the previous value of 256 reached just 3872 < 4080.
>>>
>>>
>>> The raspberrypi repo
>>> https://github.com/raspberrypi/linux/search?q=g-tx-fifo-size :
>>>
>>> It has a different mix of the DTSI files
>>> dwc2-overlay.dts
>>> upstream-overlay.dts
>>> upstream-pi4-overlay.dts
>> yes these overlay files are vendor specific and doesn't exist in
>> mainline. The upstream*dts were intended to "simulate" mainline
>> behavior, but unfortunately differ in this case.
>>>
>>> all of which define
>>> g-tx-fifo-size = <512 512 512 512 512 256 256>;
>>>
>>> Here the calculation holds:
>>> 558 + 32 + 512 + 512 + 512 + 512 + 512 + 256 + 256 = 3662 < 4080
>>>
>>> My RPi4 uses one of these DTSIs, because my
>>> /sys/kernel/debug/usb/fe980000.usb/params says:
>>>
>>> g_rx_fifo_size                : 558
>>> g_np_tx_fifo_size             : 32
>>> g_tx_fifo_size[0]             : 0
>>> g_tx_fifo_size[1]             : 512
>>> g_tx_fifo_size[2]             : 512
>>> g_tx_fifo_size[3]             : 512
>>> g_tx_fifo_size[4]             : 512
>>> g_tx_fifo_size[5]             : 512
>>> g_tx_fifo_size[6]             : 256
>>> g_tx_fifo_size[7]             : 256
>>>
>>>
>>> IIUC the tx_fifo values in bcm283x-rpi-usb-peripheral.dtsi and
>>> bcm283x-rpi-usb-otg.dtsi files can be lowered to the values used and
>>> tested (at least by me) in the RPi repo. But this is outside of my
>>> knowledge, honestly I do not know what is the most appropriate
>>> distribution of the remaining fifo space among the g_tx_fifo buffers.
>>> Please can the RPi developers (Phil?) suggest a fix?
>>
>> As author of the mainline bcm283x-rpi-usb-otg.dtsi i was trying to
>> optimize the fifo sizes for EP 6 and 7. But i don't remember why. So my
>> suggestion for a fix would be:
>>
>> g-tx-fifo-size = <256 256 256 512 512 768 768>;
>>
>> But i'm also unsure about the values.
>>
> 
> IIUC this code
> https://github.com/torvalds/linux/blob/master/drivers/usb/dwc2/gadget.c#L4091
> optimizes the FIFO assignment to endpoints. From that I would conclude
> that correct values are specific for each use-case configuration of
> endpoints. Maybe a varied selection (256, 512, 768) is more convenient
> than just 256 and 512. I really do not know what use cases need what TX
> fifo values.
> 

My patch raising  g-rx-fifo-size = 558 has been reverted back to 
g-rx-fifo-size = 256 in upstream. 256 is clearly a wrong value. 558 is 
enough for 2 packets per microframe. How about raising the value in the 
mainline DTS files to 301 instead which will correctly work with 1 
packet per microframe (the most common scenario) and comply with the 
4080 limit of the RX + all TXs sum of the TX configs in the mainline?

Thanks for considering.

Best regards,

Pavel.
Stefan Wahren Aug. 6, 2021, 2:08 p.m. UTC | #7
Hi Pavel,

Am 06.08.21 um 15:03 schrieb Pavel Hofman:
> Dne 28. 05. 21 v 10:59 Pavel Hofman napsal(a):
>> Dne 27. 05. 21 v 15:47 Stefan Wahren napsal(a):
>>
>>>> I think I see the problem.
>>>>
>>>> IIUC the calculations and checks, all g-tx-fifo-size values +
>>>> g-rx-fifo-size + g-np-tx-fifo-size must not exceed total_fifo_size. My
>>>> RPi4 reports the total_fifo_size as 4080 (in
>>>> /sys/kernel/debug/usb/fe980000.usb/hw_params).
>>>>
>>>> Linux mainline
>>>> https://github.com/torvalds/linux/search?p=3&q=g-tx-fifo-size :
>>>>
>>>> The increase in value of g-rx-fifo-size exceeds the limit for the DTSI
>>>> files we patched:
>>>>
>>>> Both bcm283x-rpi-usb-peripheral.dtsi and bcm283x-rpi-usb-otg.dtsi:
>>>> 558 + 32 + 256 + 256 + 512 + 512 + 512 + 768 + 768 = 4174 > 4080
>>>>
>>>> while the sum with the previous value of 256 reached just 3872 < 4080.
>>>>
>>>>
>>>> The raspberrypi repo
>>>> https://github.com/raspberrypi/linux/search?q=g-tx-fifo-size :
>>>>
>>>> It has a different mix of the DTSI files
>>>> dwc2-overlay.dts
>>>> upstream-overlay.dts
>>>> upstream-pi4-overlay.dts
>>> yes these overlay files are vendor specific and doesn't exist in
>>> mainline. The upstream*dts were intended to "simulate" mainline
>>> behavior, but unfortunately differ in this case.
>>>>
>>>> all of which define
>>>> g-tx-fifo-size = <512 512 512 512 512 256 256>;
>>>>
>>>> Here the calculation holds:
>>>> 558 + 32 + 512 + 512 + 512 + 512 + 512 + 256 + 256 = 3662 < 4080
>>>>
>>>> My RPi4 uses one of these DTSIs, because my
>>>> /sys/kernel/debug/usb/fe980000.usb/params says:
>>>>
>>>> g_rx_fifo_size                : 558
>>>> g_np_tx_fifo_size             : 32
>>>> g_tx_fifo_size[0]             : 0
>>>> g_tx_fifo_size[1]             : 512
>>>> g_tx_fifo_size[2]             : 512
>>>> g_tx_fifo_size[3]             : 512
>>>> g_tx_fifo_size[4]             : 512
>>>> g_tx_fifo_size[5]             : 512
>>>> g_tx_fifo_size[6]             : 256
>>>> g_tx_fifo_size[7]             : 256
>>>>
>>>>
>>>> IIUC the tx_fifo values in bcm283x-rpi-usb-peripheral.dtsi and
>>>> bcm283x-rpi-usb-otg.dtsi files can be lowered to the values used and
>>>> tested (at least by me) in the RPi repo. But this is outside of my
>>>> knowledge, honestly I do not know what is the most appropriate
>>>> distribution of the remaining fifo space among the g_tx_fifo buffers.
>>>> Please can the RPi developers (Phil?) suggest a fix?
>>>
>>> As author of the mainline bcm283x-rpi-usb-otg.dtsi i was trying to
>>> optimize the fifo sizes for EP 6 and 7. But i don't remember why. So my
>>> suggestion for a fix would be:
>>>
>>> g-tx-fifo-size = <256 256 256 512 512 768 768>;
>>>
>>> But i'm also unsure about the values.
>>>
>>
>> IIUC this code
>> https://github.com/torvalds/linux/blob/master/drivers/usb/dwc2/gadget.c#L4091
>>
>> optimizes the FIFO assignment to endpoints. From that I would conclude
>> that correct values are specific for each use-case configuration of
>> endpoints. Maybe a varied selection (256, 512, 768) is more convenient
>> than just 256 and 512. I really do not know what use cases need what TX
>> fifo values.
>>
>
> My patch raising  g-rx-fifo-size = 558 has been reverted back to
> g-rx-fifo-size = 256 in upstream. 256 is clearly a wrong value. 558 is
> enough for 2 packets per microframe. How about raising the value in
> the mainline DTS files to 301 instead which will correctly work with 1
> packet per microframe (the most common scenario) and comply with the
> 4080 limit of the RX + all TXs sum of the TX configs in the mainline?

thank for your feedback. It has been reverted because the last patch
broke USB completely on Raspberry Pi Zero and the only explanation for
me is it has never been tested. The workflow is that the submitter is
responsibly for testing. In case this is not possible the patch must be
tagged with RFT or at least it must be mentioned in the commit message.

In case you want to have a different value, please feel free to send a
patch, but please test it against a mainline kernel before. In case you
have problems with it, feel free to ask.

Sorry, in case this sounds grumpy but it's very annoying to hunt down
especially avoidable regressions with every kernel release. This wastes
other developers time to get their patches upstream.

Best regards
Stefan

>
> Thanks for considering.
>
> Best regards,
>
> Pavel.
>
>
>
Pavel Hofman Aug. 6, 2021, 2:46 p.m. UTC | #8
Hi Stefan,

Dne 06. 08. 21 v 16:08 Stefan Wahren napsal(a):
> Hi Pavel,
> 
> Am 06.08.21 um 15:03 schrieb Pavel Hofman:
>> Dne 28. 05. 21 v 10:59 Pavel Hofman napsal(a):
>>> Dne 27. 05. 21 v 15:47 Stefan Wahren napsal(a):
>>>
>>>>> I think I see the problem.
>>>>>
>>>>> IIUC the calculations and checks, all g-tx-fifo-size values +
>>>>> g-rx-fifo-size + g-np-tx-fifo-size must not exceed total_fifo_size. My
>>>>> RPi4 reports the total_fifo_size as 4080 (in
>>>>> /sys/kernel/debug/usb/fe980000.usb/hw_params).
>>>>>
>>>>> Linux mainline
>>>>> https://github.com/torvalds/linux/search?p=3&q=g-tx-fifo-size :
>>>>>
>>>>> The increase in value of g-rx-fifo-size exceeds the limit for the DTSI
>>>>> files we patched:
>>>>>
>>>>> Both bcm283x-rpi-usb-peripheral.dtsi and bcm283x-rpi-usb-otg.dtsi:
>>>>> 558 + 32 + 256 + 256 + 512 + 512 + 512 + 768 + 768 = 4174 > 4080
>>>>>
>>>>> while the sum with the previous value of 256 reached just 3872 < 4080.
>>>>>
>>>>>
>>>>> The raspberrypi repo
>>>>> https://github.com/raspberrypi/linux/search?q=g-tx-fifo-size :
>>>>>
>>>>> It has a different mix of the DTSI files
>>>>> dwc2-overlay.dts
>>>>> upstream-overlay.dts
>>>>> upstream-pi4-overlay.dts
>>>> yes these overlay files are vendor specific and doesn't exist in
>>>> mainline. The upstream*dts were intended to "simulate" mainline
>>>> behavior, but unfortunately differ in this case.
>>>>>
>>>>> all of which define
>>>>> g-tx-fifo-size = <512 512 512 512 512 256 256>;
>>>>>
>>>>> Here the calculation holds:
>>>>> 558 + 32 + 512 + 512 + 512 + 512 + 512 + 256 + 256 = 3662 < 4080
>>>>>
>>>>> My RPi4 uses one of these DTSIs, because my
>>>>> /sys/kernel/debug/usb/fe980000.usb/params says:
>>>>>
>>>>> g_rx_fifo_size                : 558
>>>>> g_np_tx_fifo_size             : 32
>>>>> g_tx_fifo_size[0]             : 0
>>>>> g_tx_fifo_size[1]             : 512
>>>>> g_tx_fifo_size[2]             : 512
>>>>> g_tx_fifo_size[3]             : 512
>>>>> g_tx_fifo_size[4]             : 512
>>>>> g_tx_fifo_size[5]             : 512
>>>>> g_tx_fifo_size[6]             : 256
>>>>> g_tx_fifo_size[7]             : 256
>>>>>
>>>>>
>>>>> IIUC the tx_fifo values in bcm283x-rpi-usb-peripheral.dtsi and
>>>>> bcm283x-rpi-usb-otg.dtsi files can be lowered to the values used and
>>>>> tested (at least by me) in the RPi repo. But this is outside of my
>>>>> knowledge, honestly I do not know what is the most appropriate
>>>>> distribution of the remaining fifo space among the g_tx_fifo buffers.
>>>>> Please can the RPi developers (Phil?) suggest a fix?
>>>>
>>>> As author of the mainline bcm283x-rpi-usb-otg.dtsi i was trying to
>>>> optimize the fifo sizes for EP 6 and 7. But i don't remember why. So my
>>>> suggestion for a fix would be:
>>>>
>>>> g-tx-fifo-size = <256 256 256 512 512 768 768>;
>>>>
>>>> But i'm also unsure about the values.
>>>>
>>>
>>> IIUC this code
>>> https://github.com/torvalds/linux/blob/master/drivers/usb/dwc2/gadget.c#L4091
>>>
>>> optimizes the FIFO assignment to endpoints. From that I would conclude
>>> that correct values are specific for each use-case configuration of
>>> endpoints. Maybe a varied selection (256, 512, 768) is more convenient
>>> than just 256 and 512. I really do not know what use cases need what TX
>>> fifo values.
>>>
>>
>> My patch raising  g-rx-fifo-size = 558 has been reverted back to
>> g-rx-fifo-size = 256 in upstream. 256 is clearly a wrong value. 558 is
>> enough for 2 packets per microframe. How about raising the value in
>> the mainline DTS files to 301 instead which will correctly work with 1
>> packet per microframe (the most common scenario) and comply with the
>> 4080 limit of the RX + all TXs sum of the TX configs in the mainline?
> 
> thank for your feedback. It has been reverted because the last patch
> broke USB completely on Raspberry Pi Zero and the only explanation for
> me is it has never been tested. The workflow is that the submitter is
> responsibly for testing. In case this is not possible the patch must be
> tagged with RFT or at least it must be mentioned in the commit message.
> 
> In case you want to have a different value, please feel free to send a
> patch, but please test it against a mainline kernel before. In case you
> have problems with it, feel free to ask.
> 
> Sorry, in case this sounds grumpy but it's very annoying to hunt down
> especially avoidable regressions with every kernel release. This wastes
> other developers time to get their patches upstream.
> 

I understand your points. I really did not test the patch with mainline 
combination of the TX values, sorry for that. I have no problem with the 
revert at all, just pointing out that the value of 256 is incorrect. It 
took a number of hours with patient help of Minas to find the culprit of 
the dwc2 gadget dropping audio samples with packet sizes over 980 bytes.

However, even if I did test and changed the TX values on my RPi4 
accordingly, I would not have been able to test on RPi Zero and the 
other RPi models. The questions are:

* Why did your TX values, changed to comply with the 4080 limit, break 
RPi Zero, what are the rules apart of the max sum of 4080?

* Why does mainline config have different RX and TX sizes than the 
RPi-vendor specific config (which I happen/ed to use)?

Maybe these questions should be resolved, allowing for safer developing 
the gadget on the RPi hardware.

Best regards,

Pavel.
Stefan Wahren Aug. 6, 2021, 3:57 p.m. UTC | #9
Hi Pavel,

Am 06.08.21 um 16:46 schrieb Pavel Hofman:
> Hi Stefan,
>
> Dne 06. 08. 21 v 16:08 Stefan Wahren napsal(a):
>> Hi Pavel,
>>
>> Am 06.08.21 um 15:03 schrieb Pavel Hofman:
>>> Dne 28. 05. 21 v 10:59 Pavel Hofman napsal(a):
>>>> Dne 27. 05. 21 v 15:47 Stefan Wahren napsal(a):
>>>>
>>>>>> I think I see the problem.
>>>>>>
>>>>>> IIUC the calculations and checks, all g-tx-fifo-size values +
>>>>>> g-rx-fifo-size + g-np-tx-fifo-size must not exceed
>>>>>> total_fifo_size. My
>>>>>> RPi4 reports the total_fifo_size as 4080 (in
>>>>>> /sys/kernel/debug/usb/fe980000.usb/hw_params).
>>>>>>
>>>>>> Linux mainline
>>>>>> https://github.com/torvalds/linux/search?p=3&q=g-tx-fifo-size :
>>>>>>
>>>>>> The increase in value of g-rx-fifo-size exceeds the limit for the
>>>>>> DTSI
>>>>>> files we patched:
>>>>>>
>>>>>> Both bcm283x-rpi-usb-peripheral.dtsi and bcm283x-rpi-usb-otg.dtsi:
>>>>>> 558 + 32 + 256 + 256 + 512 + 512 + 512 + 768 + 768 = 4174 > 4080
>>>>>>
>>>>>> while the sum with the previous value of 256 reached just 3872 <
>>>>>> 4080.
>>>>>>
>>>>>>
>>>>>> The raspberrypi repo
>>>>>> https://github.com/raspberrypi/linux/search?q=g-tx-fifo-size :
>>>>>>
>>>>>> It has a different mix of the DTSI files
>>>>>> dwc2-overlay.dts
>>>>>> upstream-overlay.dts
>>>>>> upstream-pi4-overlay.dts
>>>>> yes these overlay files are vendor specific and doesn't exist in
>>>>> mainline. The upstream*dts were intended to "simulate" mainline
>>>>> behavior, but unfortunately differ in this case.
>>>>>>
>>>>>> all of which define
>>>>>> g-tx-fifo-size = <512 512 512 512 512 256 256>;
>>>>>>
>>>>>> Here the calculation holds:
>>>>>> 558 + 32 + 512 + 512 + 512 + 512 + 512 + 256 + 256 = 3662 < 4080
>>>>>>
>>>>>> My RPi4 uses one of these DTSIs, because my
>>>>>> /sys/kernel/debug/usb/fe980000.usb/params says:
>>>>>>
>>>>>> g_rx_fifo_size                : 558
>>>>>> g_np_tx_fifo_size             : 32
>>>>>> g_tx_fifo_size[0]             : 0
>>>>>> g_tx_fifo_size[1]             : 512
>>>>>> g_tx_fifo_size[2]             : 512
>>>>>> g_tx_fifo_size[3]             : 512
>>>>>> g_tx_fifo_size[4]             : 512
>>>>>> g_tx_fifo_size[5]             : 512
>>>>>> g_tx_fifo_size[6]             : 256
>>>>>> g_tx_fifo_size[7]             : 256
>>>>>>
>>>>>>
>>>>>> IIUC the tx_fifo values in bcm283x-rpi-usb-peripheral.dtsi and
>>>>>> bcm283x-rpi-usb-otg.dtsi files can be lowered to the values used and
>>>>>> tested (at least by me) in the RPi repo. But this is outside of my
>>>>>> knowledge, honestly I do not know what is the most appropriate
>>>>>> distribution of the remaining fifo space among the g_tx_fifo
>>>>>> buffers.
>>>>>> Please can the RPi developers (Phil?) suggest a fix?
>>>>>
>>>>> As author of the mainline bcm283x-rpi-usb-otg.dtsi i was trying to
>>>>> optimize the fifo sizes for EP 6 and 7. But i don't remember why.
>>>>> So my
>>>>> suggestion for a fix would be:
>>>>>
>>>>> g-tx-fifo-size = <256 256 256 512 512 768 768>;
>>>>>
>>>>> But i'm also unsure about the values.
>>>>>
>>>>
>>>> IIUC this code
>>>> https://github.com/torvalds/linux/blob/master/drivers/usb/dwc2/gadget.c#L4091
>>>>
>>>>
>>>> optimizes the FIFO assignment to endpoints. From that I would conclude
>>>> that correct values are specific for each use-case configuration of
>>>> endpoints. Maybe a varied selection (256, 512, 768) is more convenient
>>>> than just 256 and 512. I really do not know what use cases need
>>>> what TX
>>>> fifo values.
>>>>
>>>
>>> My patch raising  g-rx-fifo-size = 558 has been reverted back to
>>> g-rx-fifo-size = 256 in upstream. 256 is clearly a wrong value. 558 is
>>> enough for 2 packets per microframe. How about raising the value in
>>> the mainline DTS files to 301 instead which will correctly work with 1
>>> packet per microframe (the most common scenario) and comply with the
>>> 4080 limit of the RX + all TXs sum of the TX configs in the mainline?
>>
>> thank for your feedback. It has been reverted because the last patch
>> broke USB completely on Raspberry Pi Zero and the only explanation for
>> me is it has never been tested. The workflow is that the submitter is
>> responsibly for testing. In case this is not possible the patch must be
>> tagged with RFT or at least it must be mentioned in the commit message.
>>
>> In case you want to have a different value, please feel free to send a
>> patch, but please test it against a mainline kernel before. In case you
>> have problems with it, feel free to ask.
>>
>> Sorry, in case this sounds grumpy but it's very annoying to hunt down
>> especially avoidable regressions with every kernel release. This wastes
>> other developers time to get their patches upstream.
>>
>
> I understand your points. I really did not test the patch with
> mainline combination of the TX values, sorry for that. I have no
> problem with the revert at all, just pointing out that the value of
> 256 is incorrect. It took a number of hours with patient help of Minas
> to find the culprit of the dwc2 gadget dropping audio samples with
> packet sizes over 980 bytes.
believe me, i understand this absolutely as the author of the mainline
Raspberry Pi Zero DTS (back in 2017). In the old days there were a lot
of issues in the DWC2. It took until ~ 4.14 to get a proper working USB
host mode.
>
> However, even if I did test and changed the TX values on my RPi4
> accordingly, I would not have been able to test on RPi Zero and the
> other RPi models. 
This doesn't matter. The USB IP is always the same. The mentioned issue
was also on the Raspberry Pi 4, but nobody notices this (using Raspberry
Pi 4 as USB gadget is very special). But for the RPI Zero this issue was
very critical.
> The questions are:
>
> * Why did your TX values, changed to comply with the 4080 limit, break
> RPi Zero, what are the rules apart of the max sum of 4080?
Unfortunately i don't have access to the DWC2 reference manual and the
time to figure out all these constrains.
>
> * Why does mainline config have different RX and TX sizes than the
> RPi-vendor specific config (which I happen/ed to use)?

For my initial version of the DTS i took some working values, i don't
remember exactly. During time the values in the vendor tree changed.
Nobody upstreamed the later changes.

I'm fine with changing all to RPi-vendor specific settings, as long as
it works with OTG, Gadget mode, with and without USB hub, ...

I don't have a strong opinion for these magic numbers. Currently i'm
busy in my spare time with CM4 upstreaming, so not much time for this topic.

I hope this helps.

Best regards
Stefan

>
> Maybe these questions should be resolved, allowing for safer
> developing the gadget on the RPi hardware.
>
> Best regards,
>
> Pavel.
diff mbox series

Patch

diff --git a/arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi
b/arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi
index e2fd961..20322de 100644
--- a/arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi
+++ b/arch/arm/boot/dts/bcm283x-rpi-usb-otg.dtsi
@@ -1,7 +1,7 @@ 
 // SPDX-License-Identifier: GPL-2.0
 &usb {
 	dr_mode = "otg";
-	g-rx-fifo-size = <256>;
+	g-rx-fifo-size = <558>;
 	g-np-tx-fifo-size = <32>;
 	/*
 	 * According to dwc2 the sum of all device EP
diff --git a/arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi
b/arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi
index 0ff0e9e..1409d1b 100644
--- a/arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi
+++ b/arch/arm/boot/dts/bcm283x-rpi-usb-peripheral.dtsi
@@ -1,7 +1,7 @@ 
 // SPDX-License-Identifier: GPL-2.0
 &usb {
 	dr_mode = "peripheral";
-	g-rx-fifo-size = <256>;
+	g-rx-fifo-size = <558>;
 	g-np-tx-fifo-size = <32>;
 	g-tx-fifo-size = <256 256 512 512 512 768 768>;
 };