diff mbox series

[v3] usb: dwc3: core: Do core softreset when switch mode

Message ID 374440f8dcd4f06c02c2caf4b1efde86774e02d9.1618521663.git.Thinh.Nguyen@synopsys.com (mailing list archive)
State Accepted
Commit f88359e1588b85cf0e8209ab7d6620085f3441d9
Headers show
Series [v3] usb: dwc3: core: Do core softreset when switch mode | expand

Commit Message

Thinh Nguyen April 15, 2021, 10:20 p.m. UTC
From: Yu Chen <chenyu56@huawei.com>
From: John Stultz <john.stultz@linaro.org>

According to the programming guide, to switch mode for DRD controller,
the driver needs to do the following.

To switch from device to host:
1. Reset controller with GCTL.CoreSoftReset
2. Set GCTL.PrtCapDir(host mode)
3. Reset the host with USBCMD.HCRESET
4. Then follow up with the initializing host registers sequence

To switch from host to device:
1. Reset controller with GCTL.CoreSoftReset
2. Set GCTL.PrtCapDir(device mode)
3. Reset the device with DCTL.CSftRst
4. Then follow up with the initializing registers sequence

Currently we're missing step 1) to do GCTL.CoreSoftReset and step 3) of
switching from host to device. John Stult reported a lockup issue seen
with HiKey960 platform without these steps[1]. Similar issue is observed
with Ferry's testing platform[2].

So, apply the required steps along with some fixes to Yu Chen's and John
Stultz's version. The main fixes to their versions are the missing wait
for clocks synchronization before clearing GCTL.CoreSoftReset and only
apply DCTL.CSftRst when switching from host to device.

[1] https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/
[2] https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/

Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Ferry Toth <fntoth@gmail.com>
Cc: Wesley Cheng <wcheng@codeaurora.org>
Cc: <stable@vger.kernel.org>
Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly")
Signed-off-by: Yu Chen <chenyu56@huawei.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
---
Changes in v3:
- Check if the desired mode is OTG, then keep the old flow
- Remove condition for OTG support only since the device can still be
  configured DRD host/device mode only
- Remove redundant hw_mode check since __dwc3_set_mode() only applies when
  hw_mode is DRD
Changes in v2:
- Initialize mutex per device and not as global mutex.
- Add additional checks for DRD only mode

 drivers/usb/dwc3/core.c | 27 +++++++++++++++++++++++++++
 drivers/usb/dwc3/core.h |  5 +++++
 2 files changed, 32 insertions(+)


base-commit: 4b853c236c7b5161a2e444bd8b3c76fe5aa5ddcb

Comments

Thinh Nguyen April 15, 2021, 10:23 p.m. UTC | #1
Thinh Nguyen wrote:
> From: Yu Chen <chenyu56@huawei.com>
> From: John Stultz <john.stultz@linaro.org>
> 
> According to the programming guide, to switch mode for DRD controller,
> the driver needs to do the following.
> 
> To switch from device to host:
> 1. Reset controller with GCTL.CoreSoftReset
> 2. Set GCTL.PrtCapDir(host mode)
> 3. Reset the host with USBCMD.HCRESET
> 4. Then follow up with the initializing host registers sequence
> 
> To switch from host to device:
> 1. Reset controller with GCTL.CoreSoftReset
> 2. Set GCTL.PrtCapDir(device mode)
> 3. Reset the device with DCTL.CSftRst
> 4. Then follow up with the initializing registers sequence
> 
> Currently we're missing step 1) to do GCTL.CoreSoftReset and step 3) of
> switching from host to device. John Stult reported a lockup issue seen
> with HiKey960 platform without these steps[1]. Similar issue is observed
> with Ferry's testing platform[2].
> 
> So, apply the required steps along with some fixes to Yu Chen's and John
> Stultz's version. The main fixes to their versions are the missing wait
> for clocks synchronization before clearing GCTL.CoreSoftReset and only
> apply DCTL.CSftRst when switching from host to device.
> 
> [1] https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqLhzN4i3$ 
> [2] https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqGeZStt4$ 
> 
> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
> Cc: Ferry Toth <fntoth@gmail.com>
> Cc: Wesley Cheng <wcheng@codeaurora.org>
> Cc: <stable@vger.kernel.org>
> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly")
> Signed-off-by: Yu Chen <chenyu56@huawei.com>
> Signed-off-by: John Stultz <john.stultz@linaro.org>
> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
> ---
> Changes in v3:
> - Check if the desired mode is OTG, then keep the old flow
> - Remove condition for OTG support only since the device can still be
>   configured DRD host/device mode only
> - Remove redundant hw_mode check since __dwc3_set_mode() only applies when
>   hw_mode is DRD
> Changes in v2:
> - Initialize mutex per device and not as global mutex.
> - Add additional checks for DRD only mode
> 
>  drivers/usb/dwc3/core.c | 27 +++++++++++++++++++++++++++
>  drivers/usb/dwc3/core.h |  5 +++++
>  2 files changed, 32 insertions(+)
> 

Hi John,

If possible, can you run a test with this version on your platform?

Thanks,
Thinh
John Stultz April 16, 2021, 12:12 a.m. UTC | #2
On Thu, Apr 15, 2021 at 3:20 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
>
> From: Yu Chen <chenyu56@huawei.com>
> From: John Stultz <john.stultz@linaro.org>
>
> According to the programming guide, to switch mode for DRD controller,
> the driver needs to do the following.
>
> To switch from device to host:
> 1. Reset controller with GCTL.CoreSoftReset
> 2. Set GCTL.PrtCapDir(host mode)
> 3. Reset the host with USBCMD.HCRESET
> 4. Then follow up with the initializing host registers sequence
>
> To switch from host to device:
> 1. Reset controller with GCTL.CoreSoftReset
> 2. Set GCTL.PrtCapDir(device mode)
> 3. Reset the device with DCTL.CSftRst
> 4. Then follow up with the initializing registers sequence
>
> Currently we're missing step 1) to do GCTL.CoreSoftReset and step 3) of
> switching from host to device. John Stult reported a lockup issue seen
> with HiKey960 platform without these steps[1]. Similar issue is observed
> with Ferry's testing platform[2].
>
> So, apply the required steps along with some fixes to Yu Chen's and John
> Stultz's version. The main fixes to their versions are the missing wait
> for clocks synchronization before clearing GCTL.CoreSoftReset and only
> apply DCTL.CSftRst when switching from host to device.
>
> [1] https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/
> [2] https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/
>
> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
> Cc: Ferry Toth <fntoth@gmail.com>
> Cc: Wesley Cheng <wcheng@codeaurora.org>
> Cc: <stable@vger.kernel.org>
> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly")
> Signed-off-by: Yu Chen <chenyu56@huawei.com>
> Signed-off-by: John Stultz <john.stultz@linaro.org>
> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
> ---
> Changes in v3:
> - Check if the desired mode is OTG, then keep the old flow
> - Remove condition for OTG support only since the device can still be
>   configured DRD host/device mode only
> - Remove redundant hw_mode check since __dwc3_set_mode() only applies when
>   hw_mode is DRD
> Changes in v2:
> - Initialize mutex per device and not as global mutex.
> - Add additional checks for DRD only mode
>

I've not been able to test all the different modes on HiKey960 yet,
but with this patch we avoid the !COREIDLE hangs that we see
frequently on bootup, so it looks pretty good to me.  I'll get back to
you tonight when I can put hands on the board to test the gadget to
host switching to make sure all is well (I really don't expect any
issues, but just want to be sure).

thanks
-john
John Stultz April 16, 2021, 3:28 a.m. UTC | #3
On Thu, Apr 15, 2021 at 5:12 PM John Stultz <john.stultz@linaro.org> wrote:
>
> On Thu, Apr 15, 2021 at 3:20 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
> >
> > From: Yu Chen <chenyu56@huawei.com>
> > From: John Stultz <john.stultz@linaro.org>
> >
> > According to the programming guide, to switch mode for DRD controller,
> > the driver needs to do the following.
> >
> > To switch from device to host:
> > 1. Reset controller with GCTL.CoreSoftReset
> > 2. Set GCTL.PrtCapDir(host mode)
> > 3. Reset the host with USBCMD.HCRESET
> > 4. Then follow up with the initializing host registers sequence
> >
> > To switch from host to device:
> > 1. Reset controller with GCTL.CoreSoftReset
> > 2. Set GCTL.PrtCapDir(device mode)
> > 3. Reset the device with DCTL.CSftRst
> > 4. Then follow up with the initializing registers sequence
> >
> > Currently we're missing step 1) to do GCTL.CoreSoftReset and step 3) of
> > switching from host to device. John Stult reported a lockup issue seen
> > with HiKey960 platform without these steps[1]. Similar issue is observed
> > with Ferry's testing platform[2].
> >
> > So, apply the required steps along with some fixes to Yu Chen's and John
> > Stultz's version. The main fixes to their versions are the missing wait
> > for clocks synchronization before clearing GCTL.CoreSoftReset and only
> > apply DCTL.CSftRst when switching from host to device.
> >
> > [1] https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/
> > [2] https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/
> >
> > Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
> > Cc: Ferry Toth <fntoth@gmail.com>
> > Cc: Wesley Cheng <wcheng@codeaurora.org>
> > Cc: <stable@vger.kernel.org>
> > Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly")
> > Signed-off-by: Yu Chen <chenyu56@huawei.com>
> > Signed-off-by: John Stultz <john.stultz@linaro.org>
> > Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
> > ---
> > Changes in v3:
> > - Check if the desired mode is OTG, then keep the old flow
> > - Remove condition for OTG support only since the device can still be
> >   configured DRD host/device mode only
> > - Remove redundant hw_mode check since __dwc3_set_mode() only applies when
> >   hw_mode is DRD
> > Changes in v2:
> > - Initialize mutex per device and not as global mutex.
> > - Add additional checks for DRD only mode
> >
>
> I've not been able to test all the different modes on HiKey960 yet,
> but with this patch we avoid the !COREIDLE hangs that we see
> frequently on bootup, so it looks pretty good to me.  I'll get back to
> you tonight when I can put hands on the board to test the gadget to
> host switching to make sure all is well (I really don't expect any
> issues, but just want to be sure).

Ok, got a chance to test the mode switching and everything is looking good.

Tested-by: John Stultz <john.stultz@linaro.org>

Thanks again for continuing to push this!
-john
Ferry Toth April 16, 2021, 9:10 a.m. UTC | #4
Hi

Op 16-04-2021 om 05:28 schreef John Stultz:
> On Thu, Apr 15, 2021 at 5:12 PM John Stultz<john.stultz@linaro.org>  wrote:
>> On Thu, Apr 15, 2021 at 3:20 PM Thinh Nguyen<Thinh.Nguyen@synopsys.com>  wrote:
>>> From: Yu Chen<chenyu56@huawei.com>
>>> From: John Stultz<john.stultz@linaro.org>
>>>
>>> According to the programming guide, to switch mode for DRD controller,
>>> the driver needs to do the following.
>>>
>>> To switch from device to host:
>>> 1. Reset controller with GCTL.CoreSoftReset
>>> 2. Set GCTL.PrtCapDir(host mode)
>>> 3. Reset the host with USBCMD.HCRESET
>>> 4. Then follow up with the initializing host registers sequence
>>>
>>> To switch from host to device:
>>> 1. Reset controller with GCTL.CoreSoftReset
>>> 2. Set GCTL.PrtCapDir(device mode)
>>> 3. Reset the device with DCTL.CSftRst
>>> 4. Then follow up with the initializing registers sequence
>>>
>>> Currently we're missing step 1) to do GCTL.CoreSoftReset and step 3) of
>>> switching from host to device. John Stult reported a lockup issue seen
>>> with HiKey960 platform without these steps[1]. Similar issue is observed
>>> with Ferry's testing platform[2].
>>>
>>> So, apply the required steps along with some fixes to Yu Chen's and John
>>> Stultz's version. The main fixes to their versions are the missing wait
>>> for clocks synchronization before clearing GCTL.CoreSoftReset and only
>>> apply DCTL.CSftRst when switching from host to device.
>>>
>>> [1]https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/
>>> [2]https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/
>>>
>>> Cc: Andy Shevchenko<andy.shevchenko@gmail.com>
>>> Cc: Ferry Toth<fntoth@gmail.com>
>>> Cc: Wesley Cheng<wcheng@codeaurora.org>
>>> Cc:<stable@vger.kernel.org>
>>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly")
>>> Signed-off-by: Yu Chen<chenyu56@huawei.com>
>>> Signed-off-by: John Stultz<john.stultz@linaro.org>
>>> Signed-off-by: Thinh Nguyen<Thinh.Nguyen@synopsys.com>
>>> ---
>>> Changes in v3:
>>> - Check if the desired mode is OTG, then keep the old flow
>>> - Remove condition for OTG support only since the device can still be
>>>    configured DRD host/device mode only
>>> - Remove redundant hw_mode check since __dwc3_set_mode() only applies when
>>>    hw_mode is DRD
>>> Changes in v2:
>>> - Initialize mutex per device and not as global mutex.
>>> - Add additional checks for DRD only mode
>>>
>> I've not been able to test all the different modes on HiKey960 yet,
>> but with this patch we avoid the !COREIDLE hangs that we see
>> frequently on bootup, so it looks pretty good to me.  I'll get back to
>> you tonight when I can put hands on the board to test the gadget to
>> host switching to make sure all is well (I really don't expect any
>> issues, but just want to be sure).
> Ok, got a chance to test the mode switching and everything is looking good.
I expect to be able to test this weekend on my platform.
> Tested-by: John Stultz<john.stultz@linaro.org>
>
> Thanks again for continuing to push this!
> -john
Felipe Balbi April 16, 2021, 10:47 a.m. UTC | #5
Hi,

Thinh Nguyen <Thinh.Nguyen@synopsys.com> writes:

> From: Yu Chen <chenyu56@huawei.com>
> From: John Stultz <john.stultz@linaro.org>
>
> According to the programming guide, to switch mode for DRD controller,
> the driver needs to do the following.
>
> To switch from device to host:
> 1. Reset controller with GCTL.CoreSoftReset
> 2. Set GCTL.PrtCapDir(host mode)
> 3. Reset the host with USBCMD.HCRESET
> 4. Then follow up with the initializing host registers sequence
>
> To switch from host to device:
> 1. Reset controller with GCTL.CoreSoftReset
> 2. Set GCTL.PrtCapDir(device mode)
> 3. Reset the device with DCTL.CSftRst
> 4. Then follow up with the initializing registers sequence
>
> Currently we're missing step 1) to do GCTL.CoreSoftReset and step 3) of
> switching from host to device. John Stult reported a lockup issue seen
> with HiKey960 platform without these steps[1]. Similar issue is observed
> with Ferry's testing platform[2].
>
> So, apply the required steps along with some fixes to Yu Chen's and John
> Stultz's version. The main fixes to their versions are the missing wait
> for clocks synchronization before clearing GCTL.CoreSoftReset and only
> apply DCTL.CSftRst when switching from host to device.
>
> [1] https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/
> [2] https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/
>
> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
> Cc: Ferry Toth <fntoth@gmail.com>
> Cc: Wesley Cheng <wcheng@codeaurora.org>
> Cc: <stable@vger.kernel.org>
> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly")
> Signed-off-by: Yu Chen <chenyu56@huawei.com>
> Signed-off-by: John Stultz <john.stultz@linaro.org>
> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>

I still have concerns about the soft reset, but I won't block you guys
from fixing Hikey's problem :-)

The only thing I would like to confirm is that this has been verified
with hundreds of swaps happening as quickly as possible. DWC3 should
still be functional after several hundred swaps.

Can someone confirm this is the case? (I'm assuming this can be
scripted)
Wesley Cheng April 16, 2021, 7:05 p.m. UTC | #6
On 4/16/2021 3:47 AM, Felipe Balbi wrote:
> 
> Hi,
> 
> Thinh Nguyen <Thinh.Nguyen@synopsys.com> writes:
> 
>> From: Yu Chen <chenyu56@huawei.com>
>> From: John Stultz <john.stultz@linaro.org>
>>
>> According to the programming guide, to switch mode for DRD controller,
>> the driver needs to do the following.
>>
>> To switch from device to host:
>> 1. Reset controller with GCTL.CoreSoftReset
>> 2. Set GCTL.PrtCapDir(host mode)
>> 3. Reset the host with USBCMD.HCRESET
>> 4. Then follow up with the initializing host registers sequence
>>
>> To switch from host to device:
>> 1. Reset controller with GCTL.CoreSoftReset
>> 2. Set GCTL.PrtCapDir(device mode)
>> 3. Reset the device with DCTL.CSftRst
>> 4. Then follow up with the initializing registers sequence
>>
>> Currently we're missing step 1) to do GCTL.CoreSoftReset and step 3) of
>> switching from host to device. John Stult reported a lockup issue seen
>> with HiKey960 platform without these steps[1]. Similar issue is observed
>> with Ferry's testing platform[2].
>>
>> So, apply the required steps along with some fixes to Yu Chen's and John
>> Stultz's version. The main fixes to their versions are the missing wait
>> for clocks synchronization before clearing GCTL.CoreSoftReset and only
>> apply DCTL.CSftRst when switching from host to device.
>>
>> [1] https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/
>> [2] https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/
>>
>> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
>> Cc: Ferry Toth <fntoth@gmail.com>
>> Cc: Wesley Cheng <wcheng@codeaurora.org>
>> Cc: <stable@vger.kernel.org>
>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly")
>> Signed-off-by: Yu Chen <chenyu56@huawei.com>
>> Signed-off-by: John Stultz <john.stultz@linaro.org>
>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
> 
> I still have concerns about the soft reset, but I won't block you guys
> from fixing Hikey's problem :-)
> 
> The only thing I would like to confirm is that this has been verified
> with hundreds of swaps happening as quickly as possible. DWC3 should
> still be functional after several hundred swaps.
> 
> Can someone confirm this is the case? (I'm assuming this can be
> scripted)
> 
Hi Thinh/Felipe,

Thanks Thinh for this change.  Will verify this on our platform as well
with a mode switch loop over the weekend.

Thanks
Wesley Cheng
John Stultz April 16, 2021, 7:38 p.m. UTC | #7
On Fri, Apr 16, 2021 at 3:47 AM Felipe Balbi <balbi@kernel.org> wrote:
>
>
> Hi,
>
> Thinh Nguyen <Thinh.Nguyen@synopsys.com> writes:
>
> > From: Yu Chen <chenyu56@huawei.com>
> > From: John Stultz <john.stultz@linaro.org>
> >
> > According to the programming guide, to switch mode for DRD controller,
> > the driver needs to do the following.
> >
> > To switch from device to host:
> > 1. Reset controller with GCTL.CoreSoftReset
> > 2. Set GCTL.PrtCapDir(host mode)
> > 3. Reset the host with USBCMD.HCRESET
> > 4. Then follow up with the initializing host registers sequence
> >
> > To switch from host to device:
> > 1. Reset controller with GCTL.CoreSoftReset
> > 2. Set GCTL.PrtCapDir(device mode)
> > 3. Reset the device with DCTL.CSftRst
> > 4. Then follow up with the initializing registers sequence
> >
> > Currently we're missing step 1) to do GCTL.CoreSoftReset and step 3) of
> > switching from host to device. John Stult reported a lockup issue seen
> > with HiKey960 platform without these steps[1]. Similar issue is observed
> > with Ferry's testing platform[2].
> >
> > So, apply the required steps along with some fixes to Yu Chen's and John
> > Stultz's version. The main fixes to their versions are the missing wait
> > for clocks synchronization before clearing GCTL.CoreSoftReset and only
> > apply DCTL.CSftRst when switching from host to device.
> >
> > [1] https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/
> > [2] https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/
> >
> > Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
> > Cc: Ferry Toth <fntoth@gmail.com>
> > Cc: Wesley Cheng <wcheng@codeaurora.org>
> > Cc: <stable@vger.kernel.org>
> > Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly")
> > Signed-off-by: Yu Chen <chenyu56@huawei.com>
> > Signed-off-by: John Stultz <john.stultz@linaro.org>
> > Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>
> I still have concerns about the soft reset, but I won't block you guys
> from fixing Hikey's problem :-)
>
> The only thing I would like to confirm is that this has been verified
> with hundreds of swaps happening as quickly as possible. DWC3 should
> still be functional after several hundred swaps.
>
> Can someone confirm this is the case? (I'm assuming this can be
> scripted)

I unfortunately don't have an easy way to automate the switching right
off. But I'll try to hack up the mux switch driver to provide an
interface we can script against.

thanks
-john
Thinh Nguyen April 16, 2021, 7:49 p.m. UTC | #8
John Stultz wrote:
> On Fri, Apr 16, 2021 at 3:47 AM Felipe Balbi <balbi@kernel.org> wrote:
>>
>>
>> Hi,
>>
>> Thinh Nguyen <Thinh.Nguyen@synopsys.com> writes:
>>
>>> From: Yu Chen <chenyu56@huawei.com>
>>> From: John Stultz <john.stultz@linaro.org>
>>>
>>> According to the programming guide, to switch mode for DRD controller,
>>> the driver needs to do the following.
>>>
>>> To switch from device to host:
>>> 1. Reset controller with GCTL.CoreSoftReset
>>> 2. Set GCTL.PrtCapDir(host mode)
>>> 3. Reset the host with USBCMD.HCRESET
>>> 4. Then follow up with the initializing host registers sequence
>>>
>>> To switch from host to device:
>>> 1. Reset controller with GCTL.CoreSoftReset
>>> 2. Set GCTL.PrtCapDir(device mode)
>>> 3. Reset the device with DCTL.CSftRst
>>> 4. Then follow up with the initializing registers sequence
>>>
>>> Currently we're missing step 1) to do GCTL.CoreSoftReset and step 3) of
>>> switching from host to device. John Stult reported a lockup issue seen
>>> with HiKey960 platform without these steps[1]. Similar issue is observed
>>> with Ferry's testing platform[2].
>>>
>>> So, apply the required steps along with some fixes to Yu Chen's and John
>>> Stultz's version. The main fixes to their versions are the missing wait
>>> for clocks synchronization before clearing GCTL.CoreSoftReset and only
>>> apply DCTL.CSftRst when switching from host to device.
>>>
>>> [1] https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/__;!!A4F2R9G_pg!L4TLb25Nkq0DF2qrCPWW13PUq4idhDn5QSZhgvnVAy7wJiYFOSSouSptwo9nOzIdPD4j$ 
>>> [2] https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/__;!!A4F2R9G_pg!L4TLb25Nkq0DF2qrCPWW13PUq4idhDn5QSZhgvnVAy7wJiYFOSSouSptwo9nO21VT8q7$ 
>>>
>>> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
>>> Cc: Ferry Toth <fntoth@gmail.com>
>>> Cc: Wesley Cheng <wcheng@codeaurora.org>
>>> Cc: <stable@vger.kernel.org>
>>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly")
>>> Signed-off-by: Yu Chen <chenyu56@huawei.com>
>>> Signed-off-by: John Stultz <john.stultz@linaro.org>
>>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>>
>> I still have concerns about the soft reset, but I won't block you guys
>> from fixing Hikey's problem :-)
>>
>> The only thing I would like to confirm is that this has been verified
>> with hundreds of swaps happening as quickly as possible. DWC3 should
>> still be functional after several hundred swaps.
>>
>> Can someone confirm this is the case? (I'm assuming this can be
>> scripted)
> 
> I unfortunately don't have an easy way to automate the switching right
> off. But I'll try to hack up the mux switch driver to provide an
> interface we can script against.
> 

FYI, you can do the following:

1) Enable "usb-role-switch" DT property if not already done so
2) Add userspace control:

diff --git a/drivers/usb/dwc3/drd.c b/drivers/usb/dwc3/drd.c
index e2b68bb770d1..b203e3d87291 100644
--- a/drivers/usb/dwc3/drd.c
+++ b/drivers/usb/dwc3/drd.c
@@ -555,6 +555,7 @@ static int dwc3_setup_role_switch(struct dwc3 *dwc)
                mode = DWC3_GCTL_PRTCAP_DEVICE;
        }
 
+       dwc3_role_switch.allow_userspace_control = true;
        dwc3_role_switch.fwnode = dev_fwnode(dwc->dev);
        dwc3_role_switch.set = dwc3_usb_role_switch_set;
        dwc3_role_switch.get = dwc3_usb_role_switch_get;

3) Write a script to do the following:

# echo host > /sys/class/usb_role/<UDC>/role

and

# echo device > /sys/class/usb_role/<UDC>/role

BR,
Thinh
John Stultz April 16, 2021, 9:08 p.m. UTC | #9
On Fri, Apr 16, 2021 at 12:49 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
>
> John Stultz wrote:
> > On Fri, Apr 16, 2021 at 3:47 AM Felipe Balbi <balbi@kernel.org> wrote:
> >>
> >>
> >> Hi,
> >>
> >> Thinh Nguyen <Thinh.Nguyen@synopsys.com> writes:
> >>
> >>> From: Yu Chen <chenyu56@huawei.com>
> >>> From: John Stultz <john.stultz@linaro.org>
> >>>
> >>> According to the programming guide, to switch mode for DRD controller,
> >>> the driver needs to do the following.
> >>>
> >>> To switch from device to host:
> >>> 1. Reset controller with GCTL.CoreSoftReset
> >>> 2. Set GCTL.PrtCapDir(host mode)
> >>> 3. Reset the host with USBCMD.HCRESET
> >>> 4. Then follow up with the initializing host registers sequence
> >>>
> >>> To switch from host to device:
> >>> 1. Reset controller with GCTL.CoreSoftReset
> >>> 2. Set GCTL.PrtCapDir(device mode)
> >>> 3. Reset the device with DCTL.CSftRst
> >>> 4. Then follow up with the initializing registers sequence
> >>>
> >>> Currently we're missing step 1) to do GCTL.CoreSoftReset and step 3) of
> >>> switching from host to device. John Stult reported a lockup issue seen
> >>> with HiKey960 platform without these steps[1]. Similar issue is observed
> >>> with Ferry's testing platform[2].
> >>>
> >>> So, apply the required steps along with some fixes to Yu Chen's and John
> >>> Stultz's version. The main fixes to their versions are the missing wait
> >>> for clocks synchronization before clearing GCTL.CoreSoftReset and only
> >>> apply DCTL.CSftRst when switching from host to device.
> >>>
> >>> [1] https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/__;!!A4F2R9G_pg!L4TLb25Nkq0DF2qrCPWW13PUq4idhDn5QSZhgvnVAy7wJiYFOSSouSptwo9nOzIdPD4j$
> >>> [2] https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/__;!!A4F2R9G_pg!L4TLb25Nkq0DF2qrCPWW13PUq4idhDn5QSZhgvnVAy7wJiYFOSSouSptwo9nO21VT8q7$
> >>>
> >>> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
> >>> Cc: Ferry Toth <fntoth@gmail.com>
> >>> Cc: Wesley Cheng <wcheng@codeaurora.org>
> >>> Cc: <stable@vger.kernel.org>
> >>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly")
> >>> Signed-off-by: Yu Chen <chenyu56@huawei.com>
> >>> Signed-off-by: John Stultz <john.stultz@linaro.org>
> >>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
> >>
> >> I still have concerns about the soft reset, but I won't block you guys
> >> from fixing Hikey's problem :-)
> >>
> >> The only thing I would like to confirm is that this has been verified
> >> with hundreds of swaps happening as quickly as possible. DWC3 should
> >> still be functional after several hundred swaps.
> >>
> >> Can someone confirm this is the case? (I'm assuming this can be
> >> scripted)
> >
> > I unfortunately don't have an easy way to automate the switching right
> > off. But I'll try to hack up the mux switch driver to provide an
> > interface we can script against.
> >
>
> FYI, you can do the following:
>
> 1) Enable "usb-role-switch" DT property if not already done so
> 2) Add userspace control:
>
> diff --git a/drivers/usb/dwc3/drd.c b/drivers/usb/dwc3/drd.c
> index e2b68bb770d1..b203e3d87291 100644
> --- a/drivers/usb/dwc3/drd.c
> +++ b/drivers/usb/dwc3/drd.c
> @@ -555,6 +555,7 @@ static int dwc3_setup_role_switch(struct dwc3 *dwc)
>                 mode = DWC3_GCTL_PRTCAP_DEVICE;
>         }
>
> +       dwc3_role_switch.allow_userspace_control = true;
>         dwc3_role_switch.fwnode = dev_fwnode(dwc->dev);
>         dwc3_role_switch.set = dwc3_usb_role_switch_set;
>         dwc3_role_switch.get = dwc3_usb_role_switch_get;
>
> 3) Write a script to do the following:
>
> # echo host > /sys/class/usb_role/<UDC>/role
>
> and
>
> # echo device > /sys/class/usb_role/<UDC>/role

Thanks so much for this. So I ran both of those commands in a while
loop for awhile and didn't see any trouble.

HiKey960 is interesting as well because we have a mux switch, which is
sort of an intermediary roll switcher (it gets the role switch signal
from the tcpci_rt1711h, tweaks some gpios and then signals the dwc3).
So I also did the above tweaks to the mux-switch and had it switching
between device/none and validated the onboard hub came up and down
along with the dwc3 core.

Everything still looks good here.

thanks
-john

thanks
-john
Ferry Toth April 16, 2021, 9:17 p.m. UTC | #10
Hi

Op 16-04-2021 om 00:23 schreef Thinh Nguyen:
> Thinh Nguyen wrote:
>> From: Yu Chen <chenyu56@huawei.com>
>> From: John Stultz <john.stultz@linaro.org>
>>
>> According to the programming guide, to switch mode for DRD controller,
>> the driver needs to do the following.
>>
>> To switch from device to host:
>> 1. Reset controller with GCTL.CoreSoftReset
>> 2. Set GCTL.PrtCapDir(host mode)
>> 3. Reset the host with USBCMD.HCRESET
>> 4. Then follow up with the initializing host registers sequence
>>
>> To switch from host to device:
>> 1. Reset controller with GCTL.CoreSoftReset
>> 2. Set GCTL.PrtCapDir(device mode)
>> 3. Reset the device with DCTL.CSftRst
>> 4. Then follow up with the initializing registers sequence
>>
>> Currently we're missing step 1) to do GCTL.CoreSoftReset and step 3) of
>> switching from host to device. John Stult reported a lockup issue seen
>> with HiKey960 platform without these steps[1]. Similar issue is observed
>> with Ferry's testing platform[2].
>>
>> So, apply the required steps along with some fixes to Yu Chen's and John
>> Stultz's version. The main fixes to their versions are the missing wait
>> for clocks synchronization before clearing GCTL.CoreSoftReset and only
>> apply DCTL.CSftRst when switching from host to device.
>>
>> [1] https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqLhzN4i3$
>> [2] https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqGeZStt4$
>>
>> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
>> Cc: Ferry Toth <fntoth@gmail.com>
>> Cc: Wesley Cheng <wcheng@codeaurora.org>
>> Cc: <stable@vger.kernel.org>
>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly")
>> Signed-off-by: Yu Chen <chenyu56@huawei.com>
>> Signed-off-by: John Stultz <john.stultz@linaro.org>
>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>> ---
>> Changes in v3:
>> - Check if the desired mode is OTG, then keep the old flow
>> - Remove condition for OTG support only since the device can still be
>>    configured DRD host/device mode only
>> - Remove redundant hw_mode check since __dwc3_set_mode() only applies when
>>    hw_mode is DRD
>> Changes in v2:
>> - Initialize mutex per device and not as global mutex.
>> - Add additional checks for DRD only mode
>>
>>   drivers/usb/dwc3/core.c | 27 +++++++++++++++++++++++++++
>>   drivers/usb/dwc3/core.h |  5 +++++
>>   2 files changed, 32 insertions(+)
>>
> Hi John,
>
> If possible, can you run a test with this version on your platform?
>
> Thanks,
> Thinh
>
I tested this on edison-arduino with this patch on top of usb-next 
(5.12-rc7 + "increase BESL baseline to 6" to prevent throttling").

On this platform there is a physical switch to switch roles. With this 
patch I find:

- switch to host mode always works fine

- switch to gadget mode I need to flip the switch 3x (gadget-host-gadget).

An error message appears on the gadget side "dwc3 dwc3.0.auto: timed out 
waiting for SETUP phase" appears, but then the device connects to my PC, 
no throttling.

- alternatively I can switch to gadget 1x and then unplug/replug the cable.

No error message and connects fine.

- if I flip the switch only once, on the PC side I get:

    kernel: usb 1-5: new high-speed USB device number 18 using xhci_hcd
    kernel: usb 1-5: New USB device found, idVendor=1d6b,
    idProduct=0104, bcdDevice= 1.00 kernel: usb 1-5: New USB device
    strings: Mfr=1, Product=2, SerialNumber=3 kernel: usb 1-5: Product:
    USBArmory Gadget kernel: usb 1-5: Manufacturer: USBArmory kernel:
    usb 1-5: SerialNumber: 0123456789abcdef kernel: usb 1-5: can't set
    config #1, error -110

Then if I wait long enough on the gadget side I get:

    root@yuna:~# ifconfig

    usb0: flags=-28605<UP,BROADCAST,RUNNING,MULTICAST,DYNAMIC> mtu 1500
    inet 169.254.119.239 netmask 255.255.0.0 broadcast 169.254.255.255
    inet6 fe80::a8bb:ccff:fedd:eef1 prefixlen 64 scopeid 0x20<link>
    ether aa:bb:cc:dd:ee:f1 txqueuelen 1000 (Ethernet) RX packets 490424
    bytes 735146578 (701.0 MiB) RX errors 0 dropped 191 overruns 0 frame
    0 TX packets 35279 bytes 2532746 (2.4 MiB) TX errors 0 dropped 0
    overruns 0 carrier 0 collisions 0

(correct would be: inet 10.42.0.221 netmask 255.255.255.0 broadcast 
10.42.0.255)

So much improved now, but it seems I am still missing something on plug.
Thinh Nguyen April 17, 2021, 2:27 a.m. UTC | #11
Ferry Toth wrote:
> Hi
> 
> Op 16-04-2021 om 00:23 schreef Thinh Nguyen:
>> Thinh Nguyen wrote:
>>> From: Yu Chen <chenyu56@huawei.com>
>>> From: John Stultz <john.stultz@linaro.org>
>>>
>>> According to the programming guide, to switch mode for DRD controller,
>>> the driver needs to do the following.
>>>
>>> To switch from device to host:
>>> 1. Reset controller with GCTL.CoreSoftReset
>>> 2. Set GCTL.PrtCapDir(host mode)
>>> 3. Reset the host with USBCMD.HCRESET
>>> 4. Then follow up with the initializing host registers sequence
>>>
>>> To switch from host to device:
>>> 1. Reset controller with GCTL.CoreSoftReset
>>> 2. Set GCTL.PrtCapDir(device mode)
>>> 3. Reset the device with DCTL.CSftRst
>>> 4. Then follow up with the initializing registers sequence
>>>
>>> Currently we're missing step 1) to do GCTL.CoreSoftReset and step 3) of
>>> switching from host to device. John Stult reported a lockup issue seen
>>> with HiKey960 platform without these steps[1]. Similar issue is observed
>>> with Ferry's testing platform[2].
>>>
>>> So, apply the required steps along with some fixes to Yu Chen's and John
>>> Stultz's version. The main fixes to their versions are the missing wait
>>> for clocks synchronization before clearing GCTL.CoreSoftReset and only
>>> apply DCTL.CSftRst when switching from host to device.
>>>
>>> [1]
>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqLhzN4i3$
>>>
>>> [2]
>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqGeZStt4$
>>>
>>>
>>> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
>>> Cc: Ferry Toth <fntoth@gmail.com>
>>> Cc: Wesley Cheng <wcheng@codeaurora.org>
>>> Cc: <stable@vger.kernel.org>
>>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work
>>> properly")
>>> Signed-off-by: Yu Chen <chenyu56@huawei.com>
>>> Signed-off-by: John Stultz <john.stultz@linaro.org>
>>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>>> ---
>>> Changes in v3:
>>> - Check if the desired mode is OTG, then keep the old flow
>>> - Remove condition for OTG support only since the device can still be
>>>    configured DRD host/device mode only
>>> - Remove redundant hw_mode check since __dwc3_set_mode() only applies
>>> when
>>>    hw_mode is DRD
>>> Changes in v2:
>>> - Initialize mutex per device and not as global mutex.
>>> - Add additional checks for DRD only mode
>>>
>>>   drivers/usb/dwc3/core.c | 27 +++++++++++++++++++++++++++
>>>   drivers/usb/dwc3/core.h |  5 +++++
>>>   2 files changed, 32 insertions(+)
>>>
>> Hi John,
>>
>> If possible, can you run a test with this version on your platform?
>>
>> Thanks,
>> Thinh
>>
> I tested this on edison-arduino with this patch on top of usb-next
> (5.12-rc7 + "increase BESL baseline to 6" to prevent throttling").
> 
> On this platform there is a physical switch to switch roles. With this
> patch I find:
> 
> - switch to host mode always works fine
> 
> - switch to gadget mode I need to flip the switch 3x (gadget-host-gadget).
> 
> An error message appears on the gadget side "dwc3 dwc3.0.auto: timed out
> waiting for SETUP phase" appears, but then the device connects to my PC,
> no throttling.
> 
> - alternatively I can switch to gadget 1x and then unplug/replug the cable.
> 
> No error message and connects fine.
> 
> - if I flip the switch only once, on the PC side I get:
> 
>   kernel: usb 1-5: new high-speed USB device number 18 using xhci_hcd
>   kernel: usb 1-5: New USB device found, idVendor=1d6b,
>   idProduct=0104, bcdDevice= 1.00 kernel: usb 1-5: New USB device
>   strings: Mfr=1, Product=2, SerialNumber=3 kernel: usb 1-5: Product:
>   USBArmory Gadget kernel: usb 1-5: Manufacturer: USBArmory kernel:
>   usb 1-5: SerialNumber: 0123456789abcdef kernel: usb 1-5: can't set
>   config #1, error -110

The device failed at set_configuration() request and timed out. It
probably timed out from the status stage looking at the device err print.

> 
> Then if I wait long enough on the gadget side I get:
> 
>   root@yuna:~# ifconfig
> 
>   usb0: flags=-28605<UP,BROADCAST,RUNNING,MULTICAST,DYNAMIC> mtu 1500
>   inet 169.254.119.239 netmask 255.255.0.0 broadcast 169.254.255.255
>   inet6 fe80::a8bb:ccff:fedd:eef1 prefixlen 64 scopeid 0x20<link>
>   ether aa:bb:cc:dd:ee:f1 txqueuelen 1000 (Ethernet) RX packets 490424
>   bytes 735146578 (701.0 MiB) RX errors 0 dropped 191 overruns 0 frame
>   0 TX packets 35279 bytes 2532746 (2.4 MiB) TX errors 0 dropped 0
>   overruns 0 carrier 0 collisions 0
> 
> (correct would be: inet 10.42.0.221 netmask 255.255.255.0 broadcast
> 10.42.0.255)
> 
> So much improved now, but it seems I am still missing something on plug.
> 

That's great! We can look at it further. Can you capture the tracepoints
of the issue. Also, can you try with mass_storage gadget to see if the
result is the same?

Thanks,
Thinh
Felipe Balbi April 17, 2021, 6:25 a.m. UTC | #12
Hi,

John Stultz <john.stultz@linaro.org> writes:
> On Fri, Apr 16, 2021 at 12:49 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
>>
>> John Stultz wrote:
>> > On Fri, Apr 16, 2021 at 3:47 AM Felipe Balbi <balbi@kernel.org> wrote:
>> >>
>> >>
>> >> Hi,
>> >>
>> >> Thinh Nguyen <Thinh.Nguyen@synopsys.com> writes:
>> >>
>> >>> From: Yu Chen <chenyu56@huawei.com>
>> >>> From: John Stultz <john.stultz@linaro.org>
>> >>>
>> >>> According to the programming guide, to switch mode for DRD controller,
>> >>> the driver needs to do the following.
>> >>>
>> >>> To switch from device to host:
>> >>> 1. Reset controller with GCTL.CoreSoftReset
>> >>> 2. Set GCTL.PrtCapDir(host mode)
>> >>> 3. Reset the host with USBCMD.HCRESET
>> >>> 4. Then follow up with the initializing host registers sequence
>> >>>
>> >>> To switch from host to device:
>> >>> 1. Reset controller with GCTL.CoreSoftReset
>> >>> 2. Set GCTL.PrtCapDir(device mode)
>> >>> 3. Reset the device with DCTL.CSftRst
>> >>> 4. Then follow up with the initializing registers sequence
>> >>>
>> >>> Currently we're missing step 1) to do GCTL.CoreSoftReset and step 3) of
>> >>> switching from host to device. John Stult reported a lockup issue seen
>> >>> with HiKey960 platform without these steps[1]. Similar issue is observed
>> >>> with Ferry's testing platform[2].
>> >>>
>> >>> So, apply the required steps along with some fixes to Yu Chen's and John
>> >>> Stultz's version. The main fixes to their versions are the missing wait
>> >>> for clocks synchronization before clearing GCTL.CoreSoftReset and only
>> >>> apply DCTL.CSftRst when switching from host to device.
>> >>>
>> >>> [1] https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/__;!!A4F2R9G_pg!L4TLb25Nkq0DF2qrCPWW13PUq4idhDn5QSZhgvnVAy7wJiYFOSSouSptwo9nOzIdPD4j$
>> >>> [2] https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/__;!!A4F2R9G_pg!L4TLb25Nkq0DF2qrCPWW13PUq4idhDn5QSZhgvnVAy7wJiYFOSSouSptwo9nO21VT8q7$
>> >>>
>> >>> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
>> >>> Cc: Ferry Toth <fntoth@gmail.com>
>> >>> Cc: Wesley Cheng <wcheng@codeaurora.org>
>> >>> Cc: <stable@vger.kernel.org>
>> >>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly")
>> >>> Signed-off-by: Yu Chen <chenyu56@huawei.com>
>> >>> Signed-off-by: John Stultz <john.stultz@linaro.org>
>> >>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>> >>
>> >> I still have concerns about the soft reset, but I won't block you guys
>> >> from fixing Hikey's problem :-)
>> >>
>> >> The only thing I would like to confirm is that this has been verified
>> >> with hundreds of swaps happening as quickly as possible. DWC3 should
>> >> still be functional after several hundred swaps.
>> >>
>> >> Can someone confirm this is the case? (I'm assuming this can be
>> >> scripted)
>> >
>> > I unfortunately don't have an easy way to automate the switching right
>> > off. But I'll try to hack up the mux switch driver to provide an
>> > interface we can script against.
>> >
>>
>> FYI, you can do the following:
>>
>> 1) Enable "usb-role-switch" DT property if not already done so
>> 2) Add userspace control:
>>
>> diff --git a/drivers/usb/dwc3/drd.c b/drivers/usb/dwc3/drd.c
>> index e2b68bb770d1..b203e3d87291 100644
>> --- a/drivers/usb/dwc3/drd.c
>> +++ b/drivers/usb/dwc3/drd.c
>> @@ -555,6 +555,7 @@ static int dwc3_setup_role_switch(struct dwc3 *dwc)
>>                 mode = DWC3_GCTL_PRTCAP_DEVICE;
>>         }
>>
>> +       dwc3_role_switch.allow_userspace_control = true;
>>         dwc3_role_switch.fwnode = dev_fwnode(dwc->dev);
>>         dwc3_role_switch.set = dwc3_usb_role_switch_set;
>>         dwc3_role_switch.get = dwc3_usb_role_switch_get;
>>
>> 3) Write a script to do the following:
>>
>> # echo host > /sys/class/usb_role/<UDC>/role
>>
>> and
>>
>> # echo device > /sys/class/usb_role/<UDC>/role
>
> Thanks so much for this. So I ran both of those commands in a while
> loop for awhile and didn't see any trouble.
>
> HiKey960 is interesting as well because we have a mux switch, which is
> sort of an intermediary roll switcher (it gets the role switch signal
> from the tcpci_rt1711h, tweaks some gpios and then signals the dwc3).
> So I also did the above tweaks to the mux-switch and had it switching
> between device/none and validated the onboard hub came up and down
> along with the dwc3 core.
>
> Everything still looks good here.

Sounds good, happy to see so many platforms supported by Thinh's
change. Thanks for doing this work, Thinh :-)
Ferry Toth April 17, 2021, 2:22 p.m. UTC | #13
Hi

Op 17-04-2021 om 04:27 schreef Thinh Nguyen:
> Ferry Toth wrote:
>> Hi
>>
>> Op 16-04-2021 om 00:23 schreef Thinh Nguyen:
>>> Thinh Nguyen wrote:
>>>> From: Yu Chen <chenyu56@huawei.com>
>>>> From: John Stultz <john.stultz@linaro.org>
>>>>
>>>> According to the programming guide, to switch mode for DRD controller,
>>>> the driver needs to do the following.
>>>>
>>>> To switch from device to host:
>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>> 2. Set GCTL.PrtCapDir(host mode)
>>>> 3. Reset the host with USBCMD.HCRESET
>>>> 4. Then follow up with the initializing host registers sequence
>>>>
>>>> To switch from host to device:
>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>> 2. Set GCTL.PrtCapDir(device mode)
>>>> 3. Reset the device with DCTL.CSftRst
>>>> 4. Then follow up with the initializing registers sequence
>>>>
>>>> Currently we're missing step 1) to do GCTL.CoreSoftReset and step 3) of
>>>> switching from host to device. John Stult reported a lockup issue seen
>>>> with HiKey960 platform without these steps[1]. Similar issue is observed
>>>> with Ferry's testing platform[2].
>>>>
>>>> So, apply the required steps along with some fixes to Yu Chen's and John
>>>> Stultz's version. The main fixes to their versions are the missing wait
>>>> for clocks synchronization before clearing GCTL.CoreSoftReset and only
>>>> apply DCTL.CSftRst when switching from host to device.
>>>>
>>>> [1]
>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqLhzN4i3$
>>>>
>>>> [2]
>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqGeZStt4$
>>>>
>>>>
>>>> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
>>>> Cc: Ferry Toth <fntoth@gmail.com>
>>>> Cc: Wesley Cheng <wcheng@codeaurora.org>
>>>> Cc: <stable@vger.kernel.org>
>>>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work
>>>> properly")
>>>> Signed-off-by: Yu Chen <chenyu56@huawei.com>
>>>> Signed-off-by: John Stultz <john.stultz@linaro.org>
>>>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>>>> ---
>>>> Changes in v3:
>>>> - Check if the desired mode is OTG, then keep the old flow
>>>> - Remove condition for OTG support only since the device can still be
>>>>     configured DRD host/device mode only
>>>> - Remove redundant hw_mode check since __dwc3_set_mode() only applies
>>>> when
>>>>     hw_mode is DRD
>>>> Changes in v2:
>>>> - Initialize mutex per device and not as global mutex.
>>>> - Add additional checks for DRD only mode
>>>>
>>>>    drivers/usb/dwc3/core.c | 27 +++++++++++++++++++++++++++
>>>>    drivers/usb/dwc3/core.h |  5 +++++
>>>>    2 files changed, 32 insertions(+)
>>>>
>>> Hi John,
>>>
>>> If possible, can you run a test with this version on your platform?
>>>
>>> Thanks,
>>> Thinh
>>>
>> I tested this on edison-arduino with this patch on top of usb-next
>> (5.12-rc7 + "increase BESL baseline to 6" to prevent throttling").
>>
>> On this platform there is a physical switch to switch roles. With this
>> patch I find:
>>
>> - switch to host mode always works fine
>>
>> - switch to gadget mode I need to flip the switch 3x (gadget-host-gadget).
>>
>> An error message appears on the gadget side "dwc3 dwc3.0.auto: timed out
>> waiting for SETUP phase" appears, but then the device connects to my PC,
>> no throttling.
>>
>> - alternatively I can switch to gadget 1x and then unplug/replug the cable.
>>
>> No error message and connects fine.
>>
>> - if I flip the switch only once, on the PC side I get:
>>
>>    kernel: usb 1-5: new high-speed USB device number 18 using xhci_hcd
>>    kernel: usb 1-5: New USB device found, idVendor=1d6b,
>>    idProduct=0104, bcdDevice= 1.00 kernel: usb 1-5: New USB device
>>    strings: Mfr=1, Product=2, SerialNumber=3 kernel: usb 1-5: Product:
>>    USBArmory Gadget kernel: usb 1-5: Manufacturer: USBArmory kernel:
>>    usb 1-5: SerialNumber: 0123456789abcdef kernel: usb 1-5: can't set
>>    config #1, error -110
> The device failed at set_configuration() request and timed out. It
> probably timed out from the status stage looking at the device err print.
>
>> Then if I wait long enough on the gadget side I get:
>>
>>    root@yuna:~# ifconfig
>>
>>    usb0: flags=-28605<UP,BROADCAST,RUNNING,MULTICAST,DYNAMIC> mtu 1500
>>    inet 169.254.119.239 netmask 255.255.0.0 broadcast 169.254.255.255
>>    inet6 fe80::a8bb:ccff:fedd:eef1 prefixlen 64 scopeid 0x20<link>
>>    ether aa:bb:cc:dd:ee:f1 txqueuelen 1000 (Ethernet) RX packets 490424
>>    bytes 735146578 (701.0 MiB) RX errors 0 dropped 191 overruns 0 frame
>>    0 TX packets 35279 bytes 2532746 (2.4 MiB) TX errors 0 dropped 0
>>    overruns 0 carrier 0 collisions 0
>>
>> (correct would be: inet 10.42.0.221 netmask 255.255.255.0 broadcast
>> 10.42.0.255)
>>
>> So much improved now, but it seems I am still missing something on plug.
>>
> That's great! We can look at it further. Can you capture the tracepoints
> of the issue. Also, can you try with mass_storage gadget to see if the
> result is the same?

I have already gser, eem, mass_storage and uac2 combo. When eem fails, 
the mass_storage and uac2 don't appear (on KDE you get all kind of 
popups when they appear).

So either all works, or all fails.

I'll trace this later today.

> Thanks,
> Thinh
Ferry Toth April 17, 2021, 4:32 p.m. UTC | #14
Hi

Op 17-04-2021 om 16:22 schreef Ferry Toth:
> Hi
>
> Op 17-04-2021 om 04:27 schreef Thinh Nguyen:
>> Ferry Toth wrote:
>>> Hi
>>>
>>> Op 16-04-2021 om 00:23 schreef Thinh Nguyen:
>>>> Thinh Nguyen wrote:
>>>>> From: Yu Chen <chenyu56@huawei.com>
>>>>> From: John Stultz <john.stultz@linaro.org>
>>>>>
>>>>> According to the programming guide, to switch mode for DRD 
>>>>> controller,
>>>>> the driver needs to do the following.
>>>>>
>>>>> To switch from device to host:
>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>> 2. Set GCTL.PrtCapDir(host mode)
>>>>> 3. Reset the host with USBCMD.HCRESET
>>>>> 4. Then follow up with the initializing host registers sequence
>>>>>
>>>>> To switch from host to device:
>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>> 2. Set GCTL.PrtCapDir(device mode)
>>>>> 3. Reset the device with DCTL.CSftRst
>>>>> 4. Then follow up with the initializing registers sequence
>>>>>
>>>>> Currently we're missing step 1) to do GCTL.CoreSoftReset and step 
>>>>> 3) of
>>>>> switching from host to device. John Stult reported a lockup issue 
>>>>> seen
>>>>> with HiKey960 platform without these steps[1]. Similar issue is 
>>>>> observed
>>>>> with Ferry's testing platform[2].
>>>>>
>>>>> So, apply the required steps along with some fixes to Yu Chen's 
>>>>> and John
>>>>> Stultz's version. The main fixes to their versions are the missing 
>>>>> wait
>>>>> for clocks synchronization before clearing GCTL.CoreSoftReset and 
>>>>> only
>>>>> apply DCTL.CSftRst when switching from host to device.
>>>>>
>>>>> [1]
>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqLhzN4i3$ 
>>>>>
>>>>>
>>>>> [2]
>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqGeZStt4$ 
>>>>>
>>>>>
>>>>>
>>>>> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
>>>>> Cc: Ferry Toth <fntoth@gmail.com>
>>>>> Cc: Wesley Cheng <wcheng@codeaurora.org>
>>>>> Cc: <stable@vger.kernel.org>
>>>>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work
>>>>> properly")
>>>>> Signed-off-by: Yu Chen <chenyu56@huawei.com>
>>>>> Signed-off-by: John Stultz <john.stultz@linaro.org>
>>>>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>>>>> ---
>>>>> Changes in v3:
>>>>> - Check if the desired mode is OTG, then keep the old flow
>>>>> - Remove condition for OTG support only since the device can still be
>>>>>     configured DRD host/device mode only
>>>>> - Remove redundant hw_mode check since __dwc3_set_mode() only applies
>>>>> when
>>>>>     hw_mode is DRD
>>>>> Changes in v2:
>>>>> - Initialize mutex per device and not as global mutex.
>>>>> - Add additional checks for DRD only mode
>>>>>
>>>>>    drivers/usb/dwc3/core.c | 27 +++++++++++++++++++++++++++
>>>>>    drivers/usb/dwc3/core.h |  5 +++++
>>>>>    2 files changed, 32 insertions(+)
>>>>>
>>>> Hi John,
>>>>
>>>> If possible, can you run a test with this version on your platform?
>>>>
>>>> Thanks,
>>>> Thinh
>>>>
>>> I tested this on edison-arduino with this patch on top of usb-next
>>> (5.12-rc7 + "increase BESL baseline to 6" to prevent throttling").
>>>
>>> On this platform there is a physical switch to switch roles. With this
>>> patch I find:
>>>
>>> - switch to host mode always works fine
>>>
>>> - switch to gadget mode I need to flip the switch 3x 
>>> (gadget-host-gadget).
>>>
>>> An error message appears on the gadget side "dwc3 dwc3.0.auto: timed 
>>> out
>>> waiting for SETUP phase" appears, but then the device connects to my 
>>> PC,
>>> no throttling.
>>>
>>> - alternatively I can switch to gadget 1x and then unplug/replug the 
>>> cable.
>>>
>>> No error message and connects fine.
>>>
>>> - if I flip the switch only once, on the PC side I get:
>>>
>>>    kernel: usb 1-5: new high-speed USB device number 18 using xhci_hcd
>>>    kernel: usb 1-5: New USB device found, idVendor=1d6b,
>>>    idProduct=0104, bcdDevice= 1.00 kernel: usb 1-5: New USB device
>>>    strings: Mfr=1, Product=2, SerialNumber=3 kernel: usb 1-5: Product:
>>>    USBArmory Gadget kernel: usb 1-5: Manufacturer: USBArmory kernel:
>>>    usb 1-5: SerialNumber: 0123456789abcdef kernel: usb 1-5: can't set
>>>    config #1, error -110
>> The device failed at set_configuration() request and timed out. It
>> probably timed out from the status stage looking at the device err 
>> print.
>>
>>> Then if I wait long enough on the gadget side I get:
>>>
>>>    root@yuna:~# ifconfig
>>>
>>>    usb0: flags=-28605<UP,BROADCAST,RUNNING,MULTICAST,DYNAMIC> mtu 1500
>>>    inet 169.254.119.239 netmask 255.255.0.0 broadcast 169.254.255.255
>>>    inet6 fe80::a8bb:ccff:fedd:eef1 prefixlen 64 scopeid 0x20<link>
>>>    ether aa:bb:cc:dd:ee:f1 txqueuelen 1000 (Ethernet) RX packets 490424
>>>    bytes 735146578 (701.0 MiB) RX errors 0 dropped 191 overruns 0 frame
>>>    0 TX packets 35279 bytes 2532746 (2.4 MiB) TX errors 0 dropped 0
>>>    overruns 0 carrier 0 collisions 0
>>>
>>> (correct would be: inet 10.42.0.221 netmask 255.255.255.0 broadcast
>>> 10.42.0.255)
>>>
>>> So much improved now, but it seems I am still missing something on 
>>> plug.
>>>
>> That's great! We can look at it further. Can you capture the tracepoints
>> of the issue. Also, can you try with mass_storage gadget to see if the
>> result is the same?
>
> I have already gser, eem, mass_storage and uac2 combo. When eem fails, 
> the mass_storage and uac2 don't appear (on KDE you get all kind of 
> popups when they appear).
>
> So either all works, or all fails.
>
> I'll trace this later today.

Trace capturing switch from host-> gadget  here 
https://github.com/andy-shev/linux/files/6329600/5.12-rc7%2Busb-next.zip

(Issue history: https://github.com/andy-shev/linux/issues/31)

On the PC side this resulted to:

apr 17 18:17:44 delfion kernel: usb 1-5: new high-speed USB device 
number 12 using xhci_hcd
apr 17 18:17:44 delfion kernel: usb 1-5: New USB device found, 
idVendor=1d6b, idProduct=0104, bcdDevice= 1.00
apr 17 18:17:44 delfion kernel: usb 1-5: New USB device strings: Mfr=1, 
Product=2, SerialNumber=3
apr 17 18:17:44 delfion kernel: usb 1-5: Product: USBArmory Gadget
apr 17 18:17:44 delfion kernel: usb 1-5: Manufacturer: USBArmory
apr 17 18:17:44 delfion kernel: usb 1-5: SerialNumber: 0123456789abcdef
apr 17 18:17:49 delfion kernel: usb 1-5: can't set config #1, error -110


Thanks for all your help!

>
>> Thanks,
>> Thinh
Thinh Nguyen April 18, 2021, 11:03 p.m. UTC | #15
Ferry Toth wrote:
> Hi
> 
> Op 17-04-2021 om 16:22 schreef Ferry Toth:
>> Hi
>>
>> Op 17-04-2021 om 04:27 schreef Thinh Nguyen:
>>> Ferry Toth wrote:
>>>> Hi
>>>>
>>>> Op 16-04-2021 om 00:23 schreef Thinh Nguyen:
>>>>> Thinh Nguyen wrote:
>>>>>> From: Yu Chen <chenyu56@huawei.com>
>>>>>> From: John Stultz <john.stultz@linaro.org>
>>>>>>
>>>>>> According to the programming guide, to switch mode for DRD
>>>>>> controller,
>>>>>> the driver needs to do the following.
>>>>>>
>>>>>> To switch from device to host:
>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>> 2. Set GCTL.PrtCapDir(host mode)
>>>>>> 3. Reset the host with USBCMD.HCRESET
>>>>>> 4. Then follow up with the initializing host registers sequence
>>>>>>
>>>>>> To switch from host to device:
>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>> 2. Set GCTL.PrtCapDir(device mode)
>>>>>> 3. Reset the device with DCTL.CSftRst
>>>>>> 4. Then follow up with the initializing registers sequence
>>>>>>
>>>>>> Currently we're missing step 1) to do GCTL.CoreSoftReset and step
>>>>>> 3) of
>>>>>> switching from host to device. John Stult reported a lockup issue
>>>>>> seen
>>>>>> with HiKey960 platform without these steps[1]. Similar issue is
>>>>>> observed
>>>>>> with Ferry's testing platform[2].
>>>>>>
>>>>>> So, apply the required steps along with some fixes to Yu Chen's
>>>>>> and John
>>>>>> Stultz's version. The main fixes to their versions are the missing
>>>>>> wait
>>>>>> for clocks synchronization before clearing GCTL.CoreSoftReset and
>>>>>> only
>>>>>> apply DCTL.CSftRst when switching from host to device.
>>>>>>
>>>>>> [1]
>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqLhzN4i3$
>>>>>>
>>>>>>
>>>>>> [2]
>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqGeZStt4$
>>>>>>
>>>>>>
>>>>>>
>>>>>> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
>>>>>> Cc: Ferry Toth <fntoth@gmail.com>
>>>>>> Cc: Wesley Cheng <wcheng@codeaurora.org>
>>>>>> Cc: <stable@vger.kernel.org>
>>>>>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work
>>>>>> properly")
>>>>>> Signed-off-by: Yu Chen <chenyu56@huawei.com>
>>>>>> Signed-off-by: John Stultz <john.stultz@linaro.org>
>>>>>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>>>>>> ---
>>>>>> Changes in v3:
>>>>>> - Check if the desired mode is OTG, then keep the old flow
>>>>>> - Remove condition for OTG support only since the device can still be
>>>>>>     configured DRD host/device mode only
>>>>>> - Remove redundant hw_mode check since __dwc3_set_mode() only applies
>>>>>> when
>>>>>>     hw_mode is DRD
>>>>>> Changes in v2:
>>>>>> - Initialize mutex per device and not as global mutex.
>>>>>> - Add additional checks for DRD only mode
>>>>>>
>>>>>>    drivers/usb/dwc3/core.c | 27 +++++++++++++++++++++++++++
>>>>>>    drivers/usb/dwc3/core.h |  5 +++++
>>>>>>    2 files changed, 32 insertions(+)
>>>>>>
>>>>> Hi John,
>>>>>
>>>>> If possible, can you run a test with this version on your platform?
>>>>>
>>>>> Thanks,
>>>>> Thinh
>>>>>
>>>> I tested this on edison-arduino with this patch on top of usb-next
>>>> (5.12-rc7 + "increase BESL baseline to 6" to prevent throttling").
>>>>
>>>> On this platform there is a physical switch to switch roles. With this
>>>> patch I find:
>>>>
>>>> - switch to host mode always works fine
>>>>
>>>> - switch to gadget mode I need to flip the switch 3x
>>>> (gadget-host-gadget).
>>>>
>>>> An error message appears on the gadget side "dwc3 dwc3.0.auto: timed
>>>> out
>>>> waiting for SETUP phase" appears, but then the device connects to my
>>>> PC,
>>>> no throttling.
>>>>
>>>> - alternatively I can switch to gadget 1x and then unplug/replug the
>>>> cable.
>>>>
>>>> No error message and connects fine.
>>>>
>>>> - if I flip the switch only once, on the PC side I get:
>>>>
>>>>    kernel: usb 1-5: new high-speed USB device number 18 usingxhci_hcd
>>>>    kernel: usb 1-5: New USB device found, idVendor=1d6b,
>>>>    idProduct=0104, bcdDevice= 1.00 kernel: usb 1-5: New USB device
>>>>    strings: Mfr=1, Product=2, SerialNumber=3 kernel: usb 1-5: Product:
>>>>    USBArmory Gadget kernel: usb 1-5: Manufacturer: USBArmory kernel:
>>>>    usb 1-5: SerialNumber: 0123456789abcdef kernel: usb 1-5: can't set
>>>>    config #1, error -110
>>> The device failed at set_configuration() request and timed out. It
>>> probably timed out from the status stage looking at the device err
>>> print.
>>>
>>>> Then if I wait long enough on the gadget side I get:
>>>>
>>>>    root@yuna:~# ifconfig
>>>>
>>>>    usb0: flags=-28605<UP,BROADCAST,RUNNING,MULTICAST,DYNAMIC> mtu 1500
>>>>    inet 169.254.119.239 netmask 255.255.0.0 broadcast 169.254.255.255
>>>>    inet6 fe80::a8bb:ccff:fedd:eef1 prefixlen 64 scopeid 0x20<link>
>>>>    ether aa:bb:cc:dd:ee:f1 txqueuelen 1000 (Ethernet) RX packets 490424
>>>>    bytes 735146578 (701.0 MiB) RX errors 0 dropped 191 overruns 0 frame
>>>>    0 TX packets 35279 bytes 2532746 (2.4 MiB) TX errors 0 dropped 0
>>>>    overruns 0 carrier 0 collisions 0
>>>>
>>>> (correct would be: inet 10.42.0.221 netmask 255.255.255.0 broadcast
>>>> 10.42.0.255)
>>>>
>>>> So much improved now, but it seems I am still missing something on
>>>> plug.
>>>>
>>> That's great! We can look at it further. Can you capture the tracepoints
>>> of the issue. Also, can you try with mass_storage gadget to see if the
>>> result is the same?
>>
>> I have already gser, eem, mass_storage and uac2 combo. When eem fails,
>> the mass_storage and uac2 don't appear (on KDE you get all kind of
>> popups when they appear).
>>
>> So either all works, or all fails.
>>
>> I'll trace this later today.
> 
> Trace capturing switch from host-> gadget  here
> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6329600/5.12-rc7*2Busb-next.zip__;JQ!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXLOiNwGT$
> 
> (Issue history:
> https://urldefense.com/v3/__https://github.com/andy-shev/linux/issues/31__;!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXNc7KgAw$
> )
> 
> On the PC side this resulted to:
> 
> apr 17 18:17:44 delfion kernel: usb 1-5: new high-speed USB device
> number 12 using xhci_hcd
> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device found,
> idVendor=1d6b, idProduct=0104, bcdDevice= 1.00
> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device strings: Mfr=1,
> Product=2, SerialNumber=3
> apr 17 18:17:44 delfion kernel: usb 1-5: Product: USBArmory Gadget
> apr 17 18:17:44 delfion kernel: usb 1-5: Manufacturer: USBArmory
> apr 17 18:17:44 delfion kernel: usb 1-5: SerialNumber: 0123456789abcdef
> apr 17 18:17:49 delfion kernel: usb 1-5: can't set config #1, error -110
> 
> 
> Thanks for all your help!
> 

Looks like it's LPM related again. To confirm, try this:
Disable LPM with this property "snps,usb2-gadget-lpm-disable"
(Note that it's not the same as "snps,dis_enblslpm_quirk")

Make sure that your testing kernel has this patch [1]
475e8be53d04 ("usb: dwc3: gadget: Check for disabled LPM quirk")

[1] https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-next&id=475e8be53d0496f9bc6159f4abb3ff5f9b90e8de


The failure you saw was probably due the gadget function attempting
to start a delayed status stage of the SET_CONFIGURATION request.
By this time, the host already put the device in low power.

The START_TRANSFER command needs to be executed while the device
is on "ON" state (or U0 if eSS). We shouldn't use dwc->link_state
to check for link state because we only enable link state change
interrupt for some controller versions.

Once you confirms disabling LPM works, try this fix:

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 6227641f2d31..06cdec79244e 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -309,10 +309,14 @@ int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep, unsigned int cmd,
 
        if (DWC3_DEPCMD_CMD(cmd) == DWC3_DEPCMD_STARTTRANSFER) {
                int             needs_wakeup;
+               u8              link_state;
 
-               needs_wakeup = (dwc->link_state == DWC3_LINK_STATE_U1 ||
-                               dwc->link_state == DWC3_LINK_STATE_U2 ||
-                               dwc->link_state == DWC3_LINK_STATE_U3);
+               reg = dwc3_readl(dwc->regs, DWC3_DSTS);
+               link_state = DWC3_DSTS_USBLNKST(reg);
+
+               needs_wakeup = (link_state == DWC3_LINK_STATE_U1 ||
+                               link_state == DWC3_LINK_STATE_U2 ||
+                               link_state == DWC3_LINK_STATE_U3);
 
                if (unlikely(needs_wakeup)) {
                        ret = __dwc3_gadget_wakeup(dwc);
@@ -1989,6 +1993,8 @@ static int __dwc3_gadget_wakeup(struct dwc3 *dwc)
        case DWC3_LINK_STATE_RESET:
        case DWC3_LINK_STATE_RX_DET:    /* in HS, means Early Suspend */
        case DWC3_LINK_STATE_U3:        /* in HS, means SUSPEND */
+       case DWC3_LINK_STATE_U2:        /* in HS, means Sleep (L1) */
+       case DWC3_LINK_STATE_U1:
        case DWC3_LINK_STATE_RESUME:
                break;
        default:




BR,
Thinh
Andy Shevchenko April 19, 2021, 8:43 a.m. UTC | #16
On Mon, Apr 19, 2021 at 2:03 AM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
> Ferry Toth wrote:
> > Op 17-04-2021 om 16:22 schreef Ferry Toth:
> >> Op 17-04-2021 om 04:27 schreef Thinh Nguyen:
> >>> Ferry Toth wrote:
> >>>> Op 16-04-2021 om 00:23 schreef Thinh Nguyen:
> >>>>> Thinh Nguyen wrote:

> > On the PC side this resulted to:
> >
> > apr 17 18:17:44 delfion kernel: usb 1-5: new high-speed USB device
> > number 12 using xhci_hcd
> > apr 17 18:17:44 delfion kernel: usb 1-5: New USB device found,
> > idVendor=1d6b, idProduct=0104, bcdDevice= 1.00
> > apr 17 18:17:44 delfion kernel: usb 1-5: New USB device strings: Mfr=1,
> > Product=2, SerialNumber=3
> > apr 17 18:17:44 delfion kernel: usb 1-5: Product: USBArmory Gadget
> > apr 17 18:17:44 delfion kernel: usb 1-5: Manufacturer: USBArmory
> > apr 17 18:17:44 delfion kernel: usb 1-5: SerialNumber: 0123456789abcdef
> > apr 17 18:17:49 delfion kernel: usb 1-5: can't set config #1, error -110
> >
> > Thanks for all your help!
>
> Looks like it's LPM related again. To confirm, try this:
> Disable LPM with this property "snps,usb2-gadget-lpm-disable"
> (Note that it's not the same as "snps,dis_enblslpm_quirk")
>
> Make sure that your testing kernel has this patch [1]
> 475e8be53d04 ("usb: dwc3: gadget: Check for disabled LPM quirk")

Thinh, Ferry, I'm a bit lost in this thread. Can you summarize what
patches I have to apply on top of v5.12-rc8 to mitigate issues,
mentioned in this thread?

(Sounds to me there are like ~5 patches floating around)

I'll try to find time to test on my side.

> [1] https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-next&id=475e8be53d0496f9bc6159f4abb3ff5f9b90e8de
Felipe Balbi April 19, 2021, 9:47 a.m. UTC | #17
Hi,

Thinh Nguyen <Thinh.Nguyen@synopsys.com> writes:
>> (Issue history:
>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/issues/31__;!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXNc7KgAw$
>> )
>> 
>> On the PC side this resulted to:
>> 
>> apr 17 18:17:44 delfion kernel: usb 1-5: new high-speed USB device
>> number 12 using xhci_hcd
>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device found,
>> idVendor=1d6b, idProduct=0104, bcdDevice= 1.00
>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device strings: Mfr=1,
>> Product=2, SerialNumber=3
>> apr 17 18:17:44 delfion kernel: usb 1-5: Product: USBArmory Gadget
>> apr 17 18:17:44 delfion kernel: usb 1-5: Manufacturer: USBArmory
>> apr 17 18:17:44 delfion kernel: usb 1-5: SerialNumber: 0123456789abcdef
>> apr 17 18:17:49 delfion kernel: usb 1-5: can't set config #1, error -110
>> 
>> 
>> Thanks for all your help!
>> 
>
> Looks like it's LPM related again. To confirm, try this:
> Disable LPM with this property "snps,usb2-gadget-lpm-disable"
> (Note that it's not the same as "snps,dis_enblslpm_quirk")
>
> Make sure that your testing kernel has this patch [1]
> 475e8be53d04 ("usb: dwc3: gadget: Check for disabled LPM quirk")
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-next&id=475e8be53d0496f9bc6159f4abb3ff5f9b90e8de
>
>
> The failure you saw was probably due the gadget function attempting
> to start a delayed status stage of the SET_CONFIGURATION request.
> By this time, the host already put the device in low power.
>
> The START_TRANSFER command needs to be executed while the device
> is on "ON" state (or U0 if eSS). We shouldn't use dwc->link_state
> to check for link state because we only enable link state change
> interrupt for some controller versions.
>
> Once you confirms disabling LPM works, try this fix:
>
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 6227641f2d31..06cdec79244e 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -309,10 +309,14 @@ int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep, unsigned int cmd,
>  
>         if (DWC3_DEPCMD_CMD(cmd) == DWC3_DEPCMD_STARTTRANSFER) {
>                 int             needs_wakeup;
> +               u8              link_state;
>  
> -               needs_wakeup = (dwc->link_state == DWC3_LINK_STATE_U1 ||
> -                               dwc->link_state == DWC3_LINK_STATE_U2 ||
> -                               dwc->link_state == DWC3_LINK_STATE_U3);
> +               reg = dwc3_readl(dwc->regs, DWC3_DSTS);
> +               link_state = DWC3_DSTS_USBLNKST(reg);
> +
> +               needs_wakeup = (link_state == DWC3_LINK_STATE_U1 ||
> +                               link_state == DWC3_LINK_STATE_U2 ||
> +                               link_state == DWC3_LINK_STATE_U3);

this makes sense. We used to track state using the state change
interrupts, but that's long since being disabled. I think, either way,
we need this fix.
Thinh Nguyen April 19, 2021, 7:49 p.m. UTC | #18
Felipe Balbi wrote:
> 
> Hi,
> 
> John Stultz <john.stultz@linaro.org> writes:
>> On Fri, Apr 16, 2021 at 12:49 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
>>>
>>> John Stultz wrote:
>>>> On Fri, Apr 16, 2021 at 3:47 AM Felipe Balbi <balbi@kernel.org> wrote:
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> Thinh Nguyen <Thinh.Nguyen@synopsys.com> writes:
>>>>>
>>>>>> From: Yu Chen <chenyu56@huawei.com>
>>>>>> From: John Stultz <john.stultz@linaro.org>
>>>>>>
>>>>>> According to the programming guide, to switch mode for DRD controller,
>>>>>> the driver needs to do the following.
>>>>>>
>>>>>> To switch from device to host:
>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>> 2. Set GCTL.PrtCapDir(host mode)
>>>>>> 3. Reset the host with USBCMD.HCRESET
>>>>>> 4. Then follow up with the initializing host registers sequence
>>>>>>
>>>>>> To switch from host to device:
>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>> 2. Set GCTL.PrtCapDir(device mode)
>>>>>> 3. Reset the device with DCTL.CSftRst
>>>>>> 4. Then follow up with the initializing registers sequence
>>>>>>
>>>>>> Currently we're missing step 1) to do GCTL.CoreSoftReset and step 3) of
>>>>>> switching from host to device. John Stult reported a lockup issue seen
>>>>>> with HiKey960 platform without these steps[1]. Similar issue is observed
>>>>>> with Ferry's testing platform[2].
>>>>>>
>>>>>> So, apply the required steps along with some fixes to Yu Chen's and John
>>>>>> Stultz's version. The main fixes to their versions are the missing wait
>>>>>> for clocks synchronization before clearing GCTL.CoreSoftReset and only
>>>>>> apply DCTL.CSftRst when switching from host to device.
>>>>>>
>>>>>> [1] https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/__;!!A4F2R9G_pg!L4TLb25Nkq0DF2qrCPWW13PUq4idhDn5QSZhgvnVAy7wJiYFOSSouSptwo9nOzIdPD4j$
>>>>>> [2] https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/__;!!A4F2R9G_pg!L4TLb25Nkq0DF2qrCPWW13PUq4idhDn5QSZhgvnVAy7wJiYFOSSouSptwo9nO21VT8q7$
>>>>>>
>>>>>> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
>>>>>> Cc: Ferry Toth <fntoth@gmail.com>
>>>>>> Cc: Wesley Cheng <wcheng@codeaurora.org>
>>>>>> Cc: <stable@vger.kernel.org>
>>>>>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly")
>>>>>> Signed-off-by: Yu Chen <chenyu56@huawei.com>
>>>>>> Signed-off-by: John Stultz <john.stultz@linaro.org>
>>>>>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>>>>>
>>>>> I still have concerns about the soft reset, but I won't block you guys
>>>>> from fixing Hikey's problem :-)
>>>>>
>>>>> The only thing I would like to confirm is that this has been verified
>>>>> with hundreds of swaps happening as quickly as possible. DWC3 should
>>>>> still be functional after several hundred swaps.
>>>>>
>>>>> Can someone confirm this is the case? (I'm assuming this can be
>>>>> scripted)
>>>>
>>>> I unfortunately don't have an easy way to automate the switching right
>>>> off. But I'll try to hack up the mux switch driver to provide an
>>>> interface we can script against.
>>>>
>>>
>>> FYI, you can do the following:
>>>
>>> 1) Enable "usb-role-switch" DT property if not already done so
>>> 2) Add userspace control:
>>>
>>> diff --git a/drivers/usb/dwc3/drd.c b/drivers/usb/dwc3/drd.c
>>> index e2b68bb770d1..b203e3d87291 100644
>>> --- a/drivers/usb/dwc3/drd.c
>>> +++ b/drivers/usb/dwc3/drd.c
>>> @@ -555,6 +555,7 @@ static int dwc3_setup_role_switch(struct dwc3 *dwc)
>>>                 mode = DWC3_GCTL_PRTCAP_DEVICE;
>>>         }
>>>
>>> +       dwc3_role_switch.allow_userspace_control = true;
>>>         dwc3_role_switch.fwnode = dev_fwnode(dwc->dev);
>>>         dwc3_role_switch.set = dwc3_usb_role_switch_set;
>>>         dwc3_role_switch.get = dwc3_usb_role_switch_get;
>>>
>>> 3) Write a script to do the following:
>>>
>>> # echo host > /sys/class/usb_role/<UDC>/role
>>>
>>> and
>>>
>>> # echo device > /sys/class/usb_role/<UDC>/role
>>
>> Thanks so much for this. So I ran both of those commands in a while
>> loop for awhile and didn't see any trouble.
>>
>> HiKey960 is interesting as well because we have a mux switch, which is
>> sort of an intermediary roll switcher (it gets the role switch signal
>> from the tcpci_rt1711h, tweaks some gpios and then signals the dwc3).
>> So I also did the above tweaks to the mux-switch and had it switching
>> between device/none and validated the onboard hub came up and down
>> along with the dwc3 core.
>>
>> Everything still looks good here.
> 
> Sounds good, happy to see so many platforms supported by Thinh's
> change. Thanks for doing this work, Thinh :-)
> 

Thanks for the review Felipe :)

Thinh
Ferry Toth April 19, 2021, 8:15 p.m. UTC | #19
Hi

Op 19-04-2021 om 01:03 schreef Thinh Nguyen:
> Ferry Toth wrote:
>> Hi
>>
>> Op 17-04-2021 om 16:22 schreef Ferry Toth:
>>> Hi
>>>
>>> Op 17-04-2021 om 04:27 schreef Thinh Nguyen:
>>>> Ferry Toth wrote:
>>>>> Hi
>>>>>
>>>>> Op 16-04-2021 om 00:23 schreef Thinh Nguyen:
>>>>>> Thinh Nguyen wrote:
>>>>>>> From: Yu Chen <chenyu56@huawei.com>
>>>>>>> From: John Stultz <john.stultz@linaro.org>
>>>>>>>
>>>>>>> According to the programming guide, to switch mode for DRD
>>>>>>> controller,
>>>>>>> the driver needs to do the following.
>>>>>>>
>>>>>>> To switch from device to host:
>>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>>> 2. Set GCTL.PrtCapDir(host mode)
>>>>>>> 3. Reset the host with USBCMD.HCRESET
>>>>>>> 4. Then follow up with the initializing host registers sequence
>>>>>>>
>>>>>>> To switch from host to device:
>>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>>> 2. Set GCTL.PrtCapDir(device mode)
>>>>>>> 3. Reset the device with DCTL.CSftRst
>>>>>>> 4. Then follow up with the initializing registers sequence
>>>>>>>
>>>>>>> Currently we're missing step 1) to do GCTL.CoreSoftReset and step
>>>>>>> 3) of
>>>>>>> switching from host to device. John Stult reported a lockup issue
>>>>>>> seen
>>>>>>> with HiKey960 platform without these steps[1]. Similar issue is
>>>>>>> observed
>>>>>>> with Ferry's testing platform[2].
>>>>>>>
>>>>>>> So, apply the required steps along with some fixes to Yu Chen's
>>>>>>> and John
>>>>>>> Stultz's version. The main fixes to their versions are the missing
>>>>>>> wait
>>>>>>> for clocks synchronization before clearing GCTL.CoreSoftReset and
>>>>>>> only
>>>>>>> apply DCTL.CSftRst when switching from host to device.
>>>>>>>
>>>>>>> [1]
>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqLhzN4i3$
>>>>>>>
>>>>>>>
>>>>>>> [2]
>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqGeZStt4$
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
>>>>>>> Cc: Ferry Toth <fntoth@gmail.com>
>>>>>>> Cc: Wesley Cheng <wcheng@codeaurora.org>
>>>>>>> Cc: <stable@vger.kernel.org>
>>>>>>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work
>>>>>>> properly")
>>>>>>> Signed-off-by: Yu Chen <chenyu56@huawei.com>
>>>>>>> Signed-off-by: John Stultz <john.stultz@linaro.org>
>>>>>>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>>>>>>> ---
>>>>>>> Changes in v3:
>>>>>>> - Check if the desired mode is OTG, then keep the old flow
>>>>>>> - Remove condition for OTG support only since the device can still be
>>>>>>>      configured DRD host/device mode only
>>>>>>> - Remove redundant hw_mode check since __dwc3_set_mode() only applies
>>>>>>> when
>>>>>>>      hw_mode is DRD
>>>>>>> Changes in v2:
>>>>>>> - Initialize mutex per device and not as global mutex.
>>>>>>> - Add additional checks for DRD only mode
>>>>>>>
>>>>>>>     drivers/usb/dwc3/core.c | 27 +++++++++++++++++++++++++++
>>>>>>>     drivers/usb/dwc3/core.h |  5 +++++
>>>>>>>     2 files changed, 32 insertions(+)
>>>>>>>
>>>>>> Hi John,
>>>>>>
>>>>>> If possible, can you run a test with this version on your platform?
>>>>>>
>>>>>> Thanks,
>>>>>> Thinh
>>>>>>
>>>>> I tested this on edison-arduino with this patch on top of usb-next
>>>>> (5.12-rc7 + "increase BESL baseline to 6" to prevent throttling").
>>>>>
>>>>> On this platform there is a physical switch to switch roles. With this
>>>>> patch I find:
>>>>>
>>>>> - switch to host mode always works fine
>>>>>
>>>>> - switch to gadget mode I need to flip the switch 3x
>>>>> (gadget-host-gadget).
>>>>>
>>>>> An error message appears on the gadget side "dwc3 dwc3.0.auto: timed
>>>>> out
>>>>> waiting for SETUP phase" appears, but then the device connects to my
>>>>> PC,
>>>>> no throttling.
>>>>>
>>>>> - alternatively I can switch to gadget 1x and then unplug/replug the
>>>>> cable.
>>>>>
>>>>> No error message and connects fine.
>>>>>
>>>>> - if I flip the switch only once, on the PC side I get:
>>>>>
>>>>>     kernel: usb 1-5: new high-speed USB device number 18 usingxhci_hcd
>>>>>     kernel: usb 1-5: New USB device found, idVendor=1d6b,
>>>>>     idProduct=0104, bcdDevice= 1.00 kernel: usb 1-5: New USB device
>>>>>     strings: Mfr=1, Product=2, SerialNumber=3 kernel: usb 1-5: Product:
>>>>>     USBArmory Gadget kernel: usb 1-5: Manufacturer: USBArmory kernel:
>>>>>     usb 1-5: SerialNumber: 0123456789abcdef kernel: usb 1-5: can't set
>>>>>     config #1, error -110
>>>> The device failed at set_configuration() request and timed out. It
>>>> probably timed out from the status stage looking at the device err
>>>> print.
>>>>
>>>>> Then if I wait long enough on the gadget side I get:
>>>>>
>>>>>     root@yuna:~# ifconfig
>>>>>
>>>>>     usb0: flags=-28605<UP,BROADCAST,RUNNING,MULTICAST,DYNAMIC> mtu 1500
>>>>>     inet 169.254.119.239 netmask 255.255.0.0 broadcast 169.254.255.255
>>>>>     inet6 fe80::a8bb:ccff:fedd:eef1 prefixlen 64 scopeid 0x20<link>
>>>>>     ether aa:bb:cc:dd:ee:f1 txqueuelen 1000 (Ethernet) RX packets 490424
>>>>>     bytes 735146578 (701.0 MiB) RX errors 0 dropped 191 overruns 0 frame
>>>>>     0 TX packets 35279 bytes 2532746 (2.4 MiB) TX errors 0 dropped 0
>>>>>     overruns 0 carrier 0 collisions 0
>>>>>
>>>>> (correct would be: inet 10.42.0.221 netmask 255.255.255.0 broadcast
>>>>> 10.42.0.255)
>>>>>
>>>>> So much improved now, but it seems I am still missing something on
>>>>> plug.
>>>>>
>>>> That's great! We can look at it further. Can you capture the tracepoints
>>>> of the issue. Also, can you try with mass_storage gadget to see if the
>>>> result is the same?
>>> I have already gser, eem, mass_storage and uac2 combo. When eem fails,
>>> the mass_storage and uac2 don't appear (on KDE you get all kind of
>>> popups when they appear).
>>>
>>> So either all works, or all fails.
>>>
>>> I'll trace this later today.
>> Trace capturing switch from host-> gadget  here
>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6329600/5.12-rc7*2Busb-next.zip__;JQ!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXLOiNwGT$
>>
>> (Issue history:
>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/issues/31__;!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXNc7KgAw$
>> )
>>
>> On the PC side this resulted to:
>>
>> apr 17 18:17:44 delfion kernel: usb 1-5: new high-speed USB device
>> number 12 using xhci_hcd
>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device found,
>> idVendor=1d6b, idProduct=0104, bcdDevice= 1.00
>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device strings: Mfr=1,
>> Product=2, SerialNumber=3
>> apr 17 18:17:44 delfion kernel: usb 1-5: Product: USBArmory Gadget
>> apr 17 18:17:44 delfion kernel: usb 1-5: Manufacturer: USBArmory
>> apr 17 18:17:44 delfion kernel: usb 1-5: SerialNumber: 0123456789abcdef
>> apr 17 18:17:49 delfion kernel: usb 1-5: can't set config #1, error -110
>>
>>
>> Thanks for all your help!
>>
> Looks like it's LPM related again. To confirm, try this:
> Disable LPM with this property "snps,usb2-gadget-lpm-disable"
> (Note that it's not the same as "snps,dis_enblslpm_quirk")

Yes, I confirm this helps.

Note: on startup I was in host mode, with gadget cable plugged. The 
first switch to gadget didn't work, all subsequent switches did work, as 
well as unplug/plug the cable.

> Make sure that your testing kernel has this patch [1]
> 475e8be53d04 ("usb: dwc3: gadget: Check for disabled LPM quirk")
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-next&id=475e8be53d0496f9bc6159f4abb3ff5f9b90e8de
>
>
> The failure you saw was probably due the gadget function attempting
> to start a delayed status stage of the SET_CONFIGURATION request.
> By this time, the host already put the device in low power.
>
> The START_TRANSFER command needs to be executed while the device
> is on "ON" state (or U0 if eSS). We shouldn't use dwc->link_state
> to check for link state because we only enable link state change
> interrupt for some controller versions.
>
> Once you confirms disabling LPM works, try this fix:
>
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 6227641f2d31..06cdec79244e 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -309,10 +309,14 @@ int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep, unsigned int cmd,
>   
>          if (DWC3_DEPCMD_CMD(cmd) == DWC3_DEPCMD_STARTTRANSFER) {
>                  int             needs_wakeup;
> +               u8              link_state;
>   
> -               needs_wakeup = (dwc->link_state == DWC3_LINK_STATE_U1 ||
> -                               dwc->link_state == DWC3_LINK_STATE_U2 ||
> -                               dwc->link_state == DWC3_LINK_STATE_U3);
> +               reg = dwc3_readl(dwc->regs, DWC3_DSTS);
> +               link_state = DWC3_DSTS_USBLNKST(reg);
> +
> +               needs_wakeup = (link_state == DWC3_LINK_STATE_U1 ||
> +                               link_state == DWC3_LINK_STATE_U2 ||
> +                               link_state == DWC3_LINK_STATE_U3);
>   
>                  if (unlikely(needs_wakeup)) {
>                          ret = __dwc3_gadget_wakeup(dwc);
> @@ -1989,6 +1993,8 @@ static int __dwc3_gadget_wakeup(struct dwc3 *dwc)
>          case DWC3_LINK_STATE_RESET:
>          case DWC3_LINK_STATE_RX_DET:    /* in HS, means Early Suspend */
>          case DWC3_LINK_STATE_U3:        /* in HS, means SUSPEND */
> +       case DWC3_LINK_STATE_U2:        /* in HS, means Sleep (L1) */
> +       case DWC3_LINK_STATE_U1:
>          case DWC3_LINK_STATE_RESUME:
>                  break;
>          default:
>
Same (good) result as with "snps,usb2-gadget-lpm-disable". Including 
first switch from host->gadget not working.

After a 2 - 4 minutes the connection is dropped and reconnected.

On the gadget end journal shows:

Apr 19 22:08:42 yuna systemd-networkd[507]: usb0: Lost carrier
Apr 19 22:08:42 yuna systemd-journald[417]: Forwarding to syslog missed 
1 messages.
Apr 19 22:08:42 yuna systemd-timesyncd[469]: No network connectivity, 
watching for changes.
Apr 19 22:08:42 yuna kernel: IPv6: ADDRCONF(NETDEV_CHANGE): usb0: link 
becomes ready
Apr 19 22:08:42 yuna kernel[480]: [  624.382929] IPv6: 
ADDRCONF(NETDEV_CHANGE): usb0: link becomes ready
Apr 19 22:08:42 yuna systemd-networkd[507]: usb0: Gained carrier
Apr 19 22:08:44 yuna systemd-networkd[507]: usb0: Gained IPv6LL
Apr 19 22:08:44 yuna systemd-timesyncd[469]: Network configuration 
changed, trying to establish connection.
Apr 19 22:08:57 yuna systemd-timesyncd[469]: Initial synchronization to 
time server 216.239.35.8:123 (time3.google.com).

So, drops and immediately reconnects.

>
>
> BR,
> Thinh
Ferry Toth April 19, 2021, 8:24 p.m. UTC | #20
Op 19-04-2021 om 10:43 schreef Andy Shevchenko:
> On Mon, Apr 19, 2021 at 2:03 AM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
>> Ferry Toth wrote:
>>> Op 17-04-2021 om 16:22 schreef Ferry Toth:
>>>> Op 17-04-2021 om 04:27 schreef Thinh Nguyen:
>>>>> Ferry Toth wrote:
>>>>>> Op 16-04-2021 om 00:23 schreef Thinh Nguyen:
>>>>>>> Thinh Nguyen wrote:
>>> On the PC side this resulted to:
>>>
>>> apr 17 18:17:44 delfion kernel: usb 1-5: new high-speed USB device
>>> number 12 using xhci_hcd
>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device found,
>>> idVendor=1d6b, idProduct=0104, bcdDevice= 1.00
>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device strings: Mfr=1,
>>> Product=2, SerialNumber=3
>>> apr 17 18:17:44 delfion kernel: usb 1-5: Product: USBArmory Gadget
>>> apr 17 18:17:44 delfion kernel: usb 1-5: Manufacturer: USBArmory
>>> apr 17 18:17:44 delfion kernel: usb 1-5: SerialNumber: 0123456789abcdef
>>> apr 17 18:17:49 delfion kernel: usb 1-5: can't set config #1, error -110
>>>
>>> Thanks for all your help!
>> Looks like it's LPM related again. To confirm, try this:
>> Disable LPM with this property "snps,usb2-gadget-lpm-disable"
>> (Note that it's not the same as "snps,dis_enblslpm_quirk")
>> Make sure that your testing kernel has this patch [1]
>> 475e8be53d04 ("usb: dwc3: gadget: Check for disabled LPM quirk")
> Thinh, Ferry, I'm a bit lost in this thread. Can you summarize what
> patches I have to apply on top of v5.12-rc8 to mitigate issues,
> mentioned in this thread?
>
> (Sounds to me there are like ~5 patches floating around)

I have 3 on top usb-next (-rc7)

usb: dwc3: core: Do core softreset when switch mode

usb: dwc3: gadget: START_TRANSFER command needs to be executed while the 
device is on "ON" state

usb: gadget: increase BESL baseline to 6

See here: https://github.com/htot/linux/commits/eds-acpi-5.12-rc7

>
> I'll try to find time to test on my side.
>
>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-next&id=475e8be53d0496f9bc6159f4abb3ff5f9b90e8de
>
Thinh Nguyen April 19, 2021, 9:23 p.m. UTC | #21
Ferry Toth wrote:
> Hi
> 
> Op 19-04-2021 om 01:03 schreef Thinh Nguyen:
>> Ferry Toth wrote:
>>> Hi
>>>
>>> Op 17-04-2021 om 16:22 schreef Ferry Toth:
>>>> Hi
>>>>
>>>> Op 17-04-2021 om 04:27 schreef Thinh Nguyen:
>>>>> Ferry Toth wrote:
>>>>>> Hi
>>>>>>
>>>>>> Op 16-04-2021 om 00:23 schreef Thinh Nguyen:
>>>>>>> Thinh Nguyen wrote:
>>>>>>>> From: Yu Chen <chenyu56@huawei.com>
>>>>>>>> From: John Stultz <john.stultz@linaro.org>
>>>>>>>>
>>>>>>>> According to the programming guide, to switch mode for DRD
>>>>>>>> controller,
>>>>>>>> the driver needs to do the following.
>>>>>>>>
>>>>>>>> To switch from device to host:
>>>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>>>> 2. Set GCTL.PrtCapDir(host mode)
>>>>>>>> 3. Reset the host with USBCMD.HCRESET
>>>>>>>> 4. Then follow up with the initializing host registers sequence
>>>>>>>>
>>>>>>>> To switch from host to device:
>>>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>>>> 2. Set GCTL.PrtCapDir(device mode)
>>>>>>>> 3. Reset the device with DCTL.CSftRst
>>>>>>>> 4. Then follow up with the initializing registers sequence
>>>>>>>>
>>>>>>>> Currently we're missing step 1) to do GCTL.CoreSoftReset and step
>>>>>>>> 3) of
>>>>>>>> switching from host to device. John Stult reported a lockup issue
>>>>>>>> seen
>>>>>>>> with HiKey960 platform without these steps[1]. Similar issue is
>>>>>>>> observed
>>>>>>>> with Ferry's testing platform[2].
>>>>>>>>
>>>>>>>> So, apply the required steps along with some fixes to Yu Chen's
>>>>>>>> and John
>>>>>>>> Stultz's version. The main fixes to their versions are the missing
>>>>>>>> wait
>>>>>>>> for clocks synchronization before clearing GCTL.CoreSoftReset and
>>>>>>>> only
>>>>>>>> apply DCTL.CSftRst when switching from host to device.
>>>>>>>>
>>>>>>>> [1]
>>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqLhzN4i3$
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> [2]
>>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqGeZStt4$
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
>>>>>>>> Cc: Ferry Toth <fntoth@gmail.com>
>>>>>>>> Cc: Wesley Cheng <wcheng@codeaurora.org>
>>>>>>>> Cc: <stable@vger.kernel.org>
>>>>>>>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work
>>>>>>>> properly")
>>>>>>>> Signed-off-by: Yu Chen <chenyu56@huawei.com>
>>>>>>>> Signed-off-by: John Stultz <john.stultz@linaro.org>
>>>>>>>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>>>>>>>> ---
>>>>>>>> Changes in v3:
>>>>>>>> - Check if the desired mode is OTG, then keep the old flow
>>>>>>>> - Remove condition for OTG support only since the device can
>>>>>>>> still be
>>>>>>>>      configured DRD host/device mode only
>>>>>>>> - Remove redundant hw_mode check since __dwc3_set_mode() only
>>>>>>>> applies
>>>>>>>> when
>>>>>>>>      hw_mode is DRD
>>>>>>>> Changes in v2:
>>>>>>>> - Initialize mutex per device and not as global mutex.
>>>>>>>> - Add additional checks for DRD only mode
>>>>>>>>
>>>>>>>>     drivers/usb/dwc3/core.c | 27 +++++++++++++++++++++++++++
>>>>>>>>     drivers/usb/dwc3/core.h |  5 +++++
>>>>>>>>     2 files changed, 32 insertions(+)
>>>>>>>>
>>>>>>> Hi John,
>>>>>>>
>>>>>>> If possible, can you run a test with this version on your platform?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Thinh
>>>>>>>
>>>>>> I tested this on edison-arduino with this patch on top of usb-next
>>>>>> (5.12-rc7 + "increase BESL baseline to 6" to prevent throttling").
>>>>>>
>>>>>> On this platform there is a physical switch to switch roles. With
>>>>>> this
>>>>>> patch I find:
>>>>>>
>>>>>> - switch to host mode always works fine
>>>>>>
>>>>>> - switch to gadget mode I need to flip the switch 3x
>>>>>> (gadget-host-gadget).
>>>>>>
>>>>>> An error message appears on the gadget side "dwc3 dwc3.0.auto: timed
>>>>>> out
>>>>>> waiting for SETUP phase" appears, but then the device connects to my
>>>>>> PC,
>>>>>> no throttling.
>>>>>>
>>>>>> - alternatively I can switch to gadget 1x and then unplug/replug the
>>>>>> cable.
>>>>>>
>>>>>> No error message and connects fine.
>>>>>>
>>>>>> - if I flip the switch only once, on the PC side I get:
>>>>>>
>>>>>>     kernel: usb 1-5: new high-speed USB device number 18
>>>>>> usingxhci_hcd
>>>>>>     kernel: usb 1-5: New USB device found, idVendor=1d6b,
>>>>>>     idProduct=0104, bcdDevice= 1.00 kernel: usb 1-5: New USB device
>>>>>>     strings: Mfr=1, Product=2, SerialNumber=3 kernel:usb 1-5:
>>>>>> Product:
>>>>>>     USBArmory Gadget kernel: usb 1-5: Manufacturer: USBArmory kernel:
>>>>>>     usb 1-5: SerialNumber: 0123456789abcdef kernel: usb 1-5: can't
>>>>>> set
>>>>>>     config #1, error -110
>>>>> The device failed at set_configuration() request and timed out. It
>>>>> probably timed out from the status stage looking at the device err
>>>>> print.
>>>>>
>>>>>> Then if I wait long enough on the gadget side I get:
>>>>>>
>>>>>>     root@yuna:~# ifconfig
>>>>>>
>>>>>>     usb0: flags=-28605<UP,BROADCAST,RUNNING,MULTICAST,DYNAMIC> mtu
>>>>>> 1500
>>>>>>     inet 169.254.119.239 netmask 255.255.0.0 broadcast
>>>>>> 169.254.255.255
>>>>>>     inet6 fe80::a8bb:ccff:fedd:eef1 prefixlen 64 scopeid 0x20<link>
>>>>>>     ether aa:bb:cc:dd:ee:f1 txqueuelen 1000 (Ethernet) RX packets
>>>>>> 490424
>>>>>>     bytes 735146578 (701.0 MiB) RX errors 0 dropped 191 overruns 0
>>>>>> frame
>>>>>>     0 TX packets 35279 bytes 2532746 (2.4 MiB) TX errors 0 dropped 0
>>>>>>     overruns 0 carrier 0 collisions 0
>>>>>>
>>>>>> (correct would be: inet 10.42.0.221 netmask 255.255.255.0 broadcast
>>>>>> 10.42.0.255)
>>>>>>
>>>>>> So much improved now, but it seems I am still missing something on
>>>>>> plug.
>>>>>>
>>>>> That's great! We can look at it further. Can you capture the
>>>>> tracepoints
>>>>> of the issue. Also, can you try with mass_storage gadget to see if the
>>>>> result is the same?
>>>> I have already gser, eem, mass_storage and uac2 combo. When eem fails,
>>>> the mass_storage and uac2 don't appear (on KDE you get all kind of
>>>> popups when they appear).
>>>>
>>>> So either all works, or all fails.
>>>>
>>>> I'll trace this later today.
>>> Trace capturing switch from host-> gadget  here
>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6329600/5.12-rc7*2Busb-next.zip__;JQ!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXLOiNwGT$
>>>
>>>
>>> (Issue history:
>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/issues/31__;!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXNc7KgAw$
>>>
>>> )
>>>
>>> On the PC side this resulted to:
>>>
>>> apr 17 18:17:44 delfion kernel: usb 1-5: new high-speed USB device
>>> number 12 using xhci_hcd
>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device found,
>>> idVendor=1d6b, idProduct=0104, bcdDevice= 1.00
>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device strings: Mfr=1,
>>> Product=2, SerialNumber=3
>>> apr 17 18:17:44 delfion kernel: usb 1-5: Product: USBArmory Gadget
>>> apr 17 18:17:44 delfion kernel: usb 1-5: Manufacturer: USBArmory
>>> apr 17 18:17:44 delfion kernel: usb 1-5: SerialNumber: 0123456789abcdef
>>> apr 17 18:17:49 delfion kernel: usb 1-5: can't set config #1, error -110
>>>
>>>
>>> Thanks for all your help!
>>>
>> Looks like it's LPM related again. To confirm, try this:
>> Disable LPM with this property "snps,usb2-gadget-lpm-disable"
>> (Note that it's not the same as "snps,dis_enblslpm_quirk")
> 
> Yes, I confirm this helps.
> 
> Note: on startup I was in host mode, with gadget cable plugged. The
> first switch to gadget didn't work, all subsequent switches did work, as
> well as unplug/plug the cable.
> 
>> Make sure that your testing kernel has this patch [1]
>> 475e8be53d04 ("usb: dwc3: gadget: Check for disabled LPM quirk")
>>
>> [1]
>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-next&id=475e8be53d0496f9bc6159f4abb3ff5f9b90e8de__;!!A4F2R9G_pg!Mvz1Am6Ka_pOBfD0TmsA3821I05Ti8stMgh5r4XzMwZ9dy1Wan-il-DB4h50DmbaU4Zw$
>>
>>
>> The failure you saw was probably due the gadget function attempting
>> to start a delayed status stage of the SET_CONFIGURATION request.
>> By this time, the host already put the device in low power.
>>
>> The START_TRANSFER command needs to be executed while the device
>> is on "ON" state (or U0 if eSS). We shouldn't use dwc->link_state
>> to check for link state because we only enable link state change
>> interrupt for some controller versions.
>>
>> Once you confirms disabling LPM works, try this fix:
>>
>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>> index 6227641f2d31..06cdec79244e 100644
>> --- a/drivers/usb/dwc3/gadget.c
>> +++ b/drivers/usb/dwc3/gadget.c
>> @@ -309,10 +309,14 @@ int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep,
>> unsigned int cmd,
>>            if (DWC3_DEPCMD_CMD(cmd) == DWC3_DEPCMD_STARTTRANSFER) {
>>                  int             needs_wakeup;
>> +               u8              link_state;
>>   -               needs_wakeup = (dwc->link_state ==
>> DWC3_LINK_STATE_U1 ||
>> -                               dwc->link_state == DWC3_LINK_STATE_U2||
>> -                               dwc->link_state == DWC3_LINK_STATE_U3);
>> +               reg = dwc3_readl(dwc->regs, DWC3_DSTS);
>> +               link_state = DWC3_DSTS_USBLNKST(reg);
>> +
>> +               needs_wakeup = (link_state == DWC3_LINK_STATE_U1 ||
>> +                               link_state == DWC3_LINK_STATE_U2 ||
>> +                               link_state == DWC3_LINK_STATE_U3);
>>                    if (unlikely(needs_wakeup)) {
>>                          ret = __dwc3_gadget_wakeup(dwc);
>> @@ -1989,6 +1993,8 @@ static int __dwc3_gadget_wakeup(struct dwc3 *dwc)
>>          case DWC3_LINK_STATE_RESET:
>>          case DWC3_LINK_STATE_RX_DET:    /* in HS, means Early Suspend */
>>          case DWC3_LINK_STATE_U3:        /* in HS, means SUSPEND */
>> +       case DWC3_LINK_STATE_U2:        /* in HS, means Sleep (L1) */
>> +       case DWC3_LINK_STATE_U1:
>>          case DWC3_LINK_STATE_RESUME:
>>                  break;
>>          default:
>>
> Same (good) result as with "snps,usb2-gadget-lpm-disable". Including
> first switch from host->gadget not working.
> 

Great! Not sure why the first switch is not working, but it seems like
we were able to eliminate quite a few issues. If you have more dwc3
tracepoints, we can take a look further.

> After a 2 - 4 minutes the connection is dropped and reconnected.

Does this occur with LPM disabled also? We can review this issue further
with more dwc3 tracepoints.

> 
> On the gadget end journal shows:
> 
> Apr 19 22:08:42 yuna systemd-networkd[507]: usb0: Lost carrier
> Apr 19 22:08:42 yuna systemd-journald[417]: Forwarding to syslog missed
> 1 messages.
> Apr 19 22:08:42 yuna systemd-timesyncd[469]: No network connectivity,
> watching for changes.
> Apr 19 22:08:42 yuna kernel: IPv6: ADDRCONF(NETDEV_CHANGE): usb0: link
> becomes ready
> Apr 19 22:08:42 yuna kernel[480]: [  624.382929] IPv6:
> ADDRCONF(NETDEV_CHANGE): usb0: link becomes ready
> Apr 19 22:08:42 yuna systemd-networkd[507]: usb0: Gained carrier
> Apr 19 22:08:44 yuna systemd-networkd[507]: usb0: Gained IPv6LL
> Apr 19 22:08:44 yuna systemd-timesyncd[469]: Network configuration
> changed, trying to establish connection.
> Apr 19 22:08:57 yuna systemd-timesyncd[469]: Initial synchronization to
> time server 216.239.35.8:123 (time3.google.com).
> 
> So, drops and immediately reconnects.
> 

From the look at the log here, it seems to be a reset from host (and an
issue at the protocol level) unrelated to dwc3 driver or the controller.
Hopefully and maybe we can get more clues from dwc3 tracepoints.

Thanks,
Thinh
Wesley Cheng April 19, 2021, 9:26 p.m. UTC | #22
On 4/15/2021 3:20 PM, Thinh Nguyen wrote:
> From: Yu Chen <chenyu56@huawei.com>
> From: John Stultz <john.stultz@linaro.org>
> 
> According to the programming guide, to switch mode for DRD controller,
> the driver needs to do the following.
> 
> To switch from device to host:
> 1. Reset controller with GCTL.CoreSoftReset
> 2. Set GCTL.PrtCapDir(host mode)
> 3. Reset the host with USBCMD.HCRESET
> 4. Then follow up with the initializing host registers sequence
> 
> To switch from host to device:
> 1. Reset controller with GCTL.CoreSoftReset
> 2. Set GCTL.PrtCapDir(device mode)
> 3. Reset the device with DCTL.CSftRst
> 4. Then follow up with the initializing registers sequence
> 
> Currently we're missing step 1) to do GCTL.CoreSoftReset and step 3) of
> switching from host to device. John Stult reported a lockup issue seen
> with HiKey960 platform without these steps[1]. Similar issue is observed
> with Ferry's testing platform[2].
> 
> So, apply the required steps along with some fixes to Yu Chen's and John
> Stultz's version. The main fixes to their versions are the missing wait
> for clocks synchronization before clearing GCTL.CoreSoftReset and only
> apply DCTL.CSftRst when switching from host to device.
> 
> [1] https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/
> [2] https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/
> 
> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
> Cc: Ferry Toth <fntoth@gmail.com>
> Cc: Wesley Cheng <wcheng@codeaurora.org>
> Cc: <stable@vger.kernel.org>
> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly")
> Signed-off-by: Yu Chen <chenyu56@huawei.com>
> Signed-off-by: John Stultz <john.stultz@linaro.org>
> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
> ---
> Changes in v3:
> - Check if the desired mode is OTG, then keep the old flow
> - Remove condition for OTG support only since the device can still be
>   configured DRD host/device mode only
> - Remove redundant hw_mode check since __dwc3_set_mode() only applies when
>   hw_mode is DRD
> Changes in v2:
> - Initialize mutex per device and not as global mutex.
> - Add additional checks for DRD only mode
> 
>  drivers/usb/dwc3/core.c | 27 +++++++++++++++++++++++++++
>  drivers/usb/dwc3/core.h |  5 +++++
>  2 files changed, 32 insertions(+)
> 
> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
> index 5c25e6a72dbd..2f118ad43571 100644
> --- a/drivers/usb/dwc3/core.c
> +++ b/drivers/usb/dwc3/core.c
> @@ -114,6 +114,8 @@ void dwc3_set_prtcap(struct dwc3 *dwc, u32 mode)
>  	dwc->current_dr_role = mode;
>  }
>  
> +static int dwc3_core_soft_reset(struct dwc3 *dwc);
> +
>  static void __dwc3_set_mode(struct work_struct *work)
>  {
>  	struct dwc3 *dwc = work_to_dwc(work);
> @@ -121,6 +123,8 @@ static void __dwc3_set_mode(struct work_struct *work)
>  	int ret;
>  	u32 reg;
>  
> +	mutex_lock(&dwc->mutex);
> +
>  	pm_runtime_get_sync(dwc->dev);
>  
>  	if (dwc->current_dr_role == DWC3_GCTL_PRTCAP_OTG)
> @@ -154,6 +158,25 @@ static void __dwc3_set_mode(struct work_struct *work)
>  		break;
>  	}
>  
> +	/* For DRD host or device mode only */
> +	if (dwc->desired_dr_role != DWC3_GCTL_PRTCAP_OTG) {
> +		reg = dwc3_readl(dwc->regs, DWC3_GCTL);
> +		reg |= DWC3_GCTL_CORESOFTRESET;
> +		dwc3_writel(dwc->regs, DWC3_GCTL, reg);
> +
> +		/*
> +		 * Wait for internal clocks to synchronized. DWC_usb31 and
> +		 * DWC_usb32 may need at least 50ms (less for DWC_usb3). To
> +		 * keep it consistent across different IPs, let's wait up to
> +		 * 100ms before clearing GCTL.CORESOFTRESET.
> +		 */
> +		msleep(100);
> +
> +		reg = dwc3_readl(dwc->regs, DWC3_GCTL);
> +		reg &= ~DWC3_GCTL_CORESOFTRESET;
> +		dwc3_writel(dwc->regs, DWC3_GCTL, reg);
> +	}
> +
>  	spin_lock_irqsave(&dwc->lock, flags);
>  
>  	dwc3_set_prtcap(dwc, dwc->desired_dr_role);
> @@ -178,6 +201,8 @@ static void __dwc3_set_mode(struct work_struct *work)
>  		}
>  		break;
>  	case DWC3_GCTL_PRTCAP_DEVICE:
> +		dwc3_core_soft_reset(dwc);
> +
>  		dwc3_event_buffers_setup(dwc);
>  
>  		if (dwc->usb2_phy)
> @@ -200,6 +225,7 @@ static void __dwc3_set_mode(struct work_struct *work)
>  out:
>  	pm_runtime_mark_last_busy(dwc->dev);
>  	pm_runtime_put_autosuspend(dwc->dev);
> +	mutex_unlock(&dwc->mutex);
>  }
>  
>  void dwc3_set_mode(struct dwc3 *dwc, u32 mode)
> @@ -1553,6 +1579,7 @@ static int dwc3_probe(struct platform_device *pdev)
>  	dwc3_cache_hwparams(dwc);
>  
>  	spin_lock_init(&dwc->lock);
> +	mutex_init(&dwc->mutex);
>  
>  	pm_runtime_set_active(dev);
>  	pm_runtime_use_autosuspend(dev);
> diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
> index 695ff2d791e4..7e3afa5378e8 100644
> --- a/drivers/usb/dwc3/core.h
> +++ b/drivers/usb/dwc3/core.h
> @@ -13,6 +13,7 @@
>  
>  #include <linux/device.h>
>  #include <linux/spinlock.h>
> +#include <linux/mutex.h>
>  #include <linux/ioport.h>
>  #include <linux/list.h>
>  #include <linux/bitops.h>
> @@ -947,6 +948,7 @@ struct dwc3_scratchpad_array {
>   * @scratch_addr: dma address of scratchbuf
>   * @ep0_in_setup: one control transfer is completed and enter setup phase
>   * @lock: for synchronizing
> + * @mutex: for mode switching
>   * @dev: pointer to our struct device
>   * @sysdev: pointer to the DMA-capable device
>   * @xhci: pointer to our xHCI child
> @@ -1088,6 +1090,9 @@ struct dwc3 {
>  	/* device lock */
>  	spinlock_t		lock;
>  
> +	/* mode switching lock */
> +	struct mutex		mutex;
> +
>  	struct device		*dev;
>  	struct device		*sysdev;
>  
> 
> base-commit: 4b853c236c7b5161a2e444bd8b3c76fe5aa5ddcb
> 
Hi Thinh,

Thanks for this change! Tested this on our platform w/ a DRD mode switch
loop and it looks good.

Tested-by: Wesley Cheng <wcheng@codeaurora.org>

Thanks
Wesley Cheng
Ferry Toth April 20, 2021, 7:55 p.m. UTC | #23
Hi

Op 19-04-2021 om 23:23 schreef Thinh Nguyen:
> Ferry Toth wrote:
>> Hi
>>
>> Op 19-04-2021 om 01:03 schreef Thinh Nguyen:
>>> Ferry Toth wrote:
>>>> Hi
>>>>
>>>> Op 17-04-2021 om 16:22 schreef Ferry Toth:
>>>>> Hi
>>>>>
>>>>> Op 17-04-2021 om 04:27 schreef Thinh Nguyen:
>>>>>> Ferry Toth wrote:
>>>>>>> Hi
>>>>>>>
>>>>>>> Op 16-04-2021 om 00:23 schreef Thinh Nguyen:
>>>>>>>> Thinh Nguyen wrote:
>>>>>>>>> From: Yu Chen <chenyu56@huawei.com>
>>>>>>>>> From: John Stultz <john.stultz@linaro.org>
>>>>>>>>>
>>>>>>>>> According to the programming guide, to switch mode for DRD
>>>>>>>>> controller,
>>>>>>>>> the driver needs to do the following.
>>>>>>>>>
>>>>>>>>> To switch from device to host:
>>>>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>>>>> 2. Set GCTL.PrtCapDir(host mode)
>>>>>>>>> 3. Reset the host with USBCMD.HCRESET
>>>>>>>>> 4. Then follow up with the initializing host registers sequence
>>>>>>>>>
>>>>>>>>> To switch from host to device:
>>>>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>>>>> 2. Set GCTL.PrtCapDir(device mode)
>>>>>>>>> 3. Reset the device with DCTL.CSftRst
>>>>>>>>> 4. Then follow up with the initializing registers sequence
>>>>>>>>>
>>>>>>>>> Currently we're missing step 1) to do GCTL.CoreSoftReset and step
>>>>>>>>> 3) of
>>>>>>>>> switching from host to device. John Stult reported a lockup issue
>>>>>>>>> seen
>>>>>>>>> with HiKey960 platform without these steps[1]. Similar issue is
>>>>>>>>> observed
>>>>>>>>> with Ferry's testing platform[2].
>>>>>>>>>
>>>>>>>>> So, apply the required steps along with some fixes to Yu Chen's
>>>>>>>>> and John
>>>>>>>>> Stultz's version. The main fixes to their versions are the missing
>>>>>>>>> wait
>>>>>>>>> for clocks synchronization before clearing GCTL.CoreSoftReset and
>>>>>>>>> only
>>>>>>>>> apply DCTL.CSftRst when switching from host to device.
>>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqLhzN4i3$
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [2]
>>>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqGeZStt4$
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
>>>>>>>>> Cc: Ferry Toth <fntoth@gmail.com>
>>>>>>>>> Cc: Wesley Cheng <wcheng@codeaurora.org>
>>>>>>>>> Cc: <stable@vger.kernel.org>
>>>>>>>>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work
>>>>>>>>> properly")
>>>>>>>>> Signed-off-by: Yu Chen <chenyu56@huawei.com>
>>>>>>>>> Signed-off-by: John Stultz <john.stultz@linaro.org>
>>>>>>>>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>>>>>>>>> ---
>>>>>>>>> Changes in v3:
>>>>>>>>> - Check if the desired mode is OTG, then keep the old flow
>>>>>>>>> - Remove condition for OTG support only since the device can
>>>>>>>>> still be
>>>>>>>>>       configured DRD host/device mode only
>>>>>>>>> - Remove redundant hw_mode check since __dwc3_set_mode() only
>>>>>>>>> applies
>>>>>>>>> when
>>>>>>>>>       hw_mode is DRD
>>>>>>>>> Changes in v2:
>>>>>>>>> - Initialize mutex per device and not as global mutex.
>>>>>>>>> - Add additional checks for DRD only mode
>>>>>>>>>
>>>>>>>>>      drivers/usb/dwc3/core.c | 27 +++++++++++++++++++++++++++
>>>>>>>>>      drivers/usb/dwc3/core.h |  5 +++++
>>>>>>>>>      2 files changed, 32 insertions(+)
>>>>>>>>>
>>>>>>>> Hi John,
>>>>>>>>
>>>>>>>> If possible, can you run a test with this version on your platform?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Thinh
>>>>>>>>
>>>>>>> I tested this on edison-arduino with this patch on top of usb-next
>>>>>>> (5.12-rc7 + "increase BESL baseline to 6" to prevent throttling").
>>>>>>>
>>>>>>> On this platform there is a physical switch to switch roles. With
>>>>>>> this
>>>>>>> patch I find:
>>>>>>>
>>>>>>> - switch to host mode always works fine
>>>>>>>
>>>>>>> - switch to gadget mode I need to flip the switch 3x
>>>>>>> (gadget-host-gadget).
>>>>>>>
>>>>>>> An error message appears on the gadget side "dwc3 dwc3.0.auto: timed
>>>>>>> out
>>>>>>> waiting for SETUP phase" appears, but then the device connects to my
>>>>>>> PC,
>>>>>>> no throttling.
>>>>>>>
>>>>>>> - alternatively I can switch to gadget 1x and then unplug/replug the
>>>>>>> cable.
>>>>>>>
>>>>>>> No error message and connects fine.
>>>>>>>
>>>>>>> - if I flip the switch only once, on the PC side I get:
>>>>>>>
>>>>>>>      kernel: usb 1-5: new high-speed USB device number 18
>>>>>>> usingxhci_hcd
>>>>>>>      kernel: usb 1-5: New USB device found, idVendor=1d6b,
>>>>>>>      idProduct=0104, bcdDevice= 1.00 kernel: usb 1-5: New USB device
>>>>>>>      strings: Mfr=1, Product=2, SerialNumber=3 kernel:usb 1-5:
>>>>>>> Product:
>>>>>>>      USBArmory Gadget kernel: usb 1-5: Manufacturer: USBArmory kernel:
>>>>>>>      usb 1-5: SerialNumber: 0123456789abcdef kernel: usb 1-5: can't
>>>>>>> set
>>>>>>>      config #1, error -110
>>>>>> The device failed at set_configuration() request and timed out. It
>>>>>> probably timed out from the status stage looking at the device err
>>>>>> print.
>>>>>>
>>>>>>> Then if I wait long enough on the gadget side I get:
>>>>>>>
>>>>>>>      root@yuna:~# ifconfig
>>>>>>>
>>>>>>>      usb0: flags=-28605<UP,BROADCAST,RUNNING,MULTICAST,DYNAMIC> mtu
>>>>>>> 1500
>>>>>>>      inet 169.254.119.239 netmask 255.255.0.0 broadcast
>>>>>>> 169.254.255.255
>>>>>>>      inet6 fe80::a8bb:ccff:fedd:eef1 prefixlen 64 scopeid 0x20<link>
>>>>>>>      ether aa:bb:cc:dd:ee:f1 txqueuelen 1000 (Ethernet) RX packets
>>>>>>> 490424
>>>>>>>      bytes 735146578 (701.0 MiB) RX errors 0 dropped 191 overruns 0
>>>>>>> frame
>>>>>>>      0 TX packets 35279 bytes 2532746 (2.4 MiB) TX errors 0 dropped 0
>>>>>>>      overruns 0 carrier 0 collisions 0
>>>>>>>
>>>>>>> (correct would be: inet 10.42.0.221 netmask 255.255.255.0 broadcast
>>>>>>> 10.42.0.255)
>>>>>>>
>>>>>>> So much improved now, but it seems I am still missing something on
>>>>>>> plug.
>>>>>>>
>>>>>> That's great! We can look at it further. Can you capture the
>>>>>> tracepoints
>>>>>> of the issue. Also, can you try with mass_storage gadget to see if the
>>>>>> result is the same?
>>>>> I have already gser, eem, mass_storage and uac2 combo. When eem fails,
>>>>> the mass_storage and uac2 don't appear (on KDE you get all kind of
>>>>> popups when they appear).
>>>>>
>>>>> So either all works, or all fails.
>>>>>
>>>>> I'll trace this later today.
>>>> Trace capturing switch from host-> gadget  here
>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6329600/5.12-rc7*2Busb-next.zip__;JQ!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXLOiNwGT$
>>>>
>>>>
>>>> (Issue history:
>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/issues/31__;!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXNc7KgAw$
>>>>
>>>> )
>>>>
>>>> On the PC side this resulted to:
>>>>
>>>> apr 17 18:17:44 delfion kernel: usb 1-5: new high-speed USB device
>>>> number 12 using xhci_hcd
>>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device found,
>>>> idVendor=1d6b, idProduct=0104, bcdDevice= 1.00
>>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device strings: Mfr=1,
>>>> Product=2, SerialNumber=3
>>>> apr 17 18:17:44 delfion kernel: usb 1-5: Product: USBArmory Gadget
>>>> apr 17 18:17:44 delfion kernel: usb 1-5: Manufacturer: USBArmory
>>>> apr 17 18:17:44 delfion kernel: usb 1-5: SerialNumber: 0123456789abcdef
>>>> apr 17 18:17:49 delfion kernel: usb 1-5: can't set config #1, error -110
>>>>
>>>>
>>>> Thanks for all your help!
>>>>
>>> Looks like it's LPM related again. To confirm, try this:
>>> Disable LPM with this property "snps,usb2-gadget-lpm-disable"
>>> (Note that it's not the same as "snps,dis_enblslpm_quirk")
>> Yes, I confirm this helps.
>>
>> Note: on startup I was in host mode, with gadget cable plugged. The
>> first switch to gadget didn't work, all subsequent switches did work, as
>> well as unplug/plug the cable.
>>
>>> Make sure that your testing kernel has this patch [1]
>>> 475e8be53d04 ("usb: dwc3: gadget: Check for disabled LPM quirk")
>>>
>>> [1]
>>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-next&id=475e8be53d0496f9bc6159f4abb3ff5f9b90e8de__;!!A4F2R9G_pg!Mvz1Am6Ka_pOBfD0TmsA3821I05Ti8stMgh5r4XzMwZ9dy1Wan-il-DB4h50DmbaU4Zw$
>>>
>>>
>>> The failure you saw was probably due the gadget function attempting
>>> to start a delayed status stage of the SET_CONFIGURATION request.
>>> By this time, the host already put the device in low power.
>>>
>>> The START_TRANSFER command needs to be executed while the device
>>> is on "ON" state (or U0 if eSS). We shouldn't use dwc->link_state
>>> to check for link state because we only enable link state change
>>> interrupt for some controller versions.
>>>
>>> Once you confirms disabling LPM works, try this fix:
>>>
>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>> index 6227641f2d31..06cdec79244e 100644
>>> --- a/drivers/usb/dwc3/gadget.c
>>> +++ b/drivers/usb/dwc3/gadget.c
>>> @@ -309,10 +309,14 @@ int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep,
>>> unsigned int cmd,
>>>             if (DWC3_DEPCMD_CMD(cmd) == DWC3_DEPCMD_STARTTRANSFER) {
>>>                   int             needs_wakeup;
>>> +               u8              link_state;
>>>    -               needs_wakeup = (dwc->link_state ==
>>> DWC3_LINK_STATE_U1 ||
>>> -                               dwc->link_state == DWC3_LINK_STATE_U2||
>>> -                               dwc->link_state == DWC3_LINK_STATE_U3);
>>> +               reg = dwc3_readl(dwc->regs, DWC3_DSTS);
>>> +               link_state = DWC3_DSTS_USBLNKST(reg);
>>> +
>>> +               needs_wakeup = (link_state == DWC3_LINK_STATE_U1 ||
>>> +                               link_state == DWC3_LINK_STATE_U2 ||
>>> +                               link_state == DWC3_LINK_STATE_U3);
>>>                     if (unlikely(needs_wakeup)) {
>>>                           ret = __dwc3_gadget_wakeup(dwc);
>>> @@ -1989,6 +1993,8 @@ static int __dwc3_gadget_wakeup(struct dwc3 *dwc)
>>>           case DWC3_LINK_STATE_RESET:
>>>           case DWC3_LINK_STATE_RX_DET:    /* in HS, means Early Suspend */
>>>           case DWC3_LINK_STATE_U3:        /* in HS, means SUSPEND */
>>> +       case DWC3_LINK_STATE_U2:        /* in HS, means Sleep (L1) */
>>> +       case DWC3_LINK_STATE_U1:
>>>           case DWC3_LINK_STATE_RESUME:
>>>                   break;
>>>           default:
>>>
>> Same (good) result as with "snps,usb2-gadget-lpm-disable". Including
>> first switch from host->gadget not working.
>>
> Great! Not sure why the first switch is not working, but it seems like
> we were able to eliminate quite a few issues. If you have more dwc3
> tracepoints, we can take a look further.

I traced but the file is empty. I captured the registers as well. The 
zip file is here:

https://github.com/andy-shev/linux/files/6346271/first-switch.zip

I found the gadget configuration script was not called, which normally 
gets called due to a udev rule:

ACTION=="add", KERNEL=="dwc3.0.auto", SUBSYSTEMS=="udc", 
ATTRS{state}=="not attached", RUN+="/usr/bin/conf-gadget.sh"

So I retried and see  with ~# udevadm monitor:

# flipping the switch from host->gadget

KERNEL[51.824914] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/enp0s17u1u1/queues/rx-0 
(queues)
KERNEL[51.825682] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/enp0s17u1u1/queues/tx-0 
(queues)
KERNEL[51.826226] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/enp0s17u1u1 
(net)
KERNEL[51.836041] unbind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:003/usb-001:003:01 
(mdio_bus)
KERNEL[51.836709] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:003/usb-001:003:01 
(mdio_bus)
KERNEL[51.837342] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:003 
(mdio_bus)
KERNEL[51.837763] unbind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0 
(usb)
KERNEL[51.838116] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0 
(usb)
KERNEL[51.873712] unbind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1 
(usb)
KERNEL[51.874000] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1 
(usb)
KERNEL[51.874207] unbind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0 
(usb)
KERNEL[51.874431] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0 
(usb)
KERNEL[51.897175] unbind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
KERNEL[51.897486] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)

# stopped capture tracepoints here, then switch back to host

KERNEL[253.214406] add 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
KERNEL[253.263305] change 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
KERNEL[253.263687] add 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0 
(usb)
KERNEL[253.328354] bind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0 
(usb)
KERNEL[253.328734] bind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
KERNEL[253.699341] add 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1 
(usb)
KERNEL[253.744911] change 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1 
(usb)
KERNEL[253.745804] add 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0 
(usb)
KERNEL[253.805307] add 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005 
(mdio_bus)
KERNEL[253.812978] add 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005/usb-001:005:01 
(mdio_bus)
KERNEL[253.814318] bind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005/usb-001:005:01 
(mdio_bus)
KERNEL[253.815386] add 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0 
(net)
KERNEL[253.815552] add 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0/queues/rx-0 
(queues)
KERNEL[253.815778] add 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0/queues/tx-0 
(queues)
KERNEL[253.825279] bind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0 
(usb)
KERNEL[253.825667] bind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1 
(usb)

# switch to gadget again

KERNEL[314.212144] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0/queues/rx-0 
(queues)
KERNEL[314.212473] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0/queues/tx-0 
(queues)
KERNEL[314.214691] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0 
(net)

# extcon event didn't show the first time

KERNEL[314.238385] change 
/devices/pci0000:00/0000:00:13.0/INTC100E:00/mrfld_bcove_pwrsrc/extcon/extcon0 
(extcon)
KERNEL[314.238677] unbind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005/usb-001:005:01 
(mdio_bus)
KERNEL[314.238863] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005/usb-001:005:01 
(mdio_bus)
KERNEL[314.239015] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005 
(mdio_bus)
KERNEL[314.239205] unbind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0 
(usb)
KERNEL[314.239429] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0 
(usb)
KERNEL[314.239666] unbind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb2/2-0:1.0 
(usb)
KERNEL[314.239933] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb2/2-0:1.0 
(usb)
KERNEL[314.262713] unbind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb2 (usb)
KERNEL[314.263030] unbind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1 
(usb)
KERNEL[314.263298] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb2 (usb)
KERNEL[314.263569] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1 
(usb)
KERNEL[314.263815] unbind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0 
(usb)
KERNEL[314.264042] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0 
(usb)
KERNEL[314.264753] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usbmon/usbmon2 
(usbmon)
KERNEL[314.265019] unbind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
KERNEL[314.265289] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
KERNEL[314.288792] unbind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-0:1.0 
(usb)
KERNEL[314.289057] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-0:1.0 
(usb)
KERNEL[314.289327] unbind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1 (usb)
KERNEL[314.289661] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1 (usb)
KERNEL[314.647375] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usbmon/usbmon1 
(usbmon)
KERNEL[314.647816] unbind 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto (platform)
KERNEL[314.648143] remove   /kernel/software_nodes/node1 (software_nodes)
KERNEL[314.648672] remove 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto (platform)

# here is the event we were waiting for

KERNEL[314.649158] add 
/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/udc/dwc3.0.auto (udc)

# after this gadget devices appear normally

Maybe this issue is due to extcon missing the event?

>> After a 2 - 4 minutes the connection is dropped and reconnected.
> Does this occur with LPM disabled also? We can review this issue further
> with more dwc3 tracepoints.

I captured connection dropping and reconnecting in this fairly long 
trace near the end of the file:

https://github.com/andy-shev/linux/files/6346323/lost-connection.zip


>
>> On the gadget end journal shows:
>>
>> Apr 19 22:08:42 yuna systemd-networkd[507]: usb0: Lost carrier
>> Apr 19 22:08:42 yuna systemd-journald[417]: Forwarding to syslog missed
>> 1 messages.
>> Apr 19 22:08:42 yuna systemd-timesyncd[469]: No network connectivity,
>> watching for changes.
>> Apr 19 22:08:42 yuna kernel: IPv6: ADDRCONF(NETDEV_CHANGE): usb0: link
>> becomes ready
>> Apr 19 22:08:42 yuna kernel[480]: [  624.382929] IPv6:
>> ADDRCONF(NETDEV_CHANGE): usb0: link becomes ready
>> Apr 19 22:08:42 yuna systemd-networkd[507]: usb0: Gained carrier
>> Apr 19 22:08:44 yuna systemd-networkd[507]: usb0: Gained IPv6LL
>> Apr 19 22:08:44 yuna systemd-timesyncd[469]: Network configuration
>> changed, trying to establish connection.
>> Apr 19 22:08:57 yuna systemd-timesyncd[469]: Initial synchronization to
>> time server 216.239.35.8:123 (time3.google.com).
>>
>> So, drops and immediately reconnects.
>>
>  From the look at the log here, it seems to be a reset from host (and an
> issue at the protocol level) unrelated to dwc3 driver or the controller.
> Hopefully and maybe we can get more clues from dwc3 tracepoints.
>
> Thanks,
> Thinh
Thinh Nguyen April 21, 2021, 7:01 p.m. UTC | #24
Ferry Toth wrote:
> Hi
> 
> Op 19-04-2021 om 23:23 schreef Thinh Nguyen:
>> Ferry Toth wrote:
>>> Hi
>>>
>>> Op 19-04-2021 om 01:03 schreef Thinh Nguyen:
>>>> Ferry Toth wrote:
>>>>> Hi
>>>>>
>>>>> Op 17-04-2021 om 16:22 schreef Ferry Toth:
>>>>>> Hi
>>>>>>
>>>>>> Op 17-04-2021 om 04:27 schreef Thinh Nguyen:
>>>>>>> Ferry Toth wrote:
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> Op 16-04-2021 om 00:23 schreef Thinh Nguyen:
>>>>>>>>> Thinh Nguyen wrote:
>>>>>>>>>> From: Yu Chen <chenyu56@huawei.com>
>>>>>>>>>> From: John Stultz <john.stultz@linaro.org>
>>>>>>>>>>
>>>>>>>>>> According to the programming guide, to switch mode for DRD
>>>>>>>>>> controller,
>>>>>>>>>> the driver needs to do the following.
>>>>>>>>>>
>>>>>>>>>> To switch from device to host:
>>>>>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>>>>>> 2. Set GCTL.PrtCapDir(host mode)
>>>>>>>>>> 3. Reset the host with USBCMD.HCRESET
>>>>>>>>>> 4. Then follow up with the initializing host registers sequence
>>>>>>>>>>
>>>>>>>>>> To switch from host to device:
>>>>>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>>>>>> 2. Set GCTL.PrtCapDir(device mode)
>>>>>>>>>> 3. Reset the device with DCTL.CSftRst
>>>>>>>>>> 4. Then follow up with the initializing registers sequence
>>>>>>>>>>
>>>>>>>>>> Currently we're missing step 1) to do GCTL.CoreSoftReset and step
>>>>>>>>>> 3) of
>>>>>>>>>> switching from host to device. John Stult reported a lockup issue
>>>>>>>>>> seen
>>>>>>>>>> with HiKey960 platform without these steps[1]. Similar issue is
>>>>>>>>>> observed
>>>>>>>>>> with Ferry's testing platform[2].
>>>>>>>>>>
>>>>>>>>>> So, apply the required steps along with some fixes to Yu Chen's
>>>>>>>>>> and John
>>>>>>>>>> Stultz's version. The main fixes to their versions are the
>>>>>>>>>> missing
>>>>>>>>>> wait
>>>>>>>>>> for clocks synchronization before clearing GCTL.CoreSoftReset and
>>>>>>>>>> only
>>>>>>>>>> apply DCTL.CSftRst when switching from host to device.
>>>>>>>>>>
>>>>>>>>>> [1]
>>>>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqLhzN4i3$
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> [2]
>>>>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqGeZStt4$
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
>>>>>>>>>> Cc: Ferry Toth <fntoth@gmail.com>
>>>>>>>>>> Cc: Wesley Cheng <wcheng@codeaurora.org>
>>>>>>>>>> Cc: <stable@vger.kernel.org>
>>>>>>>>>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work
>>>>>>>>>> properly")
>>>>>>>>>> Signed-off-by: Yu Chen <chenyu56@huawei.com>
>>>>>>>>>> Signed-off-by: John Stultz <john.stultz@linaro.org>
>>>>>>>>>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>>>>>>>>>> ---
>>>>>>>>>> Changes in v3:
>>>>>>>>>> - Check if the desired mode is OTG, then keep the old flow
>>>>>>>>>> - Remove condition for OTG support only since the device can
>>>>>>>>>> still be
>>>>>>>>>>       configured DRD host/device mode only
>>>>>>>>>> - Remove redundant hw_mode check since __dwc3_set_mode() only
>>>>>>>>>> applies
>>>>>>>>>> when
>>>>>>>>>>       hw_mode is DRD
>>>>>>>>>> Changes in v2:
>>>>>>>>>> - Initialize mutex per device and not as global mutex.
>>>>>>>>>> - Add additional checks for DRD only mode
>>>>>>>>>>
>>>>>>>>>>      drivers/usb/dwc3/core.c | 27 +++++++++++++++++++++++++++
>>>>>>>>>>      drivers/usb/dwc3/core.h |  5 +++++
>>>>>>>>>>      2 files changed, 32 insertions(+)
>>>>>>>>>>
>>>>>>>>> Hi John,
>>>>>>>>>
>>>>>>>>> If possible, can you run a test with this version on your
>>>>>>>>> platform?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Thinh
>>>>>>>>>
>>>>>>>> I tested this on edison-arduino with this patch on top of usb-next
>>>>>>>> (5.12-rc7 + "increase BESL baseline to 6" to prevent throttling").
>>>>>>>>
>>>>>>>> On this platform there is a physical switch to switch roles. With
>>>>>>>> this
>>>>>>>> patch I find:
>>>>>>>>
>>>>>>>> - switch to host mode always works fine
>>>>>>>>
>>>>>>>> - switch to gadget mode I need to flip the switch 3x
>>>>>>>> (gadget-host-gadget).
>>>>>>>>
>>>>>>>> An error message appears on the gadget side "dwc3 dwc3.0.auto:
>>>>>>>> timed
>>>>>>>> out
>>>>>>>> waiting for SETUP phase" appears, but then the device connects
>>>>>>>> to my
>>>>>>>> PC,
>>>>>>>> no throttling.
>>>>>>>>
>>>>>>>> - alternatively I can switch to gadget 1x and then unplug/replug
>>>>>>>> the
>>>>>>>> cable.
>>>>>>>>
>>>>>>>> No error message and connects fine.
>>>>>>>>
>>>>>>>> - if I flip the switch only once, on the PC side I get:
>>>>>>>>
>>>>>>>>      kernel: usb 1-5: new high-speed USB device number 18
>>>>>>>> usingxhci_hcd
>>>>>>>>      kernel: usb 1-5: New USB device found, idVendor=1d6b,
>>>>>>>>      idProduct=0104, bcdDevice= 1.00 kernel: usb1-5: New USB device
>>>>>>>>      strings: Mfr=1, Product=2, SerialNumber=3kernel:usb 1-5:
>>>>>>>> Product:
>>>>>>>>      USBArmory Gadget kernel: usb 1-5: Manufacturer:USBArmory
>>>>>>>> kernel:
>>>>>>>>      usb 1-5: SerialNumber: 0123456789abcdef kernel:usb 1-5: can't
>>>>>>>> set
>>>>>>>>      config #1, error -110
>>>>>>> The device failed at set_configuration() request and timed out. It
>>>>>>> probably timed out from the status stage looking at the device err
>>>>>>> print.
>>>>>>>
>>>>>>>> Then if I wait long enough on the gadget side I get:
>>>>>>>>
>>>>>>>>      root@yuna:~# ifconfig
>>>>>>>>
>>>>>>>>      usb0: flags=-28605<UP,BROADCAST,RUNNING,MULTICAST,DYNAMIC> mtu
>>>>>>>> 1500
>>>>>>>>      inet 169.254.119.239 netmask 255.255.0.0 broadcast
>>>>>>>> 169.254.255.255
>>>>>>>>      inet6 fe80::a8bb:ccff:fedd:eef1 prefixlen 64 scopeid
>>>>>>>> 0x20<link>
>>>>>>>>      ether aa:bb:cc:dd:ee:f1 txqueuelen 1000 (Ethernet) RX packets
>>>>>>>> 490424
>>>>>>>>      bytes 735146578 (701.0 MiB) RX errors 0 dropped191 overruns 0
>>>>>>>> frame
>>>>>>>>      0 TX packets 35279 bytes 2532746 (2.4 MiB) TX errors 0
>>>>>>>> dropped 0
>>>>>>>>      overruns 0 carrier 0 collisions 0
>>>>>>>>
>>>>>>>> (correct would be: inet 10.42.0.221 netmask 255.255.255.0 broadcast
>>>>>>>> 10.42.0.255)
>>>>>>>>
>>>>>>>> So much improved now, but it seems I am still missing something on
>>>>>>>> plug.
>>>>>>>>
>>>>>>> That's great! We can look at it further. Can you capture the
>>>>>>> tracepoints
>>>>>>> of the issue. Also, can you try with mass_storage gadget to see
>>>>>>> if the
>>>>>>> result is the same?
>>>>>> I have already gser, eem, mass_storage and uac2 combo. When eem
>>>>>> fails,
>>>>>> the mass_storage and uac2 don't appear (on KDE you get all kind of
>>>>>> popups when they appear).
>>>>>>
>>>>>> So either all works, or all fails.
>>>>>>
>>>>>> I'll trace this later today.
>>>>> Trace capturing switch from host-> gadget  here
>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6329600/5.12-rc7*2Busb-next.zip__;JQ!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXLOiNwGT$
>>>>>
>>>>>
>>>>>
>>>>> (Issue history:
>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/issues/31__;!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXNc7KgAw$
>>>>>
>>>>>
>>>>> )
>>>>>
>>>>> On the PC side this resulted to:
>>>>>
>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: new high-speed USB device
>>>>> number 12 using xhci_hcd
>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device found,
>>>>> idVendor=1d6b, idProduct=0104, bcdDevice= 1.00
>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device strings:
>>>>> Mfr=1,
>>>>> Product=2, SerialNumber=3
>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: Product: USBArmory Gadget
>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: Manufacturer: USBArmory
>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: SerialNumber:
>>>>> 0123456789abcdef
>>>>> apr 17 18:17:49 delfion kernel: usb 1-5: can't set config #1, error
>>>>> -110
>>>>>
>>>>>
>>>>> Thanks for all your help!
>>>>>
>>>> Looks like it's LPM related again. To confirm, try this:
>>>> Disable LPM with this property "snps,usb2-gadget-lpm-disable"
>>>> (Note that it's not the same as "snps,dis_enblslpm_quirk")
>>> Yes, I confirm this helps.
>>>
>>> Note: on startup I was in host mode, with gadget cable plugged. The
>>> first switch to gadget didn't work, all subsequent switches did work, as
>>> well as unplug/plug the cable.
>>>
>>>> Make sure that your testing kernel has this patch [1]
>>>> 475e8be53d04 ("usb: dwc3: gadget: Check for disabled LPM quirk")
>>>>
>>>> [1]
>>>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-next&id=475e8be53d0496f9bc6159f4abb3ff5f9b90e8de__;!!A4F2R9G_pg!Mvz1Am6Ka_pOBfD0TmsA3821I05Ti8stMgh5r4XzMwZ9dy1Wan-il-DB4h50DmbaU4Zw$
>>>>
>>>>
>>>>
>>>> The failure you saw was probably due the gadget function attempting
>>>> to start a delayed status stage of the SET_CONFIGURATION request.
>>>> By this time, the host already put the device in low power.
>>>>
>>>> The START_TRANSFER command needs to be executed while the device
>>>> is on "ON" state (or U0 if eSS). We shouldn't use dwc->link_state
>>>> to check for link state because we only enable link state change
>>>> interrupt for some controller versions.
>>>>
>>>> Once you confirms disabling LPM works, try this fix:
>>>>
>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>> index 6227641f2d31..06cdec79244e 100644
>>>> --- a/drivers/usb/dwc3/gadget.c
>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>> @@ -309,10 +309,14 @@ int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep,
>>>> unsigned int cmd,
>>>>             if (DWC3_DEPCMD_CMD(cmd) == DWC3_DEPCMD_STARTTRANSFER) {
>>>>                   int             needs_wakeup;
>>>> +               u8              link_state;
>>>>    -               needs_wakeup = (dwc->link_state ==
>>>> DWC3_LINK_STATE_U1 ||
>>>> -                               dwc->link_state == DWC3_LINK_STATE_U2||
>>>> -                               dwc->link_state == DWC3_LINK_STATE_U3);
>>>> +               reg = dwc3_readl(dwc->regs, DWC3_DSTS);
>>>> +               link_state = DWC3_DSTS_USBLNKST(reg);
>>>> +
>>>> +               needs_wakeup = (link_state == DWC3_LINK_STATE_U1 ||
>>>> +                               link_state == DWC3_LINK_STATE_U2 ||
>>>> +                               link_state == DWC3_LINK_STATE_U3);
>>>>                     if (unlikely(needs_wakeup)) {
>>>>                          ret = __dwc3_gadget_wakeup(dwc);
>>>> @@ -1989,6 +1993,8 @@ static int __dwc3_gadget_wakeup(struct dwc3 *dwc)
>>>>           case DWC3_LINK_STATE_RESET:
>>>>           case DWC3_LINK_STATE_RX_DET:    /* in HS, means Early
>>>> Suspend */
>>>>           case DWC3_LINK_STATE_U3:        /* in HS, means SUSPEND */
>>>> +       case DWC3_LINK_STATE_U2:        /* in HS, means Sleep (L1) */
>>>> +       case DWC3_LINK_STATE_U1:
>>>>           case DWC3_LINK_STATE_RESUME:
>>>>                   break;
>>>>           default:
>>>>
>>> Same (good) result as with "snps,usb2-gadget-lpm-disable". Including
>>> first switch from host->gadget not working.
>>>
>> Great! Not sure why the first switch is not working, but it seems like
>> we were able to eliminate quite a few issues. If you have more dwc3
>> tracepoints, we can take a look further.
> 
> I traced but the file is empty. I captured the registers as well. The
> zip file is here:
> 
> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6346271/first-switch.zip__;!!A4F2R9G_pg!KAmPA0Vw1WUiSxdc5-BKNPyD0klvdr5ucZI3E_C2ojho2rNT9wzMs8HG4qCYSDx89HFE$
> 
> I found the gadget configuration script was not called, which normally
> gets called due to a udev rule:
> 
> ACTION=="add", KERNEL=="dwc3.0.auto", SUBSYSTEMS=="udc",
> ATTRS{state}=="not attached", RUN+="/usr/bin/conf-gadget.sh"
> 
> So I retried and see  with ~# udevadm monitor:
> 
> # flipping the switch from host->gadget
> 
> KERNEL[51.824914] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/enp0s17u1u1/queues/rx-0
> (queues)
> KERNEL[51.825682] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/enp0s17u1u1/queues/tx-0
> (queues)
> KERNEL[51.826226] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/enp0s17u1u1
> (net)
> KERNEL[51.836041] unbind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:003/usb-001:003:01
> (mdio_bus)
> KERNEL[51.836709] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:003/usb-001:003:01
> (mdio_bus)
> KERNEL[51.837342] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:003
> (mdio_bus)
> KERNEL[51.837763] unbind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0
> (usb)
> KERNEL[51.838116] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0
> (usb)
> KERNEL[51.873712] unbind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
> (usb)
> KERNEL[51.874000] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
> (usb)
> KERNEL[51.874207] unbind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0
> (usb)
> KERNEL[51.874431] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0
> (usb)
> KERNEL[51.897175] unbind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
> KERNEL[51.897486] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
> 
> # stopped capture tracepoints here, then switch back to host
> 
> KERNEL[253.214406] add
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
> KERNEL[253.263305] change
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
> KERNEL[253.263687] add
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0
> (usb)
> KERNEL[253.328354] bind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0
> (usb)
> KERNEL[253.328734] bind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
> KERNEL[253.699341] add
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
> (usb)
> KERNEL[253.744911] change
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
> (usb)
> KERNEL[253.745804] add
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0
> (usb)
> KERNEL[253.805307] add
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005
> (mdio_bus)
> KERNEL[253.812978] add
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005/usb-001:005:01
> (mdio_bus)
> KERNEL[253.814318] bind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005/usb-001:005:01
> (mdio_bus)
> KERNEL[253.815386] add
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0
> (net)
> KERNEL[253.815552] add
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0/queues/rx-0
> (queues)
> KERNEL[253.815778] add
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0/queues/tx-0
> (queues)
> KERNEL[253.825279] bind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0
> (usb)
> KERNEL[253.825667] bind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
> (usb)
> 
> # switch to gadget again
> 
> KERNEL[314.212144] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0/queues/rx-0
> (queues)
> KERNEL[314.212473] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0/queues/tx-0
> (queues)
> KERNEL[314.214691] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0
> (net)
> 
> # extcon event didn't show the first time
> 
> KERNEL[314.238385] change
> /devices/pci0000:00/0000:00:13.0/INTC100E:00/mrfld_bcove_pwrsrc/extcon/extcon0
> (extcon)
> KERNEL[314.238677] unbind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005/usb-001:005:01
> (mdio_bus)
> KERNEL[314.238863] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005/usb-001:005:01
> (mdio_bus)
> KERNEL[314.239015] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005
> (mdio_bus)
> KERNEL[314.239205] unbind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0
> (usb)
> KERNEL[314.239429] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0
> (usb)
> KERNEL[314.239666] unbind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb2/2-0:1.0 (usb)
> 
> KERNEL[314.239933] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb2/2-0:1.0 (usb)
> 
> KERNEL[314.262713] unbind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb2 (usb)
> KERNEL[314.263030] unbind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
> (usb)
> KERNEL[314.263298] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb2 (usb)
> KERNEL[314.263569] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
> (usb)
> KERNEL[314.263815] unbind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0
> (usb)
> KERNEL[314.264042] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0
> (usb)
> KERNEL[314.264753] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usbmon/usbmon2
> (usbmon)
> KERNEL[314.265019] unbind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
> KERNEL[314.265289] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
> KERNEL[314.288792] unbind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-0:1.0 (usb)
> 
> KERNEL[314.289057] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-0:1.0 (usb)
> 
> KERNEL[314.289327] unbind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1 (usb)
> KERNEL[314.289661] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1 (usb)
> KERNEL[314.647375] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usbmon/usbmon1
> (usbmon)
> KERNEL[314.647816] unbind
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto (platform)
> KERNEL[314.648143] remove   /kernel/software_nodes/node1 (software_nodes)
> KERNEL[314.648672] remove
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto (platform)
> 
> # here is the event we were waiting for
> 
> KERNEL[314.649158] add
> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/udc/dwc3.0.auto (udc)
> 
> # after this gadget devices appear normally
> 
> Maybe this issue is due to extcon missing the event?

From the info here, it doesn't look like the host platform device was
removed on the first switch. Also, as you pointed it out, the extcon
event was not shown on the first switch either. Without a notification
to switch mode, the dwc3 driver won't do anything. You need to check why
that's the case as I can't help much here.

> 
>>> After a 2 - 4 minutes the connection is dropped and reconnected.
>> Does this occur with LPM disabled also? We can review this issue further
>> with more dwc3 tracepoints.
> 
> I captured connection dropping and reconnecting in this fairly long
> trace near the end of the file:
> 
> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6346323/lost-connection.zip__;!!A4F2R9G_pg!KAmPA0Vw1WUiSxdc5-BKNPyD0klvdr5ucZI3E_C2ojho2rNT9wzMs8HG4qCYSIuUHRz4$
> 

Nothing obvious stands out as a problem from the dwc3 driver or the
controller. I see a (port) reset after 30 seconds of inactivity, which
is a typical timeout and recovery mechanism in the upperlayer from host.

* Is this a new issue or was it always there?
* Does turning off LPM help?
* Are the other gadget functions still work during the 30 seconds
inactivity?


>>
>>> On the gadget end journal shows:
>>>
>>> Apr 19 22:08:42 yuna systemd-networkd[507]: usb0: Lost carrier
>>> Apr 19 22:08:42 yuna systemd-journald[417]: Forwarding to syslog missed
>>> 1 messages.
>>> Apr 19 22:08:42 yuna systemd-timesyncd[469]: No network connectivity,
>>> watching for changes.
>>> Apr 19 22:08:42 yuna kernel: IPv6: ADDRCONF(NETDEV_CHANGE): usb0: link
>>> becomes ready
>>> Apr 19 22:08:42 yuna kernel[480]: [  624.382929] IPv6:
>>> ADDRCONF(NETDEV_CHANGE): usb0: link becomes ready
>>> Apr 19 22:08:42 yuna systemd-networkd[507]: usb0: Gained carrier
>>> Apr 19 22:08:44 yuna systemd-networkd[507]: usb0: Gained IPv6LL
>>> Apr 19 22:08:44 yuna systemd-timesyncd[469]: Network configuration
>>> changed, trying to establish connection.
>>> Apr 19 22:08:57 yuna systemd-timesyncd[469]: Initial synchronization to
>>> time server 216.239.35.8:123 (time3.google.com).
>>>
>>> So, drops and immediately reconnects.
>>>
>>  From the look at the log here, it seems to be a reset from host (and an
>> issue at the protocol level) unrelated to dwc3 driver or the controller.
>> Hopefully and maybe we can get more clues from dwc3 tracepoints.
>>

BR,
Thinh
Ferry Toth April 21, 2021, 10:30 p.m. UTC | #25
Hi

Op 21-04-2021 om 21:01 schreef Thinh Nguyen:
> Ferry Toth wrote:
>> Hi
>>
>> Op 19-04-2021 om 23:23 schreef Thinh Nguyen:
>>> Ferry Toth wrote:
>>>> Hi
>>>>
>>>> Op 19-04-2021 om 01:03 schreef Thinh Nguyen:
>>>>> Ferry Toth wrote:
>>>>>> Hi
>>>>>>
>>>>>> Op 17-04-2021 om 16:22 schreef Ferry Toth:
>>>>>>> Hi
>>>>>>>
>>>>>>> Op 17-04-2021 om 04:27 schreef Thinh Nguyen:
>>>>>>>> Ferry Toth wrote:
>>>>>>>>> Hi
>>>>>>>>>
>>>>>>>>> Op 16-04-2021 om 00:23 schreef Thinh Nguyen:
>>>>>>>>>> Thinh Nguyen wrote:
>>>>>>>>>>> From: Yu Chen <chenyu56@huawei.com>
>>>>>>>>>>> From: John Stultz <john.stultz@linaro.org>
>>>>>>>>>>>
>>>>>>>>>>> According to the programming guide, to switch mode for DRD
>>>>>>>>>>> controller,
>>>>>>>>>>> the driver needs to do the following.
>>>>>>>>>>>
>>>>>>>>>>> To switch from device to host:
>>>>>>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>>>>>>> 2. Set GCTL.PrtCapDir(host mode)
>>>>>>>>>>> 3. Reset the host with USBCMD.HCRESET
>>>>>>>>>>> 4. Then follow up with the initializing host registers sequence
>>>>>>>>>>>
>>>>>>>>>>> To switch from host to device:
>>>>>>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>>>>>>> 2. Set GCTL.PrtCapDir(device mode)
>>>>>>>>>>> 3. Reset the device with DCTL.CSftRst
>>>>>>>>>>> 4. Then follow up with the initializing registers sequence
>>>>>>>>>>>
>>>>>>>>>>> Currently we're missing step 1) to do GCTL.CoreSoftReset and step
>>>>>>>>>>> 3) of
>>>>>>>>>>> switching from host to device. John Stult reported a lockup issue
>>>>>>>>>>> seen
>>>>>>>>>>> with HiKey960 platform without these steps[1]. Similar issue is
>>>>>>>>>>> observed
>>>>>>>>>>> with Ferry's testing platform[2].
>>>>>>>>>>>
>>>>>>>>>>> So, apply the required steps along with some fixes to Yu Chen's
>>>>>>>>>>> and John
>>>>>>>>>>> Stultz's version. The main fixes to their versions are the
>>>>>>>>>>> missing
>>>>>>>>>>> wait
>>>>>>>>>>> for clocks synchronization before clearing GCTL.CoreSoftReset and
>>>>>>>>>>> only
>>>>>>>>>>> apply DCTL.CSftRst when switching from host to device.
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqLhzN4i3$
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [2]
>>>>>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqGeZStt4$
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
>>>>>>>>>>> Cc: Ferry Toth <fntoth@gmail.com>
>>>>>>>>>>> Cc: Wesley Cheng <wcheng@codeaurora.org>
>>>>>>>>>>> Cc: <stable@vger.kernel.org>
>>>>>>>>>>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work
>>>>>>>>>>> properly")
>>>>>>>>>>> Signed-off-by: Yu Chen <chenyu56@huawei.com>
>>>>>>>>>>> Signed-off-by: John Stultz <john.stultz@linaro.org>
>>>>>>>>>>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>>>>>>>>>>> ---
>>>>>>>>>>> Changes in v3:
>>>>>>>>>>> - Check if the desired mode is OTG, then keep the old flow
>>>>>>>>>>> - Remove condition for OTG support only since the device can
>>>>>>>>>>> still be
>>>>>>>>>>>        configured DRD host/device mode only
>>>>>>>>>>> - Remove redundant hw_mode check since __dwc3_set_mode() only
>>>>>>>>>>> applies
>>>>>>>>>>> when
>>>>>>>>>>>        hw_mode is DRD
>>>>>>>>>>> Changes in v2:
>>>>>>>>>>> - Initialize mutex per device and not as global mutex.
>>>>>>>>>>> - Add additional checks for DRD only mode
>>>>>>>>>>>
>>>>>>>>>>>       drivers/usb/dwc3/core.c | 27 +++++++++++++++++++++++++++
>>>>>>>>>>>       drivers/usb/dwc3/core.h |  5 +++++
>>>>>>>>>>>       2 files changed, 32 insertions(+)
>>>>>>>>>>>
>>>>>>>>>> Hi John,
>>>>>>>>>>
>>>>>>>>>> If possible, can you run a test with this version on your
>>>>>>>>>> platform?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Thinh
>>>>>>>>>>
>>>>>>>>> I tested this on edison-arduino with this patch on top of usb-next
>>>>>>>>> (5.12-rc7 + "increase BESL baseline to 6" to prevent throttling").
>>>>>>>>>
>>>>>>>>> On this platform there is a physical switch to switch roles. With
>>>>>>>>> this
>>>>>>>>> patch I find:
>>>>>>>>>
>>>>>>>>> - switch to host mode always works fine
>>>>>>>>>
>>>>>>>>> - switch to gadget mode I need to flip the switch 3x
>>>>>>>>> (gadget-host-gadget).
>>>>>>>>>
>>>>>>>>> An error message appears on the gadget side "dwc3 dwc3.0.auto:
>>>>>>>>> timed
>>>>>>>>> out
>>>>>>>>> waiting for SETUP phase" appears, but then the device connects
>>>>>>>>> to my
>>>>>>>>> PC,
>>>>>>>>> no throttling.
>>>>>>>>>
>>>>>>>>> - alternatively I can switch to gadget 1x and then unplug/replug
>>>>>>>>> the
>>>>>>>>> cable.
>>>>>>>>>
>>>>>>>>> No error message and connects fine.
>>>>>>>>>
>>>>>>>>> - if I flip the switch only once, on the PC side I get:
>>>>>>>>>
>>>>>>>>>       kernel: usb 1-5: new high-speed USB device number 18
>>>>>>>>> usingxhci_hcd
>>>>>>>>>       kernel: usb 1-5: New USB device found, idVendor=1d6b,
>>>>>>>>>       idProduct=0104, bcdDevice= 1.00 kernel: usb1-5: New USB device
>>>>>>>>>       strings: Mfr=1, Product=2, SerialNumber=3kernel:usb 1-5:
>>>>>>>>> Product:
>>>>>>>>>       USBArmory Gadget kernel: usb 1-5: Manufacturer:USBArmory
>>>>>>>>> kernel:
>>>>>>>>>       usb 1-5: SerialNumber: 0123456789abcdef kernel:usb 1-5: can't
>>>>>>>>> set
>>>>>>>>>       config #1, error -110
>>>>>>>> The device failed at set_configuration() request and timed out. It
>>>>>>>> probably timed out from the status stage looking at the device err
>>>>>>>> print.
>>>>>>>>
>>>>>>>>> Then if I wait long enough on the gadget side I get:
>>>>>>>>>
>>>>>>>>>       root@yuna:~# ifconfig
>>>>>>>>>
>>>>>>>>>       usb0: flags=-28605<UP,BROADCAST,RUNNING,MULTICAST,DYNAMIC> mtu
>>>>>>>>> 1500
>>>>>>>>>       inet 169.254.119.239 netmask 255.255.0.0 broadcast
>>>>>>>>> 169.254.255.255
>>>>>>>>>       inet6 fe80::a8bb:ccff:fedd:eef1 prefixlen 64 scopeid
>>>>>>>>> 0x20<link>
>>>>>>>>>       ether aa:bb:cc:dd:ee:f1 txqueuelen 1000 (Ethernet) RX packets
>>>>>>>>> 490424
>>>>>>>>>       bytes 735146578 (701.0 MiB) RX errors 0 dropped191 overruns 0
>>>>>>>>> frame
>>>>>>>>>       0 TX packets 35279 bytes 2532746 (2.4 MiB) TX errors 0
>>>>>>>>> dropped 0
>>>>>>>>>       overruns 0 carrier 0 collisions 0
>>>>>>>>>
>>>>>>>>> (correct would be: inet 10.42.0.221 netmask 255.255.255.0 broadcast
>>>>>>>>> 10.42.0.255)
>>>>>>>>>
>>>>>>>>> So much improved now, but it seems I am still missing something on
>>>>>>>>> plug.
>>>>>>>>>
>>>>>>>> That's great! We can look at it further. Can you capture the
>>>>>>>> tracepoints
>>>>>>>> of the issue. Also, can you try with mass_storage gadget to see
>>>>>>>> if the
>>>>>>>> result is the same?
>>>>>>> I have already gser, eem, mass_storage and uac2 combo. When eem
>>>>>>> fails,
>>>>>>> the mass_storage and uac2 don't appear (on KDE you get all kind of
>>>>>>> popups when they appear).
>>>>>>>
>>>>>>> So either all works, or all fails.
>>>>>>>
>>>>>>> I'll trace this later today.
>>>>>> Trace capturing switch from host-> gadget  here
>>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6329600/5.12-rc7*2Busb-next.zip__;JQ!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXLOiNwGT$
>>>>>>
>>>>>>
>>>>>>
>>>>>> (Issue history:
>>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/issues/31__;!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXNc7KgAw$
>>>>>>
>>>>>>
>>>>>> )
>>>>>>
>>>>>> On the PC side this resulted to:
>>>>>>
>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: new high-speed USB device
>>>>>> number 12 using xhci_hcd
>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device found,
>>>>>> idVendor=1d6b, idProduct=0104, bcdDevice= 1.00
>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device strings:
>>>>>> Mfr=1,
>>>>>> Product=2, SerialNumber=3
>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: Product: USBArmory Gadget
>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: Manufacturer: USBArmory
>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: SerialNumber:
>>>>>> 0123456789abcdef
>>>>>> apr 17 18:17:49 delfion kernel: usb 1-5: can't set config #1, error
>>>>>> -110
>>>>>>
>>>>>>
>>>>>> Thanks for all your help!
>>>>>>
>>>>> Looks like it's LPM related again. To confirm, try this:
>>>>> Disable LPM with this property "snps,usb2-gadget-lpm-disable"
>>>>> (Note that it's not the same as "snps,dis_enblslpm_quirk")
>>>> Yes, I confirm this helps.
>>>>
>>>> Note: on startup I was in host mode, with gadget cable plugged. The
>>>> first switch to gadget didn't work, all subsequent switches did work, as
>>>> well as unplug/plug the cable.
>>>>
>>>>> Make sure that your testing kernel has this patch [1]
>>>>> 475e8be53d04 ("usb: dwc3: gadget: Check for disabled LPM quirk")
>>>>>
>>>>> [1]
>>>>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-next&id=475e8be53d0496f9bc6159f4abb3ff5f9b90e8de__;!!A4F2R9G_pg!Mvz1Am6Ka_pOBfD0TmsA3821I05Ti8stMgh5r4XzMwZ9dy1Wan-il-DB4h50DmbaU4Zw$
>>>>>
>>>>>
>>>>>
>>>>> The failure you saw was probably due the gadget function attempting
>>>>> to start a delayed status stage of the SET_CONFIGURATION request.
>>>>> By this time, the host already put the device in low power.
>>>>>
>>>>> The START_TRANSFER command needs to be executed while the device
>>>>> is on "ON" state (or U0 if eSS). We shouldn't use dwc->link_state
>>>>> to check for link state because we only enable link state change
>>>>> interrupt for some controller versions.
>>>>>
>>>>> Once you confirms disabling LPM works, try this fix:
>>>>>
>>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>>> index 6227641f2d31..06cdec79244e 100644
>>>>> --- a/drivers/usb/dwc3/gadget.c
>>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>>> @@ -309,10 +309,14 @@ int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep,
>>>>> unsigned int cmd,
>>>>>              if (DWC3_DEPCMD_CMD(cmd) == DWC3_DEPCMD_STARTTRANSFER) {
>>>>>                    int             needs_wakeup;
>>>>> +               u8              link_state;
>>>>>     -               needs_wakeup = (dwc->link_state ==
>>>>> DWC3_LINK_STATE_U1 ||
>>>>> -                               dwc->link_state == DWC3_LINK_STATE_U2||
>>>>> -                               dwc->link_state == DWC3_LINK_STATE_U3);
>>>>> +               reg = dwc3_readl(dwc->regs, DWC3_DSTS);
>>>>> +               link_state = DWC3_DSTS_USBLNKST(reg);
>>>>> +
>>>>> +               needs_wakeup = (link_state == DWC3_LINK_STATE_U1 ||
>>>>> +                               link_state == DWC3_LINK_STATE_U2 ||
>>>>> +                               link_state == DWC3_LINK_STATE_U3);
>>>>>                      if (unlikely(needs_wakeup)) {
>>>>>                           ret = __dwc3_gadget_wakeup(dwc);
>>>>> @@ -1989,6 +1993,8 @@ static int __dwc3_gadget_wakeup(struct dwc3 *dwc)
>>>>>            case DWC3_LINK_STATE_RESET:
>>>>>            case DWC3_LINK_STATE_RX_DET:    /* in HS, means Early
>>>>> Suspend */
>>>>>            case DWC3_LINK_STATE_U3:        /* in HS, means SUSPEND */
>>>>> +       case DWC3_LINK_STATE_U2:        /* in HS, means Sleep (L1) */
>>>>> +       case DWC3_LINK_STATE_U1:
>>>>>            case DWC3_LINK_STATE_RESUME:
>>>>>                    break;
>>>>>            default:
>>>>>
>>>> Same (good) result as with "snps,usb2-gadget-lpm-disable". Including
>>>> first switch from host->gadget not working.
>>>>
>>> Great! Not sure why the first switch is not working, but it seems like
>>> we were able to eliminate quite a few issues. If you have more dwc3
>>> tracepoints, we can take a look further.
>> I traced but the file is empty. I captured the registers as well. The
>> zip file is here:
>>
>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6346271/first-switch.zip__;!!A4F2R9G_pg!KAmPA0Vw1WUiSxdc5-BKNPyD0klvdr5ucZI3E_C2ojho2rNT9wzMs8HG4qCYSDx89HFE$
>>
>> I found the gadget configuration script was not called, which normally
>> gets called due to a udev rule:
>>
>> ACTION=="add", KERNEL=="dwc3.0.auto", SUBSYSTEMS=="udc",
>> ATTRS{state}=="not attached", RUN+="/usr/bin/conf-gadget.sh"
>>
>> So I retried and see  with ~# udevadm monitor:
>>
>> # flipping the switch from host->gadget
>>
>> KERNEL[51.824914] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/enp0s17u1u1/queues/rx-0
>> (queues)
>> KERNEL[51.825682] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/enp0s17u1u1/queues/tx-0
>> (queues)
>> KERNEL[51.826226] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/enp0s17u1u1
>> (net)
>> KERNEL[51.836041] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:003/usb-001:003:01
>> (mdio_bus)
>> KERNEL[51.836709] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:003/usb-001:003:01
>> (mdio_bus)
>> KERNEL[51.837342] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:003
>> (mdio_bus)
>> KERNEL[51.837763] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0
>> (usb)
>> KERNEL[51.838116] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0
>> (usb)
>> KERNEL[51.873712] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
>> (usb)
>> KERNEL[51.874000] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
>> (usb)
>> KERNEL[51.874207] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0
>> (usb)
>> KERNEL[51.874431] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0
>> (usb)
>> KERNEL[51.897175] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
>> KERNEL[51.897486] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
>>
>> # stopped capture tracepoints here, then switch back to host
>>
>> KERNEL[253.214406] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
>> KERNEL[253.263305] change
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
>> KERNEL[253.263687] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0
>> (usb)
>> KERNEL[253.328354] bind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0
>> (usb)
>> KERNEL[253.328734] bind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
>> KERNEL[253.699341] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
>> (usb)
>> KERNEL[253.744911] change
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
>> (usb)
>> KERNEL[253.745804] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0
>> (usb)
>> KERNEL[253.805307] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005
>> (mdio_bus)
>> KERNEL[253.812978] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005/usb-001:005:01
>> (mdio_bus)
>> KERNEL[253.814318] bind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005/usb-001:005:01
>> (mdio_bus)
>> KERNEL[253.815386] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0
>> (net)
>> KERNEL[253.815552] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0/queues/rx-0
>> (queues)
>> KERNEL[253.815778] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0/queues/tx-0
>> (queues)
>> KERNEL[253.825279] bind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0
>> (usb)
>> KERNEL[253.825667] bind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
>> (usb)
>>
>> # switch to gadget again
>>
>> KERNEL[314.212144] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0/queues/rx-0
>> (queues)
>> KERNEL[314.212473] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0/queues/tx-0
>> (queues)
>> KERNEL[314.214691] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0
>> (net)
>>
>> # extcon event didn't show the first time
>>
>> KERNEL[314.238385] change
>> /devices/pci0000:00/0000:00:13.0/INTC100E:00/mrfld_bcove_pwrsrc/extcon/extcon0
>> (extcon)
>> KERNEL[314.238677] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005/usb-001:005:01
>> (mdio_bus)
>> KERNEL[314.238863] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005/usb-001:005:01
>> (mdio_bus)
>> KERNEL[314.239015] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005
>> (mdio_bus)
>> KERNEL[314.239205] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0
>> (usb)
>> KERNEL[314.239429] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0
>> (usb)
>> KERNEL[314.239666] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb2/2-0:1.0 (usb)
>>
>> KERNEL[314.239933] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb2/2-0:1.0 (usb)
>>
>> KERNEL[314.262713] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb2 (usb)
>> KERNEL[314.263030] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
>> (usb)
>> KERNEL[314.263298] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb2 (usb)
>> KERNEL[314.263569] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
>> (usb)
>> KERNEL[314.263815] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0
>> (usb)
>> KERNEL[314.264042] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0
>> (usb)
>> KERNEL[314.264753] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usbmon/usbmon2
>> (usbmon)
>> KERNEL[314.265019] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
>> KERNEL[314.265289] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
>> KERNEL[314.288792] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-0:1.0 (usb)
>>
>> KERNEL[314.289057] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-0:1.0 (usb)
>>
>> KERNEL[314.289327] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1 (usb)
>> KERNEL[314.289661] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1 (usb)
>> KERNEL[314.647375] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usbmon/usbmon1
>> (usbmon)
>> KERNEL[314.647816] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto (platform)
>> KERNEL[314.648143] remove   /kernel/software_nodes/node1 (software_nodes)
>> KERNEL[314.648672] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto (platform)
>>
>> # here is the event we were waiting for
>>
>> KERNEL[314.649158] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/udc/dwc3.0.auto (udc)
>>
>> # after this gadget devices appear normally
>>
>> Maybe this issue is due to extcon missing the event?
>  From the info here, it doesn't look like the host platform device was
> removed on the first switch. Also, as you pointed it out, the extcon
> event was not shown on the first switch either. Without a notification
> to switch mode, the dwc3 driver won't do anything. You need to check why
> that's the case as I can't help much here.
Thanks, I am looking into our extcon driver now for this.
>>>> After a 2 - 4 minutes the connection is dropped and reconnected.
>>> Does this occur with LPM disabled also? We can review this issue further
>>> with more dwc3 tracepoints.
>> I captured connection dropping and reconnecting in this fairly long
>> trace near the end of the file:
>>
>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6346323/lost-connection.zip__;!!A4F2R9G_pg!KAmPA0Vw1WUiSxdc5-BKNPyD0klvdr5ucZI3E_C2ojho2rNT9wzMs8HG4qCYSIuUHRz4$
>>
> Nothing obvious stands out as a problem from the dwc3 driver or the
> controller. I see a (port) reset after 30 seconds of inactivity, which
> is a typical timeout and recovery mechanism in the upperlayer from host.
>
> * Is this a new issue or was it always there?
I've seen it before, but because of all the other issues it wasn't obvious.
> * Does turning off LPM help?
I think so, but maybe didn't wait long enough. This I will retest 
tomorrow evening.
> * Are the other gadget functions still work during the 30 seconds
> inactivity?
No. The behavior seems same as unplug/plug the cable.
>
>>>> On the gadget end journal shows:
>>>>
>>>> Apr 19 22:08:42 yuna systemd-networkd[507]: usb0: Lost carrier
>>>> Apr 19 22:08:42 yuna systemd-journald[417]: Forwarding to syslog missed
>>>> 1 messages.
>>>> Apr 19 22:08:42 yuna systemd-timesyncd[469]: No network connectivity,
>>>> watching for changes.
>>>> Apr 19 22:08:42 yuna kernel: IPv6: ADDRCONF(NETDEV_CHANGE): usb0: link
>>>> becomes ready
>>>> Apr 19 22:08:42 yuna kernel[480]: [  624.382929] IPv6:
>>>> ADDRCONF(NETDEV_CHANGE): usb0: link becomes ready
>>>> Apr 19 22:08:42 yuna systemd-networkd[507]: usb0: Gained carrier
>>>> Apr 19 22:08:44 yuna systemd-networkd[507]: usb0: Gained IPv6LL
>>>> Apr 19 22:08:44 yuna systemd-timesyncd[469]: Network configuration
>>>> changed, trying to establish connection.
>>>> Apr 19 22:08:57 yuna systemd-timesyncd[469]: Initial synchronization to
>>>> time server 216.239.35.8:123 (time3.google.com).
>>>>
>>>> So, drops and immediately reconnects.
>>>>
>>>   From the look at the log here, it seems to be a reset from host (and an
>>> issue at the protocol level) unrelated to dwc3 driver or the controller.
>>> Hopefully and maybe we can get more clues from dwc3 tracepoints.
>>>
> BR,
> Thinh
Ferry Toth April 22, 2021, 8:55 p.m. UTC | #26
Hi

Op 21-04-2021 om 21:01 schreef Thinh Nguyen:
> Ferry Toth wrote:
>> Hi
>>
>> Op 19-04-2021 om 23:23 schreef Thinh Nguyen:
>>> Ferry Toth wrote:
>>>> Hi
>>>>
>>>> Op 19-04-2021 om 01:03 schreef Thinh Nguyen:
>>>>> Ferry Toth wrote:
>>>>>> Hi
>>>>>>
>>>>>> Op 17-04-2021 om 16:22 schreef Ferry Toth:
>>>>>>> Hi
>>>>>>>
>>>>>>> Op 17-04-2021 om 04:27 schreef Thinh Nguyen:
>>>>>>>> Ferry Toth wrote:
>>>>>>>>> Hi
>>>>>>>>>
>>>>>>>>> Op 16-04-2021 om 00:23 schreef Thinh Nguyen:
>>>>>>>>>> Thinh Nguyen wrote:
>>>>>>>>>>> From: Yu Chen <chenyu56@huawei.com>
>>>>>>>>>>> From: John Stultz <john.stultz@linaro.org>
>>>>>>>>>>>
>>>>>>>>>>> According to the programming guide, to switch mode for DRD
>>>>>>>>>>> controller,
>>>>>>>>>>> the driver needs to do the following.
>>>>>>>>>>>
>>>>>>>>>>> To switch from device to host:
>>>>>>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>>>>>>> 2. Set GCTL.PrtCapDir(host mode)
>>>>>>>>>>> 3. Reset the host with USBCMD.HCRESET
>>>>>>>>>>> 4. Then follow up with the initializing host registers sequence
>>>>>>>>>>>
>>>>>>>>>>> To switch from host to device:
>>>>>>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>>>>>>> 2. Set GCTL.PrtCapDir(device mode)
>>>>>>>>>>> 3. Reset the device with DCTL.CSftRst
>>>>>>>>>>> 4. Then follow up with the initializing registers sequence
>>>>>>>>>>>
>>>>>>>>>>> Currently we're missing step 1) to do GCTL.CoreSoftReset and step
>>>>>>>>>>> 3) of
>>>>>>>>>>> switching from host to device. John Stult reported a lockup issue
>>>>>>>>>>> seen
>>>>>>>>>>> with HiKey960 platform without these steps[1]. Similar issue is
>>>>>>>>>>> observed
>>>>>>>>>>> with Ferry's testing platform[2].
>>>>>>>>>>>
>>>>>>>>>>> So, apply the required steps along with some fixes to Yu Chen's
>>>>>>>>>>> and John
>>>>>>>>>>> Stultz's version. The main fixes to their versions are the
>>>>>>>>>>> missing
>>>>>>>>>>> wait
>>>>>>>>>>> for clocks synchronization before clearing GCTL.CoreSoftReset and
>>>>>>>>>>> only
>>>>>>>>>>> apply DCTL.CSftRst when switching from host to device.
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqLhzN4i3$
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [2]
>>>>>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqGeZStt4$
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
>>>>>>>>>>> Cc: Ferry Toth <fntoth@gmail.com>
>>>>>>>>>>> Cc: Wesley Cheng <wcheng@codeaurora.org>
>>>>>>>>>>> Cc: <stable@vger.kernel.org>
>>>>>>>>>>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work
>>>>>>>>>>> properly")
>>>>>>>>>>> Signed-off-by: Yu Chen <chenyu56@huawei.com>
>>>>>>>>>>> Signed-off-by: John Stultz <john.stultz@linaro.org>
>>>>>>>>>>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>>>>>>>>>>> ---
>>>>>>>>>>> Changes in v3:
>>>>>>>>>>> - Check if the desired mode is OTG, then keep the old flow
>>>>>>>>>>> - Remove condition for OTG support only since the device can
>>>>>>>>>>> still be
>>>>>>>>>>>        configured DRD host/device mode only
>>>>>>>>>>> - Remove redundant hw_mode check since __dwc3_set_mode() only
>>>>>>>>>>> applies
>>>>>>>>>>> when
>>>>>>>>>>>        hw_mode is DRD
>>>>>>>>>>> Changes in v2:
>>>>>>>>>>> - Initialize mutex per device and not as global mutex.
>>>>>>>>>>> - Add additional checks for DRD only mode
>>>>>>>>>>>
>>>>>>>>>>>       drivers/usb/dwc3/core.c | 27 +++++++++++++++++++++++++++
>>>>>>>>>>>       drivers/usb/dwc3/core.h |  5 +++++
>>>>>>>>>>>       2 files changed, 32 insertions(+)
>>>>>>>>>>>
>>>>>>>>>> Hi John,
>>>>>>>>>>
>>>>>>>>>> If possible, can you run a test with this version on your
>>>>>>>>>> platform?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Thinh
>>>>>>>>>>
>>>>>>>>> I tested this on edison-arduino with this patch on top of usb-next
>>>>>>>>> (5.12-rc7 + "increase BESL baseline to 6" to prevent throttling").
>>>>>>>>>
>>>>>>>>> On this platform there is a physical switch to switch roles. With
>>>>>>>>> this
>>>>>>>>> patch I find:
>>>>>>>>>
>>>>>>>>> - switch to host mode always works fine
>>>>>>>>>
>>>>>>>>> - switch to gadget mode I need to flip the switch 3x
>>>>>>>>> (gadget-host-gadget).
>>>>>>>>>
>>>>>>>>> An error message appears on the gadget side "dwc3 dwc3.0.auto:
>>>>>>>>> timed
>>>>>>>>> out
>>>>>>>>> waiting for SETUP phase" appears, but then the device connects
>>>>>>>>> to my
>>>>>>>>> PC,
>>>>>>>>> no throttling.
>>>>>>>>>
>>>>>>>>> - alternatively I can switch to gadget 1x and then unplug/replug
>>>>>>>>> the
>>>>>>>>> cable.
>>>>>>>>>
>>>>>>>>> No error message and connects fine.
>>>>>>>>>
>>>>>>>>> - if I flip the switch only once, on the PC side I get:
>>>>>>>>>
>>>>>>>>>       kernel: usb 1-5: new high-speed USB device number 18
>>>>>>>>> usingxhci_hcd
>>>>>>>>>       kernel: usb 1-5: New USB device found, idVendor=1d6b,
>>>>>>>>>       idProduct=0104, bcdDevice= 1.00 kernel: usb1-5: New USB device
>>>>>>>>>       strings: Mfr=1, Product=2, SerialNumber=3kernel:usb 1-5:
>>>>>>>>> Product:
>>>>>>>>>       USBArmory Gadget kernel: usb 1-5: Manufacturer:USBArmory
>>>>>>>>> kernel:
>>>>>>>>>       usb 1-5: SerialNumber: 0123456789abcdef kernel:usb 1-5: can't
>>>>>>>>> set
>>>>>>>>>       config #1, error -110
>>>>>>>> The device failed at set_configuration() request and timed out. It
>>>>>>>> probably timed out from the status stage looking at the device err
>>>>>>>> print.
>>>>>>>>
>>>>>>>>> Then if I wait long enough on the gadget side I get:
>>>>>>>>>
>>>>>>>>>       root@yuna:~# ifconfig
>>>>>>>>>
>>>>>>>>>       usb0: flags=-28605<UP,BROADCAST,RUNNING,MULTICAST,DYNAMIC> mtu
>>>>>>>>> 1500
>>>>>>>>>       inet 169.254.119.239 netmask 255.255.0.0 broadcast
>>>>>>>>> 169.254.255.255
>>>>>>>>>       inet6 fe80::a8bb:ccff:fedd:eef1 prefixlen 64 scopeid
>>>>>>>>> 0x20<link>
>>>>>>>>>       ether aa:bb:cc:dd:ee:f1 txqueuelen 1000 (Ethernet) RX packets
>>>>>>>>> 490424
>>>>>>>>>       bytes 735146578 (701.0 MiB) RX errors 0 dropped191 overruns 0
>>>>>>>>> frame
>>>>>>>>>       0 TX packets 35279 bytes 2532746 (2.4 MiB) TX errors 0
>>>>>>>>> dropped 0
>>>>>>>>>       overruns 0 carrier 0 collisions 0
>>>>>>>>>
>>>>>>>>> (correct would be: inet 10.42.0.221 netmask 255.255.255.0 broadcast
>>>>>>>>> 10.42.0.255)
>>>>>>>>>
>>>>>>>>> So much improved now, but it seems I am still missing something on
>>>>>>>>> plug.
>>>>>>>>>
>>>>>>>> That's great! We can look at it further. Can you capture the
>>>>>>>> tracepoints
>>>>>>>> of the issue. Also, can you try with mass_storage gadget to see
>>>>>>>> if the
>>>>>>>> result is the same?
>>>>>>> I have already gser, eem, mass_storage and uac2 combo. When eem
>>>>>>> fails,
>>>>>>> the mass_storage and uac2 don't appear (on KDE you get all kind of
>>>>>>> popups when they appear).
>>>>>>>
>>>>>>> So either all works, or all fails.
>>>>>>>
>>>>>>> I'll trace this later today.
>>>>>> Trace capturing switch from host-> gadget  here
>>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6329600/5.12-rc7*2Busb-next.zip__;JQ!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXLOiNwGT$
>>>>>>
>>>>>>
>>>>>>
>>>>>> (Issue history:
>>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/issues/31__;!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXNc7KgAw$
>>>>>>
>>>>>>
>>>>>> )
>>>>>>
>>>>>> On the PC side this resulted to:
>>>>>>
>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: new high-speed USB device
>>>>>> number 12 using xhci_hcd
>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device found,
>>>>>> idVendor=1d6b, idProduct=0104, bcdDevice= 1.00
>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device strings:
>>>>>> Mfr=1,
>>>>>> Product=2, SerialNumber=3
>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: Product: USBArmory Gadget
>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: Manufacturer: USBArmory
>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: SerialNumber:
>>>>>> 0123456789abcdef
>>>>>> apr 17 18:17:49 delfion kernel: usb 1-5: can't set config #1, error
>>>>>> -110
>>>>>>
>>>>>>
>>>>>> Thanks for all your help!
>>>>>>
>>>>> Looks like it's LPM related again. To confirm, try this:
>>>>> Disable LPM with this property "snps,usb2-gadget-lpm-disable"
>>>>> (Note that it's not the same as "snps,dis_enblslpm_quirk")
>>>> Yes, I confirm this helps.
>>>>
>>>> Note: on startup I was in host mode, with gadget cable plugged. The
>>>> first switch to gadget didn't work, all subsequent switches did work, as
>>>> well as unplug/plug the cable.
>>>>
>>>>> Make sure that your testing kernel has this patch [1]
>>>>> 475e8be53d04 ("usb: dwc3: gadget: Check for disabled LPM quirk")
>>>>>
>>>>> [1]
>>>>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-next&id=475e8be53d0496f9bc6159f4abb3ff5f9b90e8de__;!!A4F2R9G_pg!Mvz1Am6Ka_pOBfD0TmsA3821I05Ti8stMgh5r4XzMwZ9dy1Wan-il-DB4h50DmbaU4Zw$
>>>>>
>>>>>
>>>>>
>>>>> The failure you saw was probably due the gadget function attempting
>>>>> to start a delayed status stage of the SET_CONFIGURATION request.
>>>>> By this time, the host already put the device in low power.
>>>>>
>>>>> The START_TRANSFER command needs to be executed while the device
>>>>> is on "ON" state (or U0 if eSS). We shouldn't use dwc->link_state
>>>>> to check for link state because we only enable link state change
>>>>> interrupt for some controller versions.
>>>>>
>>>>> Once you confirms disabling LPM works, try this fix:
>>>>>
>>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>>> index 6227641f2d31..06cdec79244e 100644
>>>>> --- a/drivers/usb/dwc3/gadget.c
>>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>>> @@ -309,10 +309,14 @@ int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep,
>>>>> unsigned int cmd,
>>>>>              if (DWC3_DEPCMD_CMD(cmd) == DWC3_DEPCMD_STARTTRANSFER) {
>>>>>                    int             needs_wakeup;
>>>>> +               u8              link_state;
>>>>>     -               needs_wakeup = (dwc->link_state ==
>>>>> DWC3_LINK_STATE_U1 ||
>>>>> -                               dwc->link_state == DWC3_LINK_STATE_U2||
>>>>> -                               dwc->link_state == DWC3_LINK_STATE_U3);
>>>>> +               reg = dwc3_readl(dwc->regs, DWC3_DSTS);
>>>>> +               link_state = DWC3_DSTS_USBLNKST(reg);
>>>>> +
>>>>> +               needs_wakeup = (link_state == DWC3_LINK_STATE_U1 ||
>>>>> +                               link_state == DWC3_LINK_STATE_U2 ||
>>>>> +                               link_state == DWC3_LINK_STATE_U3);
>>>>>                      if (unlikely(needs_wakeup)) {
>>>>>                           ret = __dwc3_gadget_wakeup(dwc);
>>>>> @@ -1989,6 +1993,8 @@ static int __dwc3_gadget_wakeup(struct dwc3 *dwc)
>>>>>            case DWC3_LINK_STATE_RESET:
>>>>>            case DWC3_LINK_STATE_RX_DET:    /* in HS, means Early
>>>>> Suspend */
>>>>>            case DWC3_LINK_STATE_U3:        /* in HS, means SUSPEND */
>>>>> +       case DWC3_LINK_STATE_U2:        /* in HS, means Sleep (L1) */
>>>>> +       case DWC3_LINK_STATE_U1:
>>>>>            case DWC3_LINK_STATE_RESUME:
>>>>>                    break;
>>>>>            default:
>>>>>
>>>> Same (good) result as with "snps,usb2-gadget-lpm-disable". Including
>>>> first switch from host->gadget not working.
>>>>
>>> Great! Not sure why the first switch is not working, but it seems like
>>> we were able to eliminate quite a few issues. If you have more dwc3
>>> tracepoints, we can take a look further.
>> I traced but the file is empty. I captured the registers as well. The
>> zip file is here:
>>
>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6346271/first-switch.zip__;!!A4F2R9G_pg!KAmPA0Vw1WUiSxdc5-BKNPyD0klvdr5ucZI3E_C2ojho2rNT9wzMs8HG4qCYSDx89HFE$
>>
>> I found the gadget configuration script was not called, which normally
>> gets called due to a udev rule:
>>
>> ACTION=="add", KERNEL=="dwc3.0.auto", SUBSYSTEMS=="udc",
>> ATTRS{state}=="not attached", RUN+="/usr/bin/conf-gadget.sh"
>>
>> So I retried and see  with ~# udevadm monitor:
>>
>> # flipping the switch from host->gadget
>>
>> KERNEL[51.824914] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/enp0s17u1u1/queues/rx-0
>> (queues)
>> KERNEL[51.825682] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/enp0s17u1u1/queues/tx-0
>> (queues)
>> KERNEL[51.826226] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/enp0s17u1u1
>> (net)
>> KERNEL[51.836041] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:003/usb-001:003:01
>> (mdio_bus)
>> KERNEL[51.836709] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:003/usb-001:003:01
>> (mdio_bus)
>> KERNEL[51.837342] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:003
>> (mdio_bus)
>> KERNEL[51.837763] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0
>> (usb)
>> KERNEL[51.838116] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0
>> (usb)
>> KERNEL[51.873712] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
>> (usb)
>> KERNEL[51.874000] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
>> (usb)
>> KERNEL[51.874207] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0
>> (usb)
>> KERNEL[51.874431] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0
>> (usb)
>> KERNEL[51.897175] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
>> KERNEL[51.897486] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
>>
>> # stopped capture tracepoints here, then switch back to host
>>
>> KERNEL[253.214406] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
>> KERNEL[253.263305] change
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
>> KERNEL[253.263687] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0
>> (usb)
>> KERNEL[253.328354] bind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0
>> (usb)
>> KERNEL[253.328734] bind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
>> KERNEL[253.699341] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
>> (usb)
>> KERNEL[253.744911] change
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
>> (usb)
>> KERNEL[253.745804] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0
>> (usb)
>> KERNEL[253.805307] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005
>> (mdio_bus)
>> KERNEL[253.812978] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005/usb-001:005:01
>> (mdio_bus)
>> KERNEL[253.814318] bind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005/usb-001:005:01
>> (mdio_bus)
>> KERNEL[253.815386] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0
>> (net)
>> KERNEL[253.815552] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0/queues/rx-0
>> (queues)
>> KERNEL[253.815778] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0/queues/tx-0
>> (queues)
>> KERNEL[253.825279] bind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0
>> (usb)
>> KERNEL[253.825667] bind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
>> (usb)
>>
>> # switch to gadget again
>>
>> KERNEL[314.212144] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0/queues/rx-0
>> (queues)
>> KERNEL[314.212473] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0/queues/tx-0
>> (queues)
>> KERNEL[314.214691] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0/net/eth0
>> (net)
>>
>> # extcon event didn't show the first time
>>
>> KERNEL[314.238385] change
>> /devices/pci0000:00/0000:00:13.0/INTC100E:00/mrfld_bcove_pwrsrc/extcon/extcon0
>> (extcon)
>> KERNEL[314.238677] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005/usb-001:005:01
>> (mdio_bus)
>> KERNEL[314.238863] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005/usb-001:005:01
>> (mdio_bus)
>> KERNEL[314.239015] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/mdio_bus/usb-001:005
>> (mdio_bus)
>> KERNEL[314.239205] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0
>> (usb)
>> KERNEL[314.239429] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1/1-1.1:1.0
>> (usb)
>> KERNEL[314.239666] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb2/2-0:1.0 (usb)
>>
>> KERNEL[314.239933] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb2/2-0:1.0 (usb)
>>
>> KERNEL[314.262713] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb2 (usb)
>> KERNEL[314.263030] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
>> (usb)
>> KERNEL[314.263298] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb2 (usb)
>> KERNEL[314.263569] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1.1
>> (usb)
>> KERNEL[314.263815] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0
>> (usb)
>> KERNEL[314.264042] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1/1-1:1.0
>> (usb)
>> KERNEL[314.264753] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usbmon/usbmon2
>> (usbmon)
>> KERNEL[314.265019] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
>> KERNEL[314.265289] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-1 (usb)
>> KERNEL[314.288792] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-0:1.0 (usb)
>>
>> KERNEL[314.289057] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1/1-0:1.0 (usb)
>>
>> KERNEL[314.289327] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1 (usb)
>> KERNEL[314.289661] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usb1 (usb)
>> KERNEL[314.647375] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto/usbmon/usbmon1
>> (usbmon)
>> KERNEL[314.647816] unbind
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto (platform)
>> KERNEL[314.648143] remove   /kernel/software_nodes/node1 (software_nodes)
>> KERNEL[314.648672] remove
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/xhci-hcd.2.auto (platform)
>>
>> # here is the event we were waiting for
>>
>> KERNEL[314.649158] add
>> /devices/pci0000:00/0000:00:11.0/dwc3.0.auto/udc/dwc3.0.auto (udc)
>>
>> # after this gadget devices appear normally
>>
>> Maybe this issue is due to extcon missing the event?
>  From the info here, it doesn't look like the host platform device was
> removed on the first switch. Also, as you pointed it out, the extcon
> event was not shown on the first switch either. Without a notification
> to switch mode, the dwc3 driver won't do anything. You need to check why
> that's the case as I can't help much here.
>
>>>> After a 2 - 4 minutes the connection is dropped and reconnected.
>>> Does this occur with LPM disabled also? We can review this issue further
>>> with more dwc3 tracepoints.
>> I captured connection dropping and reconnecting in this fairly long
>> trace near the end of the file:
>>
>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6346323/lost-connection.zip__;!!A4F2R9G_pg!KAmPA0Vw1WUiSxdc5-BKNPyD0klvdr5ucZI3E_C2ojho2rNT9wzMs8HG4qCYSIuUHRz4$
>>
> Nothing obvious stands out as a problem from the dwc3 driver or the
> controller. I see a (port) reset after 30 seconds of inactivity, which
> is a typical timeout and recovery mechanism in the upperlayer from host.
>
> * Is this a new issue or was it always there?
> * Does turning off LPM help?

I reverted my "usb: gadget: increase BESL baseline to 6" and picked 
"usb: dwc3: pci: Enable dis_enblslpm_quirk for Intel Merrifield".

So this is again the big hammer you suggested earlier to turn off LPM.


After 15 min (at least 4x longer then normal to get a port reset) I have 
still not seen a port reset.


So for now I conclude, yes turning off LPM helps.


> * Are the other gadget functions still work during the 30 seconds
> inactivity?
>
>
>>>> On the gadget end journal shows:
>>>>
>>>> Apr 19 22:08:42 yuna systemd-networkd[507]: usb0: Lost carrier
>>>> Apr 19 22:08:42 yuna systemd-journald[417]: Forwarding to syslog missed
>>>> 1 messages.
>>>> Apr 19 22:08:42 yuna systemd-timesyncd[469]: No network connectivity,
>>>> watching for changes.
>>>> Apr 19 22:08:42 yuna kernel: IPv6: ADDRCONF(NETDEV_CHANGE): usb0: link
>>>> becomes ready
>>>> Apr 19 22:08:42 yuna kernel[480]: [  624.382929] IPv6:
>>>> ADDRCONF(NETDEV_CHANGE): usb0: link becomes ready
>>>> Apr 19 22:08:42 yuna systemd-networkd[507]: usb0: Gained carrier
>>>> Apr 19 22:08:44 yuna systemd-networkd[507]: usb0: Gained IPv6LL
>>>> Apr 19 22:08:44 yuna systemd-timesyncd[469]: Network configuration
>>>> changed, trying to establish connection.
>>>> Apr 19 22:08:57 yuna systemd-timesyncd[469]: Initial synchronization to
>>>> time server 216.239.35.8:123 (time3.google.com).
>>>>
>>>> So, drops and immediately reconnects.
>>>>
>>>   From the look at the log here, it seems to be a reset from host (and an
>>> issue at the protocol level) unrelated to dwc3 driver or the controller.
>>> Hopefully and maybe we can get more clues from dwc3 tracepoints.
>>>
> BR,
> Thinh
Thinh Nguyen April 22, 2021, 9:58 p.m. UTC | #27
Ferry Toth wrote:
> Hi
> 
> Op 21-04-2021 om 21:01 schreef Thinh Nguyen:
>> Ferry Toth wrote:
>>> Hi
>>>
>>> Op 19-04-2021 om 23:23 schreef Thinh Nguyen:
>>>> Ferry Toth wrote:
>>>>> Hi
>>>>>
>>>>> Op 19-04-2021 om 01:03 schreef Thinh Nguyen:
>>>>>> Ferry Toth wrote:
>>>>>>> Hi
>>>>>>>
>>>>>>> Op 17-04-2021 om 16:22 schreef Ferry Toth:
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> Op 17-04-2021 om 04:27 schreef Thinh Nguyen:
>>>>>>>>> Ferry Toth wrote:
>>>>>>>>>> Hi
>>>>>>>>>>
>>>>>>>>>> Op 16-04-2021 om 00:23 schreef Thinh Nguyen:
>>>>>>>>>>> Thinh Nguyen wrote:
>>>>>>>>>>>> From: Yu Chen <chenyu56@huawei.com>
>>>>>>>>>>>> From: John Stultz <john.stultz@linaro.org>
>>>>>>>>>>>>
>>>>>>>>>>>> According to the programming guide, to switch mode for DRD
>>>>>>>>>>>> controller,
>>>>>>>>>>>> the driver needs to do the following.
>>>>>>>>>>>>
>>>>>>>>>>>> To switch from device to host:
>>>>>>>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>>>>>>>> 2. Set GCTL.PrtCapDir(host mode)
>>>>>>>>>>>> 3. Reset the host with USBCMD.HCRESET
>>>>>>>>>>>> 4. Then follow up with the initializing host registers sequence
>>>>>>>>>>>>
>>>>>>>>>>>> To switch from host to device:
>>>>>>>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>>>>>>>> 2. Set GCTL.PrtCapDir(device mode)
>>>>>>>>>>>> 3. Reset the device with DCTL.CSftRst
>>>>>>>>>>>> 4. Then follow up with the initializing registers sequence
>>>>>>>>>>>>
>>>>>>>>>>>> Currently we're missing step 1) to do GCTL.CoreSoftReset and
>>>>>>>>>>>> step
>>>>>>>>>>>> 3) of
>>>>>>>>>>>> switching from host to device. John Stult reported a lockup
>>>>>>>>>>>> issue
>>>>>>>>>>>> seen
>>>>>>>>>>>> with HiKey960 platform without these steps[1]. Similar issue is
>>>>>>>>>>>> observed
>>>>>>>>>>>> with Ferry's testing platform[2].
>>>>>>>>>>>>
>>>>>>>>>>>> So, apply the required steps along with some fixes to Yu Chen's
>>>>>>>>>>>> and John
>>>>>>>>>>>> Stultz's version. The main fixes to their versions are the
>>>>>>>>>>>> missing
>>>>>>>>>>>> wait
>>>>>>>>>>>> for clocks synchronization before clearing
>>>>>>>>>>>> GCTL.CoreSoftReset and
>>>>>>>>>>>> only
>>>>>>>>>>>> apply DCTL.CSftRst when switching from host to device.
>>>>>>>>>>>>
>>>>>>>>>>>> [1]
>>>>>>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqLhzN4i3$
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> [2]
>>>>>>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqGeZStt4$
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
>>>>>>>>>>>> Cc: Ferry Toth <fntoth@gmail.com>
>>>>>>>>>>>> Cc: Wesley Cheng <wcheng@codeaurora.org>
>>>>>>>>>>>> Cc: <stable@vger.kernel.org>
>>>>>>>>>>>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode()
>>>>>>>>>>>> work
>>>>>>>>>>>> properly")
>>>>>>>>>>>> Signed-off-by: Yu Chen <chenyu56@huawei.com>
>>>>>>>>>>>> Signed-off-by: John Stultz <john.stultz@linaro.org>
>>>>>>>>>>>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>>>>>>>>>>>> ---
>>>>>>>>>>>> Changes in v3:
>>>>>>>>>>>> - Check if the desired mode is OTG, then keep the old flow
>>>>>>>>>>>> - Remove condition for OTG support only since the device can
>>>>>>>>>>>> still be
>>>>>>>>>>>>        configured DRD host/device mode only
>>>>>>>>>>>> - Remove redundant hw_mode check since __dwc3_set_mode() only
>>>>>>>>>>>> applies
>>>>>>>>>>>> when
>>>>>>>>>>>>        hw_mode is DRD
>>>>>>>>>>>> Changes in v2:
>>>>>>>>>>>> - Initialize mutex per device and not as global mutex.
>>>>>>>>>>>> - Add additional checks for DRD only mode
>>>>>>>>>>>>
>>>>>>>>>>>>       drivers/usb/dwc3/core.c | 27 +++++++++++++++++++++++++++
>>>>>>>>>>>>       drivers/usb/dwc3/core.h |  5 +++++
>>>>>>>>>>>>       2 files changed, 32 insertions(+)
>>>>>>>>>>>>
>>>>>>>>>>> Hi John,
>>>>>>>>>>>
>>>>>>>>>>> If possible, can you run a test with this version on your
>>>>>>>>>>> platform?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Thinh
>>>>>>>>>>>
>>>>>>>>>> I tested this on edison-arduino with this patch on top of
>>>>>>>>>> usb-next
>>>>>>>>>> (5.12-rc7 + "increase BESL baseline to 6" to prevent
>>>>>>>>>> throttling").
>>>>>>>>>>
>>>>>>>>>> On this platform there is a physical switch to switch roles. With
>>>>>>>>>> this
>>>>>>>>>> patch I find:
>>>>>>>>>>
>>>>>>>>>> - switch to host mode always works fine
>>>>>>>>>>
>>>>>>>>>> - switch to gadget mode I need to flip the switch 3x
>>>>>>>>>> (gadget-host-gadget).
>>>>>>>>>>
>>>>>>>>>> An error message appears on the gadget side "dwc3 dwc3.0.auto:
>>>>>>>>>> timed
>>>>>>>>>> out
>>>>>>>>>> waiting for SETUP phase" appears, but then the device connects
>>>>>>>>>> to my
>>>>>>>>>> PC,
>>>>>>>>>> no throttling.
>>>>>>>>>>
>>>>>>>>>> - alternatively I can switch to gadget 1x and then unplug/replug
>>>>>>>>>> the
>>>>>>>>>> cable.
>>>>>>>>>>
>>>>>>>>>> No error message and connects fine.
>>>>>>>>>>
>>>>>>>>>> - if I flip the switch only once, on the PC side I get:
>>>>>>>>>>
>>>>>>>>>>       kernel: usb 1-5: new high-speed USB device number 18
>>>>>>>>>> usingxhci_hcd
>>>>>>>>>>       kernel: usb 1-5: New USB device found, idVendor=1d6b,
>>>>>>>>>>       idProduct=0104, bcdDevice= 1.00 kernel: usb1-5: New USB
>>>>>>>>>> device
>>>>>>>>>>       strings: Mfr=1, Product=2, SerialNumber=3kernel:usb 1-5:
>>>>>>>>>> Product:
>>>>>>>>>>       USBArmory Gadget kernel: usb 1-5: Manufacturer:USBArmory
>>>>>>>>>> kernel:
>>>>>>>>>>       usb 1-5: SerialNumber: 0123456789abcdef kernel:usb 1-5:
>>>>>>>>>> can't
>>>>>>>>>> set
>>>>>>>>>>       config #1, error -110
>>>>>>>>> The device failed at set_configuration() request and timed out. It
>>>>>>>>> probably timed out from the status stage looking at the device err
>>>>>>>>> print.
>>>>>>>>>
>>>>>>>>>> Then if I wait long enough on the gadget side I get:
>>>>>>>>>>
>>>>>>>>>>       root@yuna:~# ifconfig
>>>>>>>>>>
>>>>>>>>>>       usb0:
>>>>>>>>>> flags=-28605<UP,BROADCAST,RUNNING,MULTICAST,DYNAMIC> mtu
>>>>>>>>>> 1500
>>>>>>>>>>       inet 169.254.119.239 netmask 255.255.0.0 broadcast
>>>>>>>>>> 169.254.255.255
>>>>>>>>>>       inet6 fe80::a8bb:ccff:fedd:eef1 prefixlen 64 scopeid
>>>>>>>>>> 0x20<link>
>>>>>>>>>>       ether aa:bb:cc:dd:ee:f1 txqueuelen 1000 (Ethernet) RX
>>>>>>>>>> packets
>>>>>>>>>> 490424
>>>>>>>>>>       bytes 735146578 (701.0 MiB) RX errors 0 dropped191
>>>>>>>>>> overruns 0
>>>>>>>>>> frame
>>>>>>>>>>       0 TX packets 35279 bytes 2532746 (2.4 MiB) TX errors 0
>>>>>>>>>> dropped 0
>>>>>>>>>>       overruns 0 carrier 0 collisions 0
>>>>>>>>>>
>>>>>>>>>> (correct would be: inet 10.42.0.221 netmask 255.255.255.0
>>>>>>>>>> broadcast
>>>>>>>>>> 10.42.0.255)
>>>>>>>>>>
>>>>>>>>>> So much improved now, but it seems I am still missing
>>>>>>>>>> something on
>>>>>>>>>> plug.
>>>>>>>>>>
>>>>>>>>> That's great! We can look at it further. Can you capture the
>>>>>>>>> tracepoints
>>>>>>>>> of the issue. Also, can you try with mass_storage gadget to see
>>>>>>>>> if the
>>>>>>>>> result is the same?
>>>>>>>> I have already gser, eem, mass_storage and uac2 combo. When eem
>>>>>>>> fails,
>>>>>>>> the mass_storage and uac2 don't appear (on KDE you get all kind of
>>>>>>>> popups when they appear).
>>>>>>>>
>>>>>>>> So either all works, or all fails.
>>>>>>>>
>>>>>>>> I'll trace this later today.
>>>>>>> Trace capturing switch from host-> gadget  here
>>>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6329600/5.12-rc7*2Busb-next.zip__;JQ!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXLOiNwGT$
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> (Issue history:
>>>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/issues/31__;!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXNc7KgAw$
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> )
>>>>>>>
>>>>>>> On the PC side this resulted to:
>>>>>>>
>>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: new high-speed USB device
>>>>>>> number 12 using xhci_hcd
>>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device found,
>>>>>>> idVendor=1d6b, idProduct=0104, bcdDevice= 1.00
>>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device strings:
>>>>>>> Mfr=1,
>>>>>>> Product=2, SerialNumber=3
>>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: Product: USBArmory Gadget
>>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: Manufacturer: USBArmory
>>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: SerialNumber:
>>>>>>> 0123456789abcdef
>>>>>>> apr 17 18:17:49 delfion kernel: usb 1-5: can't set config #1, error
>>>>>>> -110
>>>>>>>
>>>>>>>
>>>>>>> Thanks for all your help!
>>>>>>>
>>>>>> Looks like it's LPM related again. To confirm, try this:
>>>>>> Disable LPM with this property "snps,usb2-gadget-lpm-disable"
>>>>>> (Note that it's not the same as "snps,dis_enblslpm_quirk")
>>>>> Yes, I confirm this helps.
>>>>>
>>>>> Note: on startup I was in host mode, with gadget cable plugged. The
>>>>> first switch to gadget didn't work, all subsequent switches did
>>>>> work, as
>>>>> well as unplug/plug the cable.
>>>>>
>>>>>> Make sure that your testing kernel has this patch [1]
>>>>>> 475e8be53d04 ("usb: dwc3: gadget: Check for disabled LPM quirk")
>>>>>>
>>>>>> [1]
>>>>>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-next&id=475e8be53d0496f9bc6159f4abb3ff5f9b90e8de__;!!A4F2R9G_pg!Mvz1Am6Ka_pOBfD0TmsA3821I05Ti8stMgh5r4XzMwZ9dy1Wan-il-DB4h50DmbaU4Zw$
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> The failure you saw was probably due the gadget function attempting
>>>>>> to start a delayed status stage of the SET_CONFIGURATION request.
>>>>>> By this time, the host already put the device in low power.
>>>>>>
>>>>>> The START_TRANSFER command needs to be executed while the device
>>>>>> is on "ON" state (or U0 if eSS). We shouldn't use dwc->link_state
>>>>>> to check for link state because we only enable link state change
>>>>>> interrupt for some controller versions.
>>>>>>
>>>>>> Once you confirms disabling LPM works, try this fix:
>>>>>>
>>>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>>>> index 6227641f2d31..06cdec79244e 100644
>>>>>> --- a/drivers/usb/dwc3/gadget.c
>>>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>>>> @@ -309,10 +309,14 @@ int dwc3_send_gadget_ep_cmd(struct dwc3_ep
>>>>>> *dep,
>>>>>> unsigned int cmd,
>>>>>>              if (DWC3_DEPCMD_CMD(cmd) == DWC3_DEPCMD_STARTTRANSFER) {
>>>>>>                    int             needs_wakeup;
>>>>>> +               u8              link_state;
>>>>>>     -               needs_wakeup = (dwc->link_state ==
>>>>>> DWC3_LINK_STATE_U1 ||
>>>>>> -                               dwc->link_state ==
>>>>>> DWC3_LINK_STATE_U2||
>>>>>> -                               dwc->link_state ==
>>>>>> DWC3_LINK_STATE_U3);
>>>>>> +               reg = dwc3_readl(dwc->regs, DWC3_DSTS);
>>>>>> +               link_state = DWC3_DSTS_USBLNKST(reg);
>>>>>> +
>>>>>> +               needs_wakeup = (link_state == DWC3_LINK_STATE_U1 ||
>>>>>> +                               link_state == DWC3_LINK_STATE_U2 ||
>>>>>> +                               link_state == DWC3_LINK_STATE_U3);
>>>>>>                      if (unlikely(needs_wakeup)) {
>>>>>>                           ret = __dwc3_gadget_wakeup(dwc);
>>>>>> @@ -1989,6 +1993,8 @@ static int __dwc3_gadget_wakeup(struct dwc3
>>>>>> *dwc)
>>>>>>            case DWC3_LINK_STATE_RESET:
>>>>>>            case DWC3_LINK_STATE_RX_DET:    /* in HS, means Early
>>>>>> Suspend */
>>>>>>            case DWC3_LINK_STATE_U3:        /* in HS, means SUSPEND */
>>>>>> +       case DWC3_LINK_STATE_U2:        /* in HS, means Sleep (L1) */
>>>>>> +       case DWC3_LINK_STATE_U1:
>>>>>>            case DWC3_LINK_STATE_RESUME:
>>>>>>                    break;
>>>>>>            default:
>>>>>>
>>>>> Same (good) result as with "snps,usb2-gadget-lpm-disable". Including
>>>>> first switch from host->gadget not working.
>>>>>
>>>> Great! Not sure why the first switch is not working, but it seems like
>>>> we were able to eliminate quite a few issues. If you have more dwc3
>>>> tracepoints, we can take a look further.
>>> I traced but the file is empty. I captured the registers as well. The
>>> zip file is here:
>>>
>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6346271/first-switch.zip__;!!A4F2R9G_pg!KAmPA0Vw1WUiSxdc5-BKNPyD0klvdr5ucZI3E_C2ojho2rNT9wzMs8HG4qCYSDx89HFE$
>>>

<snip>

>>>
>>> Maybe this issue is due to extcon missing the event?
>>  From the info here, it doesn't look like the host platform device was
>> removed on the first switch. Also, as you pointed it out, the extcon
>> event was not shown on the first switch either. Without a notification
>> to switch mode, the dwc3 driver won't do anything. You need to check why
>> that's the case as I can't help much here.
>>
>>>>> After a 2 - 4 minutes the connection is dropped and reconnected.
>>>> Does this occur with LPM disabled also? We can review this issue
>>>> further
>>>> with more dwc3 tracepoints.
>>> I captured connection dropping and reconnecting in this fairly long
>>> trace near the end of the file:
>>>
>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6346323/lost-connection.zip__;!!A4F2R9G_pg!KAmPA0Vw1WUiSxdc5-BKNPyD0klvdr5ucZI3E_C2ojho2rNT9wzMs8HG4qCYSIuUHRz4$
>>>
>>>
>> Nothing obvious stands out as a problem from the dwc3 driver or the
>> controller. I see a (port) reset after 30 seconds of inactivity, which
>> is a typical timeout and recovery mechanism in the upperlayer from host.
>>
>> * Is this a new issue or was it always there?
>> * Does turning off LPM help?
> 
> I reverted my "usb: gadget: increase BESL baseline to 6" and picked
> "usb: dwc3: pci: Enable dis_enblslpm_quirk for Intel Merrifield".
> 
> So this is again the big hammer you suggested earlier to turn off LPM.
> 
> 
> After 15 min (at least 4x longer then normal to get a port reset) I have
> still not seen a port reset.
> 
> 
> So for now I conclude, yes turning off LPM helps.
> 

Ok. Thanks for the updates. Looks like we may have to use the hammer
afterall.

Btw, earlier I mistakenly say "dis_enblslpm_quirk" will disable LPM
completely, it only disables the device going into "sleep" state. If you
want to completely disable LPM, use "usb2-gadget-lpm-disable"

Thanks,
Thinh
Ferry Toth April 23, 2021, 7:18 a.m. UTC | #28
Hi,

Op 22-04-2021 om 23:58 schreef Thinh Nguyen:
> Ferry Toth wrote:
>> Hi
>>
>> Op 21-04-2021 om 21:01 schreef Thinh Nguyen:
>>> Ferry Toth wrote:
>>>> Hi
>>>>
>>>> Op 19-04-2021 om 23:23 schreef Thinh Nguyen:
>>>>> Ferry Toth wrote:
>>>>>> Hi
>>>>>>
>>>>>> Op 19-04-2021 om 01:03 schreef Thinh Nguyen:
>>>>>>> Ferry Toth wrote:
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> Op 17-04-2021 om 16:22 schreef Ferry Toth:
>>>>>>>>> Hi
>>>>>>>>>
>>>>>>>>> Op 17-04-2021 om 04:27 schreef Thinh Nguyen:
>>>>>>>>>> Ferry Toth wrote:
>>>>>>>>>>> Hi
>>>>>>>>>>>
>>>>>>>>>>> Op 16-04-2021 om 00:23 schreef Thinh Nguyen:
>>>>>>>>>>>> Thinh Nguyen wrote:
>>>>>>>>>>>>> From: Yu Chen <chenyu56@huawei.com>
>>>>>>>>>>>>> From: John Stultz <john.stultz@linaro.org>
>>>>>>>>>>>>>
>>>>>>>>>>>>> According to the programming guide, to switch mode for DRD
>>>>>>>>>>>>> controller,
>>>>>>>>>>>>> the driver needs to do the following.
>>>>>>>>>>>>>
>>>>>>>>>>>>> To switch from device to host:
>>>>>>>>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>>>>>>>>> 2. Set GCTL.PrtCapDir(host mode)
>>>>>>>>>>>>> 3. Reset the host with USBCMD.HCRESET
>>>>>>>>>>>>> 4. Then follow up with the initializing host registers sequence
>>>>>>>>>>>>>
>>>>>>>>>>>>> To switch from host to device:
>>>>>>>>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>>>>>>>>> 2. Set GCTL.PrtCapDir(device mode)
>>>>>>>>>>>>> 3. Reset the device with DCTL.CSftRst
>>>>>>>>>>>>> 4. Then follow up with the initializing registers sequence
>>>>>>>>>>>>>
>>>>>>>>>>>>> Currently we're missing step 1) to do GCTL.CoreSoftReset and
>>>>>>>>>>>>> step
>>>>>>>>>>>>> 3) of
>>>>>>>>>>>>> switching from host to device. John Stult reported a lockup
>>>>>>>>>>>>> issue
>>>>>>>>>>>>> seen
>>>>>>>>>>>>> with HiKey960 platform without these steps[1]. Similar issue is
>>>>>>>>>>>>> observed
>>>>>>>>>>>>> with Ferry's testing platform[2].
>>>>>>>>>>>>>
>>>>>>>>>>>>> So, apply the required steps along with some fixes to Yu Chen's
>>>>>>>>>>>>> and John
>>>>>>>>>>>>> Stultz's version. The main fixes to their versions are the
>>>>>>>>>>>>> missing
>>>>>>>>>>>>> wait
>>>>>>>>>>>>> for clocks synchronization before clearing
>>>>>>>>>>>>> GCTL.CoreSoftReset and
>>>>>>>>>>>>> only
>>>>>>>>>>>>> apply DCTL.CSftRst when switching from host to device.
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1]
>>>>>>>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@linaro.org/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqLhzN4i3$
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> [2]
>>>>>>>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@gmail.com/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqGeZStt4$
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
>>>>>>>>>>>>> Cc: Ferry Toth <fntoth@gmail.com>
>>>>>>>>>>>>> Cc: Wesley Cheng <wcheng@codeaurora.org>
>>>>>>>>>>>>> Cc: <stable@vger.kernel.org>
>>>>>>>>>>>>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode()
>>>>>>>>>>>>> work
>>>>>>>>>>>>> properly")
>>>>>>>>>>>>> Signed-off-by: Yu Chen <chenyu56@huawei.com>
>>>>>>>>>>>>> Signed-off-by: John Stultz <john.stultz@linaro.org>
>>>>>>>>>>>>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>> Changes in v3:
>>>>>>>>>>>>> - Check if the desired mode is OTG, then keep the old flow
>>>>>>>>>>>>> - Remove condition for OTG support only since the device can
>>>>>>>>>>>>> still be
>>>>>>>>>>>>>         configured DRD host/device mode only
>>>>>>>>>>>>> - Remove redundant hw_mode check since __dwc3_set_mode() only
>>>>>>>>>>>>> applies
>>>>>>>>>>>>> when
>>>>>>>>>>>>>         hw_mode is DRD
>>>>>>>>>>>>> Changes in v2:
>>>>>>>>>>>>> - Initialize mutex per device and not as global mutex.
>>>>>>>>>>>>> - Add additional checks for DRD only mode
>>>>>>>>>>>>>
>>>>>>>>>>>>>        drivers/usb/dwc3/core.c | 27 +++++++++++++++++++++++++++
>>>>>>>>>>>>>        drivers/usb/dwc3/core.h |  5 +++++
>>>>>>>>>>>>>        2 files changed, 32 insertions(+)
>>>>>>>>>>>>>
>>>>>>>>>>>> Hi John,
>>>>>>>>>>>>
>>>>>>>>>>>> If possible, can you run a test with this version on your
>>>>>>>>>>>> platform?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Thinh
>>>>>>>>>>>>
>>>>>>>>>>> I tested this on edison-arduino with this patch on top of
>>>>>>>>>>> usb-next
>>>>>>>>>>> (5.12-rc7 + "increase BESL baseline to 6" to prevent
>>>>>>>>>>> throttling").
>>>>>>>>>>>
>>>>>>>>>>> On this platform there is a physical switch to switch roles. With
>>>>>>>>>>> this
>>>>>>>>>>> patch I find:
>>>>>>>>>>>
>>>>>>>>>>> - switch to host mode always works fine
>>>>>>>>>>>
>>>>>>>>>>> - switch to gadget mode I need to flip the switch 3x
>>>>>>>>>>> (gadget-host-gadget).
>>>>>>>>>>>
>>>>>>>>>>> An error message appears on the gadget side "dwc3 dwc3.0.auto:
>>>>>>>>>>> timed
>>>>>>>>>>> out
>>>>>>>>>>> waiting for SETUP phase" appears, but then the device connects
>>>>>>>>>>> to my
>>>>>>>>>>> PC,
>>>>>>>>>>> no throttling.
>>>>>>>>>>>
>>>>>>>>>>> - alternatively I can switch to gadget 1x and then unplug/replug
>>>>>>>>>>> the
>>>>>>>>>>> cable.
>>>>>>>>>>>
>>>>>>>>>>> No error message and connects fine.
>>>>>>>>>>>
>>>>>>>>>>> - if I flip the switch only once, on the PC side I get:
>>>>>>>>>>>
>>>>>>>>>>>        kernel: usb 1-5: new high-speed USB device number 18
>>>>>>>>>>> usingxhci_hcd
>>>>>>>>>>>        kernel: usb 1-5: New USB device found, idVendor=1d6b,
>>>>>>>>>>>        idProduct=0104, bcdDevice= 1.00 kernel: usb1-5: New USB
>>>>>>>>>>> device
>>>>>>>>>>>        strings: Mfr=1, Product=2, SerialNumber=3kernel:usb 1-5:
>>>>>>>>>>> Product:
>>>>>>>>>>>        USBArmory Gadget kernel: usb 1-5: Manufacturer:USBArmory
>>>>>>>>>>> kernel:
>>>>>>>>>>>        usb 1-5: SerialNumber: 0123456789abcdef kernel:usb 1-5:
>>>>>>>>>>> can't
>>>>>>>>>>> set
>>>>>>>>>>>        config #1, error -110
>>>>>>>>>> The device failed at set_configuration() request and timed out. It
>>>>>>>>>> probably timed out from the status stage looking at the device err
>>>>>>>>>> print.
>>>>>>>>>>
>>>>>>>>>>> Then if I wait long enough on the gadget side I get:
>>>>>>>>>>>
>>>>>>>>>>>        root@yuna:~# ifconfig
>>>>>>>>>>>
>>>>>>>>>>>        usb0:
>>>>>>>>>>> flags=-28605<UP,BROADCAST,RUNNING,MULTICAST,DYNAMIC> mtu
>>>>>>>>>>> 1500
>>>>>>>>>>>        inet 169.254.119.239 netmask 255.255.0.0 broadcast
>>>>>>>>>>> 169.254.255.255
>>>>>>>>>>>        inet6 fe80::a8bb:ccff:fedd:eef1 prefixlen 64 scopeid
>>>>>>>>>>> 0x20<link>
>>>>>>>>>>>        ether aa:bb:cc:dd:ee:f1 txqueuelen 1000 (Ethernet) RX
>>>>>>>>>>> packets
>>>>>>>>>>> 490424
>>>>>>>>>>>        bytes 735146578 (701.0 MiB) RX errors 0 dropped191
>>>>>>>>>>> overruns 0
>>>>>>>>>>> frame
>>>>>>>>>>>        0 TX packets 35279 bytes 2532746 (2.4 MiB) TX errors 0
>>>>>>>>>>> dropped 0
>>>>>>>>>>>        overruns 0 carrier 0 collisions 0
>>>>>>>>>>>
>>>>>>>>>>> (correct would be: inet 10.42.0.221 netmask 255.255.255.0
>>>>>>>>>>> broadcast
>>>>>>>>>>> 10.42.0.255)
>>>>>>>>>>>
>>>>>>>>>>> So much improved now, but it seems I am still missing
>>>>>>>>>>> something on
>>>>>>>>>>> plug.
>>>>>>>>>>>
>>>>>>>>>> That's great! We can look at it further. Can you capture the
>>>>>>>>>> tracepoints
>>>>>>>>>> of the issue. Also, can you try with mass_storage gadget to see
>>>>>>>>>> if the
>>>>>>>>>> result is the same?
>>>>>>>>> I have already gser, eem, mass_storage and uac2 combo. When eem
>>>>>>>>> fails,
>>>>>>>>> the mass_storage and uac2 don't appear (on KDE you get all kind of
>>>>>>>>> popups when they appear).
>>>>>>>>>
>>>>>>>>> So either all works, or all fails.
>>>>>>>>>
>>>>>>>>> I'll trace this later today.
>>>>>>>> Trace capturing switch from host-> gadget  here
>>>>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6329600/5.12-rc7*2Busb-next.zip__;JQ!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXLOiNwGT$
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> (Issue history:
>>>>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/issues/31__;!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXNc7KgAw$
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> )
>>>>>>>>
>>>>>>>> On the PC side this resulted to:
>>>>>>>>
>>>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: new high-speed USB device
>>>>>>>> number 12 using xhci_hcd
>>>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device found,
>>>>>>>> idVendor=1d6b, idProduct=0104, bcdDevice= 1.00
>>>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device strings:
>>>>>>>> Mfr=1,
>>>>>>>> Product=2, SerialNumber=3
>>>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: Product: USBArmory Gadget
>>>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: Manufacturer: USBArmory
>>>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: SerialNumber:
>>>>>>>> 0123456789abcdef
>>>>>>>> apr 17 18:17:49 delfion kernel: usb 1-5: can't set config #1, error
>>>>>>>> -110
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks for all your help!
>>>>>>>>
>>>>>>> Looks like it's LPM related again. To confirm, try this:
>>>>>>> Disable LPM with this property "snps,usb2-gadget-lpm-disable"
>>>>>>> (Note that it's not the same as "snps,dis_enblslpm_quirk")
>>>>>> Yes, I confirm this helps.
>>>>>>
>>>>>> Note: on startup I was in host mode, with gadget cable plugged. The
>>>>>> first switch to gadget didn't work, all subsequent switches did
>>>>>> work, as
>>>>>> well as unplug/plug the cable.
>>>>>>
>>>>>>> Make sure that your testing kernel has this patch [1]
>>>>>>> 475e8be53d04 ("usb: dwc3: gadget: Check for disabled LPM quirk")
>>>>>>>
>>>>>>> [1]
>>>>>>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-next&id=475e8be53d0496f9bc6159f4abb3ff5f9b90e8de__;!!A4F2R9G_pg!Mvz1Am6Ka_pOBfD0TmsA3821I05Ti8stMgh5r4XzMwZ9dy1Wan-il-DB4h50DmbaU4Zw$
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The failure you saw was probably due the gadget function attempting
>>>>>>> to start a delayed status stage of the SET_CONFIGURATION request.
>>>>>>> By this time, the host already put the device in low power.
>>>>>>>
>>>>>>> The START_TRANSFER command needs to be executed while the device
>>>>>>> is on "ON" state (or U0 if eSS). We shouldn't use dwc->link_state
>>>>>>> to check for link state because we only enable link state change
>>>>>>> interrupt for some controller versions.
>>>>>>>
>>>>>>> Once you confirms disabling LPM works, try this fix:
>>>>>>>
>>>>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>>>>> index 6227641f2d31..06cdec79244e 100644
>>>>>>> --- a/drivers/usb/dwc3/gadget.c
>>>>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>>>>> @@ -309,10 +309,14 @@ int dwc3_send_gadget_ep_cmd(struct dwc3_ep
>>>>>>> *dep,
>>>>>>> unsigned int cmd,
>>>>>>>               if (DWC3_DEPCMD_CMD(cmd) == DWC3_DEPCMD_STARTTRANSFER) {
>>>>>>>                     int             needs_wakeup;
>>>>>>> +               u8              link_state;
>>>>>>>      -               needs_wakeup = (dwc->link_state ==
>>>>>>> DWC3_LINK_STATE_U1 ||
>>>>>>> -                               dwc->link_state ==
>>>>>>> DWC3_LINK_STATE_U2||
>>>>>>> -                               dwc->link_state ==
>>>>>>> DWC3_LINK_STATE_U3);
>>>>>>> +               reg = dwc3_readl(dwc->regs, DWC3_DSTS);
>>>>>>> +               link_state = DWC3_DSTS_USBLNKST(reg);
>>>>>>> +
>>>>>>> +               needs_wakeup = (link_state == DWC3_LINK_STATE_U1 ||
>>>>>>> +                               link_state == DWC3_LINK_STATE_U2 ||
>>>>>>> +                               link_state == DWC3_LINK_STATE_U3);
>>>>>>>                       if (unlikely(needs_wakeup)) {
>>>>>>>                            ret = __dwc3_gadget_wakeup(dwc);
>>>>>>> @@ -1989,6 +1993,8 @@ static int __dwc3_gadget_wakeup(struct dwc3
>>>>>>> *dwc)
>>>>>>>             case DWC3_LINK_STATE_RESET:
>>>>>>>             case DWC3_LINK_STATE_RX_DET:    /* in HS, means Early
>>>>>>> Suspend */
>>>>>>>             case DWC3_LINK_STATE_U3:        /* in HS, means SUSPEND */
>>>>>>> +       case DWC3_LINK_STATE_U2:        /* in HS, means Sleep (L1) */
>>>>>>> +       case DWC3_LINK_STATE_U1:
>>>>>>>             case DWC3_LINK_STATE_RESUME:
>>>>>>>                     break;
>>>>>>>             default:
>>>>>>>
>>>>>> Same (good) result as with "snps,usb2-gadget-lpm-disable". Including
>>>>>> first switch from host->gadget not working.
>>>>>>
>>>>> Great! Not sure why the first switch is not working, but it seems like
>>>>> we were able to eliminate quite a few issues. If you have more dwc3
>>>>> tracepoints, we can take a look further.
>>>> I traced but the file is empty. I captured the registers as well. The
>>>> zip file is here:
>>>>
>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6346271/first-switch.zip__;!!A4F2R9G_pg!KAmPA0Vw1WUiSxdc5-BKNPyD0klvdr5ucZI3E_C2ojho2rNT9wzMs8HG4qCYSDx89HFE$
>>>>
> <snip>
>
>>>> Maybe this issue is due to extcon missing the event?
>>>   From the info here, it doesn't look like the host platform device was
>>> removed on the first switch. Also, as you pointed it out, the extcon
>>> event was not shown on the first switch either. Without a notification
>>> to switch mode, the dwc3 driver won't do anything. You need to check why
>>> that's the case as I can't help much here.
>>>
I was able to resolve this in extcon.
>>>>>> After a 2 - 4 minutes the connection is dropped and reconnected.
>>>>> Does this occur with LPM disabled also? We can review this issue
>>>>> further
>>>>> with more dwc3 tracepoints.
>>>> I captured connection dropping and reconnecting in this fairly long
>>>> trace near the end of the file:
>>>>
>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6346323/lost-connection.zip__;!!A4F2R9G_pg!KAmPA0Vw1WUiSxdc5-BKNPyD0klvdr5ucZI3E_C2ojho2rNT9wzMs8HG4qCYSIuUHRz4$
>>>>
>>>>
>>> Nothing obvious stands out as a problem from the dwc3 driver or the
>>> controller. I see a (port) reset after 30 seconds of inactivity, which
>>> is a typical timeout and recovery mechanism in the upperlayer from host.
>>>
>>> * Is this a new issue or was it always there?
>>> * Does turning off LPM help?
>> I reverted my "usb: gadget: increase BESL baseline to 6" and picked
>> "usb: dwc3: pci: Enable dis_enblslpm_quirk for Intel Merrifield".
>>
>> So this is again the big hammer you suggested earlier to turn off LPM.
For future reference: the hammer is not dis_enblslpm_quirk but 
usb2-gadget-lpm-disable. Sorry for the mixup.
>>
>> After 15 min (at least 4x longer then normal to get a port reset) I have
>> still not seen a port reset.
>>
>>
>> So for now I conclude, yes turning off LPM helps.
>>
> Ok. Thanks for the updates. Looks like we may have to use the hammer
> afterall.

Ok, so we leave it at this then. With your fixes all issues seem to be 
resolved. Thanks so much for all your help!

> Btw, earlier I mistakenly say "dis_enblslpm_quirk" will disable LPM
> completely, it only disables the device going into "sleep" state. If you
> want to completely disable LPM, use "usb2-gadget-lpm-disable"
>
> Thanks,
> Thinh
diff mbox series

Patch

diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 5c25e6a72dbd..2f118ad43571 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -114,6 +114,8 @@  void dwc3_set_prtcap(struct dwc3 *dwc, u32 mode)
 	dwc->current_dr_role = mode;
 }
 
+static int dwc3_core_soft_reset(struct dwc3 *dwc);
+
 static void __dwc3_set_mode(struct work_struct *work)
 {
 	struct dwc3 *dwc = work_to_dwc(work);
@@ -121,6 +123,8 @@  static void __dwc3_set_mode(struct work_struct *work)
 	int ret;
 	u32 reg;
 
+	mutex_lock(&dwc->mutex);
+
 	pm_runtime_get_sync(dwc->dev);
 
 	if (dwc->current_dr_role == DWC3_GCTL_PRTCAP_OTG)
@@ -154,6 +158,25 @@  static void __dwc3_set_mode(struct work_struct *work)
 		break;
 	}
 
+	/* For DRD host or device mode only */
+	if (dwc->desired_dr_role != DWC3_GCTL_PRTCAP_OTG) {
+		reg = dwc3_readl(dwc->regs, DWC3_GCTL);
+		reg |= DWC3_GCTL_CORESOFTRESET;
+		dwc3_writel(dwc->regs, DWC3_GCTL, reg);
+
+		/*
+		 * Wait for internal clocks to synchronized. DWC_usb31 and
+		 * DWC_usb32 may need at least 50ms (less for DWC_usb3). To
+		 * keep it consistent across different IPs, let's wait up to
+		 * 100ms before clearing GCTL.CORESOFTRESET.
+		 */
+		msleep(100);
+
+		reg = dwc3_readl(dwc->regs, DWC3_GCTL);
+		reg &= ~DWC3_GCTL_CORESOFTRESET;
+		dwc3_writel(dwc->regs, DWC3_GCTL, reg);
+	}
+
 	spin_lock_irqsave(&dwc->lock, flags);
 
 	dwc3_set_prtcap(dwc, dwc->desired_dr_role);
@@ -178,6 +201,8 @@  static void __dwc3_set_mode(struct work_struct *work)
 		}
 		break;
 	case DWC3_GCTL_PRTCAP_DEVICE:
+		dwc3_core_soft_reset(dwc);
+
 		dwc3_event_buffers_setup(dwc);
 
 		if (dwc->usb2_phy)
@@ -200,6 +225,7 @@  static void __dwc3_set_mode(struct work_struct *work)
 out:
 	pm_runtime_mark_last_busy(dwc->dev);
 	pm_runtime_put_autosuspend(dwc->dev);
+	mutex_unlock(&dwc->mutex);
 }
 
 void dwc3_set_mode(struct dwc3 *dwc, u32 mode)
@@ -1553,6 +1579,7 @@  static int dwc3_probe(struct platform_device *pdev)
 	dwc3_cache_hwparams(dwc);
 
 	spin_lock_init(&dwc->lock);
+	mutex_init(&dwc->mutex);
 
 	pm_runtime_set_active(dev);
 	pm_runtime_use_autosuspend(dev);
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 695ff2d791e4..7e3afa5378e8 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -13,6 +13,7 @@ 
 
 #include <linux/device.h>
 #include <linux/spinlock.h>
+#include <linux/mutex.h>
 #include <linux/ioport.h>
 #include <linux/list.h>
 #include <linux/bitops.h>
@@ -947,6 +948,7 @@  struct dwc3_scratchpad_array {
  * @scratch_addr: dma address of scratchbuf
  * @ep0_in_setup: one control transfer is completed and enter setup phase
  * @lock: for synchronizing
+ * @mutex: for mode switching
  * @dev: pointer to our struct device
  * @sysdev: pointer to the DMA-capable device
  * @xhci: pointer to our xHCI child
@@ -1088,6 +1090,9 @@  struct dwc3 {
 	/* device lock */
 	spinlock_t		lock;
 
+	/* mode switching lock */
+	struct mutex		mutex;
+
 	struct device		*dev;
 	struct device		*sysdev;