diff mbox series

[v3] usb: dwc3: gadget: Fix NULL pointer dereference in dwc3_gadget_suspend

Message ID 20240119094825.26530-1-quic_uaggarwa@quicinc.com (mailing list archive)
State Accepted
Commit 61a348857e869432e6a920ad8ea9132e8d44c316
Headers show
Series [v3] usb: dwc3: gadget: Fix NULL pointer dereference in dwc3_gadget_suspend | expand

Commit Message

UTTKARSH AGGARWAL Jan. 19, 2024, 9:48 a.m. UTC
In current scenario if Plug-out and Plug-In performed continuously
there could be a chance while checking for dwc->gadget_driver in
dwc3_gadget_suspend, a NULL pointer dereference may occur.

Call Stack:

	CPU1:                           CPU2:
	gadget_unbind_driver            dwc3_suspend_common
	dwc3_gadget_stop                dwc3_gadget_suspend
                                        dwc3_disconnect_gadget

CPU1 basically clears the variable and CPU2 checks the variable.
Consider CPU1 is running and right before gadget_driver is cleared
and in parallel CPU2 executes dwc3_gadget_suspend where it finds
dwc->gadget_driver which is not NULL and resumes execution and then
CPU1 completes execution. CPU2 executes dwc3_disconnect_gadget where
it checks dwc->gadget_driver is already NULL because of which the
NULL pointer deference occur.

Cc: <stable@vger.kernel.org>
Fixes: 9772b47a4c29 ("usb: dwc3: gadget: Fix suspend/resume during device mode")
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Signed-off-by: Uttkarsh Aggarwal <quic_uaggarwa@quicinc.com>
---

changes in v3:
Corrected fixes tag and typo mistake in commit message dw3_gadget_stop -> dwc3_gadget_stop.

Link to v2:
https://lore.kernel.org/linux-usb/CAKzKK0r8RUqgXy1o5dndU21KuTKtyZ5rn5Fb9sZqTPZqAjT_9A@mail.gmail.com/T/#t

Changes in v2:
Added cc and fixes tag missing in v1.

Link to v1:
https://lore.kernel.org/linux-usb/20240110095532.4776-1-quic_uaggarwa@quicinc.com/T/#u

 drivers/usb/dwc3/gadget.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

Comments

Marek Szyprowski Feb. 7, 2024, 11:59 a.m. UTC | #1
Dear All,

On 19.01.2024 10:48, Uttkarsh Aggarwal wrote:
> In current scenario if Plug-out and Plug-In performed continuously
> there could be a chance while checking for dwc->gadget_driver in
> dwc3_gadget_suspend, a NULL pointer dereference may occur.
>
> Call Stack:
>
> 	CPU1:                           CPU2:
> 	gadget_unbind_driver            dwc3_suspend_common
> 	dwc3_gadget_stop                dwc3_gadget_suspend
>                                          dwc3_disconnect_gadget
>
> CPU1 basically clears the variable and CPU2 checks the variable.
> Consider CPU1 is running and right before gadget_driver is cleared
> and in parallel CPU2 executes dwc3_gadget_suspend where it finds
> dwc->gadget_driver which is not NULL and resumes execution and then
> CPU1 completes execution. CPU2 executes dwc3_disconnect_gadget where
> it checks dwc->gadget_driver is already NULL because of which the
> NULL pointer deference occur.
>
> Cc: <stable@vger.kernel.org>
> Fixes: 9772b47a4c29 ("usb: dwc3: gadget: Fix suspend/resume during device mode")
> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
> Signed-off-by: Uttkarsh Aggarwal <quic_uaggarwa@quicinc.com>

This patch landed some time ago in linux-next as commit 61a348857e86 
("usb: dwc3: gadget: Fix NULL pointer dereference in 
dwc3_gadget_suspend"). Recently I found that it causes the following 
warning when no USB gadget is bound to the DWC3 driver and a system 
suspend/resume cycle is performed:

dwc3 12400000.usb: wait for SETUP phase timed out
dwc3 12400000.usb: failed to set STALL on ep0out
------------[ cut here ]------------
WARNING: CPU: 4 PID: 604 at drivers/usb/dwc3/ep0.c:289 
dwc3_ep0_out_start+0xc8/0xcc
Modules linked in:
CPU: 4 PID: 604 Comm: rtcwake Not tainted 6.8.0-rc3-next-20240207 #7979
Hardware name: Samsung Exynos (Flattened Device Tree)
  unwind_backtrace from show_stack+0x10/0x14
  show_stack from dump_stack_lvl+0x58/0x70
  dump_stack_lvl from __warn+0x7c/0x1bc
  __warn from warn_slowpath_fmt+0x1a0/0x1a8
  warn_slowpath_fmt from dwc3_ep0_out_start+0xc8/0xcc
  dwc3_ep0_out_start from dwc3_gadget_soft_disconnect+0x16c/0x230
  dwc3_gadget_soft_disconnect from dwc3_gadget_suspend+0xc/0x90
  dwc3_gadget_suspend from dwc3_suspend_common+0x44/0x30c
  dwc3_suspend_common from dwc3_suspend+0x14/0x2c
  dwc3_suspend from dpm_run_callback+0x94/0x288
  dpm_run_callback from device_suspend+0x130/0x6d0
  device_suspend from dpm_suspend+0x124/0x35c
  dpm_suspend from dpm_suspend_start+0x64/0x6c
  dpm_suspend_start from suspend_devices_and_enter+0x134/0xbd8
  suspend_devices_and_enter from pm_suspend+0x2ec/0x380
  pm_suspend from state_store+0x68/0xc8
  state_store from kernfs_fop_write_iter+0x110/0x1d4
  kernfs_fop_write_iter from vfs_write+0x2e8/0x430
  vfs_write from ksys_write+0x5c/0xd4
  ksys_write from ret_fast_syscall+0x0/0x1c
Exception stack(0xf1421fa8 to 0xf1421ff0)
...
irq event stamp: 14304
hardirqs last  enabled at (14303): [<c01a599c>] console_unlock+0x108/0x114
hardirqs last disabled at (14304): [<c0c229d8>] 
_raw_spin_lock_irqsave+0x64/0x68
softirqs last  enabled at (13030): [<c010163c>] __do_softirq+0x318/0x4f4
softirqs last disabled at (13025): [<c012dd40>] __irq_exit_rcu+0x130/0x184
---[ end trace 0000000000000000 ]---

IMHO dwc3_gadget_soft_disconnect() requires some kind of a check if 
dwc->gadget_driver is present or not, as it really makes no sense to do 
any ep0 related operations if there is no gadget driver at all.


> ---
>
> changes in v3:
> Corrected fixes tag and typo mistake in commit message dw3_gadget_stop -> dwc3_gadget_stop.
>
> Link to v2:
> https://lore.kernel.org/linux-usb/CAKzKK0r8RUqgXy1o5dndU21KuTKtyZ5rn5Fb9sZqTPZqAjT_9A@mail.gmail.com/T/#t
>
> Changes in v2:
> Added cc and fixes tag missing in v1.
>
> Link to v1:
> https://lore.kernel.org/linux-usb/20240110095532.4776-1-quic_uaggarwa@quicinc.com/T/#u
>
>   drivers/usb/dwc3/gadget.c | 6 ++----
>   1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 019368f8e9c4..564976b3e2b9 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -4709,15 +4709,13 @@ int dwc3_gadget_suspend(struct dwc3 *dwc)
>   	unsigned long flags;
>   	int ret;
>   
> -	if (!dwc->gadget_driver)
> -		return 0;
> -
>   	ret = dwc3_gadget_soft_disconnect(dwc);
>   	if (ret)
>   		goto err;
>   
>   	spin_lock_irqsave(&dwc->lock, flags);
> -	dwc3_disconnect_gadget(dwc);
> +	if (dwc->gadget_driver)
> +		dwc3_disconnect_gadget(dwc);
>   	spin_unlock_irqrestore(&dwc->lock, flags);
>   
>   	return 0;

Best regards
Thinh Nguyen Feb. 8, 2024, 10:54 p.m. UTC | #2
On Wed, Feb 07, 2024, Marek Szyprowski wrote:
> Dear All,
> 
> On 19.01.2024 10:48, Uttkarsh Aggarwal wrote:
> > In current scenario if Plug-out and Plug-In performed continuously
> > there could be a chance while checking for dwc->gadget_driver in
> > dwc3_gadget_suspend, a NULL pointer dereference may occur.
> >
> > Call Stack:
> >
> > 	CPU1:                           CPU2:
> > 	gadget_unbind_driver            dwc3_suspend_common
> > 	dwc3_gadget_stop                dwc3_gadget_suspend
> >                                          dwc3_disconnect_gadget
> >
> > CPU1 basically clears the variable and CPU2 checks the variable.
> > Consider CPU1 is running and right before gadget_driver is cleared
> > and in parallel CPU2 executes dwc3_gadget_suspend where it finds
> > dwc->gadget_driver which is not NULL and resumes execution and then
> > CPU1 completes execution. CPU2 executes dwc3_disconnect_gadget where
> > it checks dwc->gadget_driver is already NULL because of which the
> > NULL pointer deference occur.
> >
> > Cc: <stable@vger.kernel.org>
> > Fixes: 9772b47a4c29 ("usb: dwc3: gadget: Fix suspend/resume during device mode")
> > Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
> > Signed-off-by: Uttkarsh Aggarwal <quic_uaggarwa@quicinc.com>
> 
> This patch landed some time ago in linux-next as commit 61a348857e86 
> ("usb: dwc3: gadget: Fix NULL pointer dereference in 
> dwc3_gadget_suspend"). Recently I found that it causes the following 
> warning when no USB gadget is bound to the DWC3 driver and a system 
> suspend/resume cycle is performed:
> 
> dwc3 12400000.usb: wait for SETUP phase timed out
> dwc3 12400000.usb: failed to set STALL on ep0out
> ------------[ cut here ]------------
> WARNING: CPU: 4 PID: 604 at drivers/usb/dwc3/ep0.c:289 
> dwc3_ep0_out_start+0xc8/0xcc
> Modules linked in:
> CPU: 4 PID: 604 Comm: rtcwake Not tainted 6.8.0-rc3-next-20240207 #7979
> Hardware name: Samsung Exynos (Flattened Device Tree)
>   unwind_backtrace from show_stack+0x10/0x14
>   show_stack from dump_stack_lvl+0x58/0x70
>   dump_stack_lvl from __warn+0x7c/0x1bc
>   __warn from warn_slowpath_fmt+0x1a0/0x1a8
>   warn_slowpath_fmt from dwc3_ep0_out_start+0xc8/0xcc
>   dwc3_ep0_out_start from dwc3_gadget_soft_disconnect+0x16c/0x230
>   dwc3_gadget_soft_disconnect from dwc3_gadget_suspend+0xc/0x90
>   dwc3_gadget_suspend from dwc3_suspend_common+0x44/0x30c
>   dwc3_suspend_common from dwc3_suspend+0x14/0x2c
>   dwc3_suspend from dpm_run_callback+0x94/0x288
>   dpm_run_callback from device_suspend+0x130/0x6d0
>   device_suspend from dpm_suspend+0x124/0x35c
>   dpm_suspend from dpm_suspend_start+0x64/0x6c
>   dpm_suspend_start from suspend_devices_and_enter+0x134/0xbd8
>   suspend_devices_and_enter from pm_suspend+0x2ec/0x380
>   pm_suspend from state_store+0x68/0xc8
>   state_store from kernfs_fop_write_iter+0x110/0x1d4
>   kernfs_fop_write_iter from vfs_write+0x2e8/0x430
>   vfs_write from ksys_write+0x5c/0xd4
>   ksys_write from ret_fast_syscall+0x0/0x1c
> Exception stack(0xf1421fa8 to 0xf1421ff0)
> ...
> irq event stamp: 14304
> hardirqs last  enabled at (14303): [<c01a599c>] console_unlock+0x108/0x114
> hardirqs last disabled at (14304): [<c0c229d8>] 
> _raw_spin_lock_irqsave+0x64/0x68
> softirqs last  enabled at (13030): [<c010163c>] __do_softirq+0x318/0x4f4
> softirqs last disabled at (13025): [<c012dd40>] __irq_exit_rcu+0x130/0x184
> ---[ end trace 0000000000000000 ]---
> 
> IMHO dwc3_gadget_soft_disconnect() requires some kind of a check if 
> dwc->gadget_driver is present or not, as it really makes no sense to do 

I don't think checking that is sufficient, and I don't think that's the
case here.

> any ep0 related operations if there is no gadget driver at all.
> 

If there's indeed no gadget_driver present, then we wouldn't get this
stack trace. (ie. dwc3_ep0_out_start should occurs when gadget_driver is
present). This is a race happened between binding + suspend.

I think something like this should be sufficient. Would you mind giving
it a try?

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 564976b3e2b9..1990d6371066 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -2656,6 +2656,11 @@ static int dwc3_gadget_soft_disconnect(struct dwc3 *dwc)
 	int ret;
 
 	spin_lock_irqsave(&dwc->lock, flags);
+	if (!dwc->pullups_connected) {
+		spin_unlock_irqrestore(&dwc->lock, flags);
+		return 0;
+	}
+
 	dwc->connected = false;
 
 	/*


Thanks,
Thinh
Marek Szyprowski Feb. 9, 2024, 9:41 a.m. UTC | #3
On 08.02.2024 23:54, Thinh Nguyen wrote:
> On Wed, Feb 07, 2024, Marek Szyprowski wrote:
>> On 19.01.2024 10:48, Uttkarsh Aggarwal wrote:
>>> In current scenario if Plug-out and Plug-In performed continuously
>>> there could be a chance while checking for dwc->gadget_driver in
>>> dwc3_gadget_suspend, a NULL pointer dereference may occur.
>>>
>>> Call Stack:
>>>
>>> 	CPU1:                           CPU2:
>>> 	gadget_unbind_driver            dwc3_suspend_common
>>> 	dwc3_gadget_stop                dwc3_gadget_suspend
>>>                                           dwc3_disconnect_gadget
>>>
>>> CPU1 basically clears the variable and CPU2 checks the variable.
>>> Consider CPU1 is running and right before gadget_driver is cleared
>>> and in parallel CPU2 executes dwc3_gadget_suspend where it finds
>>> dwc->gadget_driver which is not NULL and resumes execution and then
>>> CPU1 completes execution. CPU2 executes dwc3_disconnect_gadget where
>>> it checks dwc->gadget_driver is already NULL because of which the
>>> NULL pointer deference occur.
>>>
>>> Cc: <stable@vger.kernel.org>
>>> Fixes: 9772b47a4c29 ("usb: dwc3: gadget: Fix suspend/resume during device mode")
>>> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>>> Signed-off-by: Uttkarsh Aggarwal <quic_uaggarwa@quicinc.com>
>> This patch landed some time ago in linux-next as commit 61a348857e86
>> ("usb: dwc3: gadget: Fix NULL pointer dereference in
>> dwc3_gadget_suspend"). Recently I found that it causes the following
>> warning when no USB gadget is bound to the DWC3 driver and a system
>> suspend/resume cycle is performed:
>>
>> dwc3 12400000.usb: wait for SETUP phase timed out
>> dwc3 12400000.usb: failed to set STALL on ep0out
>> ------------[ cut here ]------------
>> WARNING: CPU: 4 PID: 604 at drivers/usb/dwc3/ep0.c:289
>> dwc3_ep0_out_start+0xc8/0xcc
>> Modules linked in:
>> CPU: 4 PID: 604 Comm: rtcwake Not tainted 6.8.0-rc3-next-20240207 #7979
>> Hardware name: Samsung Exynos (Flattened Device Tree)
>>    unwind_backtrace from show_stack+0x10/0x14
>>    show_stack from dump_stack_lvl+0x58/0x70
>>    dump_stack_lvl from __warn+0x7c/0x1bc
>>    __warn from warn_slowpath_fmt+0x1a0/0x1a8
>>    warn_slowpath_fmt from dwc3_ep0_out_start+0xc8/0xcc
>>    dwc3_ep0_out_start from dwc3_gadget_soft_disconnect+0x16c/0x230
>>    dwc3_gadget_soft_disconnect from dwc3_gadget_suspend+0xc/0x90
>>    dwc3_gadget_suspend from dwc3_suspend_common+0x44/0x30c
>>    dwc3_suspend_common from dwc3_suspend+0x14/0x2c
>>    dwc3_suspend from dpm_run_callback+0x94/0x288
>>    dpm_run_callback from device_suspend+0x130/0x6d0
>>    device_suspend from dpm_suspend+0x124/0x35c
>>    dpm_suspend from dpm_suspend_start+0x64/0x6c
>>    dpm_suspend_start from suspend_devices_and_enter+0x134/0xbd8
>>    suspend_devices_and_enter from pm_suspend+0x2ec/0x380
>>    pm_suspend from state_store+0x68/0xc8
>>    state_store from kernfs_fop_write_iter+0x110/0x1d4
>>    kernfs_fop_write_iter from vfs_write+0x2e8/0x430
>>    vfs_write from ksys_write+0x5c/0xd4
>>    ksys_write from ret_fast_syscall+0x0/0x1c
>> Exception stack(0xf1421fa8 to 0xf1421ff0)
>> ...
>> irq event stamp: 14304
>> hardirqs last  enabled at (14303): [<c01a599c>] console_unlock+0x108/0x114
>> hardirqs last disabled at (14304): [<c0c229d8>]
>> _raw_spin_lock_irqsave+0x64/0x68
>> softirqs last  enabled at (13030): [<c010163c>] __do_softirq+0x318/0x4f4
>> softirqs last disabled at (13025): [<c012dd40>] __irq_exit_rcu+0x130/0x184
>> ---[ end trace 0000000000000000 ]---
>>
>> IMHO dwc3_gadget_soft_disconnect() requires some kind of a check if
>> dwc->gadget_driver is present or not, as it really makes no sense to do
> I don't think checking that is sufficient, and I don't think that's the
> case here.
>
>> any ep0 related operations if there is no gadget driver at all.
>>
> If there's indeed no gadget_driver present, then we wouldn't get this
> stack trace. (ie. dwc3_ep0_out_start should occurs when gadget_driver is
> present). This is a race happened between binding + suspend.

I have no gadget compiled into the kernel and no such created via 
configfs, so how can this be caused by a race?



> I think something like this should be sufficient. Would you mind giving
> it a try?
>
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 564976b3e2b9..1990d6371066 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -2656,6 +2656,11 @@ static int dwc3_gadget_soft_disconnect(struct dwc3 *dwc)
>   	int ret;
>   
>   	spin_lock_irqsave(&dwc->lock, flags);
> +	if (!dwc->pullups_connected) {
> +		spin_unlock_irqrestore(&dwc->lock, flags);
> +		return 0;
> +	}
> +
>   	dwc->connected = false;
>   
>   	/*
>
This patch fixes the reported issue. Feel free to add:

Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>


Best regards
Thinh Nguyen Feb. 15, 2024, 11:35 p.m. UTC | #4
Sorry for the late reply.

On Fri, Feb 09, 2024, Marek Szyprowski wrote:
> On 08.02.2024 23:54, Thinh Nguyen wrote:
> > On Wed, Feb 07, 2024, Marek Szyprowski wrote:
> >> On 19.01.2024 10:48, Uttkarsh Aggarwal wrote:
> >>> In current scenario if Plug-out and Plug-In performed continuously
> >>> there could be a chance while checking for dwc->gadget_driver in
> >>> dwc3_gadget_suspend, a NULL pointer dereference may occur.
> >>>
> >>> Call Stack:
> >>>
> >>> 	CPU1:                           CPU2:
> >>> 	gadget_unbind_driver            dwc3_suspend_common
> >>> 	dwc3_gadget_stop                dwc3_gadget_suspend
> >>>                                           dwc3_disconnect_gadget
> >>>
> >>> CPU1 basically clears the variable and CPU2 checks the variable.
> >>> Consider CPU1 is running and right before gadget_driver is cleared
> >>> and in parallel CPU2 executes dwc3_gadget_suspend where it finds
> >>> dwc->gadget_driver which is not NULL and resumes execution and then
> >>> CPU1 completes execution. CPU2 executes dwc3_disconnect_gadget where
> >>> it checks dwc->gadget_driver is already NULL because of which the
> >>> NULL pointer deference occur.
> >>>
> >>> Cc: <stable@vger.kernel.org>
> >>> Fixes: 9772b47a4c29 ("usb: dwc3: gadget: Fix suspend/resume during device mode")
> >>> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
> >>> Signed-off-by: Uttkarsh Aggarwal <quic_uaggarwa@quicinc.com>
> >> This patch landed some time ago in linux-next as commit 61a348857e86
> >> ("usb: dwc3: gadget: Fix NULL pointer dereference in
> >> dwc3_gadget_suspend"). Recently I found that it causes the following
> >> warning when no USB gadget is bound to the DWC3 driver and a system
> >> suspend/resume cycle is performed:
> >>
> >> dwc3 12400000.usb: wait for SETUP phase timed out
> >> dwc3 12400000.usb: failed to set STALL on ep0out
> >> ------------[ cut here ]------------
> >> WARNING: CPU: 4 PID: 604 at drivers/usb/dwc3/ep0.c:289
> >> dwc3_ep0_out_start+0xc8/0xcc
> >> Modules linked in:
> >> CPU: 4 PID: 604 Comm: rtcwake Not tainted 6.8.0-rc3-next-20240207 #7979
> >> Hardware name: Samsung Exynos (Flattened Device Tree)
> >>    unwind_backtrace from show_stack+0x10/0x14
> >>    show_stack from dump_stack_lvl+0x58/0x70
> >>    dump_stack_lvl from __warn+0x7c/0x1bc
> >>    __warn from warn_slowpath_fmt+0x1a0/0x1a8
> >>    warn_slowpath_fmt from dwc3_ep0_out_start+0xc8/0xcc
> >>    dwc3_ep0_out_start from dwc3_gadget_soft_disconnect+0x16c/0x230
> >>    dwc3_gadget_soft_disconnect from dwc3_gadget_suspend+0xc/0x90
> >>    dwc3_gadget_suspend from dwc3_suspend_common+0x44/0x30c
> >>    dwc3_suspend_common from dwc3_suspend+0x14/0x2c
> >>    dwc3_suspend from dpm_run_callback+0x94/0x288
> >>    dpm_run_callback from device_suspend+0x130/0x6d0
> >>    device_suspend from dpm_suspend+0x124/0x35c
> >>    dpm_suspend from dpm_suspend_start+0x64/0x6c
> >>    dpm_suspend_start from suspend_devices_and_enter+0x134/0xbd8
> >>    suspend_devices_and_enter from pm_suspend+0x2ec/0x380
> >>    pm_suspend from state_store+0x68/0xc8
> >>    state_store from kernfs_fop_write_iter+0x110/0x1d4
> >>    kernfs_fop_write_iter from vfs_write+0x2e8/0x430
> >>    vfs_write from ksys_write+0x5c/0xd4
> >>    ksys_write from ret_fast_syscall+0x0/0x1c
> >> Exception stack(0xf1421fa8 to 0xf1421ff0)
> >> ...
> >> irq event stamp: 14304
> >> hardirqs last  enabled at (14303): [<c01a599c>] console_unlock+0x108/0x114
> >> hardirqs last disabled at (14304): [<c0c229d8>]
> >> _raw_spin_lock_irqsave+0x64/0x68
> >> softirqs last  enabled at (13030): [<c010163c>] __do_softirq+0x318/0x4f4
> >> softirqs last disabled at (13025): [<c012dd40>] __irq_exit_rcu+0x130/0x184
> >> ---[ end trace 0000000000000000 ]---
> >>
> >> IMHO dwc3_gadget_soft_disconnect() requires some kind of a check if
> >> dwc->gadget_driver is present or not, as it really makes no sense to do
> > I don't think checking that is sufficient, and I don't think that's the
> > case here.
> >
> >> any ep0 related operations if there is no gadget driver at all.
> >>
> > If there's indeed no gadget_driver present, then we wouldn't get this
> > stack trace. (ie. dwc3_ep0_out_start should occurs when gadget_driver is
> > present). This is a race happened between binding + suspend.
> 
> I have no gadget compiled into the kernel and no such created via 
> configfs, so how can this be caused by a race?

Ah... In that case, we got through the incomplete/wrong check for
dwc3_gadget_soft_disconnect():
	if (dwc->ep0state != EP0_SETUP_PHASE)

Since there's no gadget driver, the controller never started and the
ep0state is defaulted to EP0_UNCONNECTED, which explained why it got
into the timeout condition above and incorrectly attempt to start the
control transfer.

> 
> 
> 
> > I think something like this should be sufficient. Would you mind giving
> > it a try?
> >
> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > index 564976b3e2b9..1990d6371066 100644
> > --- a/drivers/usb/dwc3/gadget.c
> > +++ b/drivers/usb/dwc3/gadget.c
> > @@ -2656,6 +2656,11 @@ static int dwc3_gadget_soft_disconnect(struct dwc3 *dwc)
> >   	int ret;
> >   
> >   	spin_lock_irqsave(&dwc->lock, flags);
> > +	if (!dwc->pullups_connected) {
> > +		spin_unlock_irqrestore(&dwc->lock, flags);
> > +		return 0;
> > +	}
> > +
> >   	dwc->connected = false;
> >   
> >   	/*
> >
> This patch fixes the reported issue. Feel free to add:
> 
> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> 

Thanks for the report and Tested-by! I'll send a fix patch soon.

BR,
Thinh
diff mbox series

Patch

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 019368f8e9c4..564976b3e2b9 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -4709,15 +4709,13 @@  int dwc3_gadget_suspend(struct dwc3 *dwc)
 	unsigned long flags;
 	int ret;
 
-	if (!dwc->gadget_driver)
-		return 0;
-
 	ret = dwc3_gadget_soft_disconnect(dwc);
 	if (ret)
 		goto err;
 
 	spin_lock_irqsave(&dwc->lock, flags);
-	dwc3_disconnect_gadget(dwc);
+	if (dwc->gadget_driver)
+		dwc3_disconnect_gadget(dwc);
 	spin_unlock_irqrestore(&dwc->lock, flags);
 
 	return 0;