diff mbox series

NULL pointer issue due to .pullup timeout at dwc3

Message ID VI1PR04MB5327779C65037389EB8400148BA10@VI1PR04MB5327.eurprd04.prod.outlook.com (mailing list archive)
State New, archived
Headers show
Series NULL pointer issue due to .pullup timeout at dwc3 | expand

Commit Message

Peter Chen Aug. 26, 2019, 9:51 a.m. UTC
Hi Balbi,

When do configfs function add and remove stress test, I find dwc3 gadget .pullup will
timeout if there is a request on the way. Even I enlarge the delay, there is still
timeout for .pullup.


This timeout causes the NULL pointer issue for ethernet gadget, eg. NCM, the reason
is ncm->notify_req is NULL, but the ncm_notify_complete is not called before .unbind is
called. See below oops:

using random self ethernet address
using random host ethernet address
udc 38100000.usb: registering UDC driver [g1]
configfs-gadget gadget: adding 'cdc_network'/000000001a44bfe6 to config 'c'/000000008bd6a9b3
usb0: HOST MAC 12:a5:cf:42:92:fd
usb0: MAC 5e:bc:ca:27:92:b1
configfs-gadget gadget: CDC Network: super speed IN/ep1in OUT/ep1out NOTIFY/ep2in
configfs-gadget gadget: resume
configfs-gadget gadget: high-speed config #1: c
configfs-gadget gadget: reset ncm control 0
configfs-gadget gadget: init ncm ctrl 0
configfs-gadget gadget: notify speed 425984000
configfs-gadget 38100000.usb: unregistering UDC driver [g1]
dwc3 38100000.usb: dwc3_gadget_pullup:ret = -110
configfs-gadget gadget: unbind function 'cdc_network'/000000001a44bfe6
configfs-gadget gadget: ncm unbind
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Mem abort info:
  ESR = 0x96000006
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
Data abort info:
  ISV = 0, ISS = 0x00000006
  CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=00000000f29ca000
[0000000000000000] pgd=00000000f4ee8003, pud=00000000f4eec003, pmd=0000000000000000
Internal error: Oops: 96000006 [#1] PREEMPT SMP
Modules linked in:
CPU: 3 PID: 673 Comm: sh Not tainted 5.3.0-rc3-next-20190809-01677-g1fb652444ffb-dirty #89
Hardware name: NXP i.MX8MQ EVK (DT)
pstate: 80000005 (Nzcv daif -PAN -UAO)
pc : ncm_unbind+0x48/0x90
lr : ncm_unbind+0x44/0x90
sp : ffff00001280bbd0
x29: ffff00001280bbd0 x28: ffff8000b289aa00 
x27: ffff8000b4f0e4b8 x26: ffff8000b4f0e520 
x25: ffff8000b4dc9c98 x24: ffff00001169ade8 
x23: ffff00001229ba28 x22: ffff8000b4dc9800 
x21: ffff8000b4f0e600 x20: ffff8000b4f0e520 
x19: ffff800001d2be00 x18: 0000000000000030 
x17: 0000000000000000 x16: 0000000000000000 
x15: ffffffffffffffff x14: ffff000011ff98c8 
x13: 0000000000000000 x12: ffff0000122db000 
x11: ffff000012018000 x10: ffff0000122db5b8 
x9 : 0000000000000000 x8 : 0000000000000007 
x7 : 00000000000001eb x6 : ffff0000122db000 
x5 : ffff000010bc2100 x4 : ffff7e0002dd52a0 
x3 : 0000000080100008 x2 : ffff8000b754ae00 
x1 : 6621d6314c7f8200 x0 : 0000000000000000 
Call trace:
 ncm_unbind+0x48/0x90
 purge_configs_funcs+0xa4/0x138
 configfs_composite_unbind+0x60/0xa0
 usb_gadget_remove_driver+0x40/0xa0
 usb_gadget_unregister_driver+0xc4/0xf8
 unregister_gadget+0x28/0x58
 gadget_dev_desc_UDC_store+0xa0/0x100
 configfs_write_file+0xe8/0x180
 __vfs_write+0x48/0x90
 vfs_write+0xb8/0x1c0
 ksys_write+0x74/0xf8
 __arm64_sys_write+0x24/0x30
 el0_svc_common.constprop.0+0x94/0x158
 el0_svc_handler+0x34/0x90
 el0_svc+0x8/0xc
Code: aa1303e0 392c203f 97ff8732 f940a260 (f9400000) 

It is very easy to reproduce it, just create ncm function,
and sleep 1 second, then, remove it.

I am not sure if it is the common issue for dwc3, there is no such
requirement for chipidea and cadence3. Would you please have a check?
Thanks.

Best regards,
Peter Chen

Comments

Felipe Balbi Aug. 26, 2019, 10:01 a.m. UTC | #1
Hi,

Peter Chen <peter.chen@nxp.com> writes:

> Hi Balbi,
>
> When do configfs function add and remove stress test, I find dwc3
> gadget .pullup will timeout if there is a request on the way. Even I

what do you mean by "a request on the way"?

> enlarge the delay, there is still timeout for .pullup.
>
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index f13bef950951..e95955b6a225 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -1827,6 +1827,7 @@ static int dwc3_gadget_run_stop(struct dwc3 *dwc, int is_on, int suspend)
>         do {
>                 reg = dwc3_readl(dwc->regs, DWC3_DSTS);
>                 reg &= DWC3_DSTS_DEVCTRLHLT;
> +               udelay(1);
>         } while (--timeout && !(!is_on ^ !reg));
>  
>         if (!timeout)
> @@ -1861,6 +1862,8 @@ static int dwc3_gadget_pullup(struct usb_gadget *g, int is_on)
>         spin_lock_irqsave(&dwc->lock, flags);
>         ret = dwc3_gadget_run_stop(dwc, is_on, false);
>         spin_unlock_irqrestore(&dwc->lock, flags);
> +       if (ret)
> +               dev_err(dwc->dev, "%s:ret = %d\n", __func__, ret);
>  
>         return ret;
>  }
>
> This timeout causes the NULL pointer issue for ethernet gadget, eg. NCM, the reason
> is ncm->notify_req is NULL, but the ncm_notify_complete is not called before .unbind is
> called. See below oops:

sounds like a bug in ncm. If we haven't connected to host side yet, we
can't really complete the request.

> It is very easy to reproduce it, just create ncm function,
> and sleep 1 second, then, remove it.
>
> I am not sure if it is the common issue for dwc3, there is no such
> requirement for chipidea and cadence3. Would you please have a check?

could you capture tracepoints of what's happening here?
diff mbox series

Patch

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index f13bef950951..e95955b6a225 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1827,6 +1827,7 @@  static int dwc3_gadget_run_stop(struct dwc3 *dwc, int is_on, int suspend)
        do {
                reg = dwc3_readl(dwc->regs, DWC3_DSTS);
                reg &= DWC3_DSTS_DEVCTRLHLT;
+               udelay(1);
        } while (--timeout && !(!is_on ^ !reg));
 
        if (!timeout)
@@ -1861,6 +1862,8 @@  static int dwc3_gadget_pullup(struct usb_gadget *g, int is_on)
        spin_lock_irqsave(&dwc->lock, flags);
        ret = dwc3_gadget_run_stop(dwc, is_on, false);
        spin_unlock_irqrestore(&dwc->lock, flags);
+       if (ret)
+               dev_err(dwc->dev, "%s:ret = %d\n", __func__, ret);
 
        return ret;
 }