Message ID | TYUPR06MB621737D16F68B5ABD6CF5772D2272@TYUPR06MB6217.apcprd06.prod.outlook.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | usb: gadget: u_serial: Fix the issue that gs_start_io crashed due to accessing null pointer | expand |
On 18-11-24 08:42 am, 胡连勤 wrote: > From: Lianqin Hu <hulianqin@vivo.com> > > Considering that in some extreme cases, > when u_serial driver is accessed by multiple threads, > Thread A is executing the open operation and calling the gs_open, > Thread B is executing the disconnect operation and calling the > gserial_disconnect function,The port->port_usb pointer will be set to NULL. > [...] > --- > drivers/usb/gadget/function/u_serial.c | 25 +++++++++++++++---------- > 1 file changed, 15 insertions(+), 10 deletions(-) > > diff --git a/drivers/usb/gadget/function/u_serial.c b/drivers/usb/gadget/function/u_serial.c > index 0a8c05b2746b..9ab2dbed60a8 100644 > --- a/drivers/usb/gadget/function/u_serial.c > +++ b/drivers/usb/gadget/function/u_serial.c > @@ -124,6 +124,7 @@ struct gs_port { > struct kfifo port_write_buf; > wait_queue_head_t drain_wait; /* wait while writes drain */ > bool write_busy; > + bool read_busy; > wait_queue_head_t close_wait; > bool suspended; /* port suspended */ > bool start_delayed; /* delay start when suspended */ > @@ -331,9 +332,11 @@ __acquires(&port->port_lock) > /* drop lock while we call out; the controller driver > * may need to call us back (e.g. for disconnect) > */ > + port->read_busy = true; > spin_unlock(&port->port_lock); > status = usb_ep_queue(out, req, GFP_ATOMIC); > spin_lock(&port->port_lock); > + port->read_busy = false; > > if (status) { > pr_debug("%s: %s %s err %d\n", > @@ -1412,19 +1415,21 @@ void gserial_disconnect(struct gserial *gser) > /* tell the TTY glue not to do I/O here any more */ > spin_lock(&port->port_lock); > > - gs_console_disconnect(port); > + if (!port->read_busy) { start_tx/rx rely on port->port_usb for queuing the requests, and if its not null during disconnect, tx/rx would keep on queuing requests to UDC even after disconnect (which is not ideal). Here in your case, after read_busy is set, start_rx would queue something outside of spinlock, meanwhile disconnect happens but port_usb is still valid (because read_busy is set) and start_rx would break early. But start_tx could continue queuing into disconnected UDC (if 'started' is non-zero, which could happen due to timing). Can't you try something like this, --- a/drivers/usb/gadget/function/u_serial.c +++ b/drivers/usb/gadget/function/u_serial.c @@ -579,9 +579,12 @@ static int gs_start_io(struct gs_port *port) * we didn't in gs_start_tx() */ tty_wakeup(port->port.tty); } else { - gs_free_requests(ep, head, &port->read_allocated); - gs_free_requests(port->port_usb->in, &port->write_pool, - &port->write_allocated); + /* Free reqs only if we are still connected */ + if (port->port_usb) { + gs_free_requests(ep, head, &port->read_allocated); + gs_free_requests(port->port_usb->in, &port->write_pool, + &port->write_allocated); + } status = -EIO; } This will skip freeing reqs (and your crash) if port_usb is null and freeing would be taken care by disconnect callback. > + gs_console_disconnect(port); > > - /* REVISIT as above: how best to track this? */ > - port->port_line_coding = gser->port_line_coding; > + /* REVISIT as above: how best to track this? */ > + port->port_line_coding = gser->port_line_coding; > > - port->port_usb = NULL; > - gser->ioport = NULL; > - if (port->port.count > 0) { > - wake_up_interruptible(&port->drain_wait); > - if (port->port.tty) > - tty_hangup(port->port.tty); > + port->port_usb = NULL; > + gser->ioport = NULL; > + if (port->port.count > 0) { > + wake_up_interruptible(&port->drain_wait); > + if (port->port.tty) > + tty_hangup(port->port.tty); > + } > + port->suspended = false; > } > - port->suspended = false; > spin_unlock(&port->port_lock); > spin_unlock_irqrestore(&serial_port_lock, flags); > Regards, Prashanth K
Hello Prashanth: > > Considering that in some extreme cases, when u_serial driver is > > accessed by multiple threads, Thread A is executing the open operation > > and calling the gs_open, Thread B is executing the disconnect > > operation and calling the gserial_disconnect function,The > > port->port_usb pointer will be set to NULL. > > > [...] > > --- > > drivers/usb/gadget/function/u_serial.c | 25 +++++++++++++++---------- > > 1 file changed, 15 insertions(+), 10 deletions(-) > > > > diff --git a/drivers/usb/gadget/function/u_serial.c > > b/drivers/usb/gadget/function/u_serial.c > > index 0a8c05b2746b..9ab2dbed60a8 100644 > > --- a/drivers/usb/gadget/function/u_serial.c > > +++ b/drivers/usb/gadget/function/u_serial.c > > @@ -124,6 +124,7 @@ struct gs_port { > > struct kfifo port_write_buf; > > wait_queue_head_t drain_wait; /* wait while writes drain */ > > bool write_busy; > > + bool read_busy; > > wait_queue_head_t close_wait; > > bool suspended; /* port suspended */ > > bool start_delayed; /* delay start when > suspended */ > > @@ -331,9 +332,11 @@ __acquires(&port->port_lock) > > /* drop lock while we call out; the controller driver > > * may need to call us back (e.g. for disconnect) > > */ > > + port->read_busy = true; > > spin_unlock(&port->port_lock); > > status = usb_ep_queue(out, req, GFP_ATOMIC); > > spin_lock(&port->port_lock); > > + port->read_busy = false; > > > > if (status) { > > pr_debug("%s: %s %s err %d\n", > > @@ -1412,19 +1415,21 @@ void gserial_disconnect(struct gserial *gser) > > /* tell the TTY glue not to do I/O here any more */ > > spin_lock(&port->port_lock); > > > > - gs_console_disconnect(port); > > + if (!port->read_busy) { > start_tx/rx rely on port->port_usb for queuing the requests, and if its not > null during disconnect, tx/rx would keep on queuing requests to UDC even > after disconnect (which is not ideal). Here in your case, after read_busy is set, > start_rx would queue something outside of spinlock, meanwhile disconnect > happens but port_usb is still valid (because read_busy is set) and start_rx > would break early. But start_tx could continue queuing into disconnected > UDC (if 'started' is non-zero, which could happen due to timing). Can't you try > something like this, > > --- a/drivers/usb/gadget/function/u_serial.c > +++ b/drivers/usb/gadget/function/u_serial.c > @@ -579,9 +579,12 @@ static int gs_start_io(struct gs_port *port) > * we didn't in gs_start_tx() */ > tty_wakeup(port->port.tty); > } else { > - gs_free_requests(ep, head, &port->read_allocated); > - gs_free_requests(port->port_usb->in, &port->write_pool, > - &port->write_allocated); > + /* Free reqs only if we are still connected */ > + if (port->port_usb) { > + gs_free_requests(ep, head, &port->read_allocated); > + gs_free_requests(port->port_usb->in, > &port->write_pool, > + &port->write_allocated); > + } > status = -EIO; > } > > This will skip freeing reqs (and your crash) if port_usb is null and freeing > would be taken care by disconnect callback. > > First of all, the patch you gave can solve the problem we are currently facing. When we first encountered this problem, we also thought about adding a null check operation to deal with it, but we saw that the entry of this function (gs_start_io) had a null check operation for port->port_usb, so I gave up the idea of null check during free_req (maybe I made a simple problem complicated), and thought about optimizing it from the software logic, so that the port->port_usb pointer is always valid before gs_start_io is executed. Thanks
On 19-11-24 06:03 pm, 胡连勤 wrote: > Hello Prashanth: > >>> Considering that in some extreme cases, when u_serial driver is >>> accessed by multiple threads, Thread A is executing the open operation >>> and calling the gs_open, Thread B is executing the disconnect >>> operation and calling the gserial_disconnect function,The >>> port->port_usb pointer will be set to NULL. >>> >> [...] >>> --- >>> drivers/usb/gadget/function/u_serial.c | 25 +++++++++++++++---------- >>> 1 file changed, 15 insertions(+), 10 deletions(-) >>> >>> diff --git a/drivers/usb/gadget/function/u_serial.c >>> b/drivers/usb/gadget/function/u_serial.c >>> index 0a8c05b2746b..9ab2dbed60a8 100644 >>> --- a/drivers/usb/gadget/function/u_serial.c >>> +++ b/drivers/usb/gadget/function/u_serial.c >>> @@ -124,6 +124,7 @@ struct gs_port { >>> struct kfifo port_write_buf; >>> wait_queue_head_t drain_wait; /* wait while writes drain */ >>> bool write_busy; >>> + bool read_busy; >>> wait_queue_head_t close_wait; >>> bool suspended; /* port suspended */ >>> bool start_delayed; /* delay start when >> suspended */ >>> @@ -331,9 +332,11 @@ __acquires(&port->port_lock) >>> /* drop lock while we call out; the controller driver >>> * may need to call us back (e.g. for disconnect) >>> */ >>> + port->read_busy = true; >>> spin_unlock(&port->port_lock); >>> status = usb_ep_queue(out, req, GFP_ATOMIC); >>> spin_lock(&port->port_lock); >>> + port->read_busy = false; >>> >>> if (status) { >>> pr_debug("%s: %s %s err %d\n", >>> @@ -1412,19 +1415,21 @@ void gserial_disconnect(struct gserial *gser) >>> /* tell the TTY glue not to do I/O here any more */ >>> spin_lock(&port->port_lock); >>> >>> - gs_console_disconnect(port); >>> + if (!port->read_busy) { >> start_tx/rx rely on port->port_usb for queuing the requests, and if its not >> null during disconnect, tx/rx would keep on queuing requests to UDC even >> after disconnect (which is not ideal). Here in your case, after read_busy is set, >> start_rx would queue something outside of spinlock, meanwhile disconnect >> happens but port_usb is still valid (because read_busy is set) and start_rx >> would break early. But start_tx could continue queuing into disconnected >> UDC (if 'started' is non-zero, which could happen due to timing). Can't you try >> something like this, >> >> --- a/drivers/usb/gadget/function/u_serial.c >> +++ b/drivers/usb/gadget/function/u_serial.c >> @@ -579,9 +579,12 @@ static int gs_start_io(struct gs_port *port) >> * we didn't in gs_start_tx() */ >> tty_wakeup(port->port.tty); >> } else { >> - gs_free_requests(ep, head, &port->read_allocated); >> - gs_free_requests(port->port_usb->in, &port->write_pool, >> - &port->write_allocated); >> + /* Free reqs only if we are still connected */ >> + if (port->port_usb) { >> + gs_free_requests(ep, head, &port->read_allocated); >> + gs_free_requests(port->port_usb->in, >> &port->write_pool, >> + &port->write_allocated); >> + } >> status = -EIO; >> } >> >> This will skip freeing reqs (and your crash) if port_usb is null and freeing >> would be taken care by disconnect callback. >> >> > First of all, the patch you gave can solve the problem we are currently facing. > > When we first encountered this problem, we also thought about adding a null check operation to deal with it, > but we saw that the entry of this function (gs_start_io) had a null check operation for port->port_usb, so I gave up > the idea of null check during free_req (maybe I made a simple problem complicated), > and thought about optimizing it from the software logic, so that the port->port_usb pointer is always valid before gs_start_io is executed. > If it solves the problem, i guess you can use the null pointer check as I suggested and send a new patchset, the current one will introduce new problems. Keep the issue analysis as it is in commit text since its descriptive enough to understand the problem. Regards, Prashanth K
Hello Prashanth: > > > >>> Considering that in some extreme cases, when u_serial driver is > >>> accessed by multiple threads, Thread A is executing the open > >>> operation and calling the gs_open, Thread B is executing the > >>> disconnect operation and calling the gserial_disconnect function,The > >>> port->port_usb pointer will be set to NULL. > >>> > >> [...] > >>> --- > >>> drivers/usb/gadget/function/u_serial.c | 25 > >>> +++++++++++++++---------- > >>> 1 file changed, 15 insertions(+), 10 deletions(-) > >>> > >>> diff --git a/drivers/usb/gadget/function/u_serial.c > >>> b/drivers/usb/gadget/function/u_serial.c > >>> index 0a8c05b2746b..9ab2dbed60a8 100644 > >>> --- a/drivers/usb/gadget/function/u_serial.c > >>> +++ b/drivers/usb/gadget/function/u_serial.c > >>> @@ -124,6 +124,7 @@ struct gs_port { > >>> struct kfifo port_write_buf; > >>> wait_queue_head_t drain_wait; /* wait while writes drain */ > >>> bool write_busy; > >>> + bool read_busy; > >>> wait_queue_head_t close_wait; > >>> bool suspended; /* port suspended */ > >>> bool start_delayed; /* delay start when > >> suspended */ > >>> @@ -331,9 +332,11 @@ __acquires(&port->port_lock) > >>> /* drop lock while we call out; the controller driver > >>> * may need to call us back (e.g. for disconnect) > >>> */ > >>> + port->read_busy = true; > >>> spin_unlock(&port->port_lock); > >>> status = usb_ep_queue(out, req, GFP_ATOMIC); > >>> spin_lock(&port->port_lock); > >>> + port->read_busy = false; > >>> > >>> if (status) { > >>> pr_debug("%s: %s %s err %d\n", > >>> @@ -1412,19 +1415,21 @@ void gserial_disconnect(struct gserial *gser) > >>> /* tell the TTY glue not to do I/O here any more */ > >>> spin_lock(&port->port_lock); > >>> > >>> - gs_console_disconnect(port); > >>> + if (!port->read_busy) { > >> start_tx/rx rely on port->port_usb for queuing the requests, and if > >> its not null during disconnect, tx/rx would keep on queuing requests > >> to UDC even after disconnect (which is not ideal). Here in your case, > >> after read_busy is set, start_rx would queue something outside of > >> spinlock, meanwhile disconnect happens but port_usb is still valid > >> (because read_busy is set) and start_rx would break early. But > >> start_tx could continue queuing into disconnected UDC (if 'started' > >> is non-zero, which could happen due to timing). Can't you try > >> something like this, > >> > >> --- a/drivers/usb/gadget/function/u_serial.c > >> +++ b/drivers/usb/gadget/function/u_serial.c > >> @@ -579,9 +579,12 @@ static int gs_start_io(struct gs_port *port) > >> * we didn't in gs_start_tx() */ > >> tty_wakeup(port->port.tty); > >> } else { > >> - gs_free_requests(ep, head, &port->read_allocated); > >> - gs_free_requests(port->port_usb->in, &port->write_pool, > >> - &port->write_allocated); > >> + /* Free reqs only if we are still connected */ > >> + if (port->port_usb) { > >> + gs_free_requests(ep, head, &port->read_allocated); > >> + gs_free_requests(port->port_usb->in, > >> &port->write_pool, > >> + &port->write_allocated); > >> + } > >> status = -EIO; > >> } > >> > >> This will skip freeing reqs (and your crash) if port_usb is null and > >> freeing would be taken care by disconnect callback. > >> > >> > > First of all, the patch you gave can solve the problem we are currently > facing. > > > > When we first encountered this problem, we also thought about adding a > > null check operation to deal with it, but we saw that the entry of > > this function (gs_start_io) had a null check operation for > > port->port_usb, so I gave up the idea of null check during free_req (maybe > I made a simple problem complicated), and thought about optimizing it from > the software logic, so that the port->port_usb pointer is always valid before > gs_start_io is executed. > > > > If it solves the problem, i guess you can use the null pointer check as I > suggested and send a new patchset, the current one will introduce new > problems. Keep the issue analysis as it is in commit text since its descriptive > enough to understand the problem. OK, I'll do a stress test and send out a new patch later. Thanks
diff --git a/drivers/usb/gadget/function/u_serial.c b/drivers/usb/gadget/function/u_serial.c index 0a8c05b2746b..9ab2dbed60a8 100644 --- a/drivers/usb/gadget/function/u_serial.c +++ b/drivers/usb/gadget/function/u_serial.c @@ -124,6 +124,7 @@ struct gs_port { struct kfifo port_write_buf; wait_queue_head_t drain_wait; /* wait while writes drain */ bool write_busy; + bool read_busy; wait_queue_head_t close_wait; bool suspended; /* port suspended */ bool start_delayed; /* delay start when suspended */ @@ -331,9 +332,11 @@ __acquires(&port->port_lock) /* drop lock while we call out; the controller driver * may need to call us back (e.g. for disconnect) */ + port->read_busy = true; spin_unlock(&port->port_lock); status = usb_ep_queue(out, req, GFP_ATOMIC); spin_lock(&port->port_lock); + port->read_busy = false; if (status) { pr_debug("%s: %s %s err %d\n", @@ -1412,19 +1415,21 @@ void gserial_disconnect(struct gserial *gser) /* tell the TTY glue not to do I/O here any more */ spin_lock(&port->port_lock); - gs_console_disconnect(port); + if (!port->read_busy) { + gs_console_disconnect(port); - /* REVISIT as above: how best to track this? */ - port->port_line_coding = gser->port_line_coding; + /* REVISIT as above: how best to track this? */ + port->port_line_coding = gser->port_line_coding; - port->port_usb = NULL; - gser->ioport = NULL; - if (port->port.count > 0) { - wake_up_interruptible(&port->drain_wait); - if (port->port.tty) - tty_hangup(port->port.tty); + port->port_usb = NULL; + gser->ioport = NULL; + if (port->port.count > 0) { + wake_up_interruptible(&port->drain_wait); + if (port->port.tty) + tty_hangup(port->port.tty); + } + port->suspended = false; } - port->suspended = false; spin_unlock(&port->port_lock); spin_unlock_irqrestore(&serial_port_lock, flags);