Message ID | 20211201100205.25448-1-quic_wcheng@quicinc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | usb: gadget: f_fs: Wake up IO thread during disconnect | expand |
On Wed, Dec 01, 2021 at 02:02:05AM -0800, Wesley Cheng wrote: > During device disconnect or composition unbind, applications should be > notified that the endpoints are no longer enabled, so that it can take > the proper actions to handle its IO threads. Otherwise, they can be > left waiting for endpoints until EPs are re-enabled. > > Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com> > --- > drivers/usb/gadget/function/f_fs.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c > index 3c584da9118c..0b0747d96378 100644 > --- a/drivers/usb/gadget/function/f_fs.c > +++ b/drivers/usb/gadget/function/f_fs.c > @@ -957,10 +957,12 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data) > if (file->f_flags & O_NONBLOCK) > return -EAGAIN; > > - ret = wait_event_interruptible( > - epfile->ffs->wait, (ep = epfile->ep)); > + ret = wait_event_interruptible(epfile->ffs->wait, > + (ep = epfile->ep) || !epfile->ffs->func); > if (ret) > return -EINTR; > + if (!epfile->ffs->func) > + return -ENODEV; This seems strange - we are inside the case where the endpoint is not initially enabled, if we're returning ENODEV here shouldn't that happen in all cases? Beyond that, there is no locking for accessing ffs->func here; modification happens in gadget callbacks so it's guarded by the gadget core (the existing case in ffs_ep0_ioctl() looks suspicious as well). But I can't see why this change is necessary - there are event notifications through ep0 when this happens, as can be seen in the hunk below from the ffs_event_add(ffs, FUNCTIONFS_DISABLE) line. If userspace cares about this, then it can read the events from ep0. > } > > /* Do we halt? */ > @@ -3292,6 +3294,7 @@ static int ffs_func_set_alt(struct usb_function *f, > if (alt == (unsigned)-1) { > ffs->func = NULL; > ffs_event_add(ffs, FUNCTIONFS_DISABLE); > + wake_up_interruptible(&ffs->wait); > return 0; > } >
On Wed, Dec 01, 2021 at 04:41:10PM +0000, John Keeping wrote: > On Wed, Dec 01, 2021 at 02:02:05AM -0800, Wesley Cheng wrote: > > During device disconnect or composition unbind, applications should be > > notified that the endpoints are no longer enabled, so that it can take > > the proper actions to handle its IO threads. Otherwise, they can be > > left waiting for endpoints until EPs are re-enabled. > > > > Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com> > > --- > > drivers/usb/gadget/function/f_fs.c | 7 +++++-- > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c > > index 3c584da9118c..0b0747d96378 100644 > > --- a/drivers/usb/gadget/function/f_fs.c > > +++ b/drivers/usb/gadget/function/f_fs.c > > @@ -957,10 +957,12 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data) > > if (file->f_flags & O_NONBLOCK) > > return -EAGAIN; > > > > - ret = wait_event_interruptible( > > - epfile->ffs->wait, (ep = epfile->ep)); > > + ret = wait_event_interruptible(epfile->ffs->wait, > > + (ep = epfile->ep) || !epfile->ffs->func); I looked at this again, and doesn't this totally break the wait condition? epfile->ep is set to non-null in ffs_func_eps_enable() which is called from ffs_func_set_alt() just after ffs->func is set to non-null, and then those are also set back to null at the same time. So the condition boils down to a || !a and this never blocks. Or am I missing something? > > if (ret) > > return -EINTR; > > + if (!epfile->ffs->func) > > + return -ENODEV; > > This seems strange - we are inside the case where the endpoint is not > initially enabled, if we're returning ENODEV here shouldn't that happen > in all cases? > > Beyond that, there is no locking for accessing ffs->func here; > modification happens in gadget callbacks so it's guarded by the gadget > core (the existing case in ffs_ep0_ioctl() looks suspicious as well). > > But I can't see why this change is necessary - there are event > notifications through ep0 when this happens, as can be seen in the hunk > below from the ffs_event_add(ffs, FUNCTIONFS_DISABLE) line. If > userspace cares about this, then it can read the events from ep0. > > > } > > > > /* Do we halt? */ > > @@ -3292,6 +3294,7 @@ static int ffs_func_set_alt(struct usb_function *f, > > if (alt == (unsigned)-1) { > > ffs->func = NULL; > > ffs_event_add(ffs, FUNCTIONFS_DISABLE); > > + wake_up_interruptible(&ffs->wait); > > return 0; > > } > >
Hi John, On 12/2/2021 6:49 AM, John Keeping wrote: > On Wed, Dec 01, 2021 at 04:41:10PM +0000, John Keeping wrote: >> On Wed, Dec 01, 2021 at 02:02:05AM -0800, Wesley Cheng wrote: >>> During device disconnect or composition unbind, applications should be >>> notified that the endpoints are no longer enabled, so that it can take >>> the proper actions to handle its IO threads. Otherwise, they can be >>> left waiting for endpoints until EPs are re-enabled. >>> >>> Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com> >>> --- >>> drivers/usb/gadget/function/f_fs.c | 7 +++++-- >>> 1 file changed, 5 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c >>> index 3c584da9118c..0b0747d96378 100644 >>> --- a/drivers/usb/gadget/function/f_fs.c >>> +++ b/drivers/usb/gadget/function/f_fs.c >>> @@ -957,10 +957,12 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data) >>> if (file->f_flags & O_NONBLOCK) >>> return -EAGAIN; >>> >>> - ret = wait_event_interruptible( >>> - epfile->ffs->wait, (ep = epfile->ep)); >>> + ret = wait_event_interruptible(epfile->ffs->wait, >>> + (ep = epfile->ep) || !epfile->ffs->func); > > I looked at this again, and doesn't this totally break the wait > condition? > > epfile->ep is set to non-null in ffs_func_eps_enable() which is called > from ffs_func_set_alt() just after ffs->func is set to non-null, and > then those are also set back to null at the same time. > > So the condition boils down to a || !a and this never blocks. Or am I > missing something? Thanks for the feedback. Hmm...yes, I get what you're saying. The EPfiles and func is basically being set/cleared together, so the above change wouldn't be any different than checking for ep != epfile->ep. Let me see if there's another way we can address the issue this change is trying to resolve. > >>> if (ret) >>> return -EINTR; >>> + if (!epfile->ffs->func) >>> + return -ENODEV; >> >> This seems strange - we are inside the case where the endpoint is not >> initially enabled, if we're returning ENODEV here shouldn't that happen >> in all cases? >> >> Beyond that, there is no locking for accessing ffs->func here; >> modification happens in gadget callbacks so it's guarded by the gadget >> core (the existing case in ffs_ep0_ioctl() looks suspicious as well). >> >> But I can't see why this change is necessary - there are event >> notifications through ep0 when this happens, as can be seen in the hunk >> below from the ffs_event_add(ffs, FUNCTIONFS_DISABLE) line. If >> userspace cares about this, then it can read the events from ep0. >> In short, the change is basically trying to resolve an issue in an application that has a separate thread handling the IO ops. When the USB cable is disconnected, the application would expect for this IO thread to be completed and exit gracefully, and restarting it on the next connect. However, since we are stuck in the read() it can not proceed further. I guess in these situations, we should utilize the O_NONBLOCK file parameter? Thanks Wesley Cheng >>> } >>> >>> /* Do we halt? */ >>> @@ -3292,6 +3294,7 @@ static int ffs_func_set_alt(struct usb_function *f, >>> if (alt == (unsigned)-1) { >>> ffs->func = NULL; >>> ffs_event_add(ffs, FUNCTIONFS_DISABLE); >>> + wake_up_interruptible(&ffs->wait); >>> return 0; >>> } >>>
Hi Wesley, On Thu, Dec 02, 2021 at 11:33:40PM -0800, Wesley Cheng wrote: > On 12/2/2021 6:49 AM, John Keeping wrote: > > On Wed, Dec 01, 2021 at 04:41:10PM +0000, John Keeping wrote: > >> On Wed, Dec 01, 2021 at 02:02:05AM -0800, Wesley Cheng wrote: > >>> During device disconnect or composition unbind, applications should be > >>> notified that the endpoints are no longer enabled, so that it can take > >>> the proper actions to handle its IO threads. Otherwise, they can be > >>> left waiting for endpoints until EPs are re-enabled. > >>> > >>> Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com> > >>> --- > >>> drivers/usb/gadget/function/f_fs.c | 7 +++++-- > >>> 1 file changed, 5 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c > >>> index 3c584da9118c..0b0747d96378 100644 > >>> --- a/drivers/usb/gadget/function/f_fs.c > >>> +++ b/drivers/usb/gadget/function/f_fs.c > >>> @@ -957,10 +957,12 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data) > >>> if (file->f_flags & O_NONBLOCK) > >>> return -EAGAIN; > >>> > >>> - ret = wait_event_interruptible( > >>> - epfile->ffs->wait, (ep = epfile->ep)); > >>> + ret = wait_event_interruptible(epfile->ffs->wait, > >>> + (ep = epfile->ep) || !epfile->ffs->func); > > > > I looked at this again, and doesn't this totally break the wait > > condition? > > > > epfile->ep is set to non-null in ffs_func_eps_enable() which is called > > from ffs_func_set_alt() just after ffs->func is set to non-null, and > > then those are also set back to null at the same time. > > > > So the condition boils down to a || !a and this never blocks. Or am I > > missing something? > > Thanks for the feedback. Hmm...yes, I get what you're saying. The > EPfiles and func is basically being set/cleared together, so the above > change wouldn't be any different than checking for ep != epfile->ep. > Let me see if there's another way we can address the issue this change > is trying to resolve. > > > > >>> if (ret) > >>> return -EINTR; > >>> + if (!epfile->ffs->func) > >>> + return -ENODEV; > >> > >> This seems strange - we are inside the case where the endpoint is not > >> initially enabled, if we're returning ENODEV here shouldn't that happen > >> in all cases? > >> > >> Beyond that, there is no locking for accessing ffs->func here; > >> modification happens in gadget callbacks so it's guarded by the gadget > >> core (the existing case in ffs_ep0_ioctl() looks suspicious as well). > >> > >> But I can't see why this change is necessary - there are event > >> notifications through ep0 when this happens, as can be seen in the hunk > >> below from the ffs_event_add(ffs, FUNCTIONFS_DISABLE) line. If > >> userspace cares about this, then it can read the events from ep0. > >> > In short, the change is basically trying to resolve an issue in an > application that has a separate thread handling the IO ops. When the > USB cable is disconnected, the application would expect for this IO > thread to be completed and exit gracefully, and restarting it on the > next connect. However, since we are stuck in the read() it can not > proceed further. Did you see the other recent thread on FunctionFS [1]? It seems that the separate I/O thread is not required and in fact there is a race here in general even using AIO via io_submit(). [1] https://lore.kernel.org/linux-usb/CAJkDvNk4G3WJuitZa+oPuNhkrCPNiwwG-zyNET+urWVWAda+cw@mail.gmail.com/ > I guess in these situations, we should utilize the O_NONBLOCK file > parameter? Yes, using O_NONBLOCK does avoid the problems. I'm not sure what the general solution is - right now I don't see any way to resolve the requirements to wait for the host to connect and handle disconnection without blocking here. The simplest thing would be to refuse epfile I/O when FunctionFS is not enabled, which neatly resolves the race in favour of returning an error. But it means that applications need to wait for a FUNCTIONFS_ENABLE event on ep0 before submitting any transfers on other endpoints, which is a change from the current behaviour. And there's no way to know how many applications rely on that. So I don't think that's an option, at least not without providing some way for userspace to opt in to the new behaviour.
diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c index 3c584da9118c..0b0747d96378 100644 --- a/drivers/usb/gadget/function/f_fs.c +++ b/drivers/usb/gadget/function/f_fs.c @@ -957,10 +957,12 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data) if (file->f_flags & O_NONBLOCK) return -EAGAIN; - ret = wait_event_interruptible( - epfile->ffs->wait, (ep = epfile->ep)); + ret = wait_event_interruptible(epfile->ffs->wait, + (ep = epfile->ep) || !epfile->ffs->func); if (ret) return -EINTR; + if (!epfile->ffs->func) + return -ENODEV; } /* Do we halt? */ @@ -3292,6 +3294,7 @@ static int ffs_func_set_alt(struct usb_function *f, if (alt == (unsigned)-1) { ffs->func = NULL; ffs_event_add(ffs, FUNCTIONFS_DISABLE); + wake_up_interruptible(&ffs->wait); return 0; }
During device disconnect or composition unbind, applications should be notified that the endpoints are no longer enabled, so that it can take the proper actions to handle its IO threads. Otherwise, they can be left waiting for endpoints until EPs are re-enabled. Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com> --- drivers/usb/gadget/function/f_fs.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)