diff mbox series

usb: gadget: uvc: Move usb_ep_disable() to uvcg_video_enable()

Message ID 20230615093406.80195-1-dan.scally@ideasonboard.com (mailing list archive)
State New, archived
Headers show
Series usb: gadget: uvc: Move usb_ep_disable() to uvcg_video_enable() | expand

Commit Message

Dan Scally June 15, 2023, 9:34 a.m. UTC
Currently the UVC Gadget's .set_alt() callback disables the USB
endpoint, following which a V4L2 event is queued that closes
down the workqueue. This ordering results in repeated calls to
usb_ep_queue() from the workqueue whilst usb_ep_disable() is
running - behaviour that the documentation for usb_ep_disable()
specifically prohibits.

Move the call to usb_ep_disable() until after cancel_work_sync(),
which will guarantee the endpoint is no longer in use when
usb_ep_disable() is called.

Signed-off-by: Daniel Scally <dan.scally@ideasonboard.com>
---
 drivers/usb/gadget/function/f_uvc.c     | 3 ---
 drivers/usb/gadget/function/uvc_video.c | 4 ++++
 2 files changed, 4 insertions(+), 3 deletions(-)

Comments

Laurent Pinchart June 15, 2023, 5:15 p.m. UTC | #1
Hi Dan,

Thank you for the patch.

On Thu, Jun 15, 2023 at 10:34:06AM +0100, Daniel Scally wrote:
> Currently the UVC Gadget's .set_alt() callback disables the USB
> endpoint, following which a V4L2 event is queued that closes
> down the workqueue. This ordering results in repeated calls to
> usb_ep_queue() from the workqueue whilst usb_ep_disable() is
> running - behaviour that the documentation for usb_ep_disable()
> specifically prohibits.
> 
> Move the call to usb_ep_disable() until after cancel_work_sync(),
> which will guarantee the endpoint is no longer in use when
> usb_ep_disable() is called.
> 
> Signed-off-by: Daniel Scally <dan.scally@ideasonboard.com>
> ---
>  drivers/usb/gadget/function/f_uvc.c     | 3 ---
>  drivers/usb/gadget/function/uvc_video.c | 4 ++++
>  2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/usb/gadget/function/f_uvc.c b/drivers/usb/gadget/function/f_uvc.c
> index 5e919fb65833..4b91bd572a83 100644
> --- a/drivers/usb/gadget/function/f_uvc.c
> +++ b/drivers/usb/gadget/function/f_uvc.c
> @@ -337,9 +337,6 @@ uvc_function_set_alt(struct usb_function *f, unsigned interface, unsigned alt)
>  		if (uvc->state != UVC_STATE_STREAMING)
>  			return 0;
>  
> -		if (uvc->video.ep)
> -			usb_ep_disable(uvc->video.ep);
> -
>  		memset(&v4l2_event, 0, sizeof(v4l2_event));
>  		v4l2_event.type = UVC_EVENT_STREAMOFF;
>  		v4l2_event_queue(&uvc->vdev, &v4l2_event);

If we don't disable the endpoint here, should we return
USB_GADGET_DELAYED_STATUS here (and call uvc_function_setup_continue()
in uvc_v4l2_streamoff()) or is that not needed ? The uvc->state updated
should then possibly be moved to uvc_v4l2_streamoff() (checking if this
would cause any race condition would also be a nice bonus :-)).

> diff --git a/drivers/usb/gadget/function/uvc_video.c b/drivers/usb/gadget/function/uvc_video.c
> index 91af3b1ef0d4..c7e38fa26492 100644
> --- a/drivers/usb/gadget/function/uvc_video.c
> +++ b/drivers/usb/gadget/function/uvc_video.c
> @@ -499,6 +499,10 @@ int uvcg_video_enable(struct uvc_video *video, int enable)
>  
>  	if (!enable) {
>  		cancel_work_sync(&video->pump);
> +
> +		if (video->ep)
> +			usb_ep_disable(video->ep);
> +
>  		uvcg_queue_cancel(&video->queue, 0);
>  
>  		for (i = 0; i < video->uvc_num_requests; ++i)
Avichal Rakesh Aug. 30, 2023, 8:38 p.m. UTC | #2
On 6/15/23 10:15, Laurent Pinchart wrote:
> Hi Dan,
> 
> Thank you for the patch.
> 
> On Thu, Jun 15, 2023 at 10:34:06AM +0100, Daniel Scally wrote:
>> Currently the UVC Gadget's .set_alt() callback disables the USB
>> endpoint, following which a V4L2 event is queued that closes
>> down the workqueue. This ordering results in repeated calls to
>> usb_ep_queue() from the workqueue whilst usb_ep_disable() is
>> running - behaviour that the documentation for usb_ep_disable()
>> specifically prohibits.
>>
>> Move the call to usb_ep_disable() until after cancel_work_sync(),
>> which will guarantee the endpoint is no longer in use when
>> usb_ep_disable() is called.
>>
>> Signed-off-by: Daniel Scally <dan.scally@ideasonboard.com>
>> ---
>>  drivers/usb/gadget/function/f_uvc.c     | 3 ---
>>  drivers/usb/gadget/function/uvc_video.c | 4 ++++
>>  2 files changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/usb/gadget/function/f_uvc.c b/drivers/usb/gadget/function/f_uvc.c
>> index 5e919fb65833..4b91bd572a83 100644
>> --- a/drivers/usb/gadget/function/f_uvc.c
>> +++ b/drivers/usb/gadget/function/f_uvc.c
>> @@ -337,9 +337,6 @@ uvc_function_set_alt(struct usb_function *f, unsigned interface, unsigned alt)
>>  		if (uvc->state != UVC_STATE_STREAMING)
>>  			return 0;
>>  
>> -		if (uvc->video.ep)
>> -			usb_ep_disable(uvc->video.ep);
>> -
>>  		memset(&v4l2_event, 0, sizeof(v4l2_event));
>>  		v4l2_event.type = UVC_EVENT_STREAMOFF;
>>  		v4l2_event_queue(&uvc->vdev, &v4l2_event);
> 
> If we don't disable the endpoint here, should we return
> USB_GADGET_DELAYED_STATUS here (and call uvc_function_setup_continue()
> in uvc_v4l2_streamoff()) or is that not needed ? The uvc->state updated
> should then possibly be moved to uvc_v4l2_streamoff() (checking if this
> would cause any race condition would also be a nice bonus :-)).
> 

Hey all,

First off, apologies for reviving an old thread. We've also been seeing the
"no resource for ep2in" warning from dwc3 controller, followed by the UVC gadget
not streaming any frames, when there is a quick STREAMOFF->STREAMON sequence.

It looks like this is the same root cause as what Dan mentioned in
https://lore.kernel.org/20230531085544.253363-1-dan.scally@ideasonboard.com/
and this patch seems to solve. (Thank you Dan, for posting the stacktrace in
that email thread! I had been banging my head for a couple of days before
thinking of looking up the exact stack :|)

IIUC, this stems from workqueue not fully shutting down by the time the usb
endpoint is disabled and we need stronger guarantees that the workqueue pumping
usb_requests doesn't accidentally queue usb_requests _after_ we've disabled
the streaming endpoint on the usb controller.

Attached is a patch that attempts to address the concerns raised here and sets
up some stronger timing and memory guarantees.

So here are the list of changes over what Dan had already started:

 - Return USB_GADGET_DELAYED_STATUS from set_alt. This is to ensure there are no
   more control transfers while we're waiting on the workqueue to wind down.

 - Move updating uvc->state to uvc_v4l2_streamoff as Laurent suggested.
   This ensures that setting uvc->state to STREAMING or CONNECTED
   happens as a result of streamoff and streamon events which makes things
   easier to reason about.

 - Guard video_pump with uvc->state so the thread can be stopped by setting
   uvc->state to anything other than UVC_STATE_STREAMING. This effectively makes
   uvc->state a flag for the video_pump loop to exit. This is the same flag that
   the complete callback uses to restart the workqueue, so all interactions
   with the controller are effectively guarded by the same flag now.

 - Set uvc->state to UVC_STATE_CONNECTED before winding down the work queue.
   Now that all usb logic is guarded by the same flag, setting the flag should
   stop all usb packet queuing once current execution finishes.

 - Add some memory guarantees in uvcg_video_enable(). At the moment, we don't
   really consider the fact that usb_ep_dequeue is async, which means that the
   usb_requests may not have been returned to the gadget before we start
   deallocating them. With this patch, we wait until all usb_requests have been
   returned to us before deallocating them (this adds a little bit of
   bookkeeping, but nothing too complicated).

   This also guarantees that the complete callback won't accidentally
   re-queue work on the workqueue. Earlier, it was possible that with bad
   enough luck (or scheduler), the complete callback re-triggered the
   workqueue after uvcg_video_enable has already returned.

 - And finally, call usb_ep_disable after uvcg_video_enable returns, which, with
   the new guarantees should ensure that it is safe to disable the endpoint.

I am not sure if this is the best way to do it, but in local testing it seems
to solve the issue. Let me know what you think or if there are better ways to
achieve this.

Thank you!

- Avi.

---
 drivers/usb/gadget/function/f_uvc.c     | 11 ++++----
 drivers/usb/gadget/function/f_uvc.h     |  2 +-
 drivers/usb/gadget/function/uvc.h       |  5 +++-
 drivers/usb/gadget/function/uvc_v4l2.c  | 21 ++++++++++++---
 drivers/usb/gadget/function/uvc_video.c | 34 +++++++++++++++++++++++--
 5 files changed, 60 insertions(+), 13 deletions(-)

diff --git a/drivers/usb/gadget/function/f_uvc.c b/drivers/usb/gadget/function/f_uvc.c
index faa398109431..75c9f9a3f884 100644
--- a/drivers/usb/gadget/function/f_uvc.c
+++ b/drivers/usb/gadget/function/f_uvc.c
@@ -263,10 +263,13 @@ uvc_function_setup(struct usb_function *f, const struct usb_ctrlrequest *ctrl)
 	return 0;
 }

-void uvc_function_setup_continue(struct uvc_device *uvc)
+void uvc_function_setup_continue(struct uvc_device *uvc, int disable_ep)
 {
 	struct usb_composite_dev *cdev = uvc->func.config->cdev;

+	if (disable_ep && uvc->video.ep) {
+		usb_ep_disable(uvc->video.ep);
+	}
 	usb_composite_setup_continue(cdev);
 }

@@ -337,15 +340,11 @@ uvc_function_set_alt(struct usb_function *f, unsigned interface, unsigned alt)
 		if (uvc->state != UVC_STATE_STREAMING)
 			return 0;

-		if (uvc->video.ep)
-			usb_ep_disable(uvc->video.ep);
-
 		memset(&v4l2_event, 0, sizeof(v4l2_event));
 		v4l2_event.type = UVC_EVENT_STREAMOFF;
 		v4l2_event_queue(&uvc->vdev, &v4l2_event);

-		uvc->state = UVC_STATE_CONNECTED;
-		return 0;
+		return USB_GADGET_DELAYED_STATUS;

 	case 1:
 		if (uvc->state != UVC_STATE_CONNECTED)
diff --git a/drivers/usb/gadget/function/f_uvc.h b/drivers/usb/gadget/function/f_uvc.h
index 1db972d4beeb..e7f9f13f14dc 100644
--- a/drivers/usb/gadget/function/f_uvc.h
+++ b/drivers/usb/gadget/function/f_uvc.h
@@ -11,7 +11,7 @@

 struct uvc_device;

-void uvc_function_setup_continue(struct uvc_device *uvc);
+void uvc_function_setup_continue(struct uvc_device *uvc, int disale_ep);

 void uvc_function_connect(struct uvc_device *uvc);

diff --git a/drivers/usb/gadget/function/uvc.h b/drivers/usb/gadget/function/uvc.h
index 6751de8b63ad..e40e702a7074 100644
--- a/drivers/usb/gadget/function/uvc.h
+++ b/drivers/usb/gadget/function/uvc.h
@@ -104,8 +104,11 @@ struct uvc_video {
 	unsigned int req_size;
 	struct uvc_request *ureq;
 	struct list_head req_free;
+	unsigned int req_free_count; /* number of requests in req_free */
 	spinlock_t req_lock;

+	wait_queue_head_t req_free_queue;
+
 	unsigned int req_int_count;

 	void (*encode) (struct usb_request *req, struct uvc_video *video,
@@ -177,7 +180,7 @@ struct uvc_file_handle {
  * Functions
  */

-extern void uvc_function_setup_continue(struct uvc_device *uvc);
+extern void uvc_function_setup_continue(struct uvc_device *uvc, int disable_ep);
 extern void uvc_function_connect(struct uvc_device *uvc);
 extern void uvc_function_disconnect(struct uvc_device *uvc);

diff --git a/drivers/usb/gadget/function/uvc_v4l2.c b/drivers/usb/gadget/function/uvc_v4l2.c
index 3f0a9795c0d4..3d3469883ed0 100644
--- a/drivers/usb/gadget/function/uvc_v4l2.c
+++ b/drivers/usb/gadget/function/uvc_v4l2.c
@@ -451,7 +451,7 @@ uvc_v4l2_streamon(struct file *file, void *fh, enum v4l2_buf_type type)
 	 * Complete the alternate setting selection setup phase now that
 	 * userspace is ready to provide video frames.
 	 */
-	uvc_function_setup_continue(uvc);
+	uvc_function_setup_continue(uvc, 0);
 	uvc->state = UVC_STATE_STREAMING;

 	return 0;
@@ -463,11 +463,19 @@ uvc_v4l2_streamoff(struct file *file, void *fh, enum v4l2_buf_type type)
 	struct video_device *vdev = video_devdata(file);
 	struct uvc_device *uvc = video_get_drvdata(vdev);
 	struct uvc_video *video = &uvc->video;
+	int ret = 0;

 	if (type != video->queue.queue.type)
 		return -EINVAL;

-	return uvcg_video_enable(video, 0);
+	uvc->state = UVC_STATE_CONNECTED;
+	ret = uvcg_video_enable(video, 0);
+	if (ret < 0) {
+		return ret;
+	}
+
+	uvc_function_setup_continue(uvc, 1);
+	return 0;
 }

 static int
@@ -500,6 +508,14 @@ uvc_v4l2_subscribe_event(struct v4l2_fh *fh,
 static void uvc_v4l2_disable(struct uvc_device *uvc)
 {
 	uvc_function_disconnect(uvc);
+	if (uvc->state == UVC_STATE_STREAMING) {
+		/*
+		 * Drop uvc->state to CONNECTED if it was streaming before.
+		 * This ensures that the usb_requests are no longer queued
+		 * to the controller.
+		 */
+		uvc->state = UVC_STATE_CONNECTED;
+	}
 	uvcg_video_enable(&uvc->video, 0);
 	uvcg_free_buffers(&uvc->video.queue);
 	uvc->func_connected = false;
@@ -647,4 +663,3 @@ const struct v4l2_file_operations uvc_v4l2_fops = {
 	.get_unmapped_area = uvcg_v4l2_get_unmapped_area,
 #endif
 };
-
diff --git a/drivers/usb/gadget/function/uvc_video.c b/drivers/usb/gadget/function/uvc_video.c
index 91af3b1ef0d4..3ea7d52df80d 100644
--- a/drivers/usb/gadget/function/uvc_video.c
+++ b/drivers/usb/gadget/function/uvc_video.c
@@ -284,10 +284,18 @@ uvc_video_complete(struct usb_ep *ep, struct usb_request *req)

 	spin_lock_irqsave(&video->req_lock, flags);
 	list_add_tail(&req->list, &video->req_free);
+	video->req_free_count++;
 	spin_unlock_irqrestore(&video->req_lock, flags);

-	if (uvc->state == UVC_STATE_STREAMING)
+	if (uvc->state == UVC_STATE_STREAMING) {
 		queue_work(video->async_wq, &video->pump);
+	} else if (video->req_free_count == video->req_size) {
+		/*
+		 * Wake up thread waiting for all requests to be returned to
+		 * the gadget driver.
+		 */
+		wake_up_interruptible(&video->req_free_queue);
+	}
 }

 static int
@@ -316,6 +324,7 @@ uvc_video_free_requests(struct uvc_video *video)

 	INIT_LIST_HEAD(&video->req_free);
 	video->req_size = 0;
+	video->req_free_count = 0;
 	return 0;
 }

@@ -360,6 +369,7 @@ uvc_video_alloc_requests(struct uvc_video *video)
 	}

 	video->req_size = req_size;
+	video->req_free_count = req_size; /* all requests are currently free */

 	return 0;

@@ -382,6 +392,7 @@ static void uvcg_video_pump(struct work_struct *work)
 {
 	struct uvc_video *video = container_of(work, struct uvc_video, pump);
 	struct uvc_video_queue *queue = &video->queue;
+	struct uvc_device *uvc = video->uvc;
 	/* video->max_payload_size is only set when using bulk transfer */
 	bool is_bulk = video->max_payload_size;
 	struct usb_request *req = NULL;
@@ -390,7 +401,7 @@ static void uvcg_video_pump(struct work_struct *work)
 	bool buf_done;
 	int ret;

-	while (video->ep->enabled) {
+	while (uvc->state == UVC_STATE_STREAMING && video->ep->enabled) {
 		/*
 		 * Retrieve the first available USB request, protected by the
 		 * request lock.
@@ -403,6 +414,7 @@ static void uvcg_video_pump(struct work_struct *work)
 		req = list_first_entry(&video->req_free, struct usb_request,
 					list);
 		list_del(&req->list);
+		video->req_free_count--;
 		spin_unlock_irqrestore(&video->req_lock, flags);

 		/*
@@ -479,6 +491,7 @@ static void uvcg_video_pump(struct work_struct *work)

 	spin_lock_irqsave(&video->req_lock, flags);
 	list_add_tail(&req->list, &video->req_free);
+	video->req_free_count++;
 	spin_unlock_irqrestore(&video->req_lock, flags);
 	return;
 }
@@ -505,6 +518,22 @@ int uvcg_video_enable(struct uvc_video *video, int enable)
 			if (video->ureq && video->ureq[i].req)
 				usb_ep_dequeue(video->ep, video->ureq[i].req);

+		/*
+		 * Wait 500ms for the usb_requests to be given back to the
+		 * gadget driver. This ensures that we don't accidentally
+		 * reference de-allocated usb_requests in the complete callback.
+		 */
+		if (video->req_free_count != video->req_size) {
+			uvcg_info(&video->uvc->func,
+					"Waiting 500ms for usb_request complete callbacks.\n");
+			ret = wait_event_interruptible_timeout(
+					video->req_free_queue,
+					video->req_free_count == video->req_size,
+					msecs_to_jiffies(500));
+			uvcg_info(&video->uvc->func,
+					"Done waiting for complete callbacks: %d\n", ret);
+		}
+
 		uvc_video_free_requests(video);
 		uvcg_queue_enable(&video->queue, 0);
 		return 0;
@@ -537,6 +566,7 @@ int uvcg_video_init(struct uvc_video *video, struct uvc_device *uvc)
 {
 	INIT_LIST_HEAD(&video->req_free);
 	spin_lock_init(&video->req_lock);
+	init_waitqueue_head(&video->req_free_queue);
 	INIT_WORK(&video->pump, uvcg_video_pump);

 	/* Allocate a work queue for asynchronous video pump handler. */
--
Michael Grzeschik Sept. 4, 2023, 2:36 p.m. UTC | #3
Hi Avichal

Cc'ing: Thinh Nguyen <Thinh.Nguyen@synopsys.com>

On Wed, Aug 30, 2023 at 01:38:23PM -0700, Avichal Rakesh wrote:
>On 6/15/23 10:15, Laurent Pinchart wrote:
>> On Thu, Jun 15, 2023 at 10:34:06AM +0100, Daniel Scally wrote:
>>> Currently the UVC Gadget's .set_alt() callback disables the USB
>>> endpoint, following which a V4L2 event is queued that closes
>>> down the workqueue. This ordering results in repeated calls to
>>> usb_ep_queue() from the workqueue whilst usb_ep_disable() is
>>> running - behaviour that the documentation for usb_ep_disable()
>>> specifically prohibits.
>>>
>>> Move the call to usb_ep_disable() until after cancel_work_sync(),
>>> which will guarantee the endpoint is no longer in use when
>>> usb_ep_disable() is called.
>>>
>>> Signed-off-by: Daniel Scally <dan.scally@ideasonboard.com>
>>> ---
>>>  drivers/usb/gadget/function/f_uvc.c     | 3 ---
>>>  drivers/usb/gadget/function/uvc_video.c | 4 ++++
>>>  2 files changed, 4 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/usb/gadget/function/f_uvc.c b/drivers/usb/gadget/function/f_uvc.c
>>> index 5e919fb65833..4b91bd572a83 100644
>>> --- a/drivers/usb/gadget/function/f_uvc.c
>>> +++ b/drivers/usb/gadget/function/f_uvc.c
>>> @@ -337,9 +337,6 @@ uvc_function_set_alt(struct usb_function *f, unsigned interface, unsigned alt)
>>>  		if (uvc->state != UVC_STATE_STREAMING)
>>>  			return 0;
>>>
>>> -		if (uvc->video.ep)
>>> -			usb_ep_disable(uvc->video.ep);
>>> -
>>>  		memset(&v4l2_event, 0, sizeof(v4l2_event));
>>>  		v4l2_event.type = UVC_EVENT_STREAMOFF;
>>>  		v4l2_event_queue(&uvc->vdev, &v4l2_event);
>>
>> If we don't disable the endpoint here, should we return
>> USB_GADGET_DELAYED_STATUS here (and call uvc_function_setup_continue()
>> in uvc_v4l2_streamoff()) or is that not needed ? The uvc->state updated
>> should then possibly be moved to uvc_v4l2_streamoff() (checking if this
>> would cause any race condition would also be a nice bonus :-)).
>>
>
>Hey all,
>
>First off, apologies for reviving an old thread. We've also been seeing the
>"no resource for ep2in" warning from dwc3 controller, followed by the UVC gadget
>not streaming any frames, when there is a quick STREAMOFF->STREAMON sequence.
>
>It looks like this is the same root cause as what Dan mentioned in
>https://lore.kernel.org/20230531085544.253363-1-dan.scally@ideasonboard.com/
>and this patch seems to solve. (Thank you Dan, for posting the stacktrace in
>that email thread! I had been banging my head for a couple of days before
>thinking of looking up the exact stack :|)
>
>IIUC, this stems from workqueue not fully shutting down by the time the usb
>endpoint is disabled and we need stronger guarantees that the workqueue pumping
>usb_requests doesn't accidentally queue usb_requests _after_ we've disabled
>the streaming endpoint on the usb controller.
>
>Attached is a patch that attempts to address the concerns raised here and sets
>up some stronger timing and memory guarantees.
>
>So here are the list of changes over what Dan had already started:
>
> - Return USB_GADGET_DELAYED_STATUS from set_alt. This is to ensure there are no
>   more control transfers while we're waiting on the workqueue to wind down.
>
> - Move updating uvc->state to uvc_v4l2_streamoff as Laurent suggested.
>   This ensures that setting uvc->state to STREAMING or CONNECTED
>   happens as a result of streamoff and streamon events which makes things
>   easier to reason about.
>
> - Guard video_pump with uvc->state so the thread can be stopped by setting
>   uvc->state to anything other than UVC_STATE_STREAMING. This effectively makes
>   uvc->state a flag for the video_pump loop to exit. This is the same flag that
>   the complete callback uses to restart the workqueue, so all interactions
>   with the controller are effectively guarded by the same flag now.
>
> - Set uvc->state to UVC_STATE_CONNECTED before winding down the work queue.
>   Now that all usb logic is guarded by the same flag, setting the flag should
>   stop all usb packet queuing once current execution finishes.
>
> - Add some memory guarantees in uvcg_video_enable(). At the moment, we don't
>   really consider the fact that usb_ep_dequeue is async, which means that the
>   usb_requests may not have been returned to the gadget before we start
>   deallocating them. With this patch, we wait until all usb_requests have been
>   returned to us before deallocating them (this adds a little bit of
>   bookkeeping, but nothing too complicated).

I am currently trying to solve that by preparing a patch that is
fixing the use of the requests when deallocating them. Since currently
the uvc_gadget is also running into wild use after free issues because
of exactly that async dequeue and dealloc situation. IMHO it should be
save to call dealloc after calling dequeue. Which is probably true for
other usb device controller driver other then dwc3.

For some background. The dwc3 is putting the requests into a cancelled list
that will be cleared by the interrupt handler and that is dequeuing them
instead. In between the dequeue call and the interrupt call the uvc layer could
dealloc the request which leads the interrupt handler to dequeue an
already freed request.

>   This also guarantees that the complete callback won't accidentally
>   re-queue work on the workqueue. Earlier, it was possible that with bad
>   enough luck (or scheduler), the complete callback re-triggered the
>   workqueue after uvcg_video_enable has already returned.
>
> - And finally, call usb_ep_disable after uvcg_video_enable returns, which, with
>   the new guarantees should ensure that it is safe to disable the endpoint.
>
>I am not sure if this is the best way to do it, but in local testing it seems
>to solve the issue. Let me know what you think or if there are better ways to
>achieve this.
>
>Thank you!
>
>- Avi.
>
>---
> drivers/usb/gadget/function/f_uvc.c     | 11 ++++----
> drivers/usb/gadget/function/f_uvc.h     |  2 +-
> drivers/usb/gadget/function/uvc.h       |  5 +++-
> drivers/usb/gadget/function/uvc_v4l2.c  | 21 ++++++++++++---
> drivers/usb/gadget/function/uvc_video.c | 34 +++++++++++++++++++++++--
> 5 files changed, 60 insertions(+), 13 deletions(-)
>
>diff --git a/drivers/usb/gadget/function/f_uvc.c b/drivers/usb/gadget/function/f_uvc.c
>index faa398109431..75c9f9a3f884 100644
>--- a/drivers/usb/gadget/function/f_uvc.c
>+++ b/drivers/usb/gadget/function/f_uvc.c
>@@ -263,10 +263,13 @@ uvc_function_setup(struct usb_function *f, const struct usb_ctrlrequest *ctrl)
> 	return 0;
> }
>
>-void uvc_function_setup_continue(struct uvc_device *uvc)
>+void uvc_function_setup_continue(struct uvc_device *uvc, int disable_ep)
> {
> 	struct usb_composite_dev *cdev = uvc->func.config->cdev;
>
>+	if (disable_ep && uvc->video.ep) {
>+		usb_ep_disable(uvc->video.ep);
>+	}
> 	usb_composite_setup_continue(cdev);
> }
>
>@@ -337,15 +340,11 @@ uvc_function_set_alt(struct usb_function *f, unsigned interface, unsigned alt)
> 		if (uvc->state != UVC_STATE_STREAMING)
> 			return 0;
>
>-		if (uvc->video.ep)
>-			usb_ep_disable(uvc->video.ep);
>-
> 		memset(&v4l2_event, 0, sizeof(v4l2_event));
> 		v4l2_event.type = UVC_EVENT_STREAMOFF;
> 		v4l2_event_queue(&uvc->vdev, &v4l2_event);
>
>-		uvc->state = UVC_STATE_CONNECTED;
>-		return 0;
>+		return USB_GADGET_DELAYED_STATUS;
>
> 	case 1:
> 		if (uvc->state != UVC_STATE_CONNECTED)
>diff --git a/drivers/usb/gadget/function/f_uvc.h b/drivers/usb/gadget/function/f_uvc.h
>index 1db972d4beeb..e7f9f13f14dc 100644
>--- a/drivers/usb/gadget/function/f_uvc.h
>+++ b/drivers/usb/gadget/function/f_uvc.h
>@@ -11,7 +11,7 @@
>
> struct uvc_device;
>
>-void uvc_function_setup_continue(struct uvc_device *uvc);
>+void uvc_function_setup_continue(struct uvc_device *uvc, int disale_ep);
>
> void uvc_function_connect(struct uvc_device *uvc);
>
>diff --git a/drivers/usb/gadget/function/uvc.h b/drivers/usb/gadget/function/uvc.h
>index 6751de8b63ad..e40e702a7074 100644
>--- a/drivers/usb/gadget/function/uvc.h
>+++ b/drivers/usb/gadget/function/uvc.h
>@@ -104,8 +104,11 @@ struct uvc_video {
> 	unsigned int req_size;
> 	struct uvc_request *ureq;
> 	struct list_head req_free;
>+	unsigned int req_free_count; /* number of requests in req_free */
> 	spinlock_t req_lock;
>
>+	wait_queue_head_t req_free_queue;
>+
> 	unsigned int req_int_count;
>
> 	void (*encode) (struct usb_request *req, struct uvc_video *video,
>@@ -177,7 +180,7 @@ struct uvc_file_handle {
>  * Functions
>  */
>
>-extern void uvc_function_setup_continue(struct uvc_device *uvc);
>+extern void uvc_function_setup_continue(struct uvc_device *uvc, int disable_ep);
> extern void uvc_function_connect(struct uvc_device *uvc);
> extern void uvc_function_disconnect(struct uvc_device *uvc);
>
>diff --git a/drivers/usb/gadget/function/uvc_v4l2.c b/drivers/usb/gadget/function/uvc_v4l2.c
>index 3f0a9795c0d4..3d3469883ed0 100644
>--- a/drivers/usb/gadget/function/uvc_v4l2.c
>+++ b/drivers/usb/gadget/function/uvc_v4l2.c
>@@ -451,7 +451,7 @@ uvc_v4l2_streamon(struct file *file, void *fh, enum v4l2_buf_type type)
> 	 * Complete the alternate setting selection setup phase now that
> 	 * userspace is ready to provide video frames.
> 	 */
>-	uvc_function_setup_continue(uvc);
>+	uvc_function_setup_continue(uvc, 0);
> 	uvc->state = UVC_STATE_STREAMING;
>
> 	return 0;
>@@ -463,11 +463,19 @@ uvc_v4l2_streamoff(struct file *file, void *fh, enum v4l2_buf_type type)
> 	struct video_device *vdev = video_devdata(file);
> 	struct uvc_device *uvc = video_get_drvdata(vdev);
> 	struct uvc_video *video = &uvc->video;
>+	int ret = 0;
>
> 	if (type != video->queue.queue.type)
> 		return -EINVAL;
>
>-	return uvcg_video_enable(video, 0);
>+	uvc->state = UVC_STATE_CONNECTED;
>+	ret = uvcg_video_enable(video, 0);
>+	if (ret < 0) {
>+		return ret;
>+	}
>+
>+	uvc_function_setup_continue(uvc, 1);
>+	return 0;
> }
>
> static int
>@@ -500,6 +508,14 @@ uvc_v4l2_subscribe_event(struct v4l2_fh *fh,
> static void uvc_v4l2_disable(struct uvc_device *uvc)
> {
> 	uvc_function_disconnect(uvc);
>+	if (uvc->state == UVC_STATE_STREAMING) {
>+		/*
>+		 * Drop uvc->state to CONNECTED if it was streaming before.
>+		 * This ensures that the usb_requests are no longer queued
>+		 * to the controller.
>+		 */
>+		uvc->state = UVC_STATE_CONNECTED;
>+	}
> 	uvcg_video_enable(&uvc->video, 0);
> 	uvcg_free_buffers(&uvc->video.queue);
> 	uvc->func_connected = false;
>@@ -647,4 +663,3 @@ const struct v4l2_file_operations uvc_v4l2_fops = {
> 	.get_unmapped_area = uvcg_v4l2_get_unmapped_area,
> #endif
> };
>-
>diff --git a/drivers/usb/gadget/function/uvc_video.c b/drivers/usb/gadget/function/uvc_video.c
>index 91af3b1ef0d4..3ea7d52df80d 100644
>--- a/drivers/usb/gadget/function/uvc_video.c
>+++ b/drivers/usb/gadget/function/uvc_video.c
>@@ -284,10 +284,18 @@ uvc_video_complete(struct usb_ep *ep, struct usb_request *req)
>
> 	spin_lock_irqsave(&video->req_lock, flags);
> 	list_add_tail(&req->list, &video->req_free);
>+	video->req_free_count++;
> 	spin_unlock_irqrestore(&video->req_lock, flags);
>
>-	if (uvc->state == UVC_STATE_STREAMING)
>+	if (uvc->state == UVC_STATE_STREAMING) {
> 		queue_work(video->async_wq, &video->pump);
>+	} else if (video->req_free_count == video->req_size) {
>+		/*
>+		 * Wake up thread waiting for all requests to be returned to
>+		 * the gadget driver.
>+		 */
>+		wake_up_interruptible(&video->req_free_queue);
>+	}
> }
>
> static int
>@@ -316,6 +324,7 @@ uvc_video_free_requests(struct uvc_video *video)
>
> 	INIT_LIST_HEAD(&video->req_free);
> 	video->req_size = 0;
>+	video->req_free_count = 0;
> 	return 0;
> }
>
>@@ -360,6 +369,7 @@ uvc_video_alloc_requests(struct uvc_video *video)
> 	}
>
> 	video->req_size = req_size;
>+	video->req_free_count = req_size; /* all requests are currently free */
>
> 	return 0;
>
>@@ -382,6 +392,7 @@ static void uvcg_video_pump(struct work_struct *work)
> {
> 	struct uvc_video *video = container_of(work, struct uvc_video, pump);
> 	struct uvc_video_queue *queue = &video->queue;
>+	struct uvc_device *uvc = video->uvc;
> 	/* video->max_payload_size is only set when using bulk transfer */
> 	bool is_bulk = video->max_payload_size;
> 	struct usb_request *req = NULL;
>@@ -390,7 +401,7 @@ static void uvcg_video_pump(struct work_struct *work)
> 	bool buf_done;
> 	int ret;
>
>-	while (video->ep->enabled) {
>+	while (uvc->state == UVC_STATE_STREAMING && video->ep->enabled) {
> 		/*
> 		 * Retrieve the first available USB request, protected by the
> 		 * request lock.
>@@ -403,6 +414,7 @@ static void uvcg_video_pump(struct work_struct *work)
> 		req = list_first_entry(&video->req_free, struct usb_request,
> 					list);
> 		list_del(&req->list);
>+		video->req_free_count--;
> 		spin_unlock_irqrestore(&video->req_lock, flags);
>
> 		/*
>@@ -479,6 +491,7 @@ static void uvcg_video_pump(struct work_struct *work)
>
> 	spin_lock_irqsave(&video->req_lock, flags);
> 	list_add_tail(&req->list, &video->req_free);
>+	video->req_free_count++;
> 	spin_unlock_irqrestore(&video->req_lock, flags);
> 	return;
> }
>@@ -505,6 +518,22 @@ int uvcg_video_enable(struct uvc_video *video, int enable)
> 			if (video->ureq && video->ureq[i].req)
> 				usb_ep_dequeue(video->ep, video->ureq[i].req);
>
>+		/*
>+		 * Wait 500ms for the usb_requests to be given back to the
>+		 * gadget driver. This ensures that we don't accidentally
>+		 * reference de-allocated usb_requests in the complete callback.
>+		 */
>+		if (video->req_free_count != video->req_size) {
>+			uvcg_info(&video->uvc->func,
>+					"Waiting 500ms for usb_request complete callbacks.\n");
>+			ret = wait_event_interruptible_timeout(
>+					video->req_free_queue,
>+					video->req_free_count == video->req_size,
>+					msecs_to_jiffies(500));
>+			uvcg_info(&video->uvc->func,
>+					"Done waiting for complete callbacks: %d\n", ret);
>+		}
>+
> 		uvc_video_free_requests(video);
> 		uvcg_queue_enable(&video->queue, 0);
> 		return 0;
>@@ -537,6 +566,7 @@ int uvcg_video_init(struct uvc_video *video, struct uvc_device *uvc)
> {
> 	INIT_LIST_HEAD(&video->req_free);
> 	spin_lock_init(&video->req_lock);
>+	init_waitqueue_head(&video->req_free_queue);
> 	INIT_WORK(&video->pump, uvcg_video_pump);
>
> 	/* Allocate a work queue for asynchronous video pump handler. */
>--
Avichal Rakesh Sept. 8, 2023, 3:54 p.m. UTC | #4
Hi Michael,

Apologies for the late reply, I have been out travelling.

On Mon, Sep 4, 2023 at 10:36 PM Michael Grzeschik <mgr@pengutronix.de> wrote:
>
> Hi Avichal
>
> Cc'ing: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>
> On Wed, Aug 30, 2023 at 01:38:23PM -0700, Avichal Rakesh wrote:
> >On 6/15/23 10:15, Laurent Pinchart wrote:
> >> On Thu, Jun 15, 2023 at 10:34:06AM +0100, Daniel Scally wrote:
> >>> Currently the UVC Gadget's .set_alt() callback disables the USB
> >>> endpoint, following which a V4L2 event is queued that closes
> >>> down the workqueue. This ordering results in repeated calls to
> >>> usb_ep_queue() from the workqueue whilst usb_ep_disable() is
> >>> running - behaviour that the documentation for usb_ep_disable()
> >>> specifically prohibits.
> >>>
> >>> Move the call to usb_ep_disable() until after cancel_work_sync(),
> >>> which will guarantee the endpoint is no longer in use when
> >>> usb_ep_disable() is called.
> >>>
> >>> Signed-off-by: Daniel Scally <dan.scally@ideasonboard.com>
> >>> ---
> >>>  drivers/usb/gadget/function/f_uvc.c     | 3 ---
> >>>  drivers/usb/gadget/function/uvc_video.c | 4 ++++
> >>>  2 files changed, 4 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/drivers/usb/gadget/function/f_uvc.c b/drivers/usb/gadget/function/f_uvc.c
> >>> index 5e919fb65833..4b91bd572a83 100644
> >>> --- a/drivers/usb/gadget/function/f_uvc.c
> >>> +++ b/drivers/usb/gadget/function/f_uvc.c
> >>> @@ -337,9 +337,6 @@ uvc_function_set_alt(struct usb_function *f, unsigned interface, unsigned alt)
> >>>             if (uvc->state != UVC_STATE_STREAMING)
> >>>                     return 0;
> >>>
> >>> -           if (uvc->video.ep)
> >>> -                   usb_ep_disable(uvc->video.ep);
> >>> -
> >>>             memset(&v4l2_event, 0, sizeof(v4l2_event));
> >>>             v4l2_event.type = UVC_EVENT_STREAMOFF;
> >>>             v4l2_event_queue(&uvc->vdev, &v4l2_event);
> >>
> >> If we don't disable the endpoint here, should we return
> >> USB_GADGET_DELAYED_STATUS here (and call uvc_function_setup_continue()
> >> in uvc_v4l2_streamoff()) or is that not needed ? The uvc->state updated
> >> should then possibly be moved to uvc_v4l2_streamoff() (checking if this
> >> would cause any race condition would also be a nice bonus :-)).
> >>
> >
> >Hey all,
> >
> >First off, apologies for reviving an old thread. We've also been seeing the
> >"no resource for ep2in" warning from dwc3 controller, followed by the UVC gadget
> >not streaming any frames, when there is a quick STREAMOFF->STREAMON sequence.
> >
> >It looks like this is the same root cause as what Dan mentioned in
> >https://lore.kernel.org/20230531085544.253363-1-dan.scally@ideasonboard.com/
> >and this patch seems to solve. (Thank you Dan, for posting the stacktrace in
> >that email thread! I had been banging my head for a couple of days before
> >thinking of looking up the exact stack :|)
> >
> >IIUC, this stems from workqueue not fully shutting down by the time the usb
> >endpoint is disabled and we need stronger guarantees that the workqueue pumping
> >usb_requests doesn't accidentally queue usb_requests _after_ we've disabled
> >the streaming endpoint on the usb controller.
> >
> >Attached is a patch that attempts to address the concerns raised here and sets
> >up some stronger timing and memory guarantees.
> >
> >So here are the list of changes over what Dan had already started:
> >
> > - Return USB_GADGET_DELAYED_STATUS from set_alt. This is to ensure there are no
> >   more control transfers while we're waiting on the workqueue to wind down.
> >
> > - Move updating uvc->state to uvc_v4l2_streamoff as Laurent suggested.
> >   This ensures that setting uvc->state to STREAMING or CONNECTED
> >   happens as a result of streamoff and streamon events which makes things
> >   easier to reason about.
> >
> > - Guard video_pump with uvc->state so the thread can be stopped by setting
> >   uvc->state to anything other than UVC_STATE_STREAMING. This effectively makes
> >   uvc->state a flag for the video_pump loop to exit. This is the same flag that
> >   the complete callback uses to restart the workqueue, so all interactions
> >   with the controller are effectively guarded by the same flag now.
> >
> > - Set uvc->state to UVC_STATE_CONNECTED before winding down the work queue.
> >   Now that all usb logic is guarded by the same flag, setting the flag should
> >   stop all usb packet queuing once current execution finishes.
> >
> > - Add some memory guarantees in uvcg_video_enable(). At the moment, we don't
> >   really consider the fact that usb_ep_dequeue is async, which means that the
> >   usb_requests may not have been returned to the gadget before we start
> >   deallocating them. With this patch, we wait until all usb_requests have been
> >   returned to us before deallocating them (this adds a little bit of
> >   bookkeeping, but nothing too complicated).
>
> I am currently trying to solve that by preparing a patch that is
> fixing the use of the requests when deallocating them. Since currently
> the uvc_gadget is also running into wild use after free issues because
> of exactly that async dequeue and dealloc situation.

Do you already have a patch up for this? It seems my LKML-fu is
failing and I can't seem to find the thread. If you aren't too deep
into the patch, can you take a look at the request counting mechanism
added in my patch? If you have a (somewhat) consistent repro of the
use-after-dealloc issue, runnin it through the whole patch would be
very appreciated! It is supposed to fix the exact problem you've
described.

> IMHO it should be
> save to call dealloc after calling dequeue. Which is probably true for
> other usb device controller driver other then dwc3.

Perhaps Thinh or someone better versed in Gadget API can chime in on
this, but as it stands usb_ep_dequeue specifically says that it is
async, and gadget drivers must wait on the complete callbacks to
regain ownership of the usb_request. Until the API change is made, UVC
should adhere to the current API?

>
> For some background. The dwc3 is putting the requests into a cancelled list
> that will be cleared by the interrupt handler and that is dequeuing them
> instead. In between the dequeue call and the interrupt call the uvc layer could
> dealloc the request which leads the interrupt handler to dequeue an
> already freed request.

This roughly tracks with what I gleaned from skimming the DWC3 code as
well. In local tests the complete calls were always timely and I never
actually ran into the situation where UVC deallocated an unowned
request, but as someone (I think it was Alan?)  said in a previous
thread: technically possible just means it will happen eventually

Please do review/test the patch. I'll send out a formal patch on
Monday once I am back, but would love to have some early eyes take a
look in case there is something obvious I missed.

Thank you.

- Avi
Michael Grzeschik Sept. 11, 2023, 12:50 a.m. UTC | #5
Hi Avichal

On Fri, Sep 08, 2023 at 11:54:40PM +0800, Avichal Rakesh wrote:
>Apologies for the late reply, I have been out travelling.
>On Mon, Sep 4, 2023 at 10:36 PM Michael Grzeschik <mgr@pengutronix.de> wrote:
>> Cc'ing: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>>
>> On Wed, Aug 30, 2023 at 01:38:23PM -0700, Avichal Rakesh wrote:
>> >On 6/15/23 10:15, Laurent Pinchart wrote:
>> >> On Thu, Jun 15, 2023 at 10:34:06AM +0100, Daniel Scally wrote:
>> >>> Currently the UVC Gadget's .set_alt() callback disables the USB
>> >>> endpoint, following which a V4L2 event is queued that closes
>> >>> down the workqueue. This ordering results in repeated calls to
>> >>> usb_ep_queue() from the workqueue whilst usb_ep_disable() is
>> >>> running - behaviour that the documentation for usb_ep_disable()
>> >>> specifically prohibits.
>> >>>
>> >>> Move the call to usb_ep_disable() until after cancel_work_sync(),
>> >>> which will guarantee the endpoint is no longer in use when
>> >>> usb_ep_disable() is called.
>> >>>
>> >>> Signed-off-by: Daniel Scally <dan.scally@ideasonboard.com>
>> >>> ---
>> >>>  drivers/usb/gadget/function/f_uvc.c     | 3 ---
>> >>>  drivers/usb/gadget/function/uvc_video.c | 4 ++++
>> >>>  2 files changed, 4 insertions(+), 3 deletions(-)
>> >>>
>> >>> diff --git a/drivers/usb/gadget/function/f_uvc.c b/drivers/usb/gadget/function/f_uvc.c
>> >>> index 5e919fb65833..4b91bd572a83 100644
>> >>> --- a/drivers/usb/gadget/function/f_uvc.c
>> >>> +++ b/drivers/usb/gadget/function/f_uvc.c
>> >>> @@ -337,9 +337,6 @@ uvc_function_set_alt(struct usb_function *f, unsigned interface, unsigned alt)
>> >>>             if (uvc->state != UVC_STATE_STREAMING)
>> >>>                     return 0;
>> >>>
>> >>> -           if (uvc->video.ep)
>> >>> -                   usb_ep_disable(uvc->video.ep);
>> >>> -
>> >>>             memset(&v4l2_event, 0, sizeof(v4l2_event));
>> >>>             v4l2_event.type = UVC_EVENT_STREAMOFF;
>> >>>             v4l2_event_queue(&uvc->vdev, &v4l2_event);
>> >>
>> >> If we don't disable the endpoint here, should we return
>> >> USB_GADGET_DELAYED_STATUS here (and call uvc_function_setup_continue()
>> >> in uvc_v4l2_streamoff()) or is that not needed ? The uvc->state updated
>> >> should then possibly be moved to uvc_v4l2_streamoff() (checking if this
>> >> would cause any race condition would also be a nice bonus :-)).
>> >>
>> >
>> >Hey all,
>> >
>> >First off, apologies for reviving an old thread. We've also been seeing the
>> >"no resource for ep2in" warning from dwc3 controller, followed by the UVC gadget
>> >not streaming any frames, when there is a quick STREAMOFF->STREAMON sequence.
>> >
>> >It looks like this is the same root cause as what Dan mentioned in
>> >https://lore.kernel.org/20230531085544.253363-1-dan.scally@ideasonboard.com/
>> >and this patch seems to solve. (Thank you Dan, for posting the stacktrace in
>> >that email thread! I had been banging my head for a couple of days before
>> >thinking of looking up the exact stack :|)
>> >
>> >IIUC, this stems from workqueue not fully shutting down by the time the usb
>> >endpoint is disabled and we need stronger guarantees that the workqueue pumping
>> >usb_requests doesn't accidentally queue usb_requests _after_ we've disabled
>> >the streaming endpoint on the usb controller.
>> >
>> >Attached is a patch that attempts to address the concerns raised here and sets
>> >up some stronger timing and memory guarantees.
>> >
>> >So here are the list of changes over what Dan had already started:
>> >
>> > - Return USB_GADGET_DELAYED_STATUS from set_alt. This is to ensure there are no
>> >   more control transfers while we're waiting on the workqueue to wind down.
>> >
>> > - Move updating uvc->state to uvc_v4l2_streamoff as Laurent suggested.
>> >   This ensures that setting uvc->state to STREAMING or CONNECTED
>> >   happens as a result of streamoff and streamon events which makes things
>> >   easier to reason about.
>> >
>> > - Guard video_pump with uvc->state so the thread can be stopped by setting
>> >   uvc->state to anything other than UVC_STATE_STREAMING. This effectively makes
>> >   uvc->state a flag for the video_pump loop to exit. This is the same flag that
>> >   the complete callback uses to restart the workqueue, so all interactions
>> >   with the controller are effectively guarded by the same flag now.
>> >
>> > - Set uvc->state to UVC_STATE_CONNECTED before winding down the work queue.
>> >   Now that all usb logic is guarded by the same flag, setting the flag should
>> >   stop all usb packet queuing once current execution finishes.
>> >
>> > - Add some memory guarantees in uvcg_video_enable(). At the moment, we don't
>> >   really consider the fact that usb_ep_dequeue is async, which means that the
>> >   usb_requests may not have been returned to the gadget before we start
>> >   deallocating them. With this patch, we wait until all usb_requests have been
>> >   returned to us before deallocating them (this adds a little bit of
>> >   bookkeeping, but nothing too complicated).
>>
>> I am currently trying to solve that by preparing a patch that is
>> fixing the use of the requests when deallocating them. Since currently
>> the uvc_gadget is also running into wild use after free issues because
>> of exactly that async dequeue and dealloc situation.
>
>Do you already have a patch up for this? It seems my LKML-fu is
>failing and I can't seem to find the thread. If you aren't too deep
>into the patch, can you take a look at the request counting mechanism
>added in my patch? If you have a (somewhat) consistent repro of the
>use-after-dealloc issue, runnin it through the whole patch would be
>very appreciated! It is supposed to fix the exact problem you've
>described.

I just send out v1:

https://lore.kernel.org/linux-usb/20230911002451.2860049-1-m.grzeschik@pengutronix.de/

My patches did go back and forth between changes in the uvc-gadget
driver and the device-controller driver. My latest version was including
calling free_request from the complete handler. I found that option
while looking into the uac2 gadget code. It took away a lot of
pain while trying to fix the issue in the dwc3 gadget driver.

>> IMHO it should be
>> save to call dealloc after calling dequeue. Which is probably true for
>> other usb device controller driver other then dwc3.
>
>Perhaps Thinh or someone better versed in Gadget API can chime in on
>this, but as it stands usb_ep_dequeue specifically says that it is
>async, and gadget drivers must wait on the complete callbacks to
>regain ownership of the usb_request. Until the API change is made, UVC
>should adhere to the current API?

Since you mention that usb_ep_dequeue is async I am very confident
that it is safe to free the request in the completion handler.

Although we could cleanup and improve the uvc_video_free_requests
function itself. But with the patches I have here the use
after free was gone so far. So they should be good so far.

>> For some background. The dwc3 is putting the requests into a cancelled list
>> that will be cleared by the interrupt handler and that is dequeuing them
>> instead. In between the dequeue call and the interrupt call the uvc layer could
>> dealloc the request which leads the interrupt handler to dequeue an
>> already freed request.
>
>This roughly tracks with what I gleaned from skimming the DWC3 code as
>well. In local tests the complete calls were always timely and I never
>actually ran into the situation where UVC deallocated an unowned
>request, but as someone (I think it was Alan?)  said in a previous
>thread: technically possible just means it will happen eventually
>
>Please do review/test the patch. I'll send out a formal patch on
>Monday once I am back, but would love to have some early eyes take a
>look in case there is something obvious I missed.

First I tested your patch with my sketchy setup where I more then once
run into the use after free condition. But with the patch this was not
gone.

I also looked over the patch. As what I saw this is a possible
alternative to my patches. The changes are doing some similar things.
But the code is changing to many things at once. Please split the code
up into more logical chunks. Perhaps you could try to rebase it on
my patches. And start from there.

Thanks,
Michael
Avichal Rakesh Sept. 12, 2023, 4:26 a.m. UTC | #6
Hi Michael,

On 9/10/23 17:50, Michael Grzeschik wrote:
> Hi Avichal
> 
> On Fri, Sep 08, 2023 at 11:54:40PM +0800, Avichal Rakesh wrote:
>> Apologies for the late reply, I have been out travelling.
>> On Mon, Sep 4, 2023 at 10:36 PM Michael Grzeschik <mgr@pengutronix.de> wrote:
>>> Cc'ing: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>>>
>>> ...
>>> I am currently trying to solve that by preparing a patch that is
>>> fixing the use of the requests when deallocating them. Since currently
>>> the uvc_gadget is also running into wild use after free issues because
>>> of exactly that async dequeue and dealloc situation.
>>
>> Do you already have a patch up for this? It seems my LKML-fu is
>> failing and I can't seem to find the thread. If you aren't too deep
>> into the patch, can you take a look at the request counting mechanism
>> added in my patch? If you have a (somewhat) consistent repro of the
>> use-after-dealloc issue, runnin it through the whole patch would be
>> very appreciated! It is supposed to fix the exact problem you've
>> described.
> 
> I just send out v1:
> 
> https://lore.kernel.org/linux-usb/20230911002451.2860049-1-m.grzeschik@pengutronix.de/

Thank you for the patch. I do have a few comments on it, will respond on that thread.

> 
> My patches did go back and forth between changes in the uvc-gadget
> driver and the device-controller driver. My latest version was including
> calling free_request from the complete handler. I found that option
> while looking into the uac2 gadget code. It took away a lot of
> pain while trying to fix the issue in the dwc3 gadget driver.
> 
>>> IMHO it should be
>>> save to call dealloc after calling dequeue. Which is probably true for
>>> other usb device controller driver other then dwc3.
>>
>> Perhaps Thinh or someone better versed in Gadget API can chime in on
>> this, but as it stands usb_ep_dequeue specifically says that it is
>> async, and gadget drivers must wait on the complete callbacks to
>> regain ownership of the usb_request. Until the API change is made, UVC
>> should adhere to the current API?
> 
> Since you mention that usb_ep_dequeue is async I am very confident
> that it is safe to free the request in the completion handler.
> 
> Although we could cleanup and improve the uvc_video_free_requests
> function itself. But with the patches I have here the use
> after free was gone so far. So they should be good so far.
> 
>>> For some background. The dwc3 is putting the requests into a cancelled list
>>> that will be cleared by the interrupt handler and that is dequeuing them
>>> instead. In between the dequeue call and the interrupt call the uvc layer could
>>> dealloc the request which leads the interrupt handler to dequeue an
>>> already freed request.
>>
>> This roughly tracks with what I gleaned from skimming the DWC3 code as
>> well. In local tests the complete calls were always timely and I never
>> actually ran into the situation where UVC deallocated an unowned
>> request, but as someone (I think it was Alan?)  said in a previous
>> thread: technically possible just means it will happen eventually
>>
>> Please do review/test the patch. I'll send out a formal patch on
>> Monday once I am back, but would love to have some early eyes take a
>> look in case there is something obvious I missed.
> 
> First I tested your patch with my sketchy setup where I more then once
> run into the use after free condition. But with the patch this was not
> gone.

Just to confirm, use-after-free issue was _not_ fixed with this patch?

> 
> I also looked over the patch. As what I saw this is a possible
> alternative to my patches. The changes are doing some similar things.
> But the code is changing to many things at once. Please split the code
> up into more logical chunks. Perhaps you could try to rebase it on
> my patches. And start from there.

You're right, there are two issues this patch fixes. One of which is the 
same as that fixed by your series of patches. Uploaded v1 of the series at
https://lore.kernel.org/20230912041910.726442-1-arakesh@google.com/ 
(cc'ed to you) which splits the fixes into two separate patches.

Thanks!
- Avi.
Michael Grzeschik Sept. 15, 2023, 11:13 p.m. UTC | #7
Hi Avichal

On Mon, Sep 11, 2023 at 09:26:09PM -0700, Avichal Rakesh wrote:
>On 9/10/23 17:50, Michael Grzeschik wrote:
>> On Fri, Sep 08, 2023 at 11:54:40PM +0800, Avichal Rakesh wrote:
>>> Apologies for the late reply, I have been out travelling.
>>> On Mon, Sep 4, 2023 at 10:36 PM Michael Grzeschik <mgr@pengutronix.de> wrote:
>>>> Cc'ing: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
>>>>
>>>> ...
>>>> I am currently trying to solve that by preparing a patch that is
>>>> fixing the use of the requests when deallocating them. Since currently
>>>> the uvc_gadget is also running into wild use after free issues because
>>>> of exactly that async dequeue and dealloc situation.
>>>
>>> Do you already have a patch up for this? It seems my LKML-fu is
>>> failing and I can't seem to find the thread. If you aren't too deep
>>> into the patch, can you take a look at the request counting mechanism
>>> added in my patch? If you have a (somewhat) consistent repro of the
>>> use-after-dealloc issue, runnin it through the whole patch would be
>>> very appreciated! It is supposed to fix the exact problem you've
>>> described.
>>
>> I just send out v1:
>>
>> https://lore.kernel.org/linux-usb/20230911002451.2860049-1-m.grzeschik@pengutronix.de/
>
>Thank you for the patch. I do have a few comments on it, will respond on that thread.

Thanks for the comments.

>>
>> My patches did go back and forth between changes in the uvc-gadget
>> driver and the device-controller driver. My latest version was including
>> calling free_request from the complete handler. I found that option
>> while looking into the uac2 gadget code. It took away a lot of
>> pain while trying to fix the issue in the dwc3 gadget driver.
>>
>>>> IMHO it should be
>>>> save to call dealloc after calling dequeue. Which is probably true for
>>>> other usb device controller driver other then dwc3.
>>>
>>> Perhaps Thinh or someone better versed in Gadget API can chime in on
>>> this, but as it stands usb_ep_dequeue specifically says that it is
>>> async, and gadget drivers must wait on the complete callbacks to
>>> regain ownership of the usb_request. Until the API change is made, UVC
>>> should adhere to the current API?
>>
>> Since you mention that usb_ep_dequeue is async I am very confident
>> that it is safe to free the request in the completion handler.
>>
>> Although we could cleanup and improve the uvc_video_free_requests
>> function itself. But with the patches I have here the use
>> after free was gone so far. So they should be good so far.
>>
>>>> For some background. The dwc3 is putting the requests into a cancelled list
>>>> that will be cleared by the interrupt handler and that is dequeuing them
>>>> instead. In between the dequeue call and the interrupt call the uvc layer could
>>>> dealloc the request which leads the interrupt handler to dequeue an
>>>> already freed request.
>>>
>>> This roughly tracks with what I gleaned from skimming the DWC3 code as
>>> well. In local tests the complete calls were always timely and I never
>>> actually ran into the situation where UVC deallocated an unowned
>>> request, but as someone (I think it was Alan?)  said in a previous
>>> thread: technically possible just means it will happen eventually
>>>
>>> Please do review/test the patch. I'll send out a formal patch on
>>> Monday once I am back, but would love to have some early eyes take a
>>> look in case there is something obvious I missed.
>>
>> First I tested your patch with my sketchy setup where I more then once
>> run into the use after free condition. But with the patch this was not
>> gone.
>
>Just to confirm, use-after-free issue was _not_ fixed with this patch?

Yes, it did somehow not help the issues I see.

I will come back with the issues I saw on your actual patch series.

>> I also looked over the patch. As what I saw this is a possible
>> alternative to my patches. The changes are doing some similar things.
>> But the code is changing to many things at once. Please split the code
>> up into more logical chunks. Perhaps you could try to rebase it on
>> my patches. And start from there.
>
>You're right, there are two issues this patch fixes. One of which is the
>same as that fixed by your series of patches. Uploaded v1 of the series at
>https://lore.kernel.org/20230912041910.726442-1-arakesh@google.com/
>(cc'ed to you) which splits the fixes into two separate patches.

Thanks for seperating them.

Regards,
Michael
diff mbox series

Patch

diff --git a/drivers/usb/gadget/function/f_uvc.c b/drivers/usb/gadget/function/f_uvc.c
index 5e919fb65833..4b91bd572a83 100644
--- a/drivers/usb/gadget/function/f_uvc.c
+++ b/drivers/usb/gadget/function/f_uvc.c
@@ -337,9 +337,6 @@  uvc_function_set_alt(struct usb_function *f, unsigned interface, unsigned alt)
 		if (uvc->state != UVC_STATE_STREAMING)
 			return 0;
 
-		if (uvc->video.ep)
-			usb_ep_disable(uvc->video.ep);
-
 		memset(&v4l2_event, 0, sizeof(v4l2_event));
 		v4l2_event.type = UVC_EVENT_STREAMOFF;
 		v4l2_event_queue(&uvc->vdev, &v4l2_event);
diff --git a/drivers/usb/gadget/function/uvc_video.c b/drivers/usb/gadget/function/uvc_video.c
index 91af3b1ef0d4..c7e38fa26492 100644
--- a/drivers/usb/gadget/function/uvc_video.c
+++ b/drivers/usb/gadget/function/uvc_video.c
@@ -499,6 +499,10 @@  int uvcg_video_enable(struct uvc_video *video, int enable)
 
 	if (!enable) {
 		cancel_work_sync(&video->pump);
+
+		if (video->ep)
+			usb_ep_disable(video->ep);
+
 		uvcg_queue_cancel(&video->queue, 0);
 
 		for (i = 0; i < video->uvc_num_requests; ++i)