diff mbox series

usb: gadget: f_fs: Prevent race between functionfs_unbind & ffs_ep0_queue_wait

Message ID 20221103073821.8210-1-quic_ugoswami@quicinc.com (mailing list archive)
State Superseded
Headers show
Series usb: gadget: f_fs: Prevent race between functionfs_unbind & ffs_ep0_queue_wait | expand

Commit Message

Udipto Goswami Nov. 3, 2022, 7:38 a.m. UTC
While performing fast composition switch, there is a possibility that the
process of ffs_ep0_write/ffs_ep0_read get into a race condition
due to ep0req being freed up from functionfs_unbind.

Consider the scenario that the ffs_ep0_write calls the ffs_ep0_queue_wait
by taking a lock &ffs->ev.waitq.lock. However, the functionfs_unbind isn't
bounded so it can go ahead and mark the ep0req to NULL, and since there
is no NULL check in ffs_ep0_queue_wait we will end up in use-after-free.

Fix this by introducing a NULL check before any req operation.
Also to ensure the serialization, perform the ep0req ops inside
spinlock &ffs->ev.waitq.lock.

Fixes: ddf8abd25994 ("USB: f_fs: the FunctionFS driver")
Signed-off-by: Udipto Goswami <quic_ugoswami@quicinc.com>
---
 drivers/usb/gadget/function/f_fs.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

John Keeping Nov. 3, 2022, 9:30 a.m. UTC | #1
On Thu, Nov 03, 2022 at 01:08:21PM +0530, Udipto Goswami wrote:
> While performing fast composition switch, there is a possibility that the
> process of ffs_ep0_write/ffs_ep0_read get into a race condition
> due to ep0req being freed up from functionfs_unbind.
> 
> Consider the scenario that the ffs_ep0_write calls the ffs_ep0_queue_wait
> by taking a lock &ffs->ev.waitq.lock. However, the functionfs_unbind isn't
> bounded so it can go ahead and mark the ep0req to NULL, and since there
> is no NULL check in ffs_ep0_queue_wait we will end up in use-after-free.
> 
> Fix this by introducing a NULL check before any req operation.
> Also to ensure the serialization, perform the ep0req ops inside
> spinlock &ffs->ev.waitq.lock.
> 
> Fixes: ddf8abd25994 ("USB: f_fs: the FunctionFS driver")
> Signed-off-by: Udipto Goswami <quic_ugoswami@quicinc.com>
> ---
>  drivers/usb/gadget/function/f_fs.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
> index 73dc10a77cde..39980b2bf285 100644
> --- a/drivers/usb/gadget/function/f_fs.c
> +++ b/drivers/usb/gadget/function/f_fs.c
> @@ -279,6 +279,13 @@ static int __ffs_ep0_queue_wait(struct ffs_data *ffs, char *data, size_t len)
>  	struct usb_request *req = ffs->ep0req;
>  	int ret;
>  
> +	if (!req)
> +		return -EINVAL;
> +	/*
> +	 * Even if ep0req is freed won't be a problem since the local
> +	 * copy of the request will be used here.
> +	 */

This doesn't sound right - if we set ep0req to NULL then we've called
usb_ep_free_request() on it so the request is not longer valid.

>  	req->zero     = len < le16_to_cpu(ffs->ev.setup.wLength);
>  
>  	spin_unlock_irq(&ffs->ev.waitq.lock);
> @@ -1892,11 +1899,13 @@ static void functionfs_unbind(struct ffs_data *ffs)
>  	ENTER();
>  
>  	if (!WARN_ON(!ffs->gadget)) {
> +		spin_lock_irq(&ffs->ev.waitq.lock);
>  		usb_ep_free_request(ffs->gadget->ep0, ffs->ep0req);
>  		ffs->ep0req = NULL;
>  		ffs->gadget = NULL;
>  		clear_bit(FFS_FL_BOUND, &ffs->flags);
>  		ffs_data_put(ffs);
> +		spin_unlock_irq(&ffs->ev.waitq.lock);

ffs may have been freed in ffs_data_put() so accessing the lock here is
unsafe.
Udipto Goswami Nov. 3, 2022, 10:27 a.m. UTC | #2
Hi John,

On 11/3/22 3:00 PM, John Keeping wrote:
> On Thu, Nov 03, 2022 at 01:08:21PM +0530, Udipto Goswami wrote:
>> While performing fast composition switch, there is a possibility that the
>> process of ffs_ep0_write/ffs_ep0_read get into a race condition
>> due to ep0req being freed up from functionfs_unbind.
>>
>> Consider the scenario that the ffs_ep0_write calls the ffs_ep0_queue_wait
>> by taking a lock &ffs->ev.waitq.lock. However, the functionfs_unbind isn't
>> bounded so it can go ahead and mark the ep0req to NULL, and since there
>> is no NULL check in ffs_ep0_queue_wait we will end up in use-after-free.
>>
>> Fix this by introducing a NULL check before any req operation.
>> Also to ensure the serialization, perform the ep0req ops inside
>> spinlock &ffs->ev.waitq.lock.
>>
>> Fixes: ddf8abd25994 ("USB: f_fs: the FunctionFS driver")
>> Signed-off-by: Udipto Goswami <quic_ugoswami@quicinc.com>
>> ---
>>   drivers/usb/gadget/function/f_fs.c | 9 +++++++++
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
>> index 73dc10a77cde..39980b2bf285 100644
>> --- a/drivers/usb/gadget/function/f_fs.c
>> +++ b/drivers/usb/gadget/function/f_fs.c
>> @@ -279,6 +279,13 @@ static int __ffs_ep0_queue_wait(struct ffs_data *ffs, char *data, size_t len)
>>   	struct usb_request *req = ffs->ep0req;
>>   	int ret;
>>   
>> +	if (!req)
>> +		return -EINVAL;
>> +	/*
>> +	 * Even if ep0req is freed won't be a problem since the local
>> +	 * copy of the request will be used here.
>> +	 */
> 
> This doesn't sound right - if we set ep0req to NULL then we've called
> usb_ep_free_request() on it so the request is not longer valid.

Yes I agree as soon as we spin_unlock it the functionfs_unbind will 
execute and free_up the req, so performing and ep_queue after that even 
if it is a local copy could be fatal.

          ret = usb_ep_queue(ffs->gadget->ep0, req, GFP_ATOMIC);
          if (unlikely(ret < 0))
                  return ret;

         spin_unlock_irq(&ffs->ev.waitq.lock);
  We can move the spin_unlock after to queue operation perhaps ?

> 
>>   	req->zero     = len < le16_to_cpu(ffs->ev.setup.wLength);
>>   
>>   	spin_unlock_irq(&ffs->ev.waitq.lock);
>> @@ -1892,11 +1899,13 @@ static void functionfs_unbind(struct ffs_data *ffs)
>>   	ENTER();
>>   
>>   	if (!WARN_ON(!ffs->gadget)) {
>> +		spin_lock_irq(&ffs->ev.waitq.lock);
>>   		usb_ep_free_request(ffs->gadget->ep0, ffs->ep0req);
>>   		ffs->ep0req = NULL;
>>   		ffs->gadget = NULL;
>>   		clear_bit(FFS_FL_BOUND, &ffs->flags);
>>   		ffs_data_put(ffs);
>> +		spin_unlock_irq(&ffs->ev.waitq.lock);
> 
> ffs may have been freed in ffs_data_put() so accessing the lock here is
> unsafe.
maybe we can move it before data_put ?
    		clear_bit(FFS_FL_BOUND, &ffs->flags);
  +		spin_unlock_irq(&ffs->ev.waitq.lock);
    		ffs_data_put(ffs);

the intention here is to protect the ep0req only since the 
ep0_queue_wait is also doing the assignments after taking the lock.

Thanks,
-Udipto
John Keeping Nov. 3, 2022, 10:52 a.m. UTC | #3
On Thu, Nov 03, 2022 at 03:57:02PM +0530, Udipto Goswami wrote:
> On 11/3/22 3:00 PM, John Keeping wrote:
> > On Thu, Nov 03, 2022 at 01:08:21PM +0530, Udipto Goswami wrote:
> > > While performing fast composition switch, there is a possibility that the
> > > process of ffs_ep0_write/ffs_ep0_read get into a race condition
> > > due to ep0req being freed up from functionfs_unbind.
> > > 
> > > Consider the scenario that the ffs_ep0_write calls the ffs_ep0_queue_wait
> > > by taking a lock &ffs->ev.waitq.lock. However, the functionfs_unbind isn't
> > > bounded so it can go ahead and mark the ep0req to NULL, and since there
> > > is no NULL check in ffs_ep0_queue_wait we will end up in use-after-free.
> > > 
> > > Fix this by introducing a NULL check before any req operation.
> > > Also to ensure the serialization, perform the ep0req ops inside
> > > spinlock &ffs->ev.waitq.lock.
> > > 
> > > Fixes: ddf8abd25994 ("USB: f_fs: the FunctionFS driver")
> > > Signed-off-by: Udipto Goswami <quic_ugoswami@quicinc.com>
> > > ---
> > >   drivers/usb/gadget/function/f_fs.c | 9 +++++++++
> > >   1 file changed, 9 insertions(+)
> > > 
> > > diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
> > > index 73dc10a77cde..39980b2bf285 100644
> > > --- a/drivers/usb/gadget/function/f_fs.c
> > > +++ b/drivers/usb/gadget/function/f_fs.c
> > > @@ -279,6 +279,13 @@ static int __ffs_ep0_queue_wait(struct ffs_data *ffs, char *data, size_t len)
> > >   	struct usb_request *req = ffs->ep0req;
> > >   	int ret;
> > > +	if (!req)
> > > +		return -EINVAL;
> > > +	/*
> > > +	 * Even if ep0req is freed won't be a problem since the local
> > > +	 * copy of the request will be used here.
> > > +	 */
> > 
> > This doesn't sound right - if we set ep0req to NULL then we've called
> > usb_ep_free_request() on it so the request is not longer valid.
> 
> Yes I agree as soon as we spin_unlock it the functionfs_unbind will execute
> and free_up the req, so performing and ep_queue after that even if it is a
> local copy could be fatal.
> 
>          ret = usb_ep_queue(ffs->gadget->ep0, req, GFP_ATOMIC);
>          if (unlikely(ret < 0))
>                  return ret;
> 
>         spin_unlock_irq(&ffs->ev.waitq.lock);
>  We can move the spin_unlock after to queue operation perhaps ?

I don't think it's that simple.  The documentation for
usb_ep_free_request() says:

	* Caller guarantees the request is not queued, and that it will
	* no longer be requeued (or otherwise used).

so some extra synchronisation is required here.

By the time we get to functionfs_unbind() everything should be disabled
by ffs_func_disable() and ffs_func_unbind() has drained the workqueue,
but none of that applies to ep0.

I think ffs_unbind() needs to dequeue the ep0 request.

In addition to that, I think we need a new ep0 status variable in struct
ffs_data so that req is not accessed after wait_for_completion() in
__ffs_ep0_queue_wait() and that status is set in ffs_ep0_complete().

With the spin_unlock_irq() moved to immediately before
wait_for_completion() in __ffs_ep0_queue_wait() it looks like everything
is then safe.
Udipto Goswami Nov. 3, 2022, 11:29 a.m. UTC | #4
Hi John

On 11/3/22 4:22 PM, John Keeping wrote:
> On Thu, Nov 03, 2022 at 03:57:02PM +0530, Udipto Goswami wrote:
>> On 11/3/22 3:00 PM, John Keeping wrote:
>>> On Thu, Nov 03, 2022 at 01:08:21PM +0530, Udipto Goswami wrote:
>>>> While performing fast composition switch, there is a possibility that the
>>>> process of ffs_ep0_write/ffs_ep0_read get into a race condition
>>>> due to ep0req being freed up from functionfs_unbind.
>>>>
>>>> Consider the scenario that the ffs_ep0_write calls the ffs_ep0_queue_wait
>>>> by taking a lock &ffs->ev.waitq.lock. However, the functionfs_unbind isn't
>>>> bounded so it can go ahead and mark the ep0req to NULL, and since there
>>>> is no NULL check in ffs_ep0_queue_wait we will end up in use-after-free.
>>>>
>>>> Fix this by introducing a NULL check before any req operation.
>>>> Also to ensure the serialization, perform the ep0req ops inside
>>>> spinlock &ffs->ev.waitq.lock.
>>>>
>>>> Fixes: ddf8abd25994 ("USB: f_fs: the FunctionFS driver")
>>>> Signed-off-by: Udipto Goswami <quic_ugoswami@quicinc.com>
>>>> ---
>>>>    drivers/usb/gadget/function/f_fs.c | 9 +++++++++
>>>>    1 file changed, 9 insertions(+)
>>>>
>>>> diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
>>>> index 73dc10a77cde..39980b2bf285 100644
>>>> --- a/drivers/usb/gadget/function/f_fs.c
>>>> +++ b/drivers/usb/gadget/function/f_fs.c
>>>> @@ -279,6 +279,13 @@ static int __ffs_ep0_queue_wait(struct ffs_data *ffs, char *data, size_t len)
>>>>    	struct usb_request *req = ffs->ep0req;
>>>>    	int ret;
>>>> +	if (!req)
>>>> +		return -EINVAL;
>>>> +	/*
>>>> +	 * Even if ep0req is freed won't be a problem since the local
>>>> +	 * copy of the request will be used here.
>>>> +	 */
>>>
>>> This doesn't sound right - if we set ep0req to NULL then we've called
>>> usb_ep_free_request() on it so the request is not longer valid.
>>
>> Yes I agree as soon as we spin_unlock it the functionfs_unbind will execute
>> and free_up the req, so performing and ep_queue after that even if it is a
>> local copy could be fatal.
>>
>>           ret = usb_ep_queue(ffs->gadget->ep0, req, GFP_ATOMIC);
>>           if (unlikely(ret < 0))
>>                   return ret;
>>
>>          spin_unlock_irq(&ffs->ev.waitq.lock);
>>   We can move the spin_unlock after to queue operation perhaps ?
> 
> I don't think it's that simple.  The documentation for
> usb_ep_free_request() says:
> 
> 	* Caller guarantees the request is not queued, and that it will
> 	* no longer be requeued (or otherwise used).
> 
> so some extra synchronisation is required here.
> 
> By the time we get to functionfs_unbind() everything should be disabled
> by ffs_func_disable() and ffs_func_unbind() has drained the workqueue,
> but none of that applies to ep0.
> 
> I think ffs_unbind() needs to dequeue the ep0 request.
> 
> In addition to that, I think we need a new ep0 status variable in struct
> ffs_data so that req is not accessed after wait_for_completion() in
> __ffs_ep0_queue_wait() and that status is set in ffs_ep0_complete().
> 
> With the spin_unlock_irq() moved to immediately before
> wait_for_completion() in __ffs_ep0_queue_wait() it looks like everything
> is then safe.

Thanks for the suggestions, let me try implementing it.

-Udipto
Udipto Goswami Nov. 4, 2022, 10:14 a.m. UTC | #5
Hi John,

On 11/3/22 4:59 PM, Udipto Goswami wrote:
> Hi John
> 
> On 11/3/22 4:22 PM, John Keeping wrote:
>> On Thu, Nov 03, 2022 at 03:57:02PM +0530, Udipto Goswami wrote:
>>> On 11/3/22 3:00 PM, John Keeping wrote:
>>>> On Thu, Nov 03, 2022 at 01:08:21PM +0530, Udipto Goswami wrote:
>>>>> While performing fast composition switch, there is a possibility 
>>>>> that the
>>>>> process of ffs_ep0_write/ffs_ep0_read get into a race condition
>>>>> due to ep0req being freed up from functionfs_unbind.
>>>>>
>>>>> Consider the scenario that the ffs_ep0_write calls the 
>>>>> ffs_ep0_queue_wait
>>>>> by taking a lock &ffs->ev.waitq.lock. However, the 
>>>>> functionfs_unbind isn't
>>>>> bounded so it can go ahead and mark the ep0req to NULL, and since 
>>>>> there
>>>>> is no NULL check in ffs_ep0_queue_wait we will end up in 
>>>>> use-after-free.
>>>>>
>>>>> Fix this by introducing a NULL check before any req operation.
>>>>> Also to ensure the serialization, perform the ep0req ops inside
>>>>> spinlock &ffs->ev.waitq.lock.
>>>>>
>>>>> Fixes: ddf8abd25994 ("USB: f_fs: the FunctionFS driver")
>>>>> Signed-off-by: Udipto Goswami <quic_ugoswami@quicinc.com>
>>>>> ---
>>>>>    drivers/usb/gadget/function/f_fs.c | 9 +++++++++
>>>>>    1 file changed, 9 insertions(+)
>>>>>
>>>>> diff --git a/drivers/usb/gadget/function/f_fs.c 
>>>>> b/drivers/usb/gadget/function/f_fs.c
>>>>> index 73dc10a77cde..39980b2bf285 100644
>>>>> --- a/drivers/usb/gadget/function/f_fs.c
>>>>> +++ b/drivers/usb/gadget/function/f_fs.c
>>>>> @@ -279,6 +279,13 @@ static int __ffs_ep0_queue_wait(struct 
>>>>> ffs_data *ffs, char *data, size_t len)
>>>>>        struct usb_request *req = ffs->ep0req;
>>>>>        int ret;
>>>>> +    if (!req)
>>>>> +        return -EINVAL;
>>>>> +    /*
>>>>> +     * Even if ep0req is freed won't be a problem since the local
>>>>> +     * copy of the request will be used here.
>>>>> +     */
>>>>
>>>> This doesn't sound right - if we set ep0req to NULL then we've called
>>>> usb_ep_free_request() on it so the request is not longer valid.
>>>
>>> Yes I agree as soon as we spin_unlock it the functionfs_unbind will 
>>> execute
>>> and free_up the req, so performing and ep_queue after that even if it 
>>> is a
>>> local copy could be fatal.
>>>
>>>           ret = usb_ep_queue(ffs->gadget->ep0, req, GFP_ATOMIC);
>>>           if (unlikely(ret < 0))
>>>                   return ret;
>>>
>>>          spin_unlock_irq(&ffs->ev.waitq.lock);
>>>   We can move the spin_unlock after to queue operation perhaps ?
>>
>> I don't think it's that simple.  The documentation for
>> usb_ep_free_request() says:
>>
>>     * Caller guarantees the request is not queued, and that it will
>>     * no longer be requeued (or otherwise used).
>>
>> so some extra synchronisation is required here.
>>
>> By the time we get to functionfs_unbind() everything should be disabled
>> by ffs_func_disable() and ffs_func_unbind() has drained the workqueue,
>> but none of that applies to ep0.
>>
>> I think ffs_unbind() needs to dequeue the ep0 request.
>>
>> In addition to that, I think we need a new ep0 status variable in struct
>> ffs_data so that req is not accessed after wait_for_completion() in
>> __ffs_ep0_queue_wait() and that status is set in ffs_ep0_complete().
>>
>> With the spin_unlock_irq() moved to immediately before
>> wait_for_completion() in __ffs_ep0_queue_wait() it looks like everything
>> is then safe.
> 
> Thanks for the suggestions, let me try implementing it.
> 
Just curious because i saw __ffs_ep0_queue_wait will only be called from 
  ffs_ep0_read & ffs_ep0_write, both using a mutex_lock(&ffs->mutex)

So if we protect the functionfs_unbind with this mutex, it will be 
better right?

@@ -1889,12 +1889,13 @@ static int functionfs_bind(struct ffs_data *ffs, 
struct usb_composite_dev *cdev)
  static void functionfs_unbind(struct ffs_data *ffs)
  {
         ENTER();

         if (!WARN_ON(!ffs->gadget)) {
+               ffs_mutex_lock(&ffs->mutex, file->f_flags & O_NONBLOCK);
                 usb_ep_free_request(ffs->gadget->ep0, ffs->ep0req);
                 ffs->ep0req = NULL;
                 ffs->gadget = NULL;
                 clear_bit(FFS_FL_BOUND, &ffs->flags);
+               mutex_unlock(&ffs->mutex);
                 ffs_data_put(ffs);
         }
  }

Perhaps we don't have to take care of the the serialization in that case 
since it will exit the function __ffs_ep0_queue_wait only after 
everything is done and hence functionfs_unbind will get the control 
after the ep0_write/read has completed?

What do you think ?

-Udipto
John Keeping Nov. 4, 2022, 11:49 a.m. UTC | #6
On Fri, Nov 04, 2022 at 03:44:09PM +0530, Udipto Goswami wrote:
> On 11/3/22 4:59 PM, Udipto Goswami wrote:
> > On 11/3/22 4:22 PM, John Keeping wrote:
> > > On Thu, Nov 03, 2022 at 03:57:02PM +0530, Udipto Goswami wrote:
> > > > On 11/3/22 3:00 PM, John Keeping wrote:
> > > > > On Thu, Nov 03, 2022 at 01:08:21PM +0530, Udipto Goswami wrote:
> > > > > > While performing fast composition switch, there is a
> > > > > > possibility that the
> > > > > > process of ffs_ep0_write/ffs_ep0_read get into a race condition
> > > > > > due to ep0req being freed up from functionfs_unbind.
> > > > > > 
> > > > > > Consider the scenario that the ffs_ep0_write calls the
> > > > > > ffs_ep0_queue_wait
> > > > > > by taking a lock &ffs->ev.waitq.lock. However, the
> > > > > > functionfs_unbind isn't
> > > > > > bounded so it can go ahead and mark the ep0req to NULL,
> > > > > > and since there
> > > > > > is no NULL check in ffs_ep0_queue_wait we will end up in
> > > > > > use-after-free.
> > > > > > 
> > > > > > Fix this by introducing a NULL check before any req operation.
> > > > > > Also to ensure the serialization, perform the ep0req ops inside
> > > > > > spinlock &ffs->ev.waitq.lock.
> > > > > > 
> > > > > > Fixes: ddf8abd25994 ("USB: f_fs: the FunctionFS driver")
> > > > > > Signed-off-by: Udipto Goswami <quic_ugoswami@quicinc.com>
> > > > > > ---
> > > > > >    drivers/usb/gadget/function/f_fs.c | 9 +++++++++
> > > > > >    1 file changed, 9 insertions(+)
> > > > > > 
> > > > > > diff --git a/drivers/usb/gadget/function/f_fs.c
> > > > > > b/drivers/usb/gadget/function/f_fs.c
> > > > > > index 73dc10a77cde..39980b2bf285 100644
> > > > > > --- a/drivers/usb/gadget/function/f_fs.c
> > > > > > +++ b/drivers/usb/gadget/function/f_fs.c
> > > > > > @@ -279,6 +279,13 @@ static int
> > > > > > __ffs_ep0_queue_wait(struct ffs_data *ffs, char *data,
> > > > > > size_t len)
> > > > > >        struct usb_request *req = ffs->ep0req;
> > > > > >        int ret;
> > > > > > +    if (!req)
> > > > > > +        return -EINVAL;
> > > > > > +    /*
> > > > > > +     * Even if ep0req is freed won't be a problem since the local
> > > > > > +     * copy of the request will be used here.
> > > > > > +     */
> > > > > 
> > > > > This doesn't sound right - if we set ep0req to NULL then we've called
> > > > > usb_ep_free_request() on it so the request is not longer valid.
> > > > 
> > > > Yes I agree as soon as we spin_unlock it the functionfs_unbind
> > > > will execute
> > > > and free_up the req, so performing and ep_queue after that even
> > > > if it is a
> > > > local copy could be fatal.
> > > > 
> > > >           ret = usb_ep_queue(ffs->gadget->ep0, req, GFP_ATOMIC);
> > > >           if (unlikely(ret < 0))
> > > >                   return ret;
> > > > 
> > > >          spin_unlock_irq(&ffs->ev.waitq.lock);
> > > >   We can move the spin_unlock after to queue operation perhaps ?
> > > 
> > > I don't think it's that simple.  The documentation for
> > > usb_ep_free_request() says:
> > > 
> > >     * Caller guarantees the request is not queued, and that it will
> > >     * no longer be requeued (or otherwise used).
> > > 
> > > so some extra synchronisation is required here.
> > > 
> > > By the time we get to functionfs_unbind() everything should be disabled
> > > by ffs_func_disable() and ffs_func_unbind() has drained the workqueue,
> > > but none of that applies to ep0.
> > > 
> > > I think ffs_unbind() needs to dequeue the ep0 request.
> > > 
> > > In addition to that, I think we need a new ep0 status variable in struct
> > > ffs_data so that req is not accessed after wait_for_completion() in
> > > __ffs_ep0_queue_wait() and that status is set in ffs_ep0_complete().
> > > 
> > > With the spin_unlock_irq() moved to immediately before
> > > wait_for_completion() in __ffs_ep0_queue_wait() it looks like everything
> > > is then safe.
> > 
> > Thanks for the suggestions, let me try implementing it.
> > 
> Just curious because i saw __ffs_ep0_queue_wait will only be called from
> ffs_ep0_read & ffs_ep0_write, both using a mutex_lock(&ffs->mutex)
> 
> So if we protect the functionfs_unbind with this mutex, it will be better
> right?
> 
> @@ -1889,12 +1889,13 @@ static int functionfs_bind(struct ffs_data *ffs,
> struct usb_composite_dev *cdev)
>  static void functionfs_unbind(struct ffs_data *ffs)
>  {
>         ENTER();
> 
>         if (!WARN_ON(!ffs->gadget)) {
> +               ffs_mutex_lock(&ffs->mutex, file->f_flags & O_NONBLOCK);
>                 usb_ep_free_request(ffs->gadget->ep0, ffs->ep0req);
>                 ffs->ep0req = NULL;
>                 ffs->gadget = NULL;
>                 clear_bit(FFS_FL_BOUND, &ffs->flags);
> +               mutex_unlock(&ffs->mutex);
>                 ffs_data_put(ffs);
>         }
>  }
> 
> Perhaps we don't have to take care of the the serialization in that case
> since it will exit the function __ffs_ep0_queue_wait only after everything
> is done and hence functionfs_unbind will get the control after the
> ep0_write/read has completed?
> 
> What do you think ?

The documentation does say it protects ep0req so this might make sense.

But I think you need to ensure ep0req is dequeued before locking the
mutex in order to avoid a deadlock as nothing else is going to complete
an outstanding request at this point.
Udipto Goswami Nov. 7, 2022, 4:10 a.m. UTC | #7
Hi John,

On 11/4/22 5:19 PM, John Keeping wrote:
> On Fri, Nov 04, 2022 at 03:44:09PM +0530, Udipto Goswami wrote:
>> On 11/3/22 4:59 PM, Udipto Goswami wrote:
>>> On 11/3/22 4:22 PM, John Keeping wrote:
>>>> On Thu, Nov 03, 2022 at 03:57:02PM +0530, Udipto Goswami wrote:
>>>>> On 11/3/22 3:00 PM, John Keeping wrote:
>>>>>> On Thu, Nov 03, 2022 at 01:08:21PM +0530, Udipto Goswami wrote:
>>>>>>> While performing fast composition switch, there is a
>>>>>>> possibility that the
>>>>>>> process of ffs_ep0_write/ffs_ep0_read get into a race condition
>>>>>>> due to ep0req being freed up from functionfs_unbind.
>>>>>>>
>>>>>>> Consider the scenario that the ffs_ep0_write calls the
>>>>>>> ffs_ep0_queue_wait
>>>>>>> by taking a lock &ffs->ev.waitq.lock. However, the
>>>>>>> functionfs_unbind isn't
>>>>>>> bounded so it can go ahead and mark the ep0req to NULL,
>>>>>>> and since there
>>>>>>> is no NULL check in ffs_ep0_queue_wait we will end up in
>>>>>>> use-after-free.
>>>>>>>
>>>>>>> Fix this by introducing a NULL check before any req operation.
>>>>>>> Also to ensure the serialization, perform the ep0req ops inside
>>>>>>> spinlock &ffs->ev.waitq.lock.
>>>>>>>
>>>>>>> Fixes: ddf8abd25994 ("USB: f_fs: the FunctionFS driver")
>>>>>>> Signed-off-by: Udipto Goswami <quic_ugoswami@quicinc.com>
>>>>>>> ---
>>>>>>>     drivers/usb/gadget/function/f_fs.c | 9 +++++++++
>>>>>>>     1 file changed, 9 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/usb/gadget/function/f_fs.c
>>>>>>> b/drivers/usb/gadget/function/f_fs.c
>>>>>>> index 73dc10a77cde..39980b2bf285 100644
>>>>>>> --- a/drivers/usb/gadget/function/f_fs.c
>>>>>>> +++ b/drivers/usb/gadget/function/f_fs.c
>>>>>>> @@ -279,6 +279,13 @@ static int
>>>>>>> __ffs_ep0_queue_wait(struct ffs_data *ffs, char *data,
>>>>>>> size_t len)
>>>>>>>         struct usb_request *req = ffs->ep0req;
>>>>>>>         int ret;
>>>>>>> +    if (!req)
>>>>>>> +        return -EINVAL;
>>>>>>> +    /*
>>>>>>> +     * Even if ep0req is freed won't be a problem since the local
>>>>>>> +     * copy of the request will be used here.
>>>>>>> +     */
>>>>>>
>>>>>> This doesn't sound right - if we set ep0req to NULL then we've called
>>>>>> usb_ep_free_request() on it so the request is not longer valid.
>>>>>
>>>>> Yes I agree as soon as we spin_unlock it the functionfs_unbind
>>>>> will execute
>>>>> and free_up the req, so performing and ep_queue after that even
>>>>> if it is a
>>>>> local copy could be fatal.
>>>>>
>>>>>            ret = usb_ep_queue(ffs->gadget->ep0, req, GFP_ATOMIC);
>>>>>            if (unlikely(ret < 0))
>>>>>                    return ret;
>>>>>
>>>>>           spin_unlock_irq(&ffs->ev.waitq.lock);
>>>>>    We can move the spin_unlock after to queue operation perhaps ?
>>>>
>>>> I don't think it's that simple.  The documentation for
>>>> usb_ep_free_request() says:
>>>>
>>>>      * Caller guarantees the request is not queued, and that it will
>>>>      * no longer be requeued (or otherwise used).
>>>>
>>>> so some extra synchronisation is required here.
>>>>
>>>> By the time we get to functionfs_unbind() everything should be disabled
>>>> by ffs_func_disable() and ffs_func_unbind() has drained the workqueue,
>>>> but none of that applies to ep0.
>>>>
>>>> I think ffs_unbind() needs to dequeue the ep0 request.
>>>>
>>>> In addition to that, I think we need a new ep0 status variable in struct
>>>> ffs_data so that req is not accessed after wait_for_completion() in
>>>> __ffs_ep0_queue_wait() and that status is set in ffs_ep0_complete().
>>>>
>>>> With the spin_unlock_irq() moved to immediately before
>>>> wait_for_completion() in __ffs_ep0_queue_wait() it looks like everything
>>>> is then safe.
>>>
>>> Thanks for the suggestions, let me try implementing it.
>>>
>> Just curious because i saw __ffs_ep0_queue_wait will only be called from
>> ffs_ep0_read & ffs_ep0_write, both using a mutex_lock(&ffs->mutex)
>>
>> So if we protect the functionfs_unbind with this mutex, it will be better
>> right?
>>
>> @@ -1889,12 +1889,13 @@ static int functionfs_bind(struct ffs_data *ffs,
>> struct usb_composite_dev *cdev)
>>   static void functionfs_unbind(struct ffs_data *ffs)
>>   {
>>          ENTER();
>>
>>          if (!WARN_ON(!ffs->gadget)) {
>> +               ffs_mutex_lock(&ffs->mutex, file->f_flags & O_NONBLOCK);
>>                  usb_ep_free_request(ffs->gadget->ep0, ffs->ep0req);
>>                  ffs->ep0req = NULL;
>>                  ffs->gadget = NULL;
>>                  clear_bit(FFS_FL_BOUND, &ffs->flags);
>> +               mutex_unlock(&ffs->mutex);
>>                  ffs_data_put(ffs);
>>          }
>>   }
>>
>> Perhaps we don't have to take care of the the serialization in that case
>> since it will exit the function __ffs_ep0_queue_wait only after everything
>> is done and hence functionfs_unbind will get the control after the
>> ep0_write/read has completed?
>>
>> What do you think ?
> 
> The documentation does say it protects ep0req so this might make sense.
> 
> But I think you need to ensure ep0req is dequeued before locking the
> mutex in order to avoid a deadlock as nothing else is going to complete
> an outstanding request at this point.
Got it, thanks for clarification, i'll make sure to dequeue it  to avoid 
any deadlocks in v2.

-Udipto
diff mbox series

Patch

diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
index 73dc10a77cde..39980b2bf285 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -279,6 +279,13 @@  static int __ffs_ep0_queue_wait(struct ffs_data *ffs, char *data, size_t len)
 	struct usb_request *req = ffs->ep0req;
 	int ret;
 
+	if (!req)
+		return -EINVAL;
+	/*
+	 * Even if ep0req is freed won't be a problem since the local
+	 * copy of the request will be used here.
+	 */
+
 	req->zero     = len < le16_to_cpu(ffs->ev.setup.wLength);
 
 	spin_unlock_irq(&ffs->ev.waitq.lock);
@@ -1892,11 +1899,13 @@  static void functionfs_unbind(struct ffs_data *ffs)
 	ENTER();
 
 	if (!WARN_ON(!ffs->gadget)) {
+		spin_lock_irq(&ffs->ev.waitq.lock);
 		usb_ep_free_request(ffs->gadget->ep0, ffs->ep0req);
 		ffs->ep0req = NULL;
 		ffs->gadget = NULL;
 		clear_bit(FFS_FL_BOUND, &ffs->flags);
 		ffs_data_put(ffs);
+		spin_unlock_irq(&ffs->ev.waitq.lock);
 	}
 }