diff mbox series

usb: gadget: f_fs: Fix NULL pointer dereference in ffs_epfile_async_io_complete()

Message ID 20240223054809.2379-1-quic_selvaras@quicinc.com (mailing list archive)
State New, archived
Headers show
Series usb: gadget: f_fs: Fix NULL pointer dereference in ffs_epfile_async_io_complete() | expand

Commit Message

Selvarasu Ganesan Feb. 23, 2024, 5:48 a.m. UTC
In scenarios of continuous and parallel usage of multiple FFS interfaces
and concurrent adb operations (e.g., adb root, adb reboot), there's a
chance that ffs_epfile_async_io_complete() might be processed after
ffs_epfile_release(). This could lead to a NULL pointer dereference of
ffs when accessing the ffs pointer in ffs_epfile_async_io_complete(), as
ffs is freed as part of ffs_epfile_release(). This epfile release is
part of file operation and is triggered when user space daemons restart
themselves or a reboot is initiated.

Fix this issue by adding a NULL pointer check for ffs in
ffs_epfile_async_io_complete().

[  9981.393115] Unable to handle kernel NULL pointer dereference at virtual address 00000000000001e0
[  9981.402854] Mem abort info:
...
[  9981.532540] Hardware name: Qualcomm Technologies,
[  9981.540579] pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  9981.548438] pc : ffs_epfile_async_io_complete+0x38/0x4c
[  9981.554529] lr : usb_gadget_giveback_request+0x30/0xd0
...
[  9981.645057] Call trace:
[  9981.648282]  ffs_epfile_async_io_complete+0x38/0x4c
[  9981.654004]  usb_gadget_giveback_request+0x30/0xd0
[  9981.659637]  dwc3_gadget_endpoint_trbs_complete+0x1a8/0x48c
[  9981.666074]  dwc3_process_event_entry+0x378/0x648
[  9981.671622]  dwc3_process_event_buf+0x6c/0x288
[  9981.676903]  dwc3_thread_interrupt+0x3c/0x68
[  9981.682003]  irq_thread_fn+0x2c/0x8c
[  9981.686388]  irq_thread+0x198/0x2ac
[  9981.690685]  kthread+0x154/0x218
[  9981.694717]  ret_from_fork+0x10/0x20

Signed-off-by: Selvarasu Ganesan <quic_selvaras@quicinc.com>
---
 drivers/usb/gadget/function/f_fs.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Greg KH Feb. 23, 2024, 5:58 a.m. UTC | #1
On Thu, Feb 22, 2024 at 09:48:09PM -0800, Selvarasu Ganesan wrote:
> In scenarios of continuous and parallel usage of multiple FFS interfaces
> and concurrent adb operations (e.g., adb root, adb reboot), there's a
> chance that ffs_epfile_async_io_complete() might be processed after
> ffs_epfile_release(). This could lead to a NULL pointer dereference of
> ffs when accessing the ffs pointer in ffs_epfile_async_io_complete(), as
> ffs is freed as part of ffs_epfile_release(). This epfile release is
> part of file operation and is triggered when user space daemons restart
> themselves or a reboot is initiated.
> 
> Fix this issue by adding a NULL pointer check for ffs in
> ffs_epfile_async_io_complete().
> 
> [  9981.393115] Unable to handle kernel NULL pointer dereference at virtual address 00000000000001e0
> [  9981.402854] Mem abort info:
> ...
> [  9981.532540] Hardware name: Qualcomm Technologies,
> [  9981.540579] pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [  9981.548438] pc : ffs_epfile_async_io_complete+0x38/0x4c
> [  9981.554529] lr : usb_gadget_giveback_request+0x30/0xd0
> ...
> [  9981.645057] Call trace:
> [  9981.648282]  ffs_epfile_async_io_complete+0x38/0x4c
> [  9981.654004]  usb_gadget_giveback_request+0x30/0xd0
> [  9981.659637]  dwc3_gadget_endpoint_trbs_complete+0x1a8/0x48c
> [  9981.666074]  dwc3_process_event_entry+0x378/0x648
> [  9981.671622]  dwc3_process_event_buf+0x6c/0x288
> [  9981.676903]  dwc3_thread_interrupt+0x3c/0x68
> [  9981.682003]  irq_thread_fn+0x2c/0x8c
> [  9981.686388]  irq_thread+0x198/0x2ac
> [  9981.690685]  kthread+0x154/0x218
> [  9981.694717]  ret_from_fork+0x10/0x20
> 
> Signed-off-by: Selvarasu Ganesan <quic_selvaras@quicinc.com>

What commit id does this fix?  Should it go to stable kernels?

> ---
>  drivers/usb/gadget/function/f_fs.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
> index be3851cffb73..d8c8e88628f9 100644
> --- a/drivers/usb/gadget/function/f_fs.c
> +++ b/drivers/usb/gadget/function/f_fs.c
> @@ -849,7 +849,9 @@ static void ffs_epfile_async_io_complete(struct usb_ep *_ep,
>  	usb_ep_free_request(_ep, req);
>  
>  	INIT_WORK(&io_data->work, ffs_user_copy_worker);
> -	queue_work(ffs->io_completion_wq, &io_data->work);
> +
> +	if (ffs && ffs->io_completion_wq)
> +		queue_work(ffs->io_completion_wq, &io_data->work);

What happens if ffs->io_compleation_wq goes away right after you test
it but before you call queue_work()?

Where is the locking here to prevent that?

thanks,

greg k-h
Selvarasu Ganesan Feb. 23, 2024, 11:35 a.m. UTC | #2
On 2/23/2024 11:28 AM, Greg KH wrote:
> On Thu, Feb 22, 2024 at 09:48:09PM -0800, Selvarasu Ganesan wrote:
>> In scenarios of continuous and parallel usage of multiple FFS interfaces
>> and concurrent adb operations (e.g., adb root, adb reboot), there's a
>> chance that ffs_epfile_async_io_complete() might be processed after
>> ffs_epfile_release(). This could lead to a NULL pointer dereference of
>> ffs when accessing the ffs pointer in ffs_epfile_async_io_complete(), as
>> ffs is freed as part of ffs_epfile_release(). This epfile release is
>> part of file operation and is triggered when user space daemons restart
>> themselves or a reboot is initiated.
>>
>> Fix this issue by adding a NULL pointer check for ffs in
>> ffs_epfile_async_io_complete().
>>
>> [  9981.393115] Unable to handle kernel NULL pointer dereference at virtual address 00000000000001e0
>> [  9981.402854] Mem abort info:
>> ...
>> [  9981.532540] Hardware name: Qualcomm Technologies,
>> [  9981.540579] pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>> [  9981.548438] pc : ffs_epfile_async_io_complete+0x38/0x4c
>> [  9981.554529] lr : usb_gadget_giveback_request+0x30/0xd0
>> ...
>> [  9981.645057] Call trace:
>> [  9981.648282]  ffs_epfile_async_io_complete+0x38/0x4c
>> [  9981.654004]  usb_gadget_giveback_request+0x30/0xd0
>> [  9981.659637]  dwc3_gadget_endpoint_trbs_complete+0x1a8/0x48c
>> [  9981.666074]  dwc3_process_event_entry+0x378/0x648
>> [  9981.671622]  dwc3_process_event_buf+0x6c/0x288
>> [  9981.676903]  dwc3_thread_interrupt+0x3c/0x68
>> [  9981.682003]  irq_thread_fn+0x2c/0x8c
>> [  9981.686388]  irq_thread+0x198/0x2ac
>> [  9981.690685]  kthread+0x154/0x218
>> [  9981.694717]  ret_from_fork+0x10/0x20
>>
>> Signed-off-by: Selvarasu Ganesan <quic_selvaras@quicinc.com>
> 
> What commit id does this fix?  Should it go to stable kernels?

Fixes: 2e4c7553cd6f9 ("usb: gadget: f_fs: add aio support"). Yes it's
required to propagate to stable kernel as well.
> 
>> ---
>>   drivers/usb/gadget/function/f_fs.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
>> index be3851cffb73..d8c8e88628f9 100644
>> --- a/drivers/usb/gadget/function/f_fs.c
>> +++ b/drivers/usb/gadget/function/f_fs.c
>> @@ -849,7 +849,9 @@ static void ffs_epfile_async_io_complete(struct usb_ep *_ep,
>>   	usb_ep_free_request(_ep, req);
>>   
>>   	INIT_WORK(&io_data->work, ffs_user_copy_worker);
>> -	queue_work(ffs->io_completion_wq, &io_data->work);
>> +
>> +	if (ffs && ffs->io_completion_wq)
>> +		queue_work(ffs->io_completion_wq, &io_data->work);
> 
> What happens if ffs->io_compleation_wq goes away right after you test
> it but before you call queue_work()?
> 
> Where is the locking here to prevent that?
> 
> thanks,
> 
> greg k-h

Hi Greg,

Thank you for your feedback. I understand your concern about the
potential race condition with ffs->io_completion_wq. I’m considering
introducing a lock to protect this section of the code, but I wanted to
get your opinion on this.
In the f_fs.c driver, there are pre-existing locks. Would it be suitable 
to utilize these locks, or do you suggest the creation of a new lock 
specifically for ffs->io_completion_wq? We anticipate a performance 
impact if we use the existing lock, as it might be held by different
threads. What are your thoughts on this?"

Here’s what the code might look like with a new lock:

static void ffs_epfile_async_io_complete(struct usb_ep *_ep,
                                          struct usb_request *req)
{
....
spin_lock(&ffs->new_lock);
if (ffs && ffs->io_completion_wq)
     queue_work(ffs->io_completion_wq, &io_data->work);
spin_unlock(&ffs->new_lock);
....
}



static void ffs_data_put(struct ffs_data *ffs) {
...
destroy_workqueue(ffs->io_completion_wq);
kfree(ffs->dev_name);
spin_lock(&ffs->new_lock);
kfree(ffs);
spin_unlock(&ffs->new_lock);
...
}

Thanks,
Selva
Greg KH Feb. 23, 2024, 12:40 p.m. UTC | #3
On Fri, Feb 23, 2024 at 05:05:59PM +0530, Selvarasu Ganesan wrote:
> 
> On 2/23/2024 11:28 AM, Greg KH wrote:
> > On Thu, Feb 22, 2024 at 09:48:09PM -0800, Selvarasu Ganesan wrote:
> > > In scenarios of continuous and parallel usage of multiple FFS interfaces
> > > and concurrent adb operations (e.g., adb root, adb reboot), there's a
> > > chance that ffs_epfile_async_io_complete() might be processed after
> > > ffs_epfile_release(). This could lead to a NULL pointer dereference of
> > > ffs when accessing the ffs pointer in ffs_epfile_async_io_complete(), as
> > > ffs is freed as part of ffs_epfile_release(). This epfile release is
> > > part of file operation and is triggered when user space daemons restart
> > > themselves or a reboot is initiated.
> > > 
> > > Fix this issue by adding a NULL pointer check for ffs in
> > > ffs_epfile_async_io_complete().
> > > 
> > > [  9981.393115] Unable to handle kernel NULL pointer dereference at virtual address 00000000000001e0
> > > [  9981.402854] Mem abort info:
> > > ...
> > > [  9981.532540] Hardware name: Qualcomm Technologies,
> > > [  9981.540579] pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > [  9981.548438] pc : ffs_epfile_async_io_complete+0x38/0x4c
> > > [  9981.554529] lr : usb_gadget_giveback_request+0x30/0xd0
> > > ...
> > > [  9981.645057] Call trace:
> > > [  9981.648282]  ffs_epfile_async_io_complete+0x38/0x4c
> > > [  9981.654004]  usb_gadget_giveback_request+0x30/0xd0
> > > [  9981.659637]  dwc3_gadget_endpoint_trbs_complete+0x1a8/0x48c
> > > [  9981.666074]  dwc3_process_event_entry+0x378/0x648
> > > [  9981.671622]  dwc3_process_event_buf+0x6c/0x288
> > > [  9981.676903]  dwc3_thread_interrupt+0x3c/0x68
> > > [  9981.682003]  irq_thread_fn+0x2c/0x8c
> > > [  9981.686388]  irq_thread+0x198/0x2ac
> > > [  9981.690685]  kthread+0x154/0x218
> > > [  9981.694717]  ret_from_fork+0x10/0x20
> > > 
> > > Signed-off-by: Selvarasu Ganesan <quic_selvaras@quicinc.com>
> > 
> > What commit id does this fix?  Should it go to stable kernels?
> 
> Fixes: 2e4c7553cd6f9 ("usb: gadget: f_fs: add aio support"). Yes it's
> required to propagate to stable kernel as well.

Great, when you resend the next version, please include both proper
tags.

> > > ---
> > >   drivers/usb/gadget/function/f_fs.c | 4 +++-
> > >   1 file changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
> > > index be3851cffb73..d8c8e88628f9 100644
> > > --- a/drivers/usb/gadget/function/f_fs.c
> > > +++ b/drivers/usb/gadget/function/f_fs.c
> > > @@ -849,7 +849,9 @@ static void ffs_epfile_async_io_complete(struct usb_ep *_ep,
> > >   	usb_ep_free_request(_ep, req);
> > >   	INIT_WORK(&io_data->work, ffs_user_copy_worker);
> > > -	queue_work(ffs->io_completion_wq, &io_data->work);
> > > +
> > > +	if (ffs && ffs->io_completion_wq)
> > > +		queue_work(ffs->io_completion_wq, &io_data->work);
> > 
> > What happens if ffs->io_compleation_wq goes away right after you test
> > it but before you call queue_work()?
> > 
> > Where is the locking here to prevent that?
> > 
> > thanks,
> > 
> > greg k-h
> 
> Hi Greg,
> 
> Thank you for your feedback. I understand your concern about the
> potential race condition with ffs->io_completion_wq. I’m considering
> introducing a lock to protect this section of the code, but I wanted to
> get your opinion on this.
> In the f_fs.c driver, there are pre-existing locks. Would it be suitable to
> utilize these locks, or do you suggest the creation of a new lock
> specifically for ffs->io_completion_wq? We anticipate a performance impact
> if we use the existing lock, as it might be held by different
> threads. What are your thoughts on this?"

Test it out yourself and see what works best!

thanks,

greg k-h
Jens Axboe Feb. 23, 2024, 2:43 p.m. UTC | #4
On 2/23/24 4:35 AM, Selvarasu Ganesan wrote:
> Here?s what the code might look like with a new lock:
> 
> static void ffs_epfile_async_io_complete(struct usb_ep *_ep,
>                                          struct usb_request *req)
> {
> ....
> spin_lock(&ffs->new_lock);
> if (ffs && ffs->io_completion_wq)
>     queue_work(ffs->io_completion_wq, &io_data->work);
> spin_unlock(&ffs->new_lock);
> ....
> }
> 
> 
> 
> static void ffs_data_put(struct ffs_data *ffs) {
> ...
> destroy_workqueue(ffs->io_completion_wq);
> kfree(ffs->dev_name);
> spin_lock(&ffs->new_lock);
> kfree(ffs);
> spin_unlock(&ffs->new_lock);
> ...
> }

This obviously won't work at all, and it's not the right way to fix it
at all. It needs a ref count.
diff mbox series

Patch

diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
index be3851cffb73..d8c8e88628f9 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -849,7 +849,9 @@  static void ffs_epfile_async_io_complete(struct usb_ep *_ep,
 	usb_ep_free_request(_ep, req);
 
 	INIT_WORK(&io_data->work, ffs_user_copy_worker);
-	queue_work(ffs->io_completion_wq, &io_data->work);
+
+	if (ffs && ffs->io_completion_wq)
+		queue_work(ffs->io_completion_wq, &io_data->work);
 }
 
 static void __ffs_epfile_read_buffer_free(struct ffs_epfile *epfile)