Message ID | 20221117120813.1257583-1-lee@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/1] usb: gadget: f_hid: Conduct proper refcounting on shared f_hidg pointer | expand |
On Thu, Nov 17, 2022 at 12:08:13PM +0000, Lee Jones wrote: > A reference to struct f_hidg is shared with this driver's associated > character device handling component without provision for life-time > handling. In some circumstances, this can lead to unwanted > behaviour depending on the order in which things are torn down. > > Utilise, the reference counting functionality already provided by the > implanted character device structure to ensure the struct f_hidg's > memory is only freed once the character device handling has finished > with it. > > Signed-off-by: Lee Jones <lee@kernel.org> > --- > drivers/usb/gadget/function/f_hid.c | 47 +++++++++++++++++++++++------ > 1 file changed, 37 insertions(+), 10 deletions(-) Hi, This is the friendly patch-bot of Greg Kroah-Hartman. You have sent him a patch that has triggered this response. He used to manually respond to these common problems, but in order to save his sanity (he kept writing the same thing over and over, yet to different people), I was created. Hopefully you will not take offence and will fix the problem in your patch and resubmit it so that it can be accepted into the Linux kernel tree. You are receiving this message because of the following common error(s) as indicated below: - This looks like a new version of a previously submitted patch, but you did not list below the --- line any changes from the previous version. Please read the section entitled "The canonical patch format" in the kernel file, Documentation/SubmittingPatches for what needs to be done here to properly describe this. If you wish to discuss this problem further, or you have questions about how to resolve this issue, please feel free to respond to this email and Greg will reply once he has dug out from the pending patches received from other developers. thanks, greg k-h's patch email bot
On Thu, Nov 17, 2022 at 12:08:13PM +0000, Lee Jones wrote: > +static inline bool f_hidg_is_open(struct f_hidg *hidg) > +{ > + return !!kref_read(&hidg->cdev.kobj.kref); > +} Ick, sorry, no, that's not going to work and is not allowed at all. That's some major layering violations there, AND it can change after you get the value as well. greg k-h
On Thu, 17 Nov 2022, Greg KH wrote: > On Thu, Nov 17, 2022 at 12:08:13PM +0000, Lee Jones wrote: > > A reference to struct f_hidg is shared with this driver's associated > > character device handling component without provision for life-time > > handling. In some circumstances, this can lead to unwanted > > behaviour depending on the order in which things are torn down. > > > > Utilise, the reference counting functionality already provided by the > > implanted character device structure to ensure the struct f_hidg's > > memory is only freed once the character device handling has finished > > with it. > > > > Signed-off-by: Lee Jones <lee@kernel.org> > > --- > > drivers/usb/gadget/function/f_hid.c | 47 +++++++++++++++++++++++------ > > 1 file changed, 37 insertions(+), 10 deletions(-) > > Hi, > > This is the friendly patch-bot of Greg Kroah-Hartman. You have sent him > a patch that has triggered this response. He used to manually respond > to these common problems, but in order to save his sanity (he kept > writing the same thing over and over, yet to different people), I was > created. Hopefully you will not take offence and will fix the problem > in your patch and resubmit it so that it can be accepted into the Linux > kernel tree. > > You are receiving this message because of the following common error(s) > as indicated below: > > - This looks like a new version of a previously submitted patch, but you > did not list below the --- line any changes from the previous version. > Please read the section entitled "The canonical patch format" in the > kernel file, Documentation/SubmittingPatches for what needs to be done > here to properly describe this. > > If you wish to discuss this problem further, or you have questions about > how to resolve this issue, please feel free to respond to this email and > Greg will reply once he has dug out from the pending patches received > from other developers. > > thanks, > > greg k-h's patch email bot This is a completely new solution to the same problem. I'm treating this as a brand new submission.
On Thu, 17 Nov 2022, Greg KH wrote: > On Thu, Nov 17, 2022 at 12:08:13PM +0000, Lee Jones wrote: > > +static inline bool f_hidg_is_open(struct f_hidg *hidg) > > +{ > > + return !!kref_read(&hidg->cdev.kobj.kref); > > +} > > Ick, sorry, no, that's not going to work and is not allowed at all. > That's some major layering violations there, AND it can change after you > get the value as well. This cdev belongs solely to this driver. Hence the *.*.* and not *->*->*. What is preventing us from reading our own data? If we cannot do this directly, can I create an API to do it 'officially'? I do, however, appreciate that a little locking wouldn't go amiss. If this solution is not acceptable either, then we're left up the creak without a paddle. The rules you've communicated are not compatible with each other. Rule 1: Only one item in a data structure can reference count. Due to the embedded cdev struct, this rules out my first solution of giving f_hidg its own kref so that it can conduct its own life-time management. A potential option to satisfy this rule would be to remove the cdev attribute and create its data dynamically instead. However, the staticness of cdev is used to obtain f_hidg (with container_of()) in the character device handling component, so it cannot be removed. Rule 2: Reading the present refcount causes a laying violation So we're essentially saying, if data is already being reference counted, you have to use the present implementation instead of adding additional counts, but there is no way to actually do that, which kind of puts us at stalemate. Since this is a genuine issue, doing anything is not really an option either. So where do we go from here?
On Thu, Nov 17, 2022 at 01:46:26PM +0000, Lee Jones wrote: > On Thu, 17 Nov 2022, Greg KH wrote: > > > On Thu, Nov 17, 2022 at 12:08:13PM +0000, Lee Jones wrote: > > > +static inline bool f_hidg_is_open(struct f_hidg *hidg) > > > +{ > > > + return !!kref_read(&hidg->cdev.kobj.kref); > > > +} > > > > Ick, sorry, no, that's not going to work and is not allowed at all. > > That's some major layering violations there, AND it can change after you > > get the value as well. > > This cdev belongs solely to this driver. Hence the *.*.* and not > *->*->*. What is preventing us from reading our own data? If we > cannot do this directly, can I create an API to do it 'officially'? > > I do, however, appreciate that a little locking wouldn't go amiss. > > If this solution is not acceptable either, then we're left up the > creak without a paddle. The rules you've communicated are not > compatible with each other. > > Rule 1: Only one item in a data structure can reference count. > > Due to the embedded cdev struct, this rules out my first solution of > giving f_hidg its own kref so that it can conduct its own life-time > management. > > A potential option to satisfy this rule would be to remove the cdev > attribute and create its data dynamically instead. However, the > staticness of cdev is used to obtain f_hidg (with container_of()) in > the character device handling component, so it cannot be removed. You have not understood this rule correctly. Only one item in a data structure can hold a reference count _for that structure_. But several items in a structure can hold reference counts for themselves. So for example, you could put a kref in f_hidg which would hold the reference count for the f_hidg structure, while at the same time including an embedded cdev with its own reference counter. The point is that the refcount in the embedded cdev refers to the lifetime of the cdev, not the lifetime of the f_hidg. To make this work properly, you have to do two additional things: When the cdev's refcount is initialized, increment the kref in f_hidg. When the cdev's refcount drops to 0, decrement the kref (and release f_hidg if the kref hits 0). Alan Stern > Rule 2: Reading the present refcount causes a laying violation > > So we're essentially saying, if data is already being reference > counted, you have to use the present implementation instead of adding > additional counts, but there is no way to actually do that, which > kind of puts us at stalemate. > > Since this is a genuine issue, doing anything is not really an option > either. So where do we go from here? > > -- > Lee Jones [李琼斯]
On Thu, Nov 17, 2022 at 01:26:21PM +0000, Lee Jones wrote: > On Thu, 17 Nov 2022, Greg KH wrote: > > > On Thu, Nov 17, 2022 at 12:08:13PM +0000, Lee Jones wrote: > > > A reference to struct f_hidg is shared with this driver's associated > > > character device handling component without provision for life-time > > > handling. In some circumstances, this can lead to unwanted > > > behaviour depending on the order in which things are torn down. > > > > > > Utilise, the reference counting functionality already provided by the > > > implanted character device structure to ensure the struct f_hidg's > > > memory is only freed once the character device handling has finished > > > with it. > > > > > > Signed-off-by: Lee Jones <lee@kernel.org> > > > --- > > > drivers/usb/gadget/function/f_hid.c | 47 +++++++++++++++++++++++------ > > > 1 file changed, 37 insertions(+), 10 deletions(-) > > > > Hi, > > > > This is the friendly patch-bot of Greg Kroah-Hartman. You have sent him > > a patch that has triggered this response. He used to manually respond > > to these common problems, but in order to save his sanity (he kept > > writing the same thing over and over, yet to different people), I was > > created. Hopefully you will not take offence and will fix the problem > > in your patch and resubmit it so that it can be accepted into the Linux > > kernel tree. > > > > You are receiving this message because of the following common error(s) > > as indicated below: > > > > - This looks like a new version of a previously submitted patch, but you > > did not list below the --- line any changes from the previous version. > > Please read the section entitled "The canonical patch format" in the > > kernel file, Documentation/SubmittingPatches for what needs to be done > > here to properly describe this. > > > > If you wish to discuss this problem further, or you have questions about > > how to resolve this issue, please feel free to respond to this email and > > Greg will reply once he has dug out from the pending patches received > > from other developers. > > > > thanks, > > > > greg k-h's patch email bot > > This is a completely new solution to the same problem. > > I'm treating this as a brand new submission. With the identical subject line, which plays havoc with tools :( This is a v2, next should be v3 please. thanks, greg k-h
On Thu, 17 Nov 2022, Alan Stern wrote: > On Thu, Nov 17, 2022 at 01:46:26PM +0000, Lee Jones wrote: > > On Thu, 17 Nov 2022, Greg KH wrote: > > > > > On Thu, Nov 17, 2022 at 12:08:13PM +0000, Lee Jones wrote: > > > > +static inline bool f_hidg_is_open(struct f_hidg *hidg) > > > > +{ > > > > + return !!kref_read(&hidg->cdev.kobj.kref); > > > > +} > > > > > > Ick, sorry, no, that's not going to work and is not allowed at all. > > > That's some major layering violations there, AND it can change after you > > > get the value as well. > > > > This cdev belongs solely to this driver. Hence the *.*.* and not > > *->*->*. What is preventing us from reading our own data? If we > > cannot do this directly, can I create an API to do it 'officially'? > > > > I do, however, appreciate that a little locking wouldn't go amiss. > > > > If this solution is not acceptable either, then we're left up the > > creak without a paddle. The rules you've communicated are not > > compatible with each other. > > > > Rule 1: Only one item in a data structure can reference count. > > > > Due to the embedded cdev struct, this rules out my first solution of > > giving f_hidg its own kref so that it can conduct its own life-time > > management. > > > > A potential option to satisfy this rule would be to remove the cdev > > attribute and create its data dynamically instead. However, the > > staticness of cdev is used to obtain f_hidg (with container_of()) in > > the character device handling component, so it cannot be removed. > > You have not understood this rule correctly. Only one item in a data > structure can hold a reference count _for that structure_. But several > items in a structure can hold reference counts for themselves. Here was the review comment I was working to on this patch [0]: "While at first glance, it seems that f_hidg is not reference counted, it really is, with the embedded "struct cdev" a few lines above this. That is the reference count that should control the lifecycle of this object, not another reference here in the "outer layer" structure." > So for example, you could put a kref in f_hidg which would hold the > reference count for the f_hidg structure, while at the same time > including an embedded cdev with its own reference counter. The point is > that the refcount in the embedded cdev refers to the lifetime of the > cdev, not the lifetime of the f_hidg. This was the approach in the original submission [1], which during review I was told was unacceptable for the aforementioned reason. [0] https://lore.kernel.org/all/Y1PnoMvDmZMqXScw@kroah.com/ [1] https://lore.kernel.org/all/20221017112737.230772-1-lee@kernel.org/ > To make this work properly, you have to do two additional things: > > When the cdev's refcount is initialized, increment the kref > in f_hidg. > > When the cdev's refcount drops to 0, decrement the kref (and > release f_hidg if the kref hits 0). More than happy to revisit the first solution with Greg's blessing.
On Fri, Nov 18, 2022 at 08:54:53AM +0000, Lee Jones wrote: > On Thu, 17 Nov 2022, Alan Stern wrote: > > > On Thu, Nov 17, 2022 at 01:46:26PM +0000, Lee Jones wrote: > > > On Thu, 17 Nov 2022, Greg KH wrote: > > > > > > > On Thu, Nov 17, 2022 at 12:08:13PM +0000, Lee Jones wrote: > > > > > +static inline bool f_hidg_is_open(struct f_hidg *hidg) > > > > > +{ > > > > > + return !!kref_read(&hidg->cdev.kobj.kref); > > > > > +} > > > > > > > > Ick, sorry, no, that's not going to work and is not allowed at all. > > > > That's some major layering violations there, AND it can change after you > > > > get the value as well. > > > > > > This cdev belongs solely to this driver. Hence the *.*.* and not > > > *->*->*. What is preventing us from reading our own data? If we > > > cannot do this directly, can I create an API to do it 'officially'? > > > > > > I do, however, appreciate that a little locking wouldn't go amiss. > > > > > > If this solution is not acceptable either, then we're left up the > > > creak without a paddle. The rules you've communicated are not > > > compatible with each other. > > > > > > Rule 1: Only one item in a data structure can reference count. > > > > > > Due to the embedded cdev struct, this rules out my first solution of > > > giving f_hidg its own kref so that it can conduct its own life-time > > > management. > > > > > > A potential option to satisfy this rule would be to remove the cdev > > > attribute and create its data dynamically instead. However, the > > > staticness of cdev is used to obtain f_hidg (with container_of()) in > > > the character device handling component, so it cannot be removed. > > > > You have not understood this rule correctly. Only one item in a data > > structure can hold a reference count _for that structure_. But several > > items in a structure can hold reference counts for themselves. > > Here was the review comment I was working to on this patch [0]: > > "While at first glance, it seems that f_hidg is not reference > counted, it really is, with the embedded "struct cdev" a few lines > above this. > > That is the reference count that should control the lifecycle of > this object, not another reference here in the "outer layer" > structure." It's worth noting that the review comment goes on to say: "But, the cdev api is tricky and messy and not really set up to control the lifecycle of objects it is embedded in." This is a good indication that a separate reference counter really is needed (in fact it almost contradicts what was written above). > > So for example, you could put a kref in f_hidg which would hold the > > reference count for the f_hidg structure, while at the same time > > including an embedded cdev with its own reference counter. The point is > > that the refcount in the embedded cdev refers to the lifetime of the > > cdev, not the lifetime of the f_hidg. > > This was the approach in the original submission [1], which during > review I was told was unacceptable for the aforementioned reason. > > [0] https://lore.kernel.org/all/Y1PnoMvDmZMqXScw@kroah.com/ > [1] https://lore.kernel.org/all/20221017112737.230772-1-lee@kernel.org/ > > > To make this work properly, you have to do two additional things: > > > > When the cdev's refcount is initialized, increment the kref > > in f_hidg. > > > > When the cdev's refcount drops to 0, decrement the kref (and > > release f_hidg if the kref hits 0). > > More than happy to revisit the first solution with Greg's blessing. Okay, let's see what Greg thinks after he reads this discussion. Alan Stern
On Fri, Nov 18, 2022 at 10:59:42AM -0500, Alan Stern wrote: > On Fri, Nov 18, 2022 at 08:54:53AM +0000, Lee Jones wrote: > > On Thu, 17 Nov 2022, Alan Stern wrote: > > > > > On Thu, Nov 17, 2022 at 01:46:26PM +0000, Lee Jones wrote: > > > > On Thu, 17 Nov 2022, Greg KH wrote: > > > > > > > > > On Thu, Nov 17, 2022 at 12:08:13PM +0000, Lee Jones wrote: > > > > > > +static inline bool f_hidg_is_open(struct f_hidg *hidg) > > > > > > +{ > > > > > > + return !!kref_read(&hidg->cdev.kobj.kref); > > > > > > +} > > > > > > > > > > Ick, sorry, no, that's not going to work and is not allowed at all. > > > > > That's some major layering violations there, AND it can change after you > > > > > get the value as well. > > > > > > > > This cdev belongs solely to this driver. Hence the *.*.* and not > > > > *->*->*. What is preventing us from reading our own data? If we > > > > cannot do this directly, can I create an API to do it 'officially'? > > > > > > > > I do, however, appreciate that a little locking wouldn't go amiss. > > > > > > > > If this solution is not acceptable either, then we're left up the > > > > creak without a paddle. The rules you've communicated are not > > > > compatible with each other. > > > > > > > > Rule 1: Only one item in a data structure can reference count. > > > > > > > > Due to the embedded cdev struct, this rules out my first solution of > > > > giving f_hidg its own kref so that it can conduct its own life-time > > > > management. > > > > > > > > A potential option to satisfy this rule would be to remove the cdev > > > > attribute and create its data dynamically instead. However, the > > > > staticness of cdev is used to obtain f_hidg (with container_of()) in > > > > the character device handling component, so it cannot be removed. > > > > > > You have not understood this rule correctly. Only one item in a data > > > structure can hold a reference count _for that structure_. But several > > > items in a structure can hold reference counts for themselves. > > > > Here was the review comment I was working to on this patch [0]: > > > > "While at first glance, it seems that f_hidg is not reference > > counted, it really is, with the embedded "struct cdev" a few lines > > above this. > > > > That is the reference count that should control the lifecycle of > > this object, not another reference here in the "outer layer" > > structure." > > It's worth noting that the review comment goes on to say: > > "But, the cdev api is tricky and messy and not really set up to control > the lifecycle of objects it is embedded in." > > This is a good indication that a separate reference counter really is > needed (in fact it almost contradicts what was written above). I don't think it's at all simple to fix this - I posted a series addressing the lifetime issues here a few years ago but didn't chase it up and there was no feedback: https://lore.kernel.org/linux-usb/20191028114228.3679219-1-john@metanate.com/ That includes a patch to remove the embedded struct cdev and manage its lifetime separately, which I think is needed as there are two different struct device objects here and we cannot tie their lifetimes together.
On Fri, Nov 18, 2022 at 04:37:32PM +0000, John Keeping wrote: > I don't think it's at all simple to fix this - I posted a series > addressing the lifetime issues here a few years ago but didn't chase it > up and there was no feedback: > > https://lore.kernel.org/linux-usb/20191028114228.3679219-1-john@metanate.com/ > > That includes a patch to remove the embedded struct cdev and manage its > lifetime separately, which I think is needed as there are two different > struct device objects here and we cannot tie their lifetimes together. I still don't have a clear picture of what the real problem is. Lee's original patch description just said "external references are presently not tracked", with no details about what those external references are. Why not add just proper cdev_get() and cdev_put() calls to whatever code handles those external references, so that they _are_ tracked? What are the two different struct device objects? Why do their lifetimes need to be tied together? If you do need to tie their lifetimes somehow, why not simply make one of them (the one which is logically allowed to be shorter-lived) hold a reference to the other? Alan Stern
On Fri, Nov 18, 2022 at 04:07:24PM -0500, Alan Stern wrote: > On Fri, Nov 18, 2022 at 04:37:32PM +0000, John Keeping wrote: > > I don't think it's at all simple to fix this - I posted a series > > addressing the lifetime issues here a few years ago but didn't chase it > > up and there was no feedback: > > > > https://lore.kernel.org/linux-usb/20191028114228.3679219-1-john@metanate.com/ > > > > That includes a patch to remove the embedded struct cdev and manage its > > lifetime separately, which I think is needed as there are two different > > struct device objects here and we cannot tie their lifetimes together. > > I still don't have a clear picture of what the real problem is. Lee's > original patch description just said "external references are presently > not tracked", with no details about what those external references are. > Why not add just proper cdev_get() and cdev_put() calls to whatever code > handles those external references, so that they _are_ tracked? > > What are the two different struct device objects? Why do their > lifetimes need to be tied together? If you do need to tie their > lifetimes somehow, why not simply make one of them (the one which is > logically allowed to be shorter-lived) hold a reference to the other? The problem is that we have a struct cdev embedded in f_hidg but the lifetime of f_hidg is not tied to any kobject so we can't solve this in the right way by setting the parent kobject of the cdev. While refcounting struct f_hidg is necessary, it's not sufficient because the only way to keep it alive long enough for the final kobject_put() on the embedded cdev is to tie the lifetime to a kobject of its own and there is no suitable object as this is not the model followed by gadget function instances. To show the problem (using libusbgx's example commands for conciseness, but obviously this can be done via configfs directly): $ gadget-hid $ exec 3<> /dev/hidg0 $ gadget-vid-pid-remove $ exec 3<&- ================================================================== BUG: KASAN: use-after-free in kobject_put+0x24/0x250 Read of size 1 at addr c61784a0 by task sh/264 CPU: 1 PID: 264 Comm: sh Tainted: G W 6.0.5 #1 Hardware name: Rockchip (Device Tree) unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x58/0x70 dump_stack_lvl from print_report+0x58/0x4dc print_report from kasan_report+0x88/0xa8 kasan_report from kobject_put+0x24/0x250 kobject_put from __fput+0x1e4/0x358 __fput from task_work_run+0xb4/0xc4 task_work_run from do_work_pending+0x4d4/0x524 do_work_pending from slow_work_pending+0xc/0x20 Exception stack(0xf284bfb0 to 0xf284bff8) bfa0: 00000000 00000003 02292190 00000000 bfc0: 02293fc8 aed4e1d0 00000001 00000006 022925a0 00000000 022925a0 022923f4 bfe0: 00532f38 b6864840 0049c218 aec57e38 60000010 0000000b Allocated by task 341: kasan_set_track+0x20/0x28 ____kasan_kmalloc+0x80/0x88 hidg_alloc+0x24/0x1f0 usb_get_function+0x28/0x48 config_usb_cfg_link+0x90/0x114 configfs_symlink+0x24c/0x5d8 vfs_symlink+0x58/0x74 do_symlinkat+0xdc/0x16c ret_fast_syscall+0x0/0x1c Freed by task 352: kasan_set_track+0x20/0x28 kasan_set_free_info+0x28/0x34 ____kasan_slab_free+0xf8/0x108 __kasan_slab_free+0x10/0x18 slab_free_freelist_hook+0x9c/0xfc kfree+0x118/0x258 hidg_free+0x44/0x6c config_usb_cfg_unlink+0xd4/0xf4 configfs_unlink+0x118/0x15c vfs_unlink+0xd8/0x1c4 do_unlinkat+0x180/0x28c ret_fast_syscall+0x0/0x1c The buggy address belongs to the object at c6178400 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 160 bytes inside of 512-byte region [c6178400, c6178600)
On Sun, Nov 20, 2022 at 05:22:19PM +0000, John Keeping wrote: > On Fri, Nov 18, 2022 at 04:07:24PM -0500, Alan Stern wrote: > > On Fri, Nov 18, 2022 at 04:37:32PM +0000, John Keeping wrote: > > > I don't think it's at all simple to fix this - I posted a series > > > addressing the lifetime issues here a few years ago but didn't chase it > > > up and there was no feedback: > > > > > > https://lore.kernel.org/linux-usb/20191028114228.3679219-1-john@metanate.com/ > > > > > > That includes a patch to remove the embedded struct cdev and manage its > > > lifetime separately, which I think is needed as there are two different > > > struct device objects here and we cannot tie their lifetimes together. > > > > I still don't have a clear picture of what the real problem is. Lee's > > original patch description just said "external references are presently > > not tracked", with no details about what those external references are. > > Why not add just proper cdev_get() and cdev_put() calls to whatever code > > handles those external references, so that they _are_ tracked? > > > > What are the two different struct device objects? Why do their > > lifetimes need to be tied together? If you do need to tie their > > lifetimes somehow, why not simply make one of them (the one which is > > logically allowed to be shorter-lived) hold a reference to the other? > > The problem is that we have a struct cdev embedded in f_hidg but the > lifetime of f_hidg is not tied to any kobject so we can't solve this in > the right way by setting the parent kobject of the cdev. > > While refcounting struct f_hidg is necessary, it's not sufficient > because the only way to keep it alive long enough for the final > kobject_put() on the embedded cdev is to tie the lifetime to a kobject > of its own and there is no suitable object as this is not the model > followed by gadget function instances. I see. The solution is simple: Embed a struct device in struct f_hidg, and call cdev_device_add() to add the device and the cdev. This will automatically make the device the parent of the cdev, so the device's refcount won't go to 0 until the cdev's refcount does. Then you can tie the f_hidg's lifetime to the device's, so the device's release routine can safely deallocate the entire f_hidg structure. The parent of the new struct device should be set to &gadget->dev. If you can't think of a better name for the device, you could simply append ":I" to the parent's name, where I is the interface number, or even append ":C.I" where C is the config number (like we do on the host side). Alan Stern
On Sun, 20 Nov 2022, Alan Stern wrote: > On Sun, Nov 20, 2022 at 05:22:19PM +0000, John Keeping wrote: > > On Fri, Nov 18, 2022 at 04:07:24PM -0500, Alan Stern wrote: > > > On Fri, Nov 18, 2022 at 04:37:32PM +0000, John Keeping wrote: > > > > I don't think it's at all simple to fix this - I posted a series > > > > addressing the lifetime issues here a few years ago but didn't chase it > > > > up and there was no feedback: > > > > > > > > https://lore.kernel.org/linux-usb/20191028114228.3679219-1-john@metanate.com/ > > > > > > > > That includes a patch to remove the embedded struct cdev and manage its > > > > lifetime separately, which I think is needed as there are two different > > > > struct device objects here and we cannot tie their lifetimes together. > > > > > > I still don't have a clear picture of what the real problem is. Lee's > > > original patch description just said "external references are presently > > > not tracked", with no details about what those external references are. > > > Why not add just proper cdev_get() and cdev_put() calls to whatever code > > > handles those external references, so that they _are_ tracked? > > > > > > What are the two different struct device objects? Why do their > > > lifetimes need to be tied together? If you do need to tie their > > > lifetimes somehow, why not simply make one of them (the one which is > > > logically allowed to be shorter-lived) hold a reference to the other? > > > > The problem is that we have a struct cdev embedded in f_hidg but the > > lifetime of f_hidg is not tied to any kobject so we can't solve this in > > the right way by setting the parent kobject of the cdev. > > > > While refcounting struct f_hidg is necessary, it's not sufficient > > because the only way to keep it alive long enough for the final > > kobject_put() on the embedded cdev is to tie the lifetime to a kobject > > of its own and there is no suitable object as this is not the model > > followed by gadget function instances. > > I see. The solution is simple: Embed a struct device in struct f_hidg, > and call cdev_device_add() to add the device and the cdev. This will > automatically make the device the parent of the cdev, so the device's > refcount won't go to 0 until the cdev's refcount does. Then you can tie > the f_hidg's lifetime to the device's, so the device's release routine > can safely deallocate the entire f_hidg structure. > > The parent of the new struct device should be set to &gadget->dev. If > you can't think of a better name for the device, you could simply append > ":I" to the parent's name, where I is the interface number, or even > append ":C.I" where C is the config number (like we do on the host > side). Thanks for the suggestions Alan. Not long finished speaking with Greg about this. He seems okay with the approach. I'll knock something together and get a "v1" (*wink*) out shortly.
On Sun, Nov 20, 2022 at 03:46:26PM -0500, Alan Stern wrote: > On Sun, Nov 20, 2022 at 05:22:19PM +0000, John Keeping wrote: > > On Fri, Nov 18, 2022 at 04:07:24PM -0500, Alan Stern wrote: > > > On Fri, Nov 18, 2022 at 04:37:32PM +0000, John Keeping wrote: > > > > I don't think it's at all simple to fix this - I posted a series > > > > addressing the lifetime issues here a few years ago but didn't chase it > > > > up and there was no feedback: > > > > > > > > https://lore.kernel.org/linux-usb/20191028114228.3679219-1-john@metanate.com/ > > > > > > > > That includes a patch to remove the embedded struct cdev and manage its > > > > lifetime separately, which I think is needed as there are two different > > > > struct device objects here and we cannot tie their lifetimes together. > > > > > > I still don't have a clear picture of what the real problem is. Lee's > > > original patch description just said "external references are presently > > > not tracked", with no details about what those external references are. > > > Why not add just proper cdev_get() and cdev_put() calls to whatever code > > > handles those external references, so that they _are_ tracked? > > > > > > What are the two different struct device objects? Why do their > > > lifetimes need to be tied together? If you do need to tie their > > > lifetimes somehow, why not simply make one of them (the one which is > > > logically allowed to be shorter-lived) hold a reference to the other? > > > > The problem is that we have a struct cdev embedded in f_hidg but the > > lifetime of f_hidg is not tied to any kobject so we can't solve this in > > the right way by setting the parent kobject of the cdev. > > > > While refcounting struct f_hidg is necessary, it's not sufficient > > because the only way to keep it alive long enough for the final > > kobject_put() on the embedded cdev is to tie the lifetime to a kobject > > of its own and there is no suitable object as this is not the model > > followed by gadget function instances. > > I see. The solution is simple: Embed a struct device in struct f_hidg, > and call cdev_device_add() to add the device and the cdev. This will > automatically make the device the parent of the cdev, so the device's > refcount won't go to 0 until the cdev's refcount does. Then you can tie > the f_hidg's lifetime to the device's, so the device's release routine > can safely deallocate the entire f_hidg structure. > > The parent of the new struct device should be set to &gadget->dev. If > you can't think of a better name for the device, you could simply append > ":I" to the parent's name, where I is the interface number, or even > append ":C.I" where C is the config number (like we do on the host > side). There is no gadget->dev at the time struct f_hidg is allocated. AFAICT the device is the UDC, which is only associated with the gadget when it is bound. The functions are allocated earlier than this and I can't see any device associated with struct usb_function_instance. The patch below does fix the issue, but I'm wondering if there's a deeper problem here that can only be properly solved by adding some device/kobject hierarchy to the config side of things. -- >8 -- Subject: [PATCH] usb: gadget: f_hid: fix f_hidg lifetime vs cdev The embedded struct cdev does not have its lifetime correctly tied to the enclosing struct f_hidg, so there is a use-after-free if /dev/hidgN is held open while the gadget is deleted. This can readily be replicated with libusbgx's example programs (for conciseness - operating directly via configfs is equivalent): gadget-hid exec 3<> /dev/hidg0 gadget-vid-pid-remove exec 3<&- Add a device to f_hidg so that the cdev can properly take a reference to this and the structure will be freed only when the child device is released. [Also fix refcount leak on an error path.] Signed-off-by: John Keeping <john@metanate.com> --- drivers/usb/gadget/function/f_hid.c | 35 ++++++++++++++++++++++++----- 1 file changed, 30 insertions(+), 5 deletions(-) diff --git a/drivers/usb/gadget/function/f_hid.c b/drivers/usb/gadget/function/f_hid.c index ca0a7d9eaa34..7ae97e5c4d20 100644 --- a/drivers/usb/gadget/function/f_hid.c +++ b/drivers/usb/gadget/function/f_hid.c @@ -71,6 +71,7 @@ struct f_hidg { wait_queue_head_t write_queue; struct usb_request *req; + struct device dev; int minor; struct cdev cdev; struct usb_function func; @@ -84,6 +85,14 @@ static inline struct f_hidg *func_to_hidg(struct usb_function *f) return container_of(f, struct f_hidg, func); } +static void hidg_release(struct device *dev) +{ + struct f_hidg *hidg = container_of(dev, struct f_hidg, dev); + + kfree(hidg->set_report_buf); + kfree(hidg); +} + /*-------------------------------------------------------------------------*/ /* Static descriptors */ @@ -999,6 +1008,7 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f) /* create char device */ cdev_init(&hidg->cdev, &f_hidg_fops); + cdev_set_parent(&hidg->cdev, &hidg->dev.kobj); dev = MKDEV(major, hidg->minor); status = cdev_add(&hidg->cdev, dev, 1); if (status) @@ -1244,9 +1254,7 @@ static void hidg_free(struct usb_function *f) hidg = func_to_hidg(f); opts = container_of(f->fi, struct f_hid_opts, func_inst); - kfree(hidg->report_desc); - kfree(hidg->set_report_buf); - kfree(hidg); + device_unregister(&hidg->dev); mutex_lock(&opts->lock); --opts->refcnt; mutex_unlock(&opts->lock); @@ -1266,6 +1274,7 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi) { struct f_hidg *hidg; struct f_hid_opts *opts; + int ret; /* allocate and initialize one new instance */ hidg = kzalloc(sizeof(*hidg), GFP_KERNEL); @@ -1277,17 +1286,27 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi) mutex_lock(&opts->lock); ++opts->refcnt; + device_initialize(&hidg->dev); + hidg->dev.release = hidg_release; + ret = dev_set_name(&hidg->dev, "hidg%d", hidg->minor); + if (ret) { + --opts->refcnt; + mutex_unlock(&opts->lock); + return ERR_PTR(ret); + } + hidg->minor = opts->minor; hidg->bInterfaceSubClass = opts->subclass; hidg->bInterfaceProtocol = opts->protocol; hidg->report_length = opts->report_length; hidg->report_desc_length = opts->report_desc_length; if (opts->report_desc) { - hidg->report_desc = kmemdup(opts->report_desc, + hidg->report_desc = devm_kmemdup(&hidg->dev, opts->report_desc, opts->report_desc_length, GFP_KERNEL); if (!hidg->report_desc) { - kfree(hidg); + put_device(&hidg->dev); + --opts->refcnt; mutex_unlock(&opts->lock); return ERR_PTR(-ENOMEM); } @@ -1307,6 +1326,12 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi) /* this could be made configurable at some point */ hidg->qlen = 4; + ret = device_add(&hidg->dev); + if (ret) { + put_device(&hidg->dev); + return ERR_PTR(ret); + } + return &hidg->func; }
On Mon, Nov 21, 2022 at 12:38:37PM +0000, John Keeping wrote: > On Sun, Nov 20, 2022 at 03:46:26PM -0500, Alan Stern wrote: > > I see. The solution is simple: Embed a struct device in struct f_hidg, > > and call cdev_device_add() to add the device and the cdev. This will > > automatically make the device the parent of the cdev, so the device's > > refcount won't go to 0 until the cdev's refcount does. Then you can tie > > the f_hidg's lifetime to the device's, so the device's release routine > > can safely deallocate the entire f_hidg structure. > > > > The parent of the new struct device should be set to &gadget->dev. If > > you can't think of a better name for the device, you could simply append > > ":I" to the parent's name, where I is the interface number, or even > > append ":C.I" where C is the config number (like we do on the host > > side). > > There is no gadget->dev at the time struct f_hidg is allocated. > > AFAICT the device is the UDC, which is only associated with the gadget > when it is bound. The functions are allocated earlier than this and I > can't see any device associated with struct usb_function_instance. Ah, that's a shame. All right, so be it. > The patch below does fix the issue, but I'm wondering if there's a > deeper problem here that can only be properly solved by adding some > device/kobject hierarchy to the config side of things. I don't believe there's any deeper problem. If someone wants to have an fhidg device as a child of the gadget, I think it could be added and removed in the ->set_alt() and ->disable() callbacks. Or maybe the ->bind() and ->unbind() callbacks (I've never had to work with configfs so I'm not clear on the details). > -- >8 -- > Subject: [PATCH] usb: gadget: f_hid: fix f_hidg lifetime vs cdev > > The embedded struct cdev does not have its lifetime correctly tied to > the enclosing struct f_hidg, so there is a use-after-free if /dev/hidgN > is held open while the gadget is deleted. > > This can readily be replicated with libusbgx's example programs (for > conciseness - operating directly via configfs is equivalent): > > gadget-hid > exec 3<> /dev/hidg0 > gadget-vid-pid-remove > exec 3<&- > > Add a device to f_hidg so that the cdev can properly take a reference to > this and the structure will be freed only when the child device is > released. > > [Also fix refcount leak on an error path.] > > Signed-off-by: John Keeping <john@metanate.com> > --- > drivers/usb/gadget/function/f_hid.c | 35 ++++++++++++++++++++++++----- > 1 file changed, 30 insertions(+), 5 deletions(-) > > diff --git a/drivers/usb/gadget/function/f_hid.c b/drivers/usb/gadget/function/f_hid.c > index ca0a7d9eaa34..7ae97e5c4d20 100644 > --- a/drivers/usb/gadget/function/f_hid.c > +++ b/drivers/usb/gadget/function/f_hid.c > @@ -71,6 +71,7 @@ struct f_hidg { > wait_queue_head_t write_queue; > struct usb_request *req; > > + struct device dev; > int minor; > struct cdev cdev; > struct usb_function func; > @@ -84,6 +85,14 @@ static inline struct f_hidg *func_to_hidg(struct usb_function *f) > return container_of(f, struct f_hidg, func); > } > > +static void hidg_release(struct device *dev) > +{ > + struct f_hidg *hidg = container_of(dev, struct f_hidg, dev); > + > + kfree(hidg->set_report_buf); > + kfree(hidg); > +} > + > /*-------------------------------------------------------------------------*/ > /* Static descriptors */ > > @@ -999,6 +1008,7 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f) > > /* create char device */ > cdev_init(&hidg->cdev, &f_hidg_fops); > + cdev_set_parent(&hidg->cdev, &hidg->dev.kobj); > dev = MKDEV(major, hidg->minor); > status = cdev_add(&hidg->cdev, dev, 1); Instead of doing it by hand like this, why not call cdev_device_add()? > if (status) > @@ -1244,9 +1254,7 @@ static void hidg_free(struct usb_function *f) > > hidg = func_to_hidg(f); > opts = container_of(f->fi, struct f_hid_opts, func_inst); > - kfree(hidg->report_desc); > - kfree(hidg->set_report_buf); > - kfree(hidg); > + device_unregister(&hidg->dev); Are you certain at this point that hidg->dev has been registered? Even on all the error paths? > mutex_lock(&opts->lock); > --opts->refcnt; > mutex_unlock(&opts->lock); > @@ -1266,6 +1274,7 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi) > { > struct f_hidg *hidg; > struct f_hid_opts *opts; > + int ret; > > /* allocate and initialize one new instance */ > hidg = kzalloc(sizeof(*hidg), GFP_KERNEL); > @@ -1277,17 +1286,27 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi) > mutex_lock(&opts->lock); > ++opts->refcnt; > > + device_initialize(&hidg->dev); > + hidg->dev.release = hidg_release; > + ret = dev_set_name(&hidg->dev, "hidg%d", hidg->minor); > + if (ret) { > + --opts->refcnt; > + mutex_unlock(&opts->lock); > + return ERR_PTR(ret); > + } > + > hidg->minor = opts->minor; > hidg->bInterfaceSubClass = opts->subclass; > hidg->bInterfaceProtocol = opts->protocol; > hidg->report_length = opts->report_length; > hidg->report_desc_length = opts->report_desc_length; > if (opts->report_desc) { > - hidg->report_desc = kmemdup(opts->report_desc, > + hidg->report_desc = devm_kmemdup(&hidg->dev, opts->report_desc, > opts->report_desc_length, > GFP_KERNEL); > if (!hidg->report_desc) { > - kfree(hidg); > + put_device(&hidg->dev); > + --opts->refcnt; > mutex_unlock(&opts->lock); > return ERR_PTR(-ENOMEM); > } > @@ -1307,6 +1326,12 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi) > /* this could be made configurable at some point */ > hidg->qlen = 4; > > + ret = device_add(&hidg->dev); > + if (ret) { > + put_device(&hidg->dev); > + return ERR_PTR(ret); > + } Why do this here instead of when the cdev is registered? Or to put it another way, why not add the two of them at the same time by calling cdev_device_add() rather than adding them at two separate places? Alan Stern > + > return &hidg->func; > } > > -- > 2.38.1 >
On Mon, Nov 21, 2022 at 11:18:34AM -0500, Alan Stern wrote: > On Mon, Nov 21, 2022 at 12:38:37PM +0000, John Keeping wrote: > > On Sun, Nov 20, 2022 at 03:46:26PM -0500, Alan Stern wrote: > > > I see. The solution is simple: Embed a struct device in struct f_hidg, > > > and call cdev_device_add() to add the device and the cdev. This will > > > automatically make the device the parent of the cdev, so the device's > > > refcount won't go to 0 until the cdev's refcount does. Then you can tie > > > the f_hidg's lifetime to the device's, so the device's release routine > > > can safely deallocate the entire f_hidg structure. > > > > > > The parent of the new struct device should be set to &gadget->dev. If > > > you can't think of a better name for the device, you could simply append > > > ":I" to the parent's name, where I is the interface number, or even > > > append ":C.I" where C is the config number (like we do on the host > > > side). > > > > There is no gadget->dev at the time struct f_hidg is allocated. > > > > AFAICT the device is the UDC, which is only associated with the gadget > > when it is bound. The functions are allocated earlier than this and I > > can't see any device associated with struct usb_function_instance. > > Ah, that's a shame. All right, so be it. > > > The patch below does fix the issue, but I'm wondering if there's a > > deeper problem here that can only be properly solved by adding some > > device/kobject hierarchy to the config side of things. > > I don't believe there's any deeper problem. If someone wants to have an > fhidg device as a child of the gadget, I think it could be added and > removed in the ->set_alt() and ->disable() callbacks. Or maybe the > ->bind() and ->unbind() callbacks (I've never had to work with configfs > so I'm not clear on the details). It turns out there's already a device being created here, just not associated with the structure. Your suggestions around cdev_device_add() made me spot what's going on with that so the actual fix is to pull its lifetime up to match struct f_hidg. -- >8 -- Subject: [PATCH] usb: gadget: f_hid: fix f_hidg lifetime vs cdev The embedded struct cdev does not have its lifetime correctly tied to the enclosing struct f_hidg, so there is a use-after-free if /dev/hidgN is held open while the gadget is deleted. This can readily be replicated with libusbgx's example programs (for conciseness - operating directly via configfs is equivalent): gadget-hid exec 3<> /dev/hidg0 gadget-vid-pid-remove exec 3<&- Pull the existing device up in to struct f_hidg and make use of the cdev_device_{add,del}() helpers. This changes the lifetime of the device object to match struct f_hidg, but note that it is still added and deleted at the same time. [Also fix refcount leak on an error path.] Signed-off-by: John Keeping <john@metanate.com> --- drivers/usb/gadget/function/f_hid.c | 50 ++++++++++++++++------------- 1 file changed, 28 insertions(+), 22 deletions(-) diff --git a/drivers/usb/gadget/function/f_hid.c b/drivers/usb/gadget/function/f_hid.c index ca0a7d9eaa34..0b94668a3812 100644 --- a/drivers/usb/gadget/function/f_hid.c +++ b/drivers/usb/gadget/function/f_hid.c @@ -71,7 +71,7 @@ struct f_hidg { wait_queue_head_t write_queue; struct usb_request *req; - int minor; + struct device dev; struct cdev cdev; struct usb_function func; @@ -84,6 +84,14 @@ static inline struct f_hidg *func_to_hidg(struct usb_function *f) return container_of(f, struct f_hidg, func); } +static void hidg_release(struct device *dev) +{ + struct f_hidg *hidg = container_of(dev, struct f_hidg, dev); + + kfree(hidg->set_report_buf); + kfree(hidg); +} + /*-------------------------------------------------------------------------*/ /* Static descriptors */ @@ -904,9 +912,7 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f) struct usb_ep *ep; struct f_hidg *hidg = func_to_hidg(f); struct usb_string *us; - struct device *device; int status; - dev_t dev; /* maybe allocate device-global string IDs, and patch descriptors */ us = usb_gstrings_attach(c->cdev, ct_func_strings, @@ -999,21 +1005,12 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f) /* create char device */ cdev_init(&hidg->cdev, &f_hidg_fops); - dev = MKDEV(major, hidg->minor); - status = cdev_add(&hidg->cdev, dev, 1); + cdev_set_parent(&hidg->cdev, &hidg->dev.kobj); + status = cdev_device_add(&hidg->cdev, &hidg->dev); if (status) goto fail_free_descs; - device = device_create(hidg_class, NULL, dev, NULL, - "%s%d", "hidg", hidg->minor); - if (IS_ERR(device)) { - status = PTR_ERR(device); - goto del; - } - return 0; -del: - cdev_del(&hidg->cdev); fail_free_descs: usb_free_all_descriptors(f); fail: @@ -1244,9 +1241,7 @@ static void hidg_free(struct usb_function *f) hidg = func_to_hidg(f); opts = container_of(f->fi, struct f_hid_opts, func_inst); - kfree(hidg->report_desc); - kfree(hidg->set_report_buf); - kfree(hidg); + put_device(&hidg->dev); mutex_lock(&opts->lock); --opts->refcnt; mutex_unlock(&opts->lock); @@ -1256,8 +1251,7 @@ static void hidg_unbind(struct usb_configuration *c, struct usb_function *f) { struct f_hidg *hidg = func_to_hidg(f); - device_destroy(hidg_class, MKDEV(major, hidg->minor)); - cdev_del(&hidg->cdev); + cdev_device_del(&hidg->cdev); usb_free_all_descriptors(f); } @@ -1266,6 +1260,7 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi) { struct f_hidg *hidg; struct f_hid_opts *opts; + int ret; /* allocate and initialize one new instance */ hidg = kzalloc(sizeof(*hidg), GFP_KERNEL); @@ -1277,17 +1272,28 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi) mutex_lock(&opts->lock); ++opts->refcnt; - hidg->minor = opts->minor; + device_initialize(&hidg->dev); + hidg->dev.release = hidg_release; + hidg->dev.class = hidg_class; + hidg->dev.devt = MKDEV(major, opts->minor); + ret = dev_set_name(&hidg->dev, "hidg%d", opts->minor); + if (ret) { + --opts->refcnt; + mutex_unlock(&opts->lock); + return ERR_PTR(ret); + } + hidg->bInterfaceSubClass = opts->subclass; hidg->bInterfaceProtocol = opts->protocol; hidg->report_length = opts->report_length; hidg->report_desc_length = opts->report_desc_length; if (opts->report_desc) { - hidg->report_desc = kmemdup(opts->report_desc, + hidg->report_desc = devm_kmemdup(&hidg->dev, opts->report_desc, opts->report_desc_length, GFP_KERNEL); if (!hidg->report_desc) { - kfree(hidg); + put_device(&hidg->dev); + --opts->refcnt; mutex_unlock(&opts->lock); return ERR_PTR(-ENOMEM); }
On Mon, Nov 21, 2022 at 06:54:55PM +0000, John Keeping wrote: > It turns out there's already a device being created here, just not > associated with the structure. Your suggestions around > cdev_device_add() made me spot what's going on with that so the actual > fix is to pull its lifetime up to match struct f_hidg. It's not obvious from the patch what device was already being created, but never mind... > -- >8 -- > Subject: [PATCH] usb: gadget: f_hid: fix f_hidg lifetime vs cdev > > The embedded struct cdev does not have its lifetime correctly tied to > the enclosing struct f_hidg, so there is a use-after-free if /dev/hidgN > is held open while the gadget is deleted. > > This can readily be replicated with libusbgx's example programs (for > conciseness - operating directly via configfs is equivalent): > > gadget-hid > exec 3<> /dev/hidg0 > gadget-vid-pid-remove > exec 3<&- > > Pull the existing device up in to struct f_hidg and make use of the > cdev_device_{add,del}() helpers. This changes the lifetime of the > device object to match struct f_hidg, but note that it is still added > and deleted at the same time. > > [Also fix refcount leak on an error path.] > > Signed-off-by: John Keeping <john@metanate.com> > --- > drivers/usb/gadget/function/f_hid.c | 50 ++++++++++++++++------------- > 1 file changed, 28 insertions(+), 22 deletions(-) > > diff --git a/drivers/usb/gadget/function/f_hid.c b/drivers/usb/gadget/function/f_hid.c > index ca0a7d9eaa34..0b94668a3812 100644 > --- a/drivers/usb/gadget/function/f_hid.c > +++ b/drivers/usb/gadget/function/f_hid.c > @@ -999,21 +1005,12 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f) > > /* create char device */ > cdev_init(&hidg->cdev, &f_hidg_fops); > - dev = MKDEV(major, hidg->minor); > - status = cdev_add(&hidg->cdev, dev, 1); > + cdev_set_parent(&hidg->cdev, &hidg->dev.kobj); This line isn't needed; cdev_device_add() does it for you because hidg->dev.devt has been set. > + status = cdev_device_add(&hidg->cdev, &hidg->dev); > if (status) > goto fail_free_descs; > @@ -1277,17 +1272,28 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi) > mutex_lock(&opts->lock); > ++opts->refcnt; > > - hidg->minor = opts->minor; > + device_initialize(&hidg->dev); > + hidg->dev.release = hidg_release; > + hidg->dev.class = hidg_class; > + hidg->dev.devt = MKDEV(major, opts->minor); > + ret = dev_set_name(&hidg->dev, "hidg%d", opts->minor); > + if (ret) { > + --opts->refcnt; > + mutex_unlock(&opts->lock); > + return ERR_PTR(ret); > + } > + Otherwise this looks okay (although I don't know any of the details of how fhidg works, so you shouldn't take my word for it). Alan Stern
On Mon, 21 Nov 2022, John Keeping wrote: 65;6999;1c > On Mon, Nov 21, 2022 at 11:18:34AM -0500, Alan Stern wrote: > > On Mon, Nov 21, 2022 at 12:38:37PM +0000, John Keeping wrote: > > > On Sun, Nov 20, 2022 at 03:46:26PM -0500, Alan Stern wrote: > > > > I see. The solution is simple: Embed a struct device in struct f_hidg, > > > > and call cdev_device_add() to add the device and the cdev. This will > > > > automatically make the device the parent of the cdev, so the device's > > > > refcount won't go to 0 until the cdev's refcount does. Then you can tie > > > > the f_hidg's lifetime to the device's, so the device's release routine > > > > can safely deallocate the entire f_hidg structure. > > > > > > > > The parent of the new struct device should be set to &gadget->dev. If > > > > you can't think of a better name for the device, you could simply append > > > > ":I" to the parent's name, where I is the interface number, or even > > > > append ":C.I" where C is the config number (like we do on the host > > > > side). > > > > > > There is no gadget->dev at the time struct f_hidg is allocated. > > > > > > AFAICT the device is the UDC, which is only associated with the gadget > > > when it is bound. The functions are allocated earlier than this and I > > > can't see any device associated with struct usb_function_instance. > > > > Ah, that's a shame. All right, so be it. > > > > > The patch below does fix the issue, but I'm wondering if there's a > > > deeper problem here that can only be properly solved by adding some > > > device/kobject hierarchy to the config side of things. > > > > I don't believe there's any deeper problem. If someone wants to have an > > fhidg device as a child of the gadget, I think it could be added and > > removed in the ->set_alt() and ->disable() callbacks. Or maybe the > > ->bind() and ->unbind() callbacks (I've never had to work with configfs > > so I'm not clear on the details). > > It turns out there's already a device being created here, just not > associated with the structure. Your suggestions around > cdev_device_add() made me spot what's going on with that so the actual > fix is to pull its lifetime up to match struct f_hidg. > > -- >8 -- > Subject: [PATCH] usb: gadget: f_hid: fix f_hidg lifetime vs cdev > > The embedded struct cdev does not have its lifetime correctly tied to > the enclosing struct f_hidg, so there is a use-after-free if /dev/hidgN > is held open while the gadget is deleted. > > This can readily be replicated with libusbgx's example programs (for > conciseness - operating directly via configfs is equivalent): > > gadget-hid > exec 3<> /dev/hidg0 > gadget-vid-pid-remove > exec 3<&- > > Pull the existing device up in to struct f_hidg and make use of the > cdev_device_{add,del}() helpers. This changes the lifetime of the > device object to match struct f_hidg, but note that it is still added > and deleted at the same time. This is much better, thanks for re-spinning. > [Also fix refcount leak on an error path.] > > Signed-off-by: John Keeping <john@metanate.com> > --- > drivers/usb/gadget/function/f_hid.c | 50 ++++++++++++++++------------- > 1 file changed, 28 insertions(+), 22 deletions(-) > > diff --git a/drivers/usb/gadget/function/f_hid.c b/drivers/usb/gadget/function/f_hid.c > index ca0a7d9eaa34..0b94668a3812 100644 > --- a/drivers/usb/gadget/function/f_hid.c > +++ b/drivers/usb/gadget/function/f_hid.c > @@ -71,7 +71,7 @@ struct f_hidg { > wait_queue_head_t write_queue; > struct usb_request *req; > > - int minor; > + struct device dev; > struct cdev cdev; > struct usb_function func; > > @@ -84,6 +84,14 @@ static inline struct f_hidg *func_to_hidg(struct usb_function *f) > return container_of(f, struct f_hidg, func); > } > > +static void hidg_release(struct device *dev) > +{ > + struct f_hidg *hidg = container_of(dev, struct f_hidg, dev); Could we store/fetch this with dev_{set,get}_drvdata(), and make hidg->dev a pointer reducing the size of the struct f_hidg. > + kfree(hidg->set_report_buf); > + kfree(hidg); > +} > + > /*-------------------------------------------------------------------------*/ > /* Static descriptors */ > > @@ -904,9 +912,7 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f) > struct usb_ep *ep; > struct f_hidg *hidg = func_to_hidg(f); > struct usb_string *us; > - struct device *device; > int status; > - dev_t dev; > > /* maybe allocate device-global string IDs, and patch descriptors */ > us = usb_gstrings_attach(c->cdev, ct_func_strings, > @@ -999,21 +1005,12 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f) > > /* create char device */ > cdev_init(&hidg->cdev, &f_hidg_fops); > - dev = MKDEV(major, hidg->minor); > - status = cdev_add(&hidg->cdev, dev, 1); > + cdev_set_parent(&hidg->cdev, &hidg->dev.kobj); cdev_device_add() should take care of this, so long as: if (dev->devt) dev_set_parent(cdev, &dev->kobj); > + status = cdev_device_add(&hidg->cdev, &hidg->dev); > if (status) > goto fail_free_descs; > > - device = device_create(hidg_class, NULL, dev, NULL, > - "%s%d", "hidg", hidg->minor); > - if (IS_ERR(device)) { > - status = PTR_ERR(device); > - goto del; > - } > - > return 0; > -del: > - cdev_del(&hidg->cdev); > fail_free_descs: > usb_free_all_descriptors(f); > fail: > @@ -1244,9 +1241,7 @@ static void hidg_free(struct usb_function *f) > > hidg = func_to_hidg(f); > opts = container_of(f->fi, struct f_hid_opts, func_inst); > - kfree(hidg->report_desc); > - kfree(hidg->set_report_buf); > - kfree(hidg); > + put_device(&hidg->dev); > mutex_lock(&opts->lock); > --opts->refcnt; > mutex_unlock(&opts->lock); > @@ -1256,8 +1251,7 @@ static void hidg_unbind(struct usb_configuration *c, struct usb_function *f) > { > struct f_hidg *hidg = func_to_hidg(f); > > - device_destroy(hidg_class, MKDEV(major, hidg->minor)); > - cdev_del(&hidg->cdev); > + cdev_device_del(&hidg->cdev); > > usb_free_all_descriptors(f); > } > @@ -1266,6 +1260,7 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi) > { > struct f_hidg *hidg; > struct f_hid_opts *opts; > + int ret; > > /* allocate and initialize one new instance */ > hidg = kzalloc(sizeof(*hidg), GFP_KERNEL); > @@ -1277,17 +1272,28 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi) > mutex_lock(&opts->lock); > ++opts->refcnt; > > - hidg->minor = opts->minor; > + device_initialize(&hidg->dev); > + hidg->dev.release = hidg_release; > + hidg->dev.class = hidg_class; > + hidg->dev.devt = MKDEV(major, opts->minor); > + ret = dev_set_name(&hidg->dev, "hidg%d", opts->minor); > + if (ret) { > + --opts->refcnt; Since we're holding the opts lock at this point, is there anything preventing us from incrementing the refcnt at the end, just before giving up the lock, thus saving 2 decrements in the error paths? > + mutex_unlock(&opts->lock); > + return ERR_PTR(ret); > + } > + > hidg->bInterfaceSubClass = opts->subclass; > hidg->bInterfaceProtocol = opts->protocol; > hidg->report_length = opts->report_length; > hidg->report_desc_length = opts->report_desc_length; > if (opts->report_desc) { > - hidg->report_desc = kmemdup(opts->report_desc, > + hidg->report_desc = devm_kmemdup(&hidg->dev, opts->report_desc, Nice. > opts->report_desc_length, > GFP_KERNEL); > if (!hidg->report_desc) { > - kfree(hidg); > + put_device(&hidg->dev); > + --opts->refcnt; > mutex_unlock(&opts->lock); > return ERR_PTR(-ENOMEM); > } Thanks for doing this John, your work is appreciated.
On Mon, Nov 21, 2022 at 02:17:56PM -0500, Alan Stern wrote: > On Mon, Nov 21, 2022 at 06:54:55PM +0000, John Keeping wrote: > > It turns out there's already a device being created here, just not > > associated with the structure. Your suggestions around > > cdev_device_add() made me spot what's going on with that so the actual > > fix is to pull its lifetime up to match struct f_hidg. > > It's not obvious from the patch what device was already being created, > but never mind... The patch has: - device = device_create(hidg_class, NULL, dev, NULL, - "%s%d", "hidg", hidg->minor); but this device was not previously associated with the cdev (apart from indirectly via dev_t). > > -- >8 -- > > Subject: [PATCH] usb: gadget: f_hid: fix f_hidg lifetime vs cdev > > > > The embedded struct cdev does not have its lifetime correctly tied to > > the enclosing struct f_hidg, so there is a use-after-free if /dev/hidgN > > is held open while the gadget is deleted. > > > > This can readily be replicated with libusbgx's example programs (for > > conciseness - operating directly via configfs is equivalent): > > > > gadget-hid > > exec 3<> /dev/hidg0 > > gadget-vid-pid-remove > > exec 3<&- > > > > Pull the existing device up in to struct f_hidg and make use of the > > cdev_device_{add,del}() helpers. This changes the lifetime of the > > device object to match struct f_hidg, but note that it is still added > > and deleted at the same time. > > > > [Also fix refcount leak on an error path.] > > > > Signed-off-by: John Keeping <john@metanate.com> > > --- > > drivers/usb/gadget/function/f_hid.c | 50 ++++++++++++++++------------- > > 1 file changed, 28 insertions(+), 22 deletions(-) > > > > diff --git a/drivers/usb/gadget/function/f_hid.c b/drivers/usb/gadget/function/f_hid.c > > index ca0a7d9eaa34..0b94668a3812 100644 > > --- a/drivers/usb/gadget/function/f_hid.c > > +++ b/drivers/usb/gadget/function/f_hid.c > > > @@ -999,21 +1005,12 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f) > > > > /* create char device */ > > cdev_init(&hidg->cdev, &f_hidg_fops); > > - dev = MKDEV(major, hidg->minor); > > - status = cdev_add(&hidg->cdev, dev, 1); > > + cdev_set_parent(&hidg->cdev, &hidg->dev.kobj); > > This line isn't needed; cdev_device_add() does it for you because > hidg->dev.devt has been set. Thanks, I'll drop this line.
On Tue, Nov 22, 2022 at 08:31:08AM +0000, Lee Jones wrote: > On Mon, 21 Nov 2022, John Keeping wrote: > > Subject: [PATCH] usb: gadget: f_hid: fix f_hidg lifetime vs cdev > > > > The embedded struct cdev does not have its lifetime correctly tied to > > the enclosing struct f_hidg, so there is a use-after-free if /dev/hidgN > > is held open while the gadget is deleted. > > > > This can readily be replicated with libusbgx's example programs (for > > conciseness - operating directly via configfs is equivalent): > > > > gadget-hid > > exec 3<> /dev/hidg0 > > gadget-vid-pid-remove > > exec 3<&- > > > > Pull the existing device up in to struct f_hidg and make use of the > > cdev_device_{add,del}() helpers. This changes the lifetime of the > > device object to match struct f_hidg, but note that it is still added > > and deleted at the same time. > > This is much better, thanks for re-spinning. > > > [Also fix refcount leak on an error path.] > > > > Signed-off-by: John Keeping <john@metanate.com> > > --- > > drivers/usb/gadget/function/f_hid.c | 50 ++++++++++++++++------------- > > 1 file changed, 28 insertions(+), 22 deletions(-) > > > > diff --git a/drivers/usb/gadget/function/f_hid.c b/drivers/usb/gadget/function/f_hid.c > > index ca0a7d9eaa34..0b94668a3812 100644 > > --- a/drivers/usb/gadget/function/f_hid.c > > +++ b/drivers/usb/gadget/function/f_hid.c > > @@ -71,7 +71,7 @@ struct f_hidg { > > wait_queue_head_t write_queue; > > struct usb_request *req; > > > > - int minor; > > + struct device dev; > > struct cdev cdev; > > struct usb_function func; > > > > @@ -84,6 +84,14 @@ static inline struct f_hidg *func_to_hidg(struct usb_function *f) > > return container_of(f, struct f_hidg, func); > > } > > > > +static void hidg_release(struct device *dev) > > +{ > > + struct f_hidg *hidg = container_of(dev, struct f_hidg, dev); > > Could we store/fetch this with dev_{set,get}_drvdata(), and make > hidg->dev a pointer reducing the size of the struct f_hidg. That will reduce the size of struct f_hidg, but we'll have an extra allocation for the device object, so I don't think that's a real benefit. It seems simpler to keep a single allocation and embed the device. > > + kfree(hidg->set_report_buf); > > + kfree(hidg); > > +} > > + > > /*-------------------------------------------------------------------------*/ > > /* Static descriptors */ > > > > @@ -904,9 +912,7 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f) > > struct usb_ep *ep; > > struct f_hidg *hidg = func_to_hidg(f); > > struct usb_string *us; > > - struct device *device; > > int status; > > - dev_t dev; > > > > /* maybe allocate device-global string IDs, and patch descriptors */ > > us = usb_gstrings_attach(c->cdev, ct_func_strings, > > @@ -999,21 +1005,12 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f) > > > > /* create char device */ > > cdev_init(&hidg->cdev, &f_hidg_fops); > > - dev = MKDEV(major, hidg->minor); > > - status = cdev_add(&hidg->cdev, dev, 1); > > + cdev_set_parent(&hidg->cdev, &hidg->dev.kobj); > > cdev_device_add() should take care of this, so long as: > > if (dev->devt) > dev_set_parent(cdev, &dev->kobj); Thanks, I'll change this. > > + status = cdev_device_add(&hidg->cdev, &hidg->dev); > > if (status) > > goto fail_free_descs; > > > > - device = device_create(hidg_class, NULL, dev, NULL, > > - "%s%d", "hidg", hidg->minor); > > - if (IS_ERR(device)) { > > - status = PTR_ERR(device); > > - goto del; > > - } > > - > > return 0; > > -del: > > - cdev_del(&hidg->cdev); > > fail_free_descs: > > usb_free_all_descriptors(f); > > fail: > > @@ -1244,9 +1241,7 @@ static void hidg_free(struct usb_function *f) > > > > hidg = func_to_hidg(f); > > opts = container_of(f->fi, struct f_hid_opts, func_inst); > > - kfree(hidg->report_desc); > > - kfree(hidg->set_report_buf); > > - kfree(hidg); > > + put_device(&hidg->dev); > > mutex_lock(&opts->lock); > > --opts->refcnt; > > mutex_unlock(&opts->lock); > > @@ -1256,8 +1251,7 @@ static void hidg_unbind(struct usb_configuration *c, struct usb_function *f) > > { > > struct f_hidg *hidg = func_to_hidg(f); > > > > - device_destroy(hidg_class, MKDEV(major, hidg->minor)); > > - cdev_del(&hidg->cdev); > > + cdev_device_del(&hidg->cdev); > > > > usb_free_all_descriptors(f); > > } > > @@ -1266,6 +1260,7 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi) > > { > > struct f_hidg *hidg; > > struct f_hid_opts *opts; > > + int ret; > > > > /* allocate and initialize one new instance */ > > hidg = kzalloc(sizeof(*hidg), GFP_KERNEL); > > @@ -1277,17 +1272,28 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi) > > mutex_lock(&opts->lock); > > ++opts->refcnt; > > > > - hidg->minor = opts->minor; > > + device_initialize(&hidg->dev); > > + hidg->dev.release = hidg_release; > > + hidg->dev.class = hidg_class; > > + hidg->dev.devt = MKDEV(major, opts->minor); > > + ret = dev_set_name(&hidg->dev, "hidg%d", opts->minor); > > + if (ret) { > > + --opts->refcnt; > > Since we're holding the opts lock at this point, is there anything > preventing us from incrementing the refcnt at the end, just before > giving up the lock, thus saving 2 decrements in the error paths? This makes sense. I'll respin this as a series and include a patch tidying up the error handling as a final step. > > + mutex_unlock(&opts->lock); > > + return ERR_PTR(ret); > > + } > > + > > hidg->bInterfaceSubClass = opts->subclass; > > hidg->bInterfaceProtocol = opts->protocol; > > hidg->report_length = opts->report_length; > > hidg->report_desc_length = opts->report_desc_length; > > if (opts->report_desc) { > > - hidg->report_desc = kmemdup(opts->report_desc, > > + hidg->report_desc = devm_kmemdup(&hidg->dev, opts->report_desc, > > Nice. > > > opts->report_desc_length, > > GFP_KERNEL); > > if (!hidg->report_desc) { > > - kfree(hidg); > > + put_device(&hidg->dev); > > + --opts->refcnt; > > mutex_unlock(&opts->lock); > > return ERR_PTR(-ENOMEM); > > } > > Thanks for doing this John, your work is appreciated.
diff --git a/drivers/usb/gadget/function/f_hid.c b/drivers/usb/gadget/function/f_hid.c index ca0a7d9eaa34e..5da8f44d47d9d 100644 --- a/drivers/usb/gadget/function/f_hid.c +++ b/drivers/usb/gadget/function/f_hid.c @@ -77,6 +77,8 @@ struct f_hidg { struct usb_ep *in_ep; struct usb_ep *out_ep; + + bool gc; }; static inline struct f_hidg *func_to_hidg(struct usb_function *f) @@ -84,6 +86,11 @@ static inline struct f_hidg *func_to_hidg(struct usb_function *f) return container_of(f, struct f_hidg, func); } +static inline bool f_hidg_is_open(struct f_hidg *hidg) +{ + return !!kref_read(&hidg->cdev.kobj.kref); +} + /*-------------------------------------------------------------------------*/ /* Static descriptors */ @@ -273,6 +280,18 @@ static struct usb_gadget_strings *ct_func_strings[] = { NULL, }; +static void hidg_free_resources(struct f_hidg *hidg) +{ + struct f_hid_opts *opts = container_of(hidg->func.fi, struct f_hid_opts, func_inst); + + mutex_lock(&opts->lock); + kfree(hidg->report_desc); + kfree(hidg->set_report_buf); + kfree(hidg); + --opts->refcnt; + mutex_unlock(&opts->lock); +} + /*-------------------------------------------------------------------------*/ /* Char Device */ @@ -539,7 +558,16 @@ static __poll_t f_hidg_poll(struct file *file, poll_table *wait) static int f_hidg_release(struct inode *inode, struct file *fd) { + struct f_hidg *hidg = fd->private_data; + + if (hidg->gc) { + /* Gadget has already been disconnected and we are the last f_hidg user */ + cdev_del(&hidg->cdev); + hidg_free_resources(hidg); + } + fd->private_data = NULL; + return 0; } @@ -1239,17 +1267,16 @@ static struct usb_function_instance *hidg_alloc_inst(void) static void hidg_free(struct usb_function *f) { - struct f_hidg *hidg; - struct f_hid_opts *opts; + struct f_hidg *hidg = func_to_hidg(f); - hidg = func_to_hidg(f); - opts = container_of(f->fi, struct f_hid_opts, func_inst); - kfree(hidg->report_desc); - kfree(hidg->set_report_buf); - kfree(hidg); - mutex_lock(&opts->lock); - --opts->refcnt; - mutex_unlock(&opts->lock); + if (f_hidg_is_open(hidg)) + /* + * Gadget disconnected whilst an open dev node exists. + * Delay freeing resources until it closes. + */ + hidg->gc = true; + else + hidg_free_resources(hidg); } static void hidg_unbind(struct usb_configuration *c, struct usb_function *f)
A reference to struct f_hidg is shared with this driver's associated character device handling component without provision for life-time handling. In some circumstances, this can lead to unwanted behaviour depending on the order in which things are torn down. Utilise, the reference counting functionality already provided by the implanted character device structure to ensure the struct f_hidg's memory is only freed once the character device handling has finished with it. Signed-off-by: Lee Jones <lee@kernel.org> --- drivers/usb/gadget/function/f_hid.c | 47 +++++++++++++++++++++++------ 1 file changed, 37 insertions(+), 10 deletions(-)