diff mbox series

[v8,06/12] kernel/module: add documentation for try_module_get()

Message ID 20210927163805.808907-7-mcgrof@kernel.org (mailing list archive)
State New
Headers show
Series syfs: generic deadlock fix with module removal | expand

Commit Message

Luis Chamberlain Sept. 27, 2021, 4:37 p.m. UTC
There is quite a bit of tribal knowledge around proper use of
try_module_get() and that it must be used only in a context which
can ensure the module won't be gone during the operation. Document
this little bit of tribal knowledge.

I'm extending this tribal knowledge with new developments which it
seems some folks do not yet believe to be true: we can be sure a
module will exist during the lifetime of a sysfs file operation.
For proof, refer to test_sysfs test #32:

./tools/testing/selftests/sysfs/sysfs.sh -t 0032

Without this being true, the write would fail or worse,
a crash would happen, in this test. It does not.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 include/linux/module.h | 34 ++++++++++++++++++++++++++++++++--
 1 file changed, 32 insertions(+), 2 deletions(-)

Comments

Kees Cook Oct. 5, 2021, 7:58 p.m. UTC | #1
On Mon, Sep 27, 2021 at 09:37:59AM -0700, Luis Chamberlain wrote:
> There is quite a bit of tribal knowledge around proper use of
> try_module_get() and that it must be used only in a context which
> can ensure the module won't be gone during the operation. Document
> this little bit of tribal knowledge.
> 
> I'm extending this tribal knowledge with new developments which it
> seems some folks do not yet believe to be true: we can be sure a
> module will exist during the lifetime of a sysfs file operation.
> For proof, refer to test_sysfs test #32:
> 
> ./tools/testing/selftests/sysfs/sysfs.sh -t 0032
> 
> Without this being true, the write would fail or worse,
> a crash would happen, in this test. It does not.
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>  include/linux/module.h | 34 ++++++++++++++++++++++++++++++++--
>  1 file changed, 32 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/module.h b/include/linux/module.h
> index c9f1200b2312..22eacd5e1e85 100644
> --- a/include/linux/module.h
> +++ b/include/linux/module.h
> @@ -609,10 +609,40 @@ void symbol_put_addr(void *addr);
>     to handle the error case (which only happens with rmmod --wait). */
>  extern void __module_get(struct module *module);
>  
> -/* This is the Right Way to get a module: if it fails, it's being removed,
> - * so pretend it's not there. */
> +/**
> + * try_module_get() - yields to module removal and bumps refcnt otherwise

I find this hard to parse. How about:
	"Take module refcount unless module is being removed"

> + * @module: the module we should check for
> + *
> + * This can be used to try to bump the reference count of a module, so to
> + * prevent module removal. The reference count of a module is not allowed
> + * to be incremented if the module is already being removed.

This I understand.

> + *
> + * Care must be taken to ensure the module cannot be removed during the call to
> + * try_module_get(). This can be done by having another entity other than the
> + * module itself increment the module reference count, or through some other
> + * means which guarantees the module could not be removed during an operation.
> + * An example of this later case is using try_module_get() in a sysfs file
> + * which the module created. The sysfs store / read file operations are
> + * gauranteed to exist through the use of kernfs's active reference (see
> + * kernfs_active()). If a sysfs file operation is being run, the module which
> + * created it must still exist as the module is in charge of removing the same
> + * sysfs file being read. Also, a sysfs / kernfs file removal cannot happen
> + * unless the same file is not active.

I can't understand this paragraph at all. "Care must be taken ..."? Why?
Shouldn't callers of try_module_get() be satisfied with the results? I
don't follow the example at all. It seems to just say "sysfs store/read
functions don't need try_module_get() because whatever opened the sysfs
file is already keeping the module referenced." ?

> + *
> + * One of the real values to try_module_get() is the module_is_live() check
> + * which ensures this the caller of try_module_get() can yield to userspace
> + * module removal requests and fail whatever it was about to process.

Please document the return value explicitly.

> + */
>  extern bool try_module_get(struct module *module);
>  
> +/**
> + * module_put() - release a reference count to a module
> + * @module: the module we should release a reference count for
> + *
> + * If you successfully bump a reference count to a module with try_module_get(),
> + * when you are finished you must call module_put() to release that reference
> + * count.
> + */
>  extern void module_put(struct module *module);
>  
>  #else /*!CONFIG_MODULE_UNLOAD*/
> -- 
> 2.30.2
>
Luis Chamberlain Oct. 11, 2021, 9:16 p.m. UTC | #2
On Tue, Oct 05, 2021 at 12:58:47PM -0700, Kees Cook wrote:
> On Mon, Sep 27, 2021 at 09:37:59AM -0700, Luis Chamberlain wrote:
> > diff --git a/include/linux/module.h b/include/linux/module.h
> > index c9f1200b2312..22eacd5e1e85 100644
> > --- a/include/linux/module.h
> > +++ b/include/linux/module.h
> > @@ -609,10 +609,40 @@ void symbol_put_addr(void *addr);
> >     to handle the error case (which only happens with rmmod --wait). */
> >  extern void __module_get(struct module *module);
> >  
> > -/* This is the Right Way to get a module: if it fails, it's being removed,
> > - * so pretend it's not there. */
> > +/**
> > + * try_module_get() - yields to module removal and bumps refcnt otherwise
> 
> I find this hard to parse. How about:
> 	"Take module refcount unless module is being removed"

Sure.

> > + * @module: the module we should check for
> > + *
> > + * This can be used to try to bump the reference count of a module, so to
> > + * prevent module removal. The reference count of a module is not allowed
> > + * to be incremented if the module is already being removed.
> 
> This I understand.
> 
> > + *
> > + * Care must be taken to ensure the module cannot be removed during the call to
> > + * try_module_get(). This can be done by having another entity other than the
> > + * module itself increment the module reference count, or through some other
> > + * means which guarantees the module could not be removed during an operation.
> > + * An example of this later case is using try_module_get() in a sysfs file
> > + * which the module created. The sysfs store / read file operations are
> > + * gauranteed to exist through the use of kernfs's active reference (see
> > + * kernfs_active()). If a sysfs file operation is being run, the module which
> > + * created it must still exist as the module is in charge of removing the same
> > + * sysfs file being read. Also, a sysfs / kernfs file removal cannot happen
> > + * unless the same file is not active.
> 
> I can't understand this paragraph at all. "Care must be taken ..."? Why?

Because the routine try_module_get() assumes the struct module pointer
is valid for the entire call. That can only be true if at least one
reference is held prior to this call.

> Shouldn't callers of try_module_get() be satisfied with the results?

Yes but only with the above care addressed.

> I don't follow the example at all. It seems to just say "sysfs store/read
> functions don't need try_module_get() because whatever opened the sysfs
> file is already keeping the module referenced." ?

That is exactly what I intended to clarify with that example, yes, a
reference is held but this is done implicitly. *If* a kernfs op is
active module removal waits for that active reference to go down. So
while a kernfs file is being used it is simply not possible for the
module to disappear underneath us. And the reason is that the module
that created the sysfs file must obviously destroy that same sysfs file.
But since kernfs ensures that sysfs file cannot be removed if a sysfs
file is being used, this implicitly holds a module reference.

Let me know if y ou can think of a better way to phrase this.

> > + *
> > + * One of the real values to try_module_get() is the module_is_live() check
> > + * which ensures this the caller of try_module_get() can yield to userspace
> > + * module removal requests and fail whatever it was about to process.
> 
> Please document the return value explicitly.

Sure thing.

  Luis
diff mbox series

Patch

diff --git a/include/linux/module.h b/include/linux/module.h
index c9f1200b2312..22eacd5e1e85 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -609,10 +609,40 @@  void symbol_put_addr(void *addr);
    to handle the error case (which only happens with rmmod --wait). */
 extern void __module_get(struct module *module);
 
-/* This is the Right Way to get a module: if it fails, it's being removed,
- * so pretend it's not there. */
+/**
+ * try_module_get() - yields to module removal and bumps refcnt otherwise
+ * @module: the module we should check for
+ *
+ * This can be used to try to bump the reference count of a module, so to
+ * prevent module removal. The reference count of a module is not allowed
+ * to be incremented if the module is already being removed.
+ *
+ * Care must be taken to ensure the module cannot be removed during the call to
+ * try_module_get(). This can be done by having another entity other than the
+ * module itself increment the module reference count, or through some other
+ * means which guarantees the module could not be removed during an operation.
+ * An example of this later case is using try_module_get() in a sysfs file
+ * which the module created. The sysfs store / read file operations are
+ * gauranteed to exist through the use of kernfs's active reference (see
+ * kernfs_active()). If a sysfs file operation is being run, the module which
+ * created it must still exist as the module is in charge of removing the same
+ * sysfs file being read. Also, a sysfs / kernfs file removal cannot happen
+ * unless the same file is not active.
+ *
+ * One of the real values to try_module_get() is the module_is_live() check
+ * which ensures this the caller of try_module_get() can yield to userspace
+ * module removal requests and fail whatever it was about to process.
+ */
 extern bool try_module_get(struct module *module);
 
+/**
+ * module_put() - release a reference count to a module
+ * @module: the module we should release a reference count for
+ *
+ * If you successfully bump a reference count to a module with try_module_get(),
+ * when you are finished you must call module_put() to release that reference
+ * count.
+ */
 extern void module_put(struct module *module);
 
 #else /*!CONFIG_MODULE_UNLOAD*/