diff mbox

module: add kset_obj_exists() and use it

Message ID 20130411095537.GC21320@redhat.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Veaceslav Falico April 11, 2013, 9:55 a.m. UTC
On Wed, Apr 10, 2013 at 04:47:34PM +0930, Rusty Russell wrote:
>Veaceslav Falico <vfalico@redhat.com> writes:
>> Add a new function, kset_obj_exists(), which is identical to
>> kset_find_obj() but doesn't take a reference to the kobject
>> found and only returns bool if found/not found.
>>
>> The main purpose would be to avoid the possible race scenario,
>> when we could get the reference in between the kref_put() and
>> kobject_release() functions (i.e. kref_put() already ran,
>> refcount became 0, but the kobject_release() function still
>> didn't run, and we try to get via kobject_get() and thus ending
>> up with a released kobject).
>
>What?  If refcount is zero, there should be no more references to it.
>That's what 0 means.
>
>If you want 0-and-then-cleanup, you need atomic_dec_and_lock().

Yep, you're right, I was trying to fix consequences instead of fixing the
cause. What I've encountered was:

rmmod:
  #5 [ffff88005a073d78] __delay at ffffffff813a940f
  #6 [ffff88005a073d80] do_raw_spin_lock at ffffffff813b1d2e
  #7 [ffff88005a073db0] _raw_spin_lock at ffffffff816f8e76
  #8 [ffff88005a073de0] kobj_kset_leave at ffffffff8139ee5a
  #9 [ffff88005a073e00] kobject_del at ffffffff8139eeb2
#10 [ffff88005a073e20] kobject_cleanup at ffffffff8139ef32
#11 [ffff88005a073e50] kobject_release at ffffffff8139f08d
#12 [ffff88005a073e60] kobject_put at ffffffff8139eddc
#13 [ffff88005a073e80] mod_sysfs_teardown at ffffffff810ed9cc
#14 [ffff88005a073eb0] free_module at ffffffff810edf57
#15 [ffff88005a073ed0] sys_delete_module at ffffffff810ee508
#16 [ffff88005a073f80] system_call_fastpath at ffffffff81703819

insmod:
[exception RIP: kobject_get+63]
  #7 [ffff88005a0f3db0] kset_find_obj at ffffffff8139f139
  #8 [ffff88005a0f3de0] mod_sysfs_init at ffffffff810ed5f3
  #9 [ffff88005a0f3e20] mod_sysfs_setup at ffffffff810f1872
#10 [ffff88005a0f3e80] load_module at ffffffff810f2003
#11 [ffff88005a0f3ef0] sys_init_module at ffffffff810f22e4
#12 [ffff88005a0f3f80] system_call_fastpath at ffffffff81703819

So I've thought it was a race in kset_find_obj(), and I've been wrong (I've
replaced the WARN_ON(refcount) with BUG_ON - we'd anyway BUG somewhere
further in kobject_put() ).

However, I think my patch still adds something good, cause now we have 2
cases where we basically do:

k = kset_find_obj();
if (!k)
	return;
kobject_put(k);

which adds useless overhead (by using kobject_get()/kobject_put(), and
kobject_release() - which is called from kobject_put()) - where we should
only verify if there exists a kobject with the specified name.

Should I resend it with a properly fixed commit message, or it's really not
needed?

>
>> It can be triggered, for example,
>> by running insmod/rmmod bonding in parallel, which ends up in
>> a race between kset_obj_find() in mod_sysfs_init() and rmmod's
>> mod_sysfs_fini()/kobject_put().
>
>That's a bug.  We should be cleaning up sysfs before we unlike the
>removed module from the list.
>
>Because the same thing applies to ddebug info, which is also keyed by
>module name.
>
>Something like this (untested!):

Sorry for the late response - I wanted to test it for a longer time.

Your patch works flawlessly and fixes this race, with just a small
addition, cause otherwise we could BUG() in show_initstate().

Can you apply this patch or should I (re-)send it somehow?

Thank you!


>
>diff --git a/kernel/module.c b/kernel/module.c
>index 3c2c72d..8d4de82 100644
>--- a/kernel/module.c
>+++ b/kernel/module.c
>@@ -1862,10 +1862,10 @@ static void free_module(struct module *mod)
> {
> 	trace_module_free(mod);
>
>-	/* Delete from various lists */
>-	mutex_lock(&module_mutex);
>-	stop_machine(__unlink_module, mod, NULL);
>-	mutex_unlock(&module_mutex);
>+	/* We leave it in list to prevent duplicate loads while we clean
>+	 * up sysfs, ddebug and any other external exposure. */
>+	mod->state = MODULE_STATE_UNFORMED;
>+
> 	mod_sysfs_teardown(mod);
>
> 	/* Remove dynamic debug info */
>@@ -1880,6 +1880,11 @@ static void free_module(struct module *mod)
> 	/* Free any allocated parameters. */
> 	destroy_params(mod->kp, mod->num_kp);
>
>+	/* Now we can delete it from the lists */
>+	mutex_lock(&module_mutex);
>+	stop_machine(__unlink_module, mod, NULL);
>+	mutex_unlock(&module_mutex);
>+
> 	/* This may be NULL, but that's OK */
> 	unset_module_init_ro_nx(mod);
> 	module_free(mod, mod->module_init);
>
>> It also saves us some CPU time in several places where we don't
>> need the kobject itself and only need to verify if it actually
>> exists, by not taking the kref (and thus we don't need to
>> kobject_put() afterwards).
>>
>> Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
>> ---
>>  drivers/pci/slot.c      |    5 +----
>>  include/linux/kobject.h |    1 +
>>  kernel/module.c         |    5 +----
>>  lib/kobject.c           |   28 ++++++++++++++++++++++++++++
>>  4 files changed, 31 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/pci/slot.c b/drivers/pci/slot.c
>> index ac6412f..3988d75 100644
>> --- a/drivers/pci/slot.c
>> +++ b/drivers/pci/slot.c
>> @@ -154,11 +154,8 @@ static char *make_slot_name(const char *name)
>>  	dup = 1;
>>
>>  	for (;;) {
>> -		struct kobject *dup_slot;
>> -		dup_slot = kset_find_obj(pci_slots_kset, new_name);
>> -		if (!dup_slot)
>> +		if (!kset_obj_exists(pci_slots_kset, new_name))
>>  			break;
>> -		kobject_put(dup_slot);
>>  		if (dup == max) {
>>  			len++;
>>  			max *= 10;
>> diff --git a/include/linux/kobject.h b/include/linux/kobject.h
>> index 939b112..7cde72c 100644
>> --- a/include/linux/kobject.h
>> +++ b/include/linux/kobject.h
>> @@ -191,6 +191,7 @@ static inline struct kobj_type *get_ktype(struct kobject *kobj)
>>  }
>>
>>  extern struct kobject *kset_find_obj(struct kset *, const char *);
>> +extern bool kset_obj_exists(struct kset *, const char *);
>>
>>  /* The global /sys/kernel/ kobject for people to chain off of */
>>  extern struct kobject *kernel_kobj;
>> diff --git a/kernel/module.c b/kernel/module.c
>> index 0925c9a..1df13a3 100644
>> --- a/kernel/module.c
>> +++ b/kernel/module.c
>> @@ -1606,7 +1606,6 @@ static void module_remove_modinfo_attrs(struct module *mod)
>>  static int mod_sysfs_init(struct module *mod)
>>  {
>>  	int err;
>> -	struct kobject *kobj;
>>
>>  	if (!module_sysfs_initialized) {
>>  		printk(KERN_ERR "%s: module sysfs not initialized\n",
>> @@ -1615,10 +1614,8 @@ static int mod_sysfs_init(struct module *mod)
>>  		goto out;
>>  	}
>>
>> -	kobj = kset_find_obj(module_kset, mod->name);
>> -	if (kobj) {
>> +	if (kset_obj_exists(module_kset, mod->name)) {
>>  		printk(KERN_ERR "%s: module is already loaded\n", mod->name);
>> -		kobject_put(kobj);
>>  		err = -EINVAL;
>>  		goto out;
>>  	}
>> diff --git a/lib/kobject.c b/lib/kobject.c
>> index e07ee1f..b82633f 100644
>> --- a/lib/kobject.c
>> +++ b/lib/kobject.c
>> @@ -760,6 +760,34 @@ struct kobject *kset_find_obj(struct kset *kset, const char *name)
>>  	return ret;
>>  }
>>
>> +/**
>> + * kset_obj_exists - verify if an object exists in kset
>> + * @kset: kset we're looking in.
>> + * @name: object's name.
>> + *
>> + * Lock kset via @kset->subsys, and iterate over @kset->list,
>> + * looking for a matching kobject. Returns bool, found/not found.
>> + * This function does not take a reference and thus doesn't
>> + * guarantee that the object won't go away any time after.
>> + */
>> +
>> +bool kset_obj_exists(struct kset *kset, const char *name)
>> +{
>> +	struct kobject *k;
>> +
>> +	spin_lock(&kset->list_lock);
>> +
>> +	list_for_each_entry(k, &kset->list, entry) {
>> +		if (kobject_name(k) && !strcmp(kobject_name(k), name)) {
>> +			spin_unlock(&kset->list_lock);
>> +			return true;
>> +		}
>> +	}
>> +
>> +	spin_unlock(&kset->list_lock);
>> +	return false;
>> +}
>> +
>>  static void kset_release(struct kobject *kobj)
>>  {
>>  	struct kset *kset = container_of(kobj, struct kset, kobj);
>> --
>> 1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Greg KH April 11, 2013, 1:28 p.m. UTC | #1
On Thu, Apr 11, 2013 at 11:55:37AM +0200, Veaceslav Falico wrote:
> However, I think my patch still adds something good, cause now we have 2
> cases where we basically do:
> 
> k = kset_find_obj();
> if (!k)
> 	return;
> kobject_put(k);
> 
> which adds useless overhead (by using kobject_get()/kobject_put(), and
> kobject_release() - which is called from kobject_put()) - where we should
> only verify if there exists a kobject with the specified name.
> 
> Should I resend it with a properly fixed commit message, or it's really not
> needed?

I don't think it's really needed, there is no speed/overhead issue here
and you need to do the kobject_get/put stuff anyway if you are trying to
look at a kobject.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Veaceslav Falico April 11, 2013, 1:53 p.m. UTC | #2
On Thu, Apr 11, 2013 at 06:28:31AM -0700, Greg KH wrote:
>On Thu, Apr 11, 2013 at 11:55:37AM +0200, Veaceslav Falico wrote:
>> However, I think my patch still adds something good, cause now we have 2
>> cases where we basically do:
>>
>> k = kset_find_obj();
>> if (!k)
>> 	return;
>> kobject_put(k);
>>
>> which adds useless overhead (by using kobject_get()/kobject_put(), and
>> kobject_release() - which is called from kobject_put()) - where we should
>> only verify if there exists a kobject with the specified name.
>>
>> Should I resend it with a properly fixed commit message, or it's really not
>> needed?
>
>I don't think it's really needed, there is no speed/overhead issue here
>and you need to do the kobject_get/put stuff anyway if you are trying to
>look at a kobject.

This is the point, actually, that we don't need to look at a kobject. We
only need to know if it existed that time or not, here are those two
examples of code:

static int mod_sysfs_init(struct module *mod)
{
         int err;
         struct kobject *kobj;

	...

         kobj = kset_find_obj(module_kset, mod->name);
         if (kobj) {
                 printk(KERN_ERR "%s: module is already loaded\n", mod->name);
                 kobject_put(kobj);
                 err = -EINVAL;
                 goto out;
         }

	...

So we just verify if there's a kobject with mod->name, and if it exists -
_put() it back and return, otherwise do nothing (with it).

Same here:

static char *make_slot_name(const char *name)
{
	...

         for (;;) {
                 struct kobject *dup_slot;
                 dup_slot = kset_find_obj(pci_slots_kset, new_name);
                 if (!dup_slot)
                         break;
                 kobject_put(dup_slot);

	...

We look if there exists a kobject named new_name in pci_slots_kset, if yes
- free it and try another name, if not - then we're good to go.

In both examples we don't look at that kobject, and only uselessly
_get()/_put() it. And it looks a bit ugly. After the patch, in both cases,
it takes only one call to kset_obj_exists() to find out if the object
exists at that time.

However, I have absolutely no knowledge/experience in this domain and might
for sure be missing something. Sorry if it's the case.

>
>thanks,
>
>greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Greg KH April 11, 2013, 3:20 p.m. UTC | #3
On Thu, Apr 11, 2013 at 03:53:40PM +0200, Veaceslav Falico wrote:
> On Thu, Apr 11, 2013 at 06:28:31AM -0700, Greg KH wrote:
> >On Thu, Apr 11, 2013 at 11:55:37AM +0200, Veaceslav Falico wrote:
> >>However, I think my patch still adds something good, cause now we have 2
> >>cases where we basically do:
> >>
> >>k = kset_find_obj();
> >>if (!k)
> >>	return;
> >>kobject_put(k);
> >>
> >>which adds useless overhead (by using kobject_get()/kobject_put(), and
> >>kobject_release() - which is called from kobject_put()) - where we should
> >>only verify if there exists a kobject with the specified name.
> >>
> >>Should I resend it with a properly fixed commit message, or it's really not
> >>needed?
> >
> >I don't think it's really needed, there is no speed/overhead issue here
> >and you need to do the kobject_get/put stuff anyway if you are trying to
> >look at a kobject.
> 
> This is the point, actually, that we don't need to look at a kobject. We
> only need to know if it existed that time or not, here are those two
> examples of code:
> 
> static int mod_sysfs_init(struct module *mod)
> {
>         int err;
>         struct kobject *kobj;
> 
> 	...
> 
>         kobj = kset_find_obj(module_kset, mod->name);
>         if (kobj) {
>                 printk(KERN_ERR "%s: module is already loaded\n", mod->name);
>                 kobject_put(kobj);
>                 err = -EINVAL;
>                 goto out;
>         }
> 
> 	...
> 
> So we just verify if there's a kobject with mod->name, and if it exists -
> _put() it back and return, otherwise do nothing (with it).
> 
> Same here:
> 
> static char *make_slot_name(const char *name)
> {
> 	...
> 
>         for (;;) {
>                 struct kobject *dup_slot;
>                 dup_slot = kset_find_obj(pci_slots_kset, new_name);
>                 if (!dup_slot)
>                         break;
>                 kobject_put(dup_slot);
> 
> 	...
> 
> We look if there exists a kobject named new_name in pci_slots_kset, if yes
> - free it and try another name, if not - then we're good to go.
> 
> In both examples we don't look at that kobject, and only uselessly
> _get()/_put() it. And it looks a bit ugly. After the patch, in both cases,
> it takes only one call to kset_obj_exists() to find out if the object
> exists at that time.

But as your function does the same thing, logically it's the same code
path :)

Anyway, yes, I understand your point here, and in some new code I'm
writing right now, we had to do much the same check as well.  But as
there are only 2 in-kernel users of this "pattern", I don't think it's
justified to add a new api call for it, especially if it were to be
misused as you were attempting to use it, which would only mask the real
problem you were trying to solve.

So, thanks for the idea, but for now, I'll pass.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Veaceslav Falico April 11, 2013, 3:39 p.m. UTC | #4
On Thu, Apr 11, 2013 at 08:20:03AM -0700, Greg KH wrote:
>On Thu, Apr 11, 2013 at 03:53:40PM +0200, Veaceslav Falico wrote:
>> On Thu, Apr 11, 2013 at 06:28:31AM -0700, Greg KH wrote:
>> >On Thu, Apr 11, 2013 at 11:55:37AM +0200, Veaceslav Falico wrote:
...
>>
>> In both examples we don't look at that kobject, and only uselessly
>> _get()/_put() it. And it looks a bit ugly. After the patch, in both cases,
>> it takes only one call to kset_obj_exists() to find out if the object
>> exists at that time.
>
>But as your function does the same thing, logically it's the same code
>path :)
>
>Anyway, yes, I understand your point here, and in some new code I'm
>writing right now, we had to do much the same check as well.  But as
>there are only 2 in-kernel users of this "pattern", I don't think it's
>justified to add a new api call for it, especially if it were to be
>misused as you were attempting to use it, which would only mask the real
>problem you were trying to solve.

Good point, it really might mask the real problem, as it would actually do
for the initial race.

>
>So, thanks for the idea, but for now, I'll pass.

Fair enough. Thank you for explaining :)

>
>thanks,
>
>greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/kernel/module.c b/kernel/module.c
index d0afe23..8be6e97 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -1063,6 +1063,9 @@  static ssize_t show_initstate(struct module_attribute *mattr,
  	case MODULE_STATE_GOING:
  		state = "going";
  		break;
+	case MODULE_STATE_UNFORMED:
+		state = "unformed";
+		break;
  	default:
  		BUG();
  	}