diff mbox series

[v2,4/4] test_sysfs: demonstrate deadlock fix

Message ID 20210703004632.621662-5-mcgrof@kernel.org (mailing list archive)
State New
Headers show
Series selftests: add a new test driver for sysfs | expand

Commit Message

Luis Chamberlain July 3, 2021, 12:46 a.m. UTC
Two mechanisms have been proposed to fix the sysfs deadlock issue.
The first approach proposed is by optionally allowing drivers to specify
a module and augmenting attributes with module information [0]. A secondary
approach is to use macros on drivers which needs this, in the meantime. This
embraces the secondary approach, in lieu of agreement of a generic solution.
This should be enough to allow for room for experimentation and demonstration
of the issue.

This then also enables the two test cases which we have disabled as
otherwise they would deadlock your system.

./tools/testing/selftests/sysfs/sysfs.sh -t 0027
Running test: sysfs_test_0027 - run #0
Test for possible rmmod deadlock while writing x ... ok

./tools/testing/selftests/sysfs/sysfs.sh -t 0028
Running test: sysfs_test_0028 - run #0
Test for possible rmmod deadlock using rtnl_lock while writing x ... ok

[0] https://lkml.kernel.org/r/20210401235925.GR4332@42.do-not-panic.com

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 lib/test_sysfs.c                       | 71 ++++++++++++++++++++++----
 tools/testing/selftests/sysfs/sysfs.sh |  4 +-
 2 files changed, 64 insertions(+), 11 deletions(-)

Comments

Greg Kroah-Hartman July 3, 2021, 4:49 a.m. UTC | #1
On Fri, Jul 02, 2021 at 05:46:32PM -0700, Luis Chamberlain wrote:
> +#define MODULE_DEVICE_ATTR_FUNC_STORE(_name) \
> +static ssize_t module_ ## _name ## _store(struct device *dev, \
> +				   struct device_attribute *attr, \
> +				   const char *buf, size_t len) \
> +{ \
> +	ssize_t __ret; \
> +	if (!try_module_get(THIS_MODULE)) \
> +		return -ENODEV; \
> +	__ret = _name ## _store(dev, attr, buf, len); \
> +	module_put(THIS_MODULE); \
> +	return __ret; \
> +}

As I have pointed out before, doing try_module_get(THIS_MODULE) is racy
and should not be added back to the kernel tree.  We got rid of many
instances of this "bad pattern" over the years, please do not encourage
it to be added back as others will somehow think that it correct code.

I'll go over the rest of this after 5.14-rc1 is out, am busy until then.

thanks,

greg k-h
Luis Chamberlain July 3, 2021, 5:28 p.m. UTC | #2
On Sat, Jul 03, 2021 at 06:49:46AM +0200, Greg KH wrote:
> On Fri, Jul 02, 2021 at 05:46:32PM -0700, Luis Chamberlain wrote:
> > +#define MODULE_DEVICE_ATTR_FUNC_STORE(_name) \
> > +static ssize_t module_ ## _name ## _store(struct device *dev, \
> > +				   struct device_attribute *attr, \
> > +				   const char *buf, size_t len) \
> > +{ \
> > +	ssize_t __ret; \
> > +	if (!try_module_get(THIS_MODULE)) \
> > +		return -ENODEV; \
> > +	__ret = _name ## _store(dev, attr, buf, len); \
> > +	module_put(THIS_MODULE); \
> > +	return __ret; \
> > +}
> 
> As I have pointed out before, doing try_module_get(THIS_MODULE) is racy
> and should not be added back to the kernel tree.  We got rid of many
> instances of this "bad pattern" over the years, please do not encourage
> it to be added back as others will somehow think that it correct code.

It is noted this is used in lieu of any agreed upon solution to
*demonstrate* how this at least does fix it. In this case (and in the
generic solution I also had suggested for kernfs a while ago), if the
try fails, we give up. If it succeeds, we now know we can rely on the
device pointer. If the refcount succeeds, can the module still not
be present? Is try_module_get() racy in that way? In what way is it
racy and where is this documented? Do we have a selftest to prove the
race?

  Luis
Greg Kroah-Hartman July 21, 2021, 11:33 a.m. UTC | #3
On Sat, Jul 03, 2021 at 10:28:28AM -0700, Luis Chamberlain wrote:
> On Sat, Jul 03, 2021 at 06:49:46AM +0200, Greg KH wrote:
> > On Fri, Jul 02, 2021 at 05:46:32PM -0700, Luis Chamberlain wrote:
> > > +#define MODULE_DEVICE_ATTR_FUNC_STORE(_name) \
> > > +static ssize_t module_ ## _name ## _store(struct device *dev, \
> > > +				   struct device_attribute *attr, \
> > > +				   const char *buf, size_t len) \
> > > +{ \
> > > +	ssize_t __ret; \
> > > +	if (!try_module_get(THIS_MODULE)) \
> > > +		return -ENODEV; \
> > > +	__ret = _name ## _store(dev, attr, buf, len); \
> > > +	module_put(THIS_MODULE); \
> > > +	return __ret; \
> > > +}
> > 
> > As I have pointed out before, doing try_module_get(THIS_MODULE) is racy
> > and should not be added back to the kernel tree.  We got rid of many
> > instances of this "bad pattern" over the years, please do not encourage
> > it to be added back as others will somehow think that it correct code.
> 
> It is noted this is used in lieu of any agreed upon solution to
> *demonstrate* how this at least does fix it. In this case (and in the
> generic solution I also had suggested for kernfs a while ago), if the
> try fails, we give up. If it succeeds, we now know we can rely on the
> device pointer. If the refcount succeeds, can the module still not
> be present? Is try_module_get() racy in that way? In what way is it
> racy and where is this documented? Do we have a selftest to prove the
> race?

As I say in the other email where you tried to add this, think about
what happens if the module is removed _right before_ you make this call.

Or a few instructions before that.  The race is still there, this fixes
nothing except make the window smaller.

thanks,

greg k-h
Luis Chamberlain July 22, 2021, 10:36 p.m. UTC | #4
On Wed, Jul 21, 2021 at 01:33:54PM +0200, Greg KH wrote:
> On Sat, Jul 03, 2021 at 10:28:28AM -0700, Luis Chamberlain wrote:
> > On Sat, Jul 03, 2021 at 06:49:46AM +0200, Greg KH wrote:
> > > On Fri, Jul 02, 2021 at 05:46:32PM -0700, Luis Chamberlain wrote:
> > > > +#define MODULE_DEVICE_ATTR_FUNC_STORE(_name) \
> > > > +static ssize_t module_ ## _name ## _store(struct device *dev, \
> > > > +				   struct device_attribute *attr, \
> > > > +				   const char *buf, size_t len) \
> > > > +{ \
> > > > +	ssize_t __ret; \
> > > > +	if (!try_module_get(THIS_MODULE)) \
> > > > +		return -ENODEV; \
> > > > +	__ret = _name ## _store(dev, attr, buf, len); \
> > > > +	module_put(THIS_MODULE); \
> > > > +	return __ret; \
> > > > +}
> > > 
> > > As I have pointed out before, doing try_module_get(THIS_MODULE) is racy
> > > and should not be added back to the kernel tree.  We got rid of many
> > > instances of this "bad pattern" over the years, please do not encourage
> > > it to be added back as others will somehow think that it correct code.
> > 
> > It is noted this is used in lieu of any agreed upon solution to
> > *demonstrate* how this at least does fix it. In this case (and in the
> > generic solution I also had suggested for kernfs a while ago), if the
> > try fails, we give up. If it succeeds, we now know we can rely on the
> > device pointer. If the refcount succeeds, can the module still not
> > be present? Is try_module_get() racy in that way? In what way is it
> > racy and where is this documented? Do we have a selftest to prove the
> > race?
> 
> As I say in the other email where you tried to add this, think about
> what happens if the module is removed _right before_ you make this call.
> 
> Or a few instructions before that.  The race is still there, this fixes
> nothing except make the window smaller.

The kernfs active reference ensures that if the file is open the module
must still exist. As such, the use within sysfs files should be safe
as the module is the one in charge of removing the files.

  Luis
diff mbox series

Patch

diff --git a/lib/test_sysfs.c b/lib/test_sysfs.c
index f27ec0eab747..af14e992e1b8 100644
--- a/lib/test_sysfs.c
+++ b/lib/test_sysfs.c
@@ -94,6 +94,59 @@  MODULE_PARM_DESC(enable_completion_on_rmmod,
 		 "enable sending a kernfs completion on rmmod");
 #endif
 
+#undef __ATTR_RO
+#undef __ATTR_RW
+#undef __ATTR_WO
+
+#define __ATTR_RO(_name) {						\
+	.attr	= { .name = __stringify(_name), .mode = 0444 },		\
+	.show	= module_##_name##_show,						\
+}
+#define __ATTR_RW(_name) __ATTR(_name, 0644, module_##_name##_show, module_##_name##_store)
+#define __ATTR_WO(_name) {						\
+	.attr	= { .name = __stringify(_name), .mode = 0200 },		\
+	.store	= module_##_name##_store,				\
+}
+
+#define MODULE_DEVICE_ATTR_FUNC_STORE(_name) \
+static ssize_t module_ ## _name ## _store(struct device *dev, \
+				   struct device_attribute *attr, \
+				   const char *buf, size_t len) \
+{ \
+	ssize_t __ret; \
+	if (!try_module_get(THIS_MODULE)) \
+		return -ENODEV; \
+	__ret = _name ## _store(dev, attr, buf, len); \
+	module_put(THIS_MODULE); \
+	return __ret; \
+}
+
+#define MODULE_DEVICE_ATTR_FUNC_SHOW(_name) \
+static ssize_t module_ ## _name ## _show(struct device *dev, \
+					 struct device_attribute *attr, \
+					 char *buf) \
+{ \
+	ssize_t __ret; \
+	if (!try_module_get(THIS_MODULE)) \
+		return -ENODEV; \
+	__ret = _name ## _show(dev, attr, buf); \
+	module_put(THIS_MODULE); \
+	return __ret; \
+}
+
+#define MODULE_DEVICE_ATTR_WO(_name) \
+MODULE_DEVICE_ATTR_FUNC_STORE(_name); \
+static DEVICE_ATTR_WO(_name)
+
+#define MODULE_DEVICE_ATTR_RW(_name) \
+MODULE_DEVICE_ATTR_FUNC_STORE(_name); \
+MODULE_DEVICE_ATTR_FUNC_SHOW(_name); \
+static DEVICE_ATTR_RW(_name)
+
+#define MODULE_DEVICE_ATTR_RO(_name) \
+MODULE_DEVICE_ATTR_FUNC_SHOW(_name); \
+static DEVICE_ATTR_RO(_name)
+
 static int sysfs_test_major;
 
 /**
@@ -311,7 +364,7 @@  static ssize_t config_show(struct device *dev,
 
 	return len;
 }
-static DEVICE_ATTR_RO(config);
+MODULE_DEVICE_ATTR_RO(config);
 
 static ssize_t reset_store(struct device *dev,
 			   struct device_attribute *attr,
@@ -336,7 +389,7 @@  static ssize_t reset_store(struct device *dev,
 
 	return count;
 }
-static DEVICE_ATTR_WO(reset);
+MODULE_DEVICE_ATTR_WO(reset);
 
 static void test_dev_busy_alloc(struct sysfs_test_device *test_dev)
 {
@@ -388,7 +441,7 @@  static ssize_t test_dev_x_show(struct device *dev,
 
 	return ret;
 }
-static DEVICE_ATTR_RW(test_dev_x);
+MODULE_DEVICE_ATTR_RW(test_dev_x);
 
 static ssize_t test_dev_y_store(struct device *dev,
 				struct device_attribute *attr,
@@ -432,7 +485,7 @@  static ssize_t test_dev_y_show(struct device *dev,
 
 	return ret;
 }
-static DEVICE_ATTR_RW(test_dev_y);
+MODULE_DEVICE_ATTR_RW(test_dev_y);
 
 static ssize_t config_enable_lock_store(struct device *dev,
 					struct device_attribute *attr,
@@ -477,7 +530,7 @@  static ssize_t config_enable_lock_show(struct device *dev,
 
 	return ret;
 }
-static DEVICE_ATTR_RW(config_enable_lock);
+MODULE_DEVICE_ATTR_RW(config_enable_lock);
 
 static ssize_t config_enable_lock_on_rmmod_store(struct device *dev,
 						 struct device_attribute *attr,
@@ -519,7 +572,7 @@  static ssize_t config_enable_lock_on_rmmod_show(struct device *dev,
 
 	return ret;
 }
-static DEVICE_ATTR_RW(config_enable_lock_on_rmmod);
+MODULE_DEVICE_ATTR_RW(config_enable_lock_on_rmmod);
 
 static ssize_t config_use_rtnl_lock_store(struct device *dev,
 					  struct device_attribute *attr,
@@ -558,7 +611,7 @@  static ssize_t config_use_rtnl_lock_show(struct device *dev,
 
 	return snprintf(buf, PAGE_SIZE, "%d\n", config->use_rtnl_lock);
 }
-static DEVICE_ATTR_RW(config_use_rtnl_lock);
+MODULE_DEVICE_ATTR_RW(config_use_rtnl_lock);
 
 static ssize_t config_write_delay_msec_y_store(struct device *dev,
 					       struct device_attribute *attr,
@@ -592,7 +645,7 @@  static ssize_t config_write_delay_msec_y_show(struct device *dev,
 
 	return snprintf(buf, PAGE_SIZE, "%d\n", config->write_delay_msec_y);
 }
-static DEVICE_ATTR_RW(config_write_delay_msec_y);
+MODULE_DEVICE_ATTR_RW(config_write_delay_msec_y);
 
 static ssize_t config_enable_busy_alloc_store(struct device *dev,
 					      struct device_attribute *attr,
@@ -626,7 +679,7 @@  static ssize_t config_enable_busy_alloc_show(struct device *dev,
 
 	return snprintf(buf, PAGE_SIZE, "%d\n", config->enable_busy_alloc);
 }
-static DEVICE_ATTR_RW(config_enable_busy_alloc);
+MODULE_DEVICE_ATTR_RW(config_enable_busy_alloc);
 
 #define TEST_SYSFS_DEV_ATTR(name)		(&dev_attr_##name.attr)
 
diff --git a/tools/testing/selftests/sysfs/sysfs.sh b/tools/testing/selftests/sysfs/sysfs.sh
index f27ea61e0e95..2de9f37cb00b 100755
--- a/tools/testing/selftests/sysfs/sysfs.sh
+++ b/tools/testing/selftests/sysfs/sysfs.sh
@@ -60,8 +60,8 @@  ALL_TESTS="$ALL_TESTS 0023:1:1:test_dev_y:block"
 ALL_TESTS="$ALL_TESTS 0024:1:1:test_dev_x:block"
 ALL_TESTS="$ALL_TESTS 0025:1:1:test_dev_y:block"
 ALL_TESTS="$ALL_TESTS 0026:1:1:test_dev_y:block"
-ALL_TESTS="$ALL_TESTS 0027:1:0:test_dev_x:block" # deadlock test
-ALL_TESTS="$ALL_TESTS 0028:1:0:test_dev_x:block" # deadlock test with rntl_lock
+ALL_TESTS="$ALL_TESTS 0027:1:1:test_dev_x:block" # deadlock test
+ALL_TESTS="$ALL_TESTS 0028:1:1:test_dev_x:block" # deadlock test with rntl_lock
 ALL_TESTS="$ALL_TESTS 0029:1:1:test_dev_x:block" # kernfs race removal of store
 ALL_TESTS="$ALL_TESTS 0030:1:1:test_dev_x:block" # kernfs race removal before mutex
 ALL_TESTS="$ALL_TESTS 0031:1:1:test_dev_x:block" # kernfs race removal after mutex