Message ID | 161117153776.2853729.6944617921517514510.stgit@dwillia2-desk3.amr.corp.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | cdev: Generic shutdown handling | expand |
The subject doesn't make any sense to me. But thn again queued sound really weird. You just have a managed API with a refcount and synchronization, right? procfs and debugfs already support these kind of managed ops, kinda sad to duplicate this concept yet another time. > +static long cdev_queued_ioctl(struct file *file, unsigned int cmd, unsigned long arg) Overly long line. > +__must_check int __cdev_register_queued(struct cdev *cdev, struct module *owner, > + dev_t dev, unsigned count, > + const struct cdev_operations *qops) > +{ > + int rc; > + > + if (!qops->ioctl || !owner) > + return -EINVAL; Why is the ioctl method mandatory?
On 2021-01-20 12:38 p.m., Dan Williams wrote: > ...common reference count handling scenarios were addressed, but the > shutdown-synchronization problem was only mentioned as something driver > developers need to be aware in the following note: > > NOTE: This guarantees that associated sysfs callbacks are not running > or runnable, however any cdevs already open will remain and their fops > will still be callable even after this function returns. > > Remove that responsibility from driver developers with the concept of a > 'queued' mode for cdevs. I find the queued name confusing. What's being queued? > +static const struct file_operations cdev_queued_fops = { > + .owner = THIS_MODULE, > + .open = cdev_queued_open, > + .unlocked_ioctl = cdev_queued_ioctl, > + .compat_ioctl = compat_ptr_ioctl, > + .llseek = noop_llseek, > +}; Why do we only protect these fops? I'd find it a bit confusing to have ioctl protected from use after del, but not write/read/etc. Logan
On Wed, Jan 20, 2021 at 11:46 AM Christoph Hellwig <hch@infradead.org> wrote: > > The subject doesn't make any sense to me. > > But thn again queued sound really weird. You just have a managed > API with a refcount and synchronization, right? Correct. "queue" was in reference to the way q_usage_count behaves, but yes, just refcount + synchronization is all this is. > > procfs and debugfs already support these kind of managed ops, kinda sad > to duplicate this concept yet another time. Oh, I didn't realize there were managed ops there, I'll go take a look and see if cdev can adopt that scheme. > > +static long cdev_queued_ioctl(struct file *file, unsigned int cmd, unsigned long arg) > > Overly long line. > > > +__must_check int __cdev_register_queued(struct cdev *cdev, struct module *owner, > > + dev_t dev, unsigned count, > > + const struct cdev_operations *qops) > > +{ > > + int rc; > > + > > + if (!qops->ioctl || !owner) > > + return -EINVAL; > > Why is the ioctl method mandatory? Yeah, that can drop. It was there more to ask the question about whether cdev should be mandating ioctls with pointer arguments and taking the need to specify the compat fallback away from a driver responsibility.
On Wed, Jan 20, 2021 at 11:51 AM Logan Gunthorpe <logang@deltatee.com> wrote: > > > > > On 2021-01-20 12:38 p.m., Dan Williams wrote: > > ...common reference count handling scenarios were addressed, but the > > shutdown-synchronization problem was only mentioned as something driver > > developers need to be aware in the following note: > > > > NOTE: This guarantees that associated sysfs callbacks are not running > > or runnable, however any cdevs already open will remain and their fops > > will still be callable even after this function returns. > > > > Remove that responsibility from driver developers with the concept of a > > 'queued' mode for cdevs. > > I find the queued name confusing. What's being queued? Yeah, as I mentioned to Christoph, a bit too much inspiration from q_usage_count. Perhaps "managed" makes more sense. > > > +static const struct file_operations cdev_queued_fops = { > > + .owner = THIS_MODULE, > > + .open = cdev_queued_open, > > + .unlocked_ioctl = cdev_queued_ioctl, > > + .compat_ioctl = compat_ptr_ioctl, > > + .llseek = noop_llseek, > > +}; > > Why do we only protect these fops? I'd find it a bit confusing to have > ioctl protected from use after del, but not write/read/etc. More ops can certainly be added over time, I didn't want to go do the work to wrap all file_operations before getting consensus on the idea that the cdev core should provide managed ops at all. The other question I'm posing with cdev_operations is whether the cdev core should take away some of the flexibility from end drivers in favor of adding more type safety. For example, mandate that all ioctls take a pointer argument not an integer argument? The question of whether wrapping cdev file_operations around a new cdev_operations is a good idea can be deferred after finalizing a mechanism for managed cdev file_operations.
On Wed, Jan 20, 2021 at 12:20 PM Dan Williams <dan.j.williams@intel.com> wrote: > > On Wed, Jan 20, 2021 at 11:46 AM Christoph Hellwig <hch@infradead.org> wrote: > > > > The subject doesn't make any sense to me. > > > > But thn again queued sound really weird. You just have a managed > > API with a refcount and synchronization, right? > > Correct. > > "queue" was in reference to the way q_usage_count behaves, but yes, > just refcount + synchronization is all this is. > > > > > procfs and debugfs already support these kind of managed ops, kinda sad > > to duplicate this concept yet another time. > > Oh, I didn't realize there were managed ops there, I'll go take a look > and see if cdev can adopt that scheme. So in both cases they are leveraging the fact that they are the filesystems that allocated the inode and will stash the private reference counting somewhere relative to the container_of the vfs_inode. I don't immediately see how that scheme can be implied to special device files that can come from an inode on any filesystem. I do see how debugfs and procfs could be unified, but I don't think this percpu_ref for cdev is compatible. Other ideas?
On Wed, Jan 20, 2021 at 11:38:57AM -0800, Dan Williams wrote: > -void cdev_del(struct cdev *p) > +void cdev_del(struct cdev *cdev) > { > - cdev_unmap(p->dev, p->count); > - kobject_put(&p->kobj); > + cdev_unmap(cdev->dev, cdev->count); > + kobject_put(&cdev->kobj); After Christoph's patch series, the kobject in struct cdev does nothing, so I will be removing it. So I don't think this patch set is going to do what you want :( thanks, greg k-h
On Thu, Jan 21, 2021 at 12:13 AM Greg KH <gregkh@linuxfoundation.org> wrote: > > On Wed, Jan 20, 2021 at 11:38:57AM -0800, Dan Williams wrote: > > -void cdev_del(struct cdev *p) > > +void cdev_del(struct cdev *cdev) > > { > > - cdev_unmap(p->dev, p->count); > > - kobject_put(&p->kobj); > > + cdev_unmap(cdev->dev, cdev->count); > > + kobject_put(&cdev->kobj); > > After Christoph's patch series, the kobject in struct cdev does nothing, > so I will be removing it. So I don't think this patch set is going to > do what you want :( The proposal in this series has nothing to do with kobject lifetime. Please take another look at the file_operations shutdown coordination problem and the fact that it's not just cdev that has a use case to manage file_operations this way. I believe Christoph's only objection, correct me if I'm wrong, is that this proposal invents a new third way to do this relative to procfs and debugfs. Perhaps there's a way to generalize this for all three, and I'm going to put some thought there, but at this point I think I'm failing to describe the problem clearly.
diff --git a/fs/char_dev.c b/fs/char_dev.c index ba0ded7842a7..c7239fbea0ff 100644 --- a/fs/char_dev.c +++ b/fs/char_dev.c @@ -367,6 +367,45 @@ void cdev_put(struct cdev *p) } } +static long cdev_queued_ioctl(struct file *file, unsigned int cmd, unsigned long arg) +{ + struct inode *inode = file_inode(file); + struct cdev *cdev = inode->i_cdev; + long rc; + + if (!percpu_ref_tryget_live(&cdev->qactive)) + return -ENXIO; + + rc = cdev->qops->ioctl(cdev, file, cmd, (void __user *) arg); + + percpu_ref_put(&cdev->qactive); + + return rc; +} + +static int cdev_queued_open(struct inode *inode, struct file *file) +{ + struct cdev *cdev = inode->i_cdev; + int rc; + + if (!percpu_ref_tryget_live(&cdev->qactive)) + return -ENXIO; + + rc = cdev->qops->open(cdev, file); + + percpu_ref_put(&cdev->qactive); + + return rc; +} + +static const struct file_operations cdev_queued_fops = { + .owner = THIS_MODULE, + .open = cdev_queued_open, + .unlocked_ioctl = cdev_queued_ioctl, + .compat_ioctl = compat_ptr_ioctl, + .llseek = noop_llseek, +}; + /* * Called every time a character special file is opened */ @@ -405,7 +444,10 @@ static int chrdev_open(struct inode *inode, struct file *filp) return ret; ret = -ENXIO; - fops = fops_get(p->ops); + if (p->qregistered) + fops = &cdev_queued_fops; + else + fops = fops_get(p->ops); if (!fops) goto out_cdev_put; @@ -582,7 +624,7 @@ static void cdev_unmap(dev_t dev, unsigned count) /** * cdev_del() - remove a cdev from the system - * @p: the cdev structure to be removed + * @cdev: the cdev structure to be removed * * cdev_del() removes @p from the system, possibly freeing the structure * itself. @@ -590,14 +632,22 @@ static void cdev_unmap(dev_t dev, unsigned count) * NOTE: This guarantees that cdev device will no longer be able to be * opened, however any cdevs already open will remain and their fops will * still be callable even after cdev_del returns. + * + * That is unless the cdev was initialized in queued mode. In queued + * mode new invocations of the fops are blocked and in-flight calls to + * the fops are drained before this returns. */ -void cdev_del(struct cdev *p) +void cdev_del(struct cdev *cdev) { - cdev_unmap(p->dev, p->count); - kobject_put(&p->kobj); + cdev_unmap(cdev->dev, cdev->count); + kobject_put(&cdev->kobj); + if (!cdev->qregistered) + return; + percpu_ref_kill(&cdev->qactive); + wait_for_completion(&cdev->qdead); + percpu_ref_exit(&cdev->qactive); } - static void cdev_default_release(struct kobject *kobj) { struct cdev *p = container_of(kobj, struct cdev, kobj); @@ -656,6 +706,52 @@ void cdev_init(struct cdev *cdev, const struct file_operations *fops) cdev->ops = fops; } +static void cdev_queued_release(struct percpu_ref *ref) +{ + struct cdev *cdev = container_of(ref, struct cdev, qactive); + + complete(&cdev->qdead); +} + +/** + * cdev_register_queued() - register a cdev structure with queued ops + * @cdev: the structure to init and add + * @owner: host module for the device + ops + * @dev: the first device number for which this device is responsible + * @count: the number of consecutive minor numbers corresponding to this + * device + * @ops: the cdev_operations for this device + * + * In addition to base cdev_init() allocate and initialize a reference + * counter to track in-flight operations. With cdev_register_queued() + * cdev_del() guarantees no in-flight operations in addition to no new + * opens. + */ +__must_check int __cdev_register_queued(struct cdev *cdev, struct module *owner, + dev_t dev, unsigned count, + const struct cdev_operations *qops) +{ + int rc; + + if (!qops->ioctl || !owner) + return -EINVAL; + + cdev_init(cdev, NULL); + cdev->qops = qops; + cdev->owner = owner; + cdev->qregistered = true; + init_completion(&cdev->qdead); + rc = percpu_ref_init(&cdev->qactive, cdev_queued_release, 0, GFP_KERNEL); + if (rc) + return rc; + + rc = cdev_add(cdev, dev, count); + if (rc) + percpu_ref_exit(&cdev->qactive); + return rc; +} +EXPORT_SYMBOL_GPL(__cdev_register_queued); + static struct kobject *base_probe(dev_t dev, int *part, void *data) { if (request_module("char-major-%d-%d", MAJOR(dev), MINOR(dev)) > 0) diff --git a/include/linux/cdev.h b/include/linux/cdev.h index 0e8cd6293deb..5ef1bfb3b495 100644 --- a/include/linux/cdev.h +++ b/include/linux/cdev.h @@ -2,6 +2,7 @@ #ifndef _LINUX_CDEV_H #define _LINUX_CDEV_H +#include <linux/percpu-refcount.h> #include <linux/kobject.h> #include <linux/kdev_t.h> #include <linux/list.h> @@ -10,17 +11,35 @@ struct file_operations; struct inode; struct module; +struct cdev; + +struct cdev_operations { + int (*open)(struct cdev *cdev, struct file *); + long (*ioctl)(struct cdev *cdev, struct file *file, unsigned int cmd, + void __user *arg); +}; struct cdev { struct kobject kobj; struct module *owner; - const struct file_operations *ops; + union { + const struct file_operations *ops; + const struct cdev_operations *qops; + }; struct list_head list; dev_t dev; unsigned int count; + struct percpu_ref qactive; + struct completion qdead; + bool qregistered; } __randomize_layout; void cdev_init(struct cdev *, const struct file_operations *); +#define cdev_register_queued(cdev, dev, count, ops) \ + __cdev_register_queued(cdev, THIS_MODULE, dev, count, ops) +__must_check int __cdev_register_queued(struct cdev *cdev, struct module *owner, + dev_t dev, unsigned count, + const struct cdev_operations *qops); struct cdev *cdev_alloc(void);