Message ID | 20220827171037.30297-2-kevin.tian@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Tidy up vfio_device life cycle | expand |
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> I have a couple of review comments, but nothing critical that would prevent my r-b. On 8/27/22 1:10 PM, Kevin Tian wrote: > The idea is to let vfio core manage the vfio_device life cycle instead > of duplicating the logic cross drivers. This is also a preparatory > step for adding struct device into vfio_device. > > New pair of helpers together with a kref in vfio_device: > > - vfio_alloc_device() > - vfio_put_device() > > Drivers can register @init/@release callbacks to manage any private > state wrapping the vfio_device. > > However vfio-ccw doesn't fit this model due to a life cycle mess > that its private structure mixes both parent and mdev info hence must > be allocated/free'ed outside of the life cycle of vfio device. > > Per prior discussions this won't be fixed in short term by IBM folks. > > Instead of waiting introduce another helper vfio_init_device() so ccw > can call it to initialize a pre-allocated vfio_device. > > Further implication of the ccw trick is that vfio_device cannot be > free'ed uniformly in vfio core. Instead, require *EVERY* driver to > implement @release and free vfio_device inside. Then ccw can choose > to delay the free at its own discretion. > > Another trick down the road is that kvzalloc() is used to accommodate > the need of gvt which uses vzalloc() while all others use kzalloc(). > So drivers should call a helper vfio_free_device() to free the > vfio_device instead of assuming that kfree() or vfree() is appliable. > > Later once the ccw mess is fixed we can remove those tricks and > fully handle structure alloc/free in vfio core. > > Existing vfio_{un}init_group_dev() will be deprecated after all > existing usages are converted to the new model. > > Suggested-by: Jason Gunthorpe <jgg@nvidia.com> > Co-developed-by: Yi Liu <yi.l.liu@intel.com> > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > Signed-off-by: Kevin Tian <kevin.tian@intel.com> > --- > drivers/vfio/vfio_main.c | 92 ++++++++++++++++++++++++++++++++++++++++ > include/linux/vfio.h | 25 ++++++++++- > 2 files changed, 116 insertions(+), 1 deletion(-) > > diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c > index 7cb56c382c97..af8aad116f2b 100644 > --- a/drivers/vfio/vfio_main.c > +++ b/drivers/vfio/vfio_main.c > @@ -496,6 +496,98 @@ void vfio_uninit_group_dev(struct vfio_device *device) > } > EXPORT_SYMBOL_GPL(vfio_uninit_group_dev); > > +/* > + * Alloc and initialize vfio_device so it can be registered to vfio > + * core. > + * > + * Drivers should use the wrapper vfio_alloc_device() for allocation. > + * @size is the size of the structure to be allocated, including any > + * private data used by the driver. It seems the purpose of the wrapper is to ensure that the object being allocated has as its first field a struct vfio_device object and to return its container. Why not just make that a requirement for this function - which I would rename vfio_alloc_device - and document it in the prologue? The caller can then cast the return pointer or use container_of. > + * > + * Driver may provide an @init callback to cover device private data. > + * > + * Use vfio_put_device() to release the structure after success return. > + */ > +struct vfio_device *_vfio_alloc_device(size_t size, struct device *dev, > + const struct vfio_device_ops *ops) > +{ > + struct vfio_device *device; > + int ret; > + > + if (WARN_ON(size < sizeof(struct vfio_device))) > + return ERR_PTR(-EINVAL); > + > + device = kvzalloc(size, GFP_KERNEL); > + if (!device) > + return ERR_PTR(-ENOMEM); > + > + ret = vfio_init_device(device, dev, ops); > + if (ret) > + goto out_free; > + return device; > + > +out_free: > + kvfree(device); > + return ERR_PTR(ret); > +} > +EXPORT_SYMBOL_GPL(_vfio_alloc_device); > + > +/* > + * Initialize a vfio_device so it can be registered to vfio core. > + * > + * Only vfio-ccw driver should call this interface. > + */ > +int vfio_init_device(struct vfio_device *device, struct device *dev, > + const struct vfio_device_ops *ops) > +{ > + int ret; > + > + vfio_init_group_dev(device, dev, ops); > + > + if (ops->init) { > + ret = ops->init(device); > + if (ret) > + goto out_uninit; > + } > + > + kref_init(&device->kref); > + return 0; > + > +out_uninit: > + vfio_uninit_group_dev(device); > + return ret; > +} > +EXPORT_SYMBOL_GPL(vfio_init_device); > + > +/* > + * The helper called by driver @release callback to free the device > + * structure. Drivers which don't have private data to clean can > + * simply use this helper as its @release. > + */ > +void vfio_free_device(struct vfio_device *device) > +{ > + kvfree(device); > +} > +EXPORT_SYMBOL_GPL(vfio_free_device); > + > +/* Release helper called by vfio_put_device() */ > +void vfio_device_release(struct kref *kref) > +{ > + struct vfio_device *device = > + container_of(kref, struct vfio_device, kref); > + > + vfio_uninit_group_dev(device); > + > + /* > + * kvfree() cannot be done here due to a life cycle mess in > + * vfio-ccw. Before the ccw part is fixed all drivers are > + * required to support @release and call vfio_free_device() > + * from there. > + */ > + device->ops->release(device); > +} > +EXPORT_SYMBOL_GPL(vfio_device_release); > + > static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev, > enum vfio_group_type type) > { > diff --git a/include/linux/vfio.h b/include/linux/vfio.h > index e05ddc6fe6a5..e1e9e8352903 100644 > --- a/include/linux/vfio.h > +++ b/include/linux/vfio.h > @@ -45,7 +45,8 @@ struct vfio_device { > struct kvm *kvm; > > /* Members below here are private, not for driver use */ > - refcount_t refcount; > + struct kref kref; /* object life cycle */ > + refcount_t refcount; /* user count on registered device*/ > unsigned int open_count; > struct completion comp; > struct list_head group_next; > @@ -55,6 +56,8 @@ struct vfio_device { > /** > * struct vfio_device_ops - VFIO bus driver device callbacks > * > + * @init: initialize private fields in device structure > + * @release: Reclaim private fields in device structure > * @open_device: Called when the first file descriptor is opened for this device > * @close_device: Opposite of open_device > * @read: Perform read(2) on device file descriptor > @@ -72,6 +75,8 @@ struct vfio_device { > */ > struct vfio_device_ops { > char *name; > + int (*init)(struct vfio_device *vdev); > + void (*release)(struct vfio_device *vdev); > int (*open_device)(struct vfio_device *vdev); > void (*close_device)(struct vfio_device *vdev); > ssize_t (*read)(struct vfio_device *vdev, char __user *buf, > @@ -137,6 +142,24 @@ static inline int vfio_check_feature(u32 flags, size_t argsz, u32 supported_ops, > return 1; > } > > +struct vfio_device *_vfio_alloc_device(size_t size, struct device *dev, > + const struct vfio_device_ops *ops); > +#define vfio_alloc_device(dev_struct, member, dev, ops) \ > + container_of(_vfio_alloc_device(sizeof(struct dev_struct) + \ > + BUILD_BUG_ON_ZERO(offsetof( \ > + struct dev_struct, member)), \ > + dev, ops), \ > + struct dev_struct, member) I found the use of this macro confusing and unnecessary. I'd prefer vfio_alloc_device be a function (see my comments above). > + > +int vfio_init_device(struct vfio_device *device, struct device *dev, > + const struct vfio_device_ops *ops); > +void vfio_free_device(struct vfio_device *device); > +void vfio_device_release(struct kref *kref); > +static inline void vfio_put_device(struct vfio_device *device) > +{ > + kref_put(&device->kref, vfio_device_release); > +} > + > void vfio_init_group_dev(struct vfio_device *device, struct device *dev, > const struct vfio_device_ops *ops); > void vfio_uninit_group_dev(struct vfio_device *device);
On 8/30/22 11:10 AM, Jason Gunthorpe wrote: > On Tue, Aug 30, 2022 at 09:42:42AM -0400, Anthony Krowiak wrote: > >>> +/* >>> + * Alloc and initialize vfio_device so it can be registered to vfio >>> + * core. >>> + * >>> + * Drivers should use the wrapper vfio_alloc_device() for allocation. >>> + * @size is the size of the structure to be allocated, including any >>> + * private data used by the driver. >> >> It seems the purpose of the wrapper is to ensure that the object being >> allocated has as its first field a struct vfio_device object and to return >> its container. Why not just make that a requirement for this function - >> which I would rename vfio_alloc_device - and document it in the prologue? >> The caller can then cast the return pointer or use container_of. > There are three fairly common patterns for this kind of thing > > 1) The caller open codes everything: > > driver_struct = kzalloc() > core_init(&driver_struct->core) > > 2) Some 'get priv' / 'get data' is used instead of container_of(): > > core_struct = core_alloc(sizeof(*driver_struct)) > driver_struct = core_get_priv(core_struct) > > 3) The allocations and initialization are consolidated in the core, > but we continue to use container_of() > > driver_struct = core_alloc(typeof(*driver_struct)) > > #1 has a general drawback that people routinely mess up the lifecycle > model and get really confused about when to do kfree() vs put(), > creating bugs. > > #2 has a general drawback of not using container_of() at all, and being > a bit confusing in some cases > > #3 has the general drawback of being a bit magical, but solves 1 and > 2's problems. > > I would not fix the struct layout without the BUILD_BUG_ON because > someone will accidently change the order and that becomes a subtle > runtime error - so at a minimum the wrapper macro has to exist to > check that. > > If you want to allow a dynamic struct layout and avoid the pitfall of > exposing the user to kalloc/kfree, then you still need the macro, and > it does some more complicated offset stuff. > > Having the wrapper macro be entirely type safe is appealing and > reduces code in the drivers, IMHO. Tell it what type you are initing > and get back init'd memory for that type that you always, always free > with a put operation. Sounds reasonable, okay I'm buying. > > Jason
> From: Jason Gunthorpe <jgg@ziepe.ca> > Sent: Tuesday, August 30, 2022 11:10 PM > > On Tue, Aug 30, 2022 at 09:42:42AM -0400, Anthony Krowiak wrote: > > > > +/* > > > + * Alloc and initialize vfio_device so it can be registered to vfio > > > + * core. > > > + * > > > + * Drivers should use the wrapper vfio_alloc_device() for allocation. > > > + * @size is the size of the structure to be allocated, including any > > > + * private data used by the driver. > > > > > > It seems the purpose of the wrapper is to ensure that the object being > > allocated has as its first field a struct vfio_device object and to return > > its container. Why not just make that a requirement for this function - > > which I would rename vfio_alloc_device - and document it in the prologue? > > The caller can then cast the return pointer or use container_of. > > There are three fairly common patterns for this kind of thing > > 1) The caller open codes everything: > > driver_struct = kzalloc() > core_init(&driver_struct->core) > > 2) Some 'get priv' / 'get data' is used instead of container_of(): > > core_struct = core_alloc(sizeof(*driver_struct)) > driver_struct = core_get_priv(core_struct) > > 3) The allocations and initialization are consolidated in the core, > but we continue to use container_of() > > driver_struct = core_alloc(typeof(*driver_struct)) > > #1 has a general drawback that people routinely mess up the lifecycle > model and get really confused about when to do kfree() vs put(), > creating bugs. > > #2 has a general drawback of not using container_of() at all, and being > a bit confusing in some cases > > #3 has the general drawback of being a bit magical, but solves 1 and > 2's problems. > > I would not fix the struct layout without the BUILD_BUG_ON because > someone will accidently change the order and that becomes a subtle > runtime error - so at a minimum the wrapper macro has to exist to > check that. Agree. And gvt happened to hit this BUILD_BUG_ON when this series was being worked on. > > If you want to allow a dynamic struct layout and avoid the pitfall of > exposing the user to kalloc/kfree, then you still need the macro, and > it does some more complicated offset stuff. > > Having the wrapper macro be entirely type safe is appealing and > reduces code in the drivers, IMHO. Tell it what type you are initing > and get back init'd memory for that type that you always, always free > with a put operation. > > Jason
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 7cb56c382c97..af8aad116f2b 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -496,6 +496,98 @@ void vfio_uninit_group_dev(struct vfio_device *device) } EXPORT_SYMBOL_GPL(vfio_uninit_group_dev); +/* + * Alloc and initialize vfio_device so it can be registered to vfio + * core. + * + * Drivers should use the wrapper vfio_alloc_device() for allocation. + * @size is the size of the structure to be allocated, including any + * private data used by the driver. + * + * Driver may provide an @init callback to cover device private data. + * + * Use vfio_put_device() to release the structure after success return. + */ +struct vfio_device *_vfio_alloc_device(size_t size, struct device *dev, + const struct vfio_device_ops *ops) +{ + struct vfio_device *device; + int ret; + + if (WARN_ON(size < sizeof(struct vfio_device))) + return ERR_PTR(-EINVAL); + + device = kvzalloc(size, GFP_KERNEL); + if (!device) + return ERR_PTR(-ENOMEM); + + ret = vfio_init_device(device, dev, ops); + if (ret) + goto out_free; + return device; + +out_free: + kvfree(device); + return ERR_PTR(ret); +} +EXPORT_SYMBOL_GPL(_vfio_alloc_device); + +/* + * Initialize a vfio_device so it can be registered to vfio core. + * + * Only vfio-ccw driver should call this interface. + */ +int vfio_init_device(struct vfio_device *device, struct device *dev, + const struct vfio_device_ops *ops) +{ + int ret; + + vfio_init_group_dev(device, dev, ops); + + if (ops->init) { + ret = ops->init(device); + if (ret) + goto out_uninit; + } + + kref_init(&device->kref); + return 0; + +out_uninit: + vfio_uninit_group_dev(device); + return ret; +} +EXPORT_SYMBOL_GPL(vfio_init_device); + +/* + * The helper called by driver @release callback to free the device + * structure. Drivers which don't have private data to clean can + * simply use this helper as its @release. + */ +void vfio_free_device(struct vfio_device *device) +{ + kvfree(device); +} +EXPORT_SYMBOL_GPL(vfio_free_device); + +/* Release helper called by vfio_put_device() */ +void vfio_device_release(struct kref *kref) +{ + struct vfio_device *device = + container_of(kref, struct vfio_device, kref); + + vfio_uninit_group_dev(device); + + /* + * kvfree() cannot be done here due to a life cycle mess in + * vfio-ccw. Before the ccw part is fixed all drivers are + * required to support @release and call vfio_free_device() + * from there. + */ + device->ops->release(device); +} +EXPORT_SYMBOL_GPL(vfio_device_release); + static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev, enum vfio_group_type type) { diff --git a/include/linux/vfio.h b/include/linux/vfio.h index e05ddc6fe6a5..e1e9e8352903 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -45,7 +45,8 @@ struct vfio_device { struct kvm *kvm; /* Members below here are private, not for driver use */ - refcount_t refcount; + struct kref kref; /* object life cycle */ + refcount_t refcount; /* user count on registered device*/ unsigned int open_count; struct completion comp; struct list_head group_next; @@ -55,6 +56,8 @@ struct vfio_device { /** * struct vfio_device_ops - VFIO bus driver device callbacks * + * @init: initialize private fields in device structure + * @release: Reclaim private fields in device structure * @open_device: Called when the first file descriptor is opened for this device * @close_device: Opposite of open_device * @read: Perform read(2) on device file descriptor @@ -72,6 +75,8 @@ struct vfio_device { */ struct vfio_device_ops { char *name; + int (*init)(struct vfio_device *vdev); + void (*release)(struct vfio_device *vdev); int (*open_device)(struct vfio_device *vdev); void (*close_device)(struct vfio_device *vdev); ssize_t (*read)(struct vfio_device *vdev, char __user *buf, @@ -137,6 +142,24 @@ static inline int vfio_check_feature(u32 flags, size_t argsz, u32 supported_ops, return 1; } +struct vfio_device *_vfio_alloc_device(size_t size, struct device *dev, + const struct vfio_device_ops *ops); +#define vfio_alloc_device(dev_struct, member, dev, ops) \ + container_of(_vfio_alloc_device(sizeof(struct dev_struct) + \ + BUILD_BUG_ON_ZERO(offsetof( \ + struct dev_struct, member)), \ + dev, ops), \ + struct dev_struct, member) + +int vfio_init_device(struct vfio_device *device, struct device *dev, + const struct vfio_device_ops *ops); +void vfio_free_device(struct vfio_device *device); +void vfio_device_release(struct kref *kref); +static inline void vfio_put_device(struct vfio_device *device) +{ + kref_put(&device->kref, vfio_device_release); +} + void vfio_init_group_dev(struct vfio_device *device, struct device *dev, const struct vfio_device_ops *ops); void vfio_uninit_group_dev(struct vfio_device *device);