Message ID | 20220614122943.1406-1-yekai13@huawei.com (mailing list archive) |
---|---|
Headers | show |
Series | crypto: hisilicon - supports device isolation feature | expand |
On Tue, Jun 14, 2022 at 08:29:39PM +0800, Kai Ye wrote: > Update documentation describing DebugFS that could help to > configure hard error frequency for users in th user space. > > Signed-off-by: Kai Ye <yekai13@huawei.com> > --- > Documentation/ABI/testing/sysfs-driver-uacce | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/Documentation/ABI/testing/sysfs-driver-uacce b/Documentation/ABI/testing/sysfs-driver-uacce > index 08f2591138af..0c4226364182 100644 > --- a/Documentation/ABI/testing/sysfs-driver-uacce > +++ b/Documentation/ABI/testing/sysfs-driver-uacce > @@ -19,6 +19,23 @@ Contact: linux-accelerators@lists.ozlabs.org > Description: Available instances left of the device > Return -ENODEV if uacce_ops get_available_instances is not provided > > +What: /sys/class/uacce/<dev_name>/isolate_strategy > +Date: Jun 2022 > +KernelVersion: 5.19 > +Contact: linux-accelerators@lists.ozlabs.org > +Description: A vfs node that used to configures the hardware What is a "vfs node"? > + error frequency. This frequency is abstract. Like once an hour > + or once a day. The specific isolation strategy can be defined in > + each driver module. No, you need to be specific here and describe the units and the format. Otherwise it is no description at all :( > + > +What: /sys/class/uacce/<dev_name>/isolate > +Date: Jun 2022 > +KernelVersion: 5.19 5.19 will not have this change. > +Contact: linux-accelerators@lists.ozlabs.org > +Description: A vfs node that show the device isolated state. The value 0 > + means that the device is working. The value 1 means that the > + device has been isolated. What does "working" or "isolated" mean? thanks, greg k-h
On Tue, Jun 14, 2022 at 08:29:38PM +0800, Kai Ye wrote: > UACCE add the hardware error isolation API. Users can configure > the error frequency threshold by this vfs node. This API interface > certainly supports the configuration of user protocol strategy. Then > parse it inside the device driver. UACCE only reports the device > isolate state. When the error frequency is exceeded, the device > will be isolated. The isolation strategy should be defined in each > driver module. > > Signed-off-by: Kai Ye <yekai13@huawei.com> > Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com> > --- > drivers/misc/uacce/uacce.c | 37 +++++++++++++++++++++++++++++++++++++ > include/linux/uacce.h | 16 +++++++++++++--- > 2 files changed, 50 insertions(+), 3 deletions(-) > > diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c > index b6219c6bfb48..525623215132 100644 > --- a/drivers/misc/uacce/uacce.c > +++ b/drivers/misc/uacce/uacce.c > @@ -346,12 +346,47 @@ static ssize_t region_dus_size_show(struct device *dev, > uacce->qf_pg_num[UACCE_QFRT_DUS] << PAGE_SHIFT); > } > > +static ssize_t isolate_show(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + struct uacce_device *uacce = to_uacce_device(dev); > + > + return sysfs_emit(buf, "%d\n", uacce->ops->get_isolate_state(uacce)); > +} > + > +static ssize_t isolate_strategy_show(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + struct uacce_device *uacce = to_uacce_device(dev); > + > + return sysfs_emit(buf, "%s\n", uacce->isolate_strategy); > +} > + > +static ssize_t isolate_strategy_store(struct device *dev, > + struct device_attribute *attr, > + const char *buf, size_t count) > +{ > + struct uacce_device *uacce = to_uacce_device(dev); > + int ret; > + > + if (!buf || sizeof(buf) > UACCE_MAX_ISOLATE_STRATEGY_LEN) > + return -EINVAL; > + > + memcpy(uacce->isolate_strategy, buf, strlen(buf)); > + > + ret = uacce->ops->isolate_strategy_write(uacce, buf); > + > + return ret ? ret : count; > +} > + > static DEVICE_ATTR_RO(api); > static DEVICE_ATTR_RO(flags); > static DEVICE_ATTR_RO(available_instances); > static DEVICE_ATTR_RO(algorithms); > static DEVICE_ATTR_RO(region_mmio_size); > static DEVICE_ATTR_RO(region_dus_size); > +static DEVICE_ATTR_RO(isolate); > +static DEVICE_ATTR_RW(isolate_strategy); > > static struct attribute *uacce_dev_attrs[] = { > &dev_attr_api.attr, > @@ -360,6 +395,8 @@ static struct attribute *uacce_dev_attrs[] = { > &dev_attr_algorithms.attr, > &dev_attr_region_mmio_size.attr, > &dev_attr_region_dus_size.attr, > + &dev_attr_isolate.attr, > + &dev_attr_isolate_strategy.attr, > NULL, > }; > > diff --git a/include/linux/uacce.h b/include/linux/uacce.h > index 48e319f40275..0f7668bfa645 100644 > --- a/include/linux/uacce.h > +++ b/include/linux/uacce.h > @@ -8,6 +8,7 @@ > #define UACCE_NAME "uacce" > #define UACCE_MAX_REGION 2 > #define UACCE_MAX_NAME_SIZE 64 > +#define UACCE_MAX_ISOLATE_STRATEGY_LEN 256 So it's a random string of characters? What format? > > struct uacce_queue; > struct uacce_device; > @@ -30,6 +31,8 @@ struct uacce_qfile_region { > * @is_q_updated: check whether the task is finished > * @mmap: mmap addresses of queue to user space > * @ioctl: ioctl for user space users of the queue > + * @get_isolate_state: get the device state after set the isolate strategy > + * @isolate_strategy_store: stored the isolate strategy to the device > */ > struct uacce_ops { > int (*get_available_instances)(struct uacce_device *uacce); > @@ -43,6 +46,8 @@ struct uacce_ops { > struct uacce_qfile_region *qfr); > long (*ioctl)(struct uacce_queue *q, unsigned int cmd, > unsigned long arg); > + enum uacce_dev_state (*get_isolate_state)(struct uacce_device *uacce); > + int (*isolate_strategy_write)(struct uacce_device *uacce, const char *buf); Length of the buffer? > }; > > /** > @@ -57,6 +62,12 @@ struct uacce_interface { > const struct uacce_ops *ops; > }; > > +enum uacce_dev_state { > + UACCE_DEV_ERR = -1, > + UACCE_DEV_NORMAL, > + UACCE_DEV_ISOLATE, > +}; > + > enum uacce_q_state { > UACCE_Q_ZOMBIE = 0, > UACCE_Q_INIT, > @@ -117,6 +128,7 @@ struct uacce_device { > struct list_head queues; > struct mutex queues_lock; > struct inode *inode; > + char isolate_strategy[UACCE_MAX_ISOLATE_STRATEGY_LEN]; > }; > > #if IS_ENABLED(CONFIG_UACCE) > @@ -125,7 +137,7 @@ struct uacce_device *uacce_alloc(struct device *parent, > struct uacce_interface *interface); > int uacce_register(struct uacce_device *uacce); > void uacce_remove(struct uacce_device *uacce); > - > +struct uacce_device *dev_to_uacce(struct device *dev); Why is this moved to the .h file yet the function is not exported? thanks, greg k-h
On Tue, 14 Jun 2022 14:41:52 +0200 Greg KH <gregkh@linuxfoundation.org> wrote: > On Tue, Jun 14, 2022 at 08:29:39PM +0800, Kai Ye wrote: > > Update documentation describing DebugFS that could help to > > configure hard error frequency for users in th user space. > > > > Signed-off-by: Kai Ye <yekai13@huawei.com> > > --- > > Documentation/ABI/testing/sysfs-driver-uacce | 17 +++++++++++++++++ > > 1 file changed, 17 insertions(+) > > > > diff --git a/Documentation/ABI/testing/sysfs-driver-uacce b/Documentation/ABI/testing/sysfs-driver-uacce > > index 08f2591138af..0c4226364182 100644 > > --- a/Documentation/ABI/testing/sysfs-driver-uacce > > +++ b/Documentation/ABI/testing/sysfs-driver-uacce > > @@ -19,6 +19,23 @@ Contact: linux-accelerators@lists.ozlabs.org > > Description: Available instances left of the device > > Return -ENODEV if uacce_ops get_available_instances is not provided > > > > +What: /sys/class/uacce/<dev_name>/isolate_strategy > > +Date: Jun 2022 > > +KernelVersion: 5.19 > > +Contact: linux-accelerators@lists.ozlabs.org > > +Description: A vfs node that used to configures the hardware > > What is a "vfs node"? > > > + error frequency. This frequency is abstract. Like once an hour > > + or once a day. The specific isolation strategy can be defined in > > + each driver module. > > No, you need to be specific here and describe the units and the format. > Otherwise it is no description at all :( Also, rename it. A frequency isn't a strategy. Strategy would be something like: * First fault * Faults in moving time window. * Faults in fixed time window. some of which would then need separate controls for the threshold and the time window - those should be in separate sysfs attributes. > > > + > > +What: /sys/class/uacce/<dev_name>/isolate > > +Date: Jun 2022 > > +KernelVersion: 5.19 > > 5.19 will not have this change. > > > +Contact: linux-accelerators@lists.ozlabs.org > > +Description: A vfs node that show the device isolated state. The value 0 > > + means that the device is working. The value 1 means that the > > + device has been isolated. > > What does "working" or "isolated" mean? > > thanks, > > greg k-h
On 2022/6/15 16:48, Jonathan Cameron wrote: > On Tue, 14 Jun 2022 14:41:52 +0200 > Greg KH <gregkh@linuxfoundation.org> wrote: > >> On Tue, Jun 14, 2022 at 08:29:39PM +0800, Kai Ye wrote: >>> Update documentation describing DebugFS that could help to >>> configure hard error frequency for users in th user space. >>> >>> Signed-off-by: Kai Ye <yekai13@huawei.com> >>> --- >>> Documentation/ABI/testing/sysfs-driver-uacce | 17 +++++++++++++++++ >>> 1 file changed, 17 insertions(+) >>> >>> diff --git a/Documentation/ABI/testing/sysfs-driver-uacce b/Documentation/ABI/testing/sysfs-driver-uacce >>> index 08f2591138af..0c4226364182 100644 >>> --- a/Documentation/ABI/testing/sysfs-driver-uacce >>> +++ b/Documentation/ABI/testing/sysfs-driver-uacce >>> @@ -19,6 +19,23 @@ Contact: linux-accelerators@lists.ozlabs.org >>> Description: Available instances left of the device >>> Return -ENODEV if uacce_ops get_available_instances is not provided >>> >>> +What: /sys/class/uacce/<dev_name>/isolate_strategy >>> +Date: Jun 2022 >>> +KernelVersion: 5.19 >>> +Contact: linux-accelerators@lists.ozlabs.org >>> +Description: A vfs node that used to configures the hardware >> >> What is a "vfs node"? >> >>> + error frequency. This frequency is abstract. Like once an hour >>> + or once a day. The specific isolation strategy can be defined in >>> + each driver module. >> >> No, you need to be specific here and describe the units and the format. >> Otherwise it is no description at all :( > > Also, rename it. A frequency isn't a strategy. Strategy would be something > like: > > * First fault > * Faults in moving time window. > * Faults in fixed time window. > > some of which would then need separate controls for the threshold and the > time window - those should be in separate sysfs attributes. > I will describe the units and the format in here. Thanks Kai >> >>> + >>> +What: /sys/class/uacce/<dev_name>/isolate >>> +Date: Jun 2022 >>> +KernelVersion: 5.19 >> >> 5.19 will not have this change. >> >>> +Contact: linux-accelerators@lists.ozlabs.org >>> +Description: A vfs node that show the device isolated state. The value 0 >>> + means that the device is working. The value 1 means that the >>> + device has been isolated. >> >> What does "working" or "isolated" mean? >> >> thanks, >> >> greg k-h > > . >