Message ID | cover.1606075915.git.vilhelm.gray@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Introduce the Counter character device interface | expand |
On Sun, Nov 22, 2020 at 03:29:52PM -0500, William Breathitt Gray wrote: > @@ -117,62 +112,95 @@ static int ti_eqep_count_write(struct counter_device *counter, > return regmap_write(priv->regmap32, QPOSCNT, val); > } > > -static int ti_eqep_function_get(struct counter_device *counter, > - struct counter_count *count, size_t *function) > +static const u8 ti_qep_t2c_functions_map[] = { > +}; Just a heads-up: this ti_qep_t2c_functions_map array is left over from some code I was testing. It's not used at all -- I simply forgot to remove it -- so I'll make sure to take it out in the next patchset. I'll give this v6 patchset some more time for people to review and comment before I submit the v7 revision. William Breathitt Gray
On 11/22/20 2:29 PM, William Breathitt Gray wrote: > 14 files changed, 1806 insertions(+), 2546 deletions(-) It would be really nice if we could break this down into smaller pieces and start getting it merged. It is really tough to keep reviewing this much code in one patch over and over again. Here are some initial findings from testing: > +static void counter_device_release(struct device *dev) > +{ > + struct counter_device *const counter = dev_get_drvdata(dev); > + > + counter_chrdev_remove(counter); > + ida_simple_remove(&counter_ida, counter->id); > +} I got the following error after `modprobe -r ti-eqep`: [ 1186.045766] ------------[ cut here ]------------ [ 1186.050647] WARNING: CPU: 0 PID: 2625 at lib/refcount.c:28 counter_device_release+0x10/0x24 [counter] [ 1186.059976] refcount_t: underflow; use-after-free. [ 1186.064790] Modules linked in: aes_arm_bs(+) crypto_simd cryptd ccm usb_f_mass_storage usb_f_acm u_serial usb_f_ecm rfcomm usb_f_rndis u_ether libcomposite aes_arm aes_generic cmac bnep wl18xx wlcore mac80211 libarc4 sha256_generic libsha256 sha256_arm cfg80211 ti_am335x_adc kfifo_buf omap_aes_driver omap_crypto omap_sham crypto_engine pm33xx ti_emif_sram hci_uart omap_rng btbcm rng_core ti_eqep(-) counter bluetooth c_can_platform c_can ecdh_generic bmp280_spi ecc can_dev libaes bmp280_i2c bmp280 industrialio omap_mailbox musb_dsps wlcore_sdio musb_hdrc udc_core usbcore wkup_m3_ipc at24 omap_wdt phy_am335x watchdog phy_am335x_control ti_am335x_tscadc phy_generic wkup_m3_rproc usb_common cppi41 rtc_omap leds_gpio led_class cpufreq_dt pwm_tiehrpwm autofs4 [ 1186.132376] CPU: 0 PID: 2625 Comm: modprobe Not tainted 5.10.0-rc7bone-counter+ #23 [ 1186.140070] Hardware name: Generic AM33XX (Flattened Device Tree) [ 1186.146225] [<c0110d70>] (unwind_backtrace) from [<c010b640>] (show_stack+0x10/0x14) [ 1186.154017] [<c010b640>] (show_stack) from [<c09a0c98>] (dump_stack+0xc4/0xe4) [ 1186.161285] [<c09a0c98>] (dump_stack) from [<c0137ba0>] (__warn+0xd8/0x100) [ 1186.168284] [<c0137ba0>] (__warn) from [<c099c8e4>] (warn_slowpath_fmt+0x94/0xbc) [ 1186.175814] [<c099c8e4>] (warn_slowpath_fmt) from [<bf10b0e8>] (counter_device_release+0x10/0x24 [counter]) [ 1186.185632] [<bf10b0e8>] (counter_device_release [counter]) from [<c0667118>] (device_release+0x30/0xa4) [ 1186.195163] [<c0667118>] (device_release) from [<c057f73c>] (kobject_put+0x94/0x104) [ 1186.202944] [<c057f73c>] (kobject_put) from [<c057f73c>] (kobject_put+0x94/0x104) [ 1186.210472] [<c057f73c>] (kobject_put) from [<bf19004c>] (ti_eqep_remove+0x10/0x30 [ti_eqep]) [ 1186.219047] [<bf19004c>] (ti_eqep_remove [ti_eqep]) from [<c066f390>] (platform_drv_remove+0x24/0x3c) [ 1186.228313] [<c066f390>] (platform_drv_remove) from [<c066d934>] (device_release_driver_internal+0xfc/0x1d0) [ 1186.238187] [<c066d934>] (device_release_driver_internal) from [<c066da78>] (driver_detach+0x58/0xa8) [ 1186.247456] [<c066da78>] (driver_detach) from [<c066c5ec>] (bus_remove_driver+0x4c/0xa0) [ 1186.255594] [<c066c5ec>] (bus_remove_driver) from [<c01dd150>] (sys_delete_module+0x180/0x264) [ 1186.264250] [<c01dd150>] (sys_delete_module) from [<c0100080>] (ret_fast_syscall+0x0/0x54) [ 1186.272551] Exception stack(0xd247ffa8 to 0xd247fff0) [ 1186.277629] ffa0: 004fb478 004fb478 004fb4b4 00000800 b3bfcf00 00000000 [ 1186.285847] ffc0: 004fb478 004fb478 004fb478 00000081 00000000 be974900 be974a55 004fb478 [ 1186.294062] ffe0: 004f8f5c be97352c 004ddd97 b6d11d68 [ 1186.299253] ---[ end trace e1c61dea091f1078 ]--- > +static ssize_t counter_comp_u8_store(struct device *dev, > + struct device_attribute *attr, > + const char *buf, size_t len) > +{ > + const struct counter_attribute *const a = to_counter_attribute(attr); > + struct counter_device *const counter = dev_get_drvdata(dev); > + struct counter_count *const count = a->parent; > + struct counter_synapse *const synapse = a->comp.priv; > + const struct counter_available *const avail = a->comp.priv; > + int err; > + bool bool_data; > + int idx; > + u8 data; > + > + switch (a->comp.type) { > + case COUNTER_COMP_BOOL: > + err = kstrtobool(buf, &bool_data); > + data = bool_data; > + break; > + case COUNTER_COMP_FUNCTION: > + err = find_in_string_array(&data, count->functions_list, > + count->num_functions, buf, > + counter_function_str); > + break; > + case COUNTER_COMP_SYNAPSE_ACTION: > + err = find_in_string_array(&data, synapse->actions_list, > + synapse->num_actions, buf, > + counter_synapse_action_str); > + break; > + case COUNTER_COMP_ENUM: > + idx = __sysfs_match_string(avail->strs, avail->num_items, buf); > + if (idx < 0) > + return idx; > + data = idx; > + break; > + case COUNTER_COMP_COUNT_MODE: > + err = find_in_string_array(&data, avail->enums, > + avail->num_items, buf, > + counter_count_mode_str); > + break; > + default: > + err = kstrtou8(buf, 0, &data); > + break; > + } > + if (err) This needs to be `if (err < 0)`. There are cases where the functions above return positive values. (And to be overly safe, it probably wouldn't hurt to use err < 0 everywhere - not just in this function.) > + return err; > + > + switch (a->scope) { > + case COUNTER_SCOPE_DEVICE: > + err = a->comp.device_u8_write(counter, data); > + break; > + case COUNTER_SCOPE_SIGNAL: > + err = a->comp.signal_u8_write(counter, a->parent, data); > + break; > + case COUNTER_SCOPE_COUNT: > + if (a->comp.type == COUNTER_COMP_SYNAPSE_ACTION) > + err = a->comp.action_write(counter, count, synapse, > + data); > + else > + err = a->comp.count_u8_write(counter, count, data); > + break; > + } > + if (err) > + return err; > + > + return len; > +}
On 11/22/20 2:29 PM, William Breathitt Gray wrote: > > 1. Should standard Counter component data types be defined as u8 or u32? > > Many standard Counter component types such COUNTER_COMP_SIGNAL_LEVEL > have standard values defined (e.g. COUNTER_SIGNAL_LEVEL_LOW and > COUNTER_SIGNAL_LEVEL_HIGH). These values are currently handled by the > Counter subsystem code as u8 data types. > > If u32 is used for these values instead, C enum structures could be > used by driver authors to implicitly cast these values via the driver > callback parameters. > > This question is primarily addressed to David Lechner. I'm somewhat > confused about how this setup would look in device drivers. I've gone > ahead and refactored the code to support u32 enums, and pushed it to > a separate branch on my repository called counter_chrdev_v6_u32_enum: > https://gitlab.com/vilhelmgray/iio/-/tree/counter_chrdev_v6_u32_enum > > Please check it out and let me know what you think. Is this the > support you had in mind? I'm curious to see an example of how would > your driver callback functions would look in this case. If everything > works out fine, then I'll submit this branch as v7 of this patchset. I haven't had time to look at this in depth, but just superficially looking at it, it is mostly there. The driver callback would just use the enum type in place of u32. For example: static int ti_eqep_function_write(struct counter_device *counter, struct counter_count *count, enum counter_function function) and the COUNTER_FUNCTION_* constants would be defined as: enum counter_function { COUNTER_FUNCTION_INCREASE, ... }; instead of using #define macros. One advantage I see to using u8, at least in the user API data structures, is that it increases the number of events that fit in the kfifo buffer by a significant factor. And that is not to say that we couldn't do both: have the user API structs use u8 for enum values and still use u32/strong enum types internally in the callback functions. > > 2. How should we handle "raw" timestamps? > > Ahmad Fatoum brought up the possibility of returning "raw" timestamps > similar to what the network stack offers (see the network stack > SOF_TIMESTAMPING_{RAW,SYS}_HARDWARE support). > > I'm not very familiar with the networking stack code, but if I > understand correctly the SOF_TIMESTAMPING_RAW_HARDWARE timestamps are > values returned from the device. If so, I suspect we would be able to > support these "raw" timestamps by defining them as Counter Extensions > and returning them in struct counter_event elements similar to the > other Extension values. Is nanosecond resolution good enough? In the TI eQEP driver I considered returning the raw timer value, but quickly realized that it would not be very nice to expect the user code to know the clock rate of the timer. It was very easy to get the clock rate in the kernel and just convert the timer value to nanoseconds before returning it to userspace. So if there is some specialized case where it can be solved no other way besides using raw timestamps, then sure, include it. Otherwise I think we should stick with nanoseconds for time values when possible.
On Sun, Dec 13, 2020 at 05:15:14PM -0600, David Lechner wrote: > On 11/22/20 2:29 PM, William Breathitt Gray wrote: > > > > 1. Should standard Counter component data types be defined as u8 or u32? > > > > Many standard Counter component types such COUNTER_COMP_SIGNAL_LEVEL > > have standard values defined (e.g. COUNTER_SIGNAL_LEVEL_LOW and > > COUNTER_SIGNAL_LEVEL_HIGH). These values are currently handled by the > > Counter subsystem code as u8 data types. > > > > If u32 is used for these values instead, C enum structures could be > > used by driver authors to implicitly cast these values via the driver > > callback parameters. > > > > This question is primarily addressed to David Lechner. I'm somewhat > > confused about how this setup would look in device drivers. I've gone > > ahead and refactored the code to support u32 enums, and pushed it to > > a separate branch on my repository called counter_chrdev_v6_u32_enum: > > https://gitlab.com/vilhelmgray/iio/-/tree/counter_chrdev_v6_u32_enum > > > > Please check it out and let me know what you think. Is this the > > support you had in mind? I'm curious to see an example of how would > > your driver callback functions would look in this case. If everything > > works out fine, then I'll submit this branch as v7 of this patchset. > > I haven't had time to look at this in depth, but just superficially looking > at it, it is mostly there. The driver callback would just use the enum type > in place of u32. For example: > > static int ti_eqep_function_write(struct counter_device *counter, > struct counter_count *count, > enum counter_function function) > > and the COUNTER_FUNCTION_* constants would be defined as: > > enum counter_function { > COUNTER_FUNCTION_INCREASE, > ... > }; > > instead of using #define macros. > > One advantage I see to using u8, at least in the user API data structures, > is that it increases the number of events that fit in the kfifo buffer by > a significant factor. > > And that is not to say that we couldn't do both: have the user API structs > use u8 for enum values and still use u32/strong enum types internally in > the callback functions. I'm including David Laight because he initially opposed enums in favor of fixed size types when we discussed this in an earlier revision: https://lkml.org/lkml/2020/5/3/159 However, there have been significant changes to this patchset so the context now is different than those earlier discussions (i.e. we're no longer discussing ioctl calls). I think reimplementing these constants as enums as described could work. If we do so, should the enum constants be given specific values? For example: enum counter_function { COUNTER_FUNCTION_INCREASE = 0, COUNTER_FUNCTION_DECREASE = 1, ... }; > > > > > 2. How should we handle "raw" timestamps? > > > > Ahmad Fatoum brought up the possibility of returning "raw" timestamps > > similar to what the network stack offers (see the network stack > > SOF_TIMESTAMPING_{RAW,SYS}_HARDWARE support). > > > > I'm not very familiar with the networking stack code, but if I > > understand correctly the SOF_TIMESTAMPING_RAW_HARDWARE timestamps are > > values returned from the device. If so, I suspect we would be able to > > support these "raw" timestamps by defining them as Counter Extensions > > and returning them in struct counter_event elements similar to the > > other Extension values. > > Is nanosecond resolution good enough? In the TI eQEP driver I considered > returning the raw timer value, but quickly realized that it would not be > very nice to expect the user code to know the clock rate of the timer. It > was very easy to get the clock rate in the kernel and just convert the > timer value to nanoseconds before returning it to userspace. > > So if there is some specialized case where it can be solved no other way > besides using raw timestamps, then sure, include it. Otherwise I think we > should stick with nanoseconds for time values when possible. Given that the struct counter_event 'timestamp' member serves as the identification vessel for correlating component values to a single event (i.e. component values of a given event will share the same unique timestamp), I believe it's prudent to standardize this timestamp format on the kernel monotonic time as we have currently done so via our ktime_get_ns() call. There are cases where it is understandably better to use a timestamp provided directly by the hardware (e.g. keeping timestamping close to data collection). For these cases, we can retrieve these "raw" timestamps via a Counter Extension: users would get their "raw" timestamp via the struct counter_event 'value' member, and just treat the 'timestamp' member as a unique event identification number. William Breathitt Gray
On Sun, Dec 13, 2020 at 05:15:00PM -0600, David Lechner wrote: > On 11/22/20 2:29 PM, William Breathitt Gray wrote: > > > 14 files changed, 1806 insertions(+), 2546 deletions(-) > > It would be really nice if we could break this down into smaller > pieces and start getting it merged. It is really tough to keep > reviewing this much code in one patch over and over again. Yes, this is a pretty massive patch. I could break this across the individual files affected to make it simpler to review, but in the end all those patches would need to end up squashed together before merge again (for the sake of git bisect), so the effort feels somewhat moot. Luckily, I don't think there will be much change in the next revision since it's looking like it'll mainly be a few bug fixes; hopefully this coming version 7 will be the final revision before merge. > Here are some initial findings from testing: > > > > +static void counter_device_release(struct device *dev) > > +{ > > + struct counter_device *const counter = dev_get_drvdata(dev); > > + > > + counter_chrdev_remove(counter); > > + ida_simple_remove(&counter_ida, counter->id); > > +} > > > I got the following error after `modprobe -r ti-eqep`: > > [ 1186.045766] ------------[ cut here ]------------ > [ 1186.050647] WARNING: CPU: 0 PID: 2625 at lib/refcount.c:28 counter_device_release+0x10/0x24 [counter] > [ 1186.059976] refcount_t: underflow; use-after-free. > [ 1186.064790] Modules linked in: aes_arm_bs(+) crypto_simd cryptd ccm usb_f_mass_storage usb_f_acm u_serial usb_f_ecm rfcomm usb_f_rndis u_ether libcomposite aes_arm aes_generic cmac bnep wl18xx wlcore mac80211 libarc4 sha256_generic libsha256 sha256_arm cfg80211 ti_am335x_adc kfifo_buf omap_aes_driver omap_crypto omap_sham crypto_engine pm33xx ti_emif_sram hci_uart omap_rng btbcm rng_core ti_eqep(-) counter bluetooth c_can_platform c_can ecdh_generic bmp280_spi ecc can_dev libaes bmp280_i2c bmp280 industrialio omap_mailbox musb_dsps wlcore_sdio musb_hdrc udc_core usbcore wkup_m3_ipc at24 omap_wdt phy_am335x watchdog phy_am335x_control ti_am335x_tscadc phy_generic wkup_m3_rproc usb_common cppi41 rtc_omap leds_gpio led_class cpufreq_dt pwm_tiehrpwm autofs4 > [ 1186.132376] CPU: 0 PID: 2625 Comm: modprobe Not tainted 5.10.0-rc7bone-counter+ #23 > [ 1186.140070] Hardware name: Generic AM33XX (Flattened Device Tree) > [ 1186.146225] [<c0110d70>] (unwind_backtrace) from [<c010b640>] (show_stack+0x10/0x14) > [ 1186.154017] [<c010b640>] (show_stack) from [<c09a0c98>] (dump_stack+0xc4/0xe4) > [ 1186.161285] [<c09a0c98>] (dump_stack) from [<c0137ba0>] (__warn+0xd8/0x100) > [ 1186.168284] [<c0137ba0>] (__warn) from [<c099c8e4>] (warn_slowpath_fmt+0x94/0xbc) > [ 1186.175814] [<c099c8e4>] (warn_slowpath_fmt) from [<bf10b0e8>] (counter_device_release+0x10/0x24 [counter]) > [ 1186.185632] [<bf10b0e8>] (counter_device_release [counter]) from [<c0667118>] (device_release+0x30/0xa4) > [ 1186.195163] [<c0667118>] (device_release) from [<c057f73c>] (kobject_put+0x94/0x104) > [ 1186.202944] [<c057f73c>] (kobject_put) from [<c057f73c>] (kobject_put+0x94/0x104) > [ 1186.210472] [<c057f73c>] (kobject_put) from [<bf19004c>] (ti_eqep_remove+0x10/0x30 [ti_eqep]) > [ 1186.219047] [<bf19004c>] (ti_eqep_remove [ti_eqep]) from [<c066f390>] (platform_drv_remove+0x24/0x3c) > [ 1186.228313] [<c066f390>] (platform_drv_remove) from [<c066d934>] (device_release_driver_internal+0xfc/0x1d0) > [ 1186.238187] [<c066d934>] (device_release_driver_internal) from [<c066da78>] (driver_detach+0x58/0xa8) > [ 1186.247456] [<c066da78>] (driver_detach) from [<c066c5ec>] (bus_remove_driver+0x4c/0xa0) > [ 1186.255594] [<c066c5ec>] (bus_remove_driver) from [<c01dd150>] (sys_delete_module+0x180/0x264) > [ 1186.264250] [<c01dd150>] (sys_delete_module) from [<c0100080>] (ret_fast_syscall+0x0/0x54) > [ 1186.272551] Exception stack(0xd247ffa8 to 0xd247fff0) > [ 1186.277629] ffa0: 004fb478 004fb478 004fb4b4 00000800 b3bfcf00 00000000 > [ 1186.285847] ffc0: 004fb478 004fb478 004fb478 00000081 00000000 be974900 be974a55 004fb478 > [ 1186.294062] ffe0: 004f8f5c be97352c 004ddd97 b6d11d68 > [ 1186.299253] ---[ end trace e1c61dea091f1078 ]--- I noticed that I'm calling counter_chrdev_remove() twice: once in counter_unregister(), and again in counter_device_release(). I suspect this is what's causing the refcount to underflow. I'll test and verify that this is the culprit. In fact, I don't think I need to define a counter_device_release() callback at all, would I? These cleanup function calls could be moved to counter_unregister() instead. > > +static ssize_t counter_comp_u8_store(struct device *dev, > > + struct device_attribute *attr, > > + const char *buf, size_t len) > > +{ > > + const struct counter_attribute *const a = to_counter_attribute(attr); > > + struct counter_device *const counter = dev_get_drvdata(dev); > > + struct counter_count *const count = a->parent; > > + struct counter_synapse *const synapse = a->comp.priv; > > + const struct counter_available *const avail = a->comp.priv; > > + int err; > > + bool bool_data; > > + int idx; > > + u8 data; > > + > > + switch (a->comp.type) { > > + case COUNTER_COMP_BOOL: > > + err = kstrtobool(buf, &bool_data); > > + data = bool_data; > > + break; > > + case COUNTER_COMP_FUNCTION: > > + err = find_in_string_array(&data, count->functions_list, > > + count->num_functions, buf, > > + counter_function_str); > > + break; > > + case COUNTER_COMP_SYNAPSE_ACTION: > > + err = find_in_string_array(&data, synapse->actions_list, > > + synapse->num_actions, buf, > > + counter_synapse_action_str); > > + break; > > + case COUNTER_COMP_ENUM: > > + idx = __sysfs_match_string(avail->strs, avail->num_items, buf); > > + if (idx < 0) > > + return idx; > > + data = idx; > > + break; > > + case COUNTER_COMP_COUNT_MODE: > > + err = find_in_string_array(&data, avail->enums, > > + avail->num_items, buf, > > + counter_count_mode_str); > > + break; > > + default: > > + err = kstrtou8(buf, 0, &data); > > + break; > > + } > > + if (err) > > This needs to be `if (err < 0)`. There are cases where the functions > above return positive values. (And to be overly safe, it probably wouldn't > hurt to use err < 0 everywhere - not just in this function.) Ack. William Breathitt Gray
On 12/20/20 3:44 PM, William Breathitt Gray wrote: > On Sun, Dec 13, 2020 at 05:15:14PM -0600, David Lechner wrote: >> On 11/22/20 2:29 PM, William Breathitt Gray wrote: >>> >>> 1. Should standard Counter component data types be defined as u8 or u32? >>> >>> Many standard Counter component types such COUNTER_COMP_SIGNAL_LEVEL >>> have standard values defined (e.g. COUNTER_SIGNAL_LEVEL_LOW and >>> COUNTER_SIGNAL_LEVEL_HIGH). These values are currently handled by the >>> Counter subsystem code as u8 data types. >>> >>> If u32 is used for these values instead, C enum structures could be >>> used by driver authors to implicitly cast these values via the driver >>> callback parameters. >>> >>> This question is primarily addressed to David Lechner. I'm somewhat >>> confused about how this setup would look in device drivers. I've gone >>> ahead and refactored the code to support u32 enums, and pushed it to >>> a separate branch on my repository called counter_chrdev_v6_u32_enum: >>> https://gitlab.com/vilhelmgray/iio/-/tree/counter_chrdev_v6_u32_enum >>> >>> Please check it out and let me know what you think. Is this the >>> support you had in mind? I'm curious to see an example of how would >>> your driver callback functions would look in this case. If everything >>> works out fine, then I'll submit this branch as v7 of this patchset. >> >> I haven't had time to look at this in depth, but just superficially looking >> at it, it is mostly there. The driver callback would just use the enum type >> in place of u32. For example: >> >> static int ti_eqep_function_write(struct counter_device *counter, >> struct counter_count *count, >> enum counter_function function) >> >> and the COUNTER_FUNCTION_* constants would be defined as: >> >> enum counter_function { >> COUNTER_FUNCTION_INCREASE, >> ... >> }; >> >> instead of using #define macros. >> >> One advantage I see to using u8, at least in the user API data structures, >> is that it increases the number of events that fit in the kfifo buffer by >> a significant factor. >> >> And that is not to say that we couldn't do both: have the user API structs >> use u8 for enum values and still use u32/strong enum types internally in >> the callback functions. > > I'm including David Laight because he initially opposed enums in favor > of fixed size types when we discussed this in an earlier revision: > https://lkml.org/lkml/2020/5/3/159 > > However, there have been significant changes to this patchset so the > context now is different than those earlier discussions (i.e. we're no > longer discussing ioctl calls). > > I think reimplementing these constants as enums as described could work. > If we do so, should the enum constants be given specific values? For > example: > > enum counter_function { > COUNTER_FUNCTION_INCREASE = 0, > COUNTER_FUNCTION_DECREASE = 1, > ... > }; I would say no on the explicit values since they don't have any significant meaning.
On 12/20/20 4:11 PM, William Breathitt Gray wrote: > On Sun, Dec 13, 2020 at 05:15:00PM -0600, David Lechner wrote: >> On 11/22/20 2:29 PM, William Breathitt Gray wrote: >> >>> 14 files changed, 1806 insertions(+), 2546 deletions(-) >> >> It would be really nice if we could break this down into smaller >> pieces and start getting it merged. It is really tough to keep >> reviewing this much code in one patch over and over again. > > Yes, this is a pretty massive patch. I could break this across the > individual files affected to make it simpler to review, but in the end > all those patches would need to end up squashed together before merge > again (for the sake of git bisect), so the effort feels somewhat moot. > > Luckily, I don't think there will be much change in the next revision > since it's looking like it'll mainly be a few bug fixes; hopefully this > coming version 7 will be the final revision before merge. > >> Here are some initial findings from testing: >> >> >>> +static void counter_device_release(struct device *dev) >>> +{ >>> + struct counter_device *const counter = dev_get_drvdata(dev); >>> + >>> + counter_chrdev_remove(counter); >>> + ida_simple_remove(&counter_ida, counter->id); >>> +} >> >> >> I got the following error after `modprobe -r ti-eqep`: >> >> [ 1186.045766] ------------[ cut here ]------------ >> [ 1186.050647] WARNING: CPU: 0 PID: 2625 at lib/refcount.c:28 counter_device_release+0x10/0x24 [counter] >> [ 1186.059976] refcount_t: underflow; use-after-free. >> [ 1186.064790] Modules linked in: aes_arm_bs(+) crypto_simd cryptd ccm usb_f_mass_storage usb_f_acm u_serial usb_f_ecm rfcomm usb_f_rndis u_ether libcomposite aes_arm aes_generic cmac bnep wl18xx wlcore mac80211 libarc4 sha256_generic libsha256 sha256_arm cfg80211 ti_am335x_adc kfifo_buf omap_aes_driver omap_crypto omap_sham crypto_engine pm33xx ti_emif_sram hci_uart omap_rng btbcm rng_core ti_eqep(-) counter bluetooth c_can_platform c_can ecdh_generic bmp280_spi ecc can_dev libaes bmp280_i2c bmp280 industrialio omap_mailbox musb_dsps wlcore_sdio musb_hdrc udc_core usbcore wkup_m3_ipc at24 omap_wdt phy_am335x watchdog phy_am335x_control ti_am335x_tscadc phy_generic wkup_m3_rproc usb_common cppi41 rtc_omap leds_gpio led_class cpufreq_dt pwm_tiehrpwm autofs4 >> [ 1186.132376] CPU: 0 PID: 2625 Comm: modprobe Not tainted 5.10.0-rc7bone-counter+ #23 >> [ 1186.140070] Hardware name: Generic AM33XX (Flattened Device Tree) >> [ 1186.146225] [<c0110d70>] (unwind_backtrace) from [<c010b640>] (show_stack+0x10/0x14) >> [ 1186.154017] [<c010b640>] (show_stack) from [<c09a0c98>] (dump_stack+0xc4/0xe4) >> [ 1186.161285] [<c09a0c98>] (dump_stack) from [<c0137ba0>] (__warn+0xd8/0x100) >> [ 1186.168284] [<c0137ba0>] (__warn) from [<c099c8e4>] (warn_slowpath_fmt+0x94/0xbc) >> [ 1186.175814] [<c099c8e4>] (warn_slowpath_fmt) from [<bf10b0e8>] (counter_device_release+0x10/0x24 [counter]) >> [ 1186.185632] [<bf10b0e8>] (counter_device_release [counter]) from [<c0667118>] (device_release+0x30/0xa4) >> [ 1186.195163] [<c0667118>] (device_release) from [<c057f73c>] (kobject_put+0x94/0x104) >> [ 1186.202944] [<c057f73c>] (kobject_put) from [<c057f73c>] (kobject_put+0x94/0x104) >> [ 1186.210472] [<c057f73c>] (kobject_put) from [<bf19004c>] (ti_eqep_remove+0x10/0x30 [ti_eqep]) >> [ 1186.219047] [<bf19004c>] (ti_eqep_remove [ti_eqep]) from [<c066f390>] (platform_drv_remove+0x24/0x3c) >> [ 1186.228313] [<c066f390>] (platform_drv_remove) from [<c066d934>] (device_release_driver_internal+0xfc/0x1d0) >> [ 1186.238187] [<c066d934>] (device_release_driver_internal) from [<c066da78>] (driver_detach+0x58/0xa8) >> [ 1186.247456] [<c066da78>] (driver_detach) from [<c066c5ec>] (bus_remove_driver+0x4c/0xa0) >> [ 1186.255594] [<c066c5ec>] (bus_remove_driver) from [<c01dd150>] (sys_delete_module+0x180/0x264) >> [ 1186.264250] [<c01dd150>] (sys_delete_module) from [<c0100080>] (ret_fast_syscall+0x0/0x54) >> [ 1186.272551] Exception stack(0xd247ffa8 to 0xd247fff0) >> [ 1186.277629] ffa0: 004fb478 004fb478 004fb4b4 00000800 b3bfcf00 00000000 >> [ 1186.285847] ffc0: 004fb478 004fb478 004fb478 00000081 00000000 be974900 be974a55 004fb478 >> [ 1186.294062] ffe0: 004f8f5c be97352c 004ddd97 b6d11d68 >> [ 1186.299253] ---[ end trace e1c61dea091f1078 ]--- > > I noticed that I'm calling counter_chrdev_remove() twice: once in > counter_unregister(), and again in counter_device_release(). I suspect > this is what's causing the refcount to underflow. I'll test and verify > that this is the culprit. > > In fact, I don't think I need to define a counter_device_release() > callback at all, would I? These cleanup function calls could be moved to > counter_unregister() instead. As long as a user program keeps a chrdev open, it holds a reference to the device, so I think it needs to be the other way around. (Unless it is impossible to call counter_unregister() before all references have been released - but I don't think that is the case - not 100% sure.)