Message ID | 20241031063707.795842-1-aardelean@baylibre.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | [1/2] util_macros.h: fix/rework find_closest() macros | expand |
On Thu, 31 Oct 2024 08:37:06 +0200 Alexandru Ardelean <aardelean@baylibre.com> wrote: > A bug was found in the find_closest() (find_closest_descending() is also > affected after some testing), where for certain values with small > progressions, the rounding (done by averaging 2 values) causes an incorrect > index to be returned. Please help us understand the userspace-visible effects of this bug. Do you believe the bug is sufficiently serious to justify backporting these fixes into earlier kernel versions? If so, are you able to help us identify a suitable Fixes: target? Thanks.
On Fri, Nov 1, 2024 at 3:59 AM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Thu, 31 Oct 2024 08:37:06 +0200 Alexandru Ardelean <aardelean@baylibre.com> wrote: > > > A bug was found in the find_closest() (find_closest_descending() is also > > affected after some testing), where for certain values with small > > progressions, the rounding (done by averaging 2 values) causes an incorrect > > index to be returned. > > Please help us understand the userspace-visible effects of this bug. > > Do you believe the bug is sufficiently serious to justify backporting > these fixes into earlier kernel versions? If so, are you able to help > us identify a suitable Fixes: target? Oh right. Apologies. I keep forgetting the Fixes tag. Added below. I can also do a V2. Please advise on what's preferred. I'll also admit that my attempt at explaining the bug (in a general way) looks a bit wonky. I'll try to re-formulate the bug description for a V2. ------------------------------------------------------------------------------------------- For this reply, maybe I'll try a more "timeline" approach (for explaining the bug). I'll apologize for the amount of text I posted here, but this bug is one-of-those-LOTR-anthologies-trying-to-explain The bug was found while testing the 'drivers/iio/adc/ad7606.c' driver; particularly the oversampling setting (of the driver). Taking as reference the oversampling table (from the driver): static const unsigned int ad7606_oversampling_avail[7] = { 1, 2, 4, 8, 16, 32, 64, }; When doing: $ echo 1 > /sys/bus/iio/devices/iio\:device0/oversampling_ratio $ cat /sys/bus/iio/devices/iio\:device0/oversampling_ratio 1 # this is fine [1] $ echo 2 > /sys/bus/iio/devices/iio\:device0/oversampling_ratio $ cat /sys/bus/iio/devices/iio\:device0/oversampling_ratio 1 # this is wrong; 2 should be returned here $ echo 3 > /sys/bus/iio/devices/iio\:device0/oversampling_ratio $ cat /sys/bus/iio/devices/iio\:device0/oversampling_ratio 2 # this is fine $ echo 4 > /sys/bus/iio/devices/iio\:device0/oversampling_ratio $ cat /sys/bus/iio/devices/iio\:device0/oversampling_ratio 4 # this is fine $ echo 5 > /sys/bus/iio/devices/iio\:device0/oversampling_ratio $ cat /sys/bus/iio/devices/iio\:device0/oversampling_ratio 4 # this is fine And from here-on, the bug goes away. i.e. one gets the values as one would expect. The bug no longer happens as in the case of [1], as the difference between 2 values (in the array) increases enough, such that the averaging (of 2 values) no longer causes issues. Initially, the assumption was that this bug happens only for searching exact values when an array is monotonic with a progression of 1. (One could argue that find_closest() is not intended for arrays of 1,2,3,4,...N). While trying to create a fix for this issue, I started writing a kunit test. That produced some other quirks, but not as bad as [1]. For example, this array from 'drivers/hwmon/ina2xx.c' & 'drivers/iio/adc/ina2xx-adc.c' drivers. const int ina226_avg_tab[] = { 1, 4, 16, 64, 128, 256, 512, 1024 }; While running the kunit test (for 'ina226_avg_tab'): * idx = find_closest([-1 to 2], ina226_avg_tab, ARRAY_SIZE(ina226_avg_tab)); This returns idx == 0, so value 1 * idx = find_closest(3, ina226_avg_tab, ARRAY_SIZE(ina226_avg_tab)); This returns idx == 0, value 1; and now one could argue whether 3 is closer to 4 or to 1. This quirk only appears for value '3' in this array. And from here-on the current find_closest() works fine (one gets what one would expect). ------------------------------------------------------------------------------------------- The above is one way of trying to explain the bug. The other way I am trying to explain this bug is through the kunit in the 2nd patch of this series. But to answer one question above: this bug exists since the first introduction of "find_closest()". Circa kernel version 4.1 And it only affects several corner cases (of arrays). Fixes: 95d119528b0b ("util_macros.h: add find_closest() macro") > > Thanks.
On Fri, 1 Nov 2024 11:07:04 +0200 Alexandru Ardelean <aardelean@baylibre.com> wrote:
> I can also do a V2.
That would be best, thanks.
On Fri, Nov 1, 2024 at 10:05 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Fri, 1 Nov 2024 11:07:04 +0200 Alexandru Ardelean <aardelean@baylibre.com> wrote: > > > I can also do a V2. > > That would be best, thanks. Still working on the V2. I ended up down a rabbit hole of signed-vs-unsigned arrays and X (value to search for). Will try to resolve this as elegantly as possible also with a kunit test.
diff --git a/include/linux/util_macros.h b/include/linux/util_macros.h index 6bb460c3e818..60c74770b703 100644 --- a/include/linux/util_macros.h +++ b/include/linux/util_macros.h @@ -7,12 +7,18 @@ #define __find_closest(x, a, as, op) \ ({ \ typeof(as) __fc_i, __fc_as = (as) - 1; \ - typeof(x) __fc_x = (x); \ + typeof(x) __fc_mid_x, __fc_x = (x); \ + typeof(x) __fc_left, __fc_right; \ typeof(*a) const *__fc_a = (a); \ for (__fc_i = 0; __fc_i < __fc_as; __fc_i++) { \ - if (__fc_x op DIV_ROUND_CLOSEST(__fc_a[__fc_i] + \ - __fc_a[__fc_i + 1], 2)) \ + __fc_mid_x = (__fc_a[__fc_i] + __fc_a[__fc_i + 1]) / 2; \ + if (__fc_x op __fc_mid_x) { \ + __fc_left = __fc_mid_x - __fc_a[__fc_i]; \ + __fc_right = __fc_a[__fc_i + 1] - __fc_mid_x; \ + if (__fc_right < __fc_left) \ + __fc_i++; \ break; \ + } \ } \ (__fc_i); \ }) @@ -38,7 +44,7 @@ * Similar to find_closest() but 'a' is expected to be sorted in descending * order. */ -#define find_closest_descending(x, a, as) __find_closest(x, a, as, >=) +#define find_closest_descending(x, a, as) __find_closest(x, a, as, >) /** * is_insidevar - check if the @ptr points inside the @var memory range.
A bug was found in the find_closest() (find_closest_descending() is also affected after some testing), where for certain values with small progressions, the rounding (done by averaging 2 values) causes an incorrect index to be returned. The rounding issues occur for progressions of 1, 2 and 3. It goes away when the progression/interval between two values is 4 or larger. It's particularly bad for progressions of 1. For example if there's an array of 'a = { 1, 2, 3 }', using 'find_closest(2, a ...)' would return 0 (the index of '1'), rather than returning 1 (the index of '2'). This means that for exact values (with a progression of 1), find_closest() will misbehave and return the index of the value smaller than the one we're searching for. For progressions of 2 and 3, the exact values are obtained correctly; but values aren't approximated correctly (as one would expect). Starting with progressions of 4, all seems to be good. This change reworks the find_closest(x,) macros to also check the difference between the left and right elements when 'x'. If the distance to the right is smaller (than the distance to the left), the index is incremented by 1. This also makes redundant the need for using the DIV_ROUND_CLOSEST() macro. For find_closest_descending(), the operator was changed from '>=' to '>'. Since the iteration is happening from the highest-to-lowest values, the '>=' comparison would (for small progressions) prefer higher values (as closer to the given values). For example: Given array 'a[] = { 10, 7, 4, 1 };' find_closest_descending(2, a,...) returns the index[2] for 4 find_closest_descending(5, a,...) returns the index[1] for 7 find_closest_descending(8, a,...) returns the index[0] for 10 Signed-off-by: Alexandru Ardelean <aardelean@baylibre.com> --- include/linux/util_macros.h | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)