diff mbox series

[1/2] util_macros.h: fix/rework find_closest() macros

Message ID 20241031063707.795842-1-aardelean@baylibre.com (mailing list archive)
State Changes Requested
Headers show
Series [1/2] util_macros.h: fix/rework find_closest() macros | expand

Commit Message

Alexandru Ardelean Oct. 31, 2024, 6:37 a.m. UTC
A bug was found in the find_closest() (find_closest_descending() is also
affected after some testing), where for certain values with small
progressions, the rounding (done by averaging 2 values) causes an incorrect
index to be returned.
The rounding issues occur for progressions of 1, 2 and 3. It goes away when
the progression/interval between two values is 4 or larger.

It's particularly bad for progressions of 1. For example if there's an
array of 'a = { 1, 2, 3 }', using 'find_closest(2, a ...)' would return 0
(the index of '1'), rather than returning 1 (the index of '2').
This means that for exact values (with a progression of 1), find_closest()
will misbehave and return the index of the value smaller than the one we're
searching for.
For progressions of 2 and 3, the exact values are obtained correctly; but
values aren't approximated correctly (as one would expect). Starting with
progressions of 4, all seems to be good.

This change reworks the find_closest(x,) macros to also check the
difference between the left and right elements when 'x'. If the distance to
the right is smaller (than the distance to the left), the index is
incremented by 1. This also makes redundant the need for using the
DIV_ROUND_CLOSEST() macro.

For find_closest_descending(), the operator was changed from '>=' to '>'.
Since the iteration is happening from the highest-to-lowest values, the
'>=' comparison would (for small progressions) prefer higher values (as
closer to the given values).
For example:
  Given array 'a[] = { 10, 7, 4, 1 };'
     find_closest_descending(2, a,...) returns the index[2] for 4
     find_closest_descending(5, a,...) returns the index[1] for 7
     find_closest_descending(8, a,...) returns the index[0] for 10

Signed-off-by: Alexandru Ardelean <aardelean@baylibre.com>
---
 include/linux/util_macros.h | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

Comments

Andrew Morton Nov. 1, 2024, 1:59 a.m. UTC | #1
On Thu, 31 Oct 2024 08:37:06 +0200 Alexandru Ardelean <aardelean@baylibre.com> wrote:

> A bug was found in the find_closest() (find_closest_descending() is also
> affected after some testing), where for certain values with small
> progressions, the rounding (done by averaging 2 values) causes an incorrect
> index to be returned.

Please help us understand the userspace-visible effects of this bug.

Do you believe the bug is sufficiently serious to justify backporting
these fixes into earlier kernel versions?  If so, are you able to help
us identify a suitable Fixes: target?

Thanks.
Alexandru Ardelean Nov. 1, 2024, 9:07 a.m. UTC | #2
On Fri, Nov 1, 2024 at 3:59 AM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Thu, 31 Oct 2024 08:37:06 +0200 Alexandru Ardelean <aardelean@baylibre.com> wrote:
>
> > A bug was found in the find_closest() (find_closest_descending() is also
> > affected after some testing), where for certain values with small
> > progressions, the rounding (done by averaging 2 values) causes an incorrect
> > index to be returned.
>
> Please help us understand the userspace-visible effects of this bug.
>
> Do you believe the bug is sufficiently serious to justify backporting
> these fixes into earlier kernel versions?  If so, are you able to help
> us identify a suitable Fixes: target?

Oh right. Apologies.
I keep forgetting the Fixes tag.
Added below.
I can also do a V2.
Please advise on what's preferred.
I'll also admit that my attempt at explaining the bug (in a general
way) looks a bit wonky.
I'll try to re-formulate the bug description for a V2.

-------------------------------------------------------------------------------------------
For this reply, maybe I'll try a more "timeline" approach (for
explaining the bug).
I'll apologize for the amount of text I posted here, but this bug is
one-of-those-LOTR-anthologies-trying-to-explain

The bug was found while testing the 'drivers/iio/adc/ad7606.c' driver;
particularly the oversampling setting (of the driver).
Taking as reference the oversampling table (from the driver):

static const unsigned int ad7606_oversampling_avail[7] = {
        1, 2, 4, 8, 16, 32, 64,
};

When doing:
$ echo 1 > /sys/bus/iio/devices/iio\:device0/oversampling_ratio
$ cat /sys/bus/iio/devices/iio\:device0/oversampling_ratio
1  # this is fine

[1]
$ echo 2 > /sys/bus/iio/devices/iio\:device0/oversampling_ratio
$ cat /sys/bus/iio/devices/iio\:device0/oversampling_ratio
1  # this is wrong; 2 should be returned here

$ echo 3 > /sys/bus/iio/devices/iio\:device0/oversampling_ratio
$ cat /sys/bus/iio/devices/iio\:device0/oversampling_ratio
2  # this is fine

$ echo 4 > /sys/bus/iio/devices/iio\:device0/oversampling_ratio
$ cat /sys/bus/iio/devices/iio\:device0/oversampling_ratio
4  # this is fine

$ echo 5 > /sys/bus/iio/devices/iio\:device0/oversampling_ratio
$ cat /sys/bus/iio/devices/iio\:device0/oversampling_ratio
4  # this is fine

And from here-on, the bug goes away. i.e. one gets the values as one
would expect.
The bug no longer happens as in the case of [1], as the difference
between 2 values (in the array) increases enough, such that the
averaging (of 2 values) no longer causes issues.

Initially, the assumption was that this bug happens only for searching
exact values when an array is monotonic with a progression of 1.
(One could argue that find_closest() is not intended for arrays of
1,2,3,4,...N).
While trying to create a fix for this issue, I started writing a kunit test.
That produced some other quirks, but not as bad as [1].

For example, this array from 'drivers/hwmon/ina2xx.c' &
'drivers/iio/adc/ina2xx-adc.c' drivers.

const int ina226_avg_tab[] = { 1, 4, 16, 64, 128, 256, 512, 1024 };

While running the kunit test (for 'ina226_avg_tab'):
*  idx = find_closest([-1 to 2], ina226_avg_tab, ARRAY_SIZE(ina226_avg_tab));
   This returns idx == 0, so value 1
* idx = find_closest(3, ina226_avg_tab, ARRAY_SIZE(ina226_avg_tab));
   This returns idx == 0, value 1; and now one could argue whether 3
is closer to 4 or to 1.
   This quirk only appears for value '3' in this array.
   And from here-on the current find_closest() works fine (one gets
what one would expect).

-------------------------------------------------------------------------------------------

The above is one way of trying to explain the bug.
The other way I am trying to explain this bug is through the kunit in
the 2nd patch of this series.

But to answer one question above: this bug exists since the first
introduction of "find_closest()".
Circa kernel version 4.1
And it only affects several corner cases (of arrays).

Fixes: 95d119528b0b ("util_macros.h: add find_closest() macro")

>
> Thanks.
Andrew Morton Nov. 1, 2024, 8:04 p.m. UTC | #3
On Fri, 1 Nov 2024 11:07:04 +0200 Alexandru Ardelean <aardelean@baylibre.com> wrote:

> I can also do a V2.

That would be best, thanks.
Alexandru Ardelean Nov. 5, 2024, 7:15 a.m. UTC | #4
On Fri, Nov 1, 2024 at 10:05 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Fri, 1 Nov 2024 11:07:04 +0200 Alexandru Ardelean <aardelean@baylibre.com> wrote:
>
> > I can also do a V2.
>
> That would be best, thanks.

Still working on the V2.
I ended up down a rabbit hole of signed-vs-unsigned arrays and X
(value to search for).
Will try to resolve this as elegantly as possible also with a kunit test.
diff mbox series

Patch

diff --git a/include/linux/util_macros.h b/include/linux/util_macros.h
index 6bb460c3e818..60c74770b703 100644
--- a/include/linux/util_macros.h
+++ b/include/linux/util_macros.h
@@ -7,12 +7,18 @@ 
 #define __find_closest(x, a, as, op)					\
 ({									\
 	typeof(as) __fc_i, __fc_as = (as) - 1;				\
-	typeof(x) __fc_x = (x);						\
+	typeof(x) __fc_mid_x, __fc_x = (x);				\
+	typeof(x) __fc_left, __fc_right;				\
 	typeof(*a) const *__fc_a = (a);					\
 	for (__fc_i = 0; __fc_i < __fc_as; __fc_i++) {			\
-		if (__fc_x op DIV_ROUND_CLOSEST(__fc_a[__fc_i] +	\
-						__fc_a[__fc_i + 1], 2))	\
+		__fc_mid_x = (__fc_a[__fc_i] + __fc_a[__fc_i + 1]) / 2;	\
+		if (__fc_x op __fc_mid_x) {				\
+			__fc_left = __fc_mid_x - __fc_a[__fc_i];	\
+			__fc_right = __fc_a[__fc_i + 1] - __fc_mid_x;	\
+			if (__fc_right < __fc_left)			\
+				__fc_i++;				\
 			break;						\
+		}							\
 	}								\
 	(__fc_i);							\
 })
@@ -38,7 +44,7 @@ 
  * Similar to find_closest() but 'a' is expected to be sorted in descending
  * order.
  */
-#define find_closest_descending(x, a, as) __find_closest(x, a, as, >=)
+#define find_closest_descending(x, a, as) __find_closest(x, a, as, >)
 
 /**
  * is_insidevar - check if the @ptr points inside the @var memory range.