diff mbox series

[3/5] lib/bitmap: add test for bitmap_{from,to}_arr64

Message ID 20220428205116.861003-4-yury.norov@gmail.com (mailing list archive)
State New, archived
Headers show
Series bitmap: fix conversion from/to fix-sized arrays | expand

Commit Message

Yury Norov April 28, 2022, 8:51 p.m. UTC
Test newly added bitmap_{from,to}_arr64() functions similarly to
already existing bitmap_{from,to}_arr32() tests.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 lib/test_bitmap.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

Comments

Guenter Roeck May 19, 2022, 3:09 p.m. UTC | #1
On Thu, Apr 28, 2022 at 01:51:14PM -0700, Yury Norov wrote:
> Test newly added bitmap_{from,to}_arr64() functions similarly to
> already existing bitmap_{from,to}_arr32() tests.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

With this patch in linux-next (including next-20220519), I see lots of
bitmap test errors when booting 32-bit ppc images in qemu. Examples:

test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0", got "0,65"
...
test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65", got "0,65,128"
test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65", got "0,65,128-129"
test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65", got "0,65,128-130"
...
test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65,128-143", got "0,65,128-143,208-209"
test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65,128-143", got "0,65,128-143,208-210"

and so on. It only  gets worse from there, and ends with:

test_bitmap: parselist: 14: input is '0-2047:128/256' OK, Time: 4274
test_bitmap: bitmap_print_to_pagebuf: input is '0-32767
', Time: 127267
test_bitmap: failed 337 out of 3801 tests

Other architectures and 64-bit ppc builds seem to be fine. 

Guenter
Yury Norov May 19, 2022, 4:01 p.m. UTC | #2
On Thu, May 19, 2022 at 8:09 AM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On Thu, Apr 28, 2022 at 01:51:14PM -0700, Yury Norov wrote:
> > Test newly added bitmap_{from,to}_arr64() functions similarly to
> > already existing bitmap_{from,to}_arr32() tests.
> >
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
>
> With this patch in linux-next (including next-20220519), I see lots of
> bitmap test errors when booting 32-bit ppc images in qemu. Examples:
>
> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0", got "0,65"
> ...
> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65", got "0,65,128"
> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65", got "0,65,128-129"
> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65", got "0,65,128-130"
> ...
> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65,128-143", got "0,65,128-143,208-209"
> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65,128-143", got "0,65,128-143,208-210"
>
> and so on. It only  gets worse from there, and ends with:
>
> test_bitmap: parselist: 14: input is '0-2047:128/256' OK, Time: 4274
> test_bitmap: bitmap_print_to_pagebuf: input is '0-32767
> ', Time: 127267
> test_bitmap: failed 337 out of 3801 tests
>
> Other architectures and 64-bit ppc builds seem to be fine.

Hi Guenter,

Thanks for letting me know. It's really weird because it has already
been for 2 weeks
in next with no issues. But I tested it on mips32, not powerpc. I'll
check what happens
there.

Can you please share your config and qemu image if possible?

Thanks,
Yury
Guenter Roeck May 19, 2022, 6:04 p.m. UTC | #3
On 5/19/22 09:01, Yury Norov wrote:
> On Thu, May 19, 2022 at 8:09 AM Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> On Thu, Apr 28, 2022 at 01:51:14PM -0700, Yury Norov wrote:
>>> Test newly added bitmap_{from,to}_arr64() functions similarly to
>>> already existing bitmap_{from,to}_arr32() tests.
>>>
>>> Signed-off-by: Yury Norov <yury.norov@gmail.com>
>>
>> With this patch in linux-next (including next-20220519), I see lots of
>> bitmap test errors when booting 32-bit ppc images in qemu. Examples:
>>
>> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0", got "0,65"
>> ...
>> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65", got "0,65,128"
>> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65", got "0,65,128-129"
>> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65", got "0,65,128-130"
>> ...
>> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65,128-143", got "0,65,128-143,208-209"
>> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65,128-143", got "0,65,128-143,208-210"
>>
>> and so on. It only  gets worse from there, and ends with:
>>
>> test_bitmap: parselist: 14: input is '0-2047:128/256' OK, Time: 4274
>> test_bitmap: bitmap_print_to_pagebuf: input is '0-32767
>> ', Time: 127267
>> test_bitmap: failed 337 out of 3801 tests
>>
>> Other architectures and 64-bit ppc builds seem to be fine.
> 
> Hi Guenter,
> 
> Thanks for letting me know. It's really weird because it has already
> been for 2 weeks
> in next with no issues. But I tested it on mips32, not powerpc. I'll
> check what happens
> there.
> 
Oh, I have seen the problem for a while, it is just that -next is in
such a bad shape that it is difficult to bisect individual problems.

> Can you please share your config and qemu image if possible?
> 

First, you have to revert commit b033767848c411
("powerpc/code-patching: Use jump_label for testing freed initmem")
to avoid a crash. After that, a recent version of qemu should work
with the following command line.

qemu-system-ppc -kernel arch/powerpc/boot/uImage -M mpc8544ds \
	-m 256 -no-reboot -initrd rootfs.cpio \
	--append "rdinit=/sbin/init coherent_pool=512k mem=256M console=ttyS0" \
	-monitor none -nographic

Configuration is mpc85xx_defconfig with CONFIG_TEST_BITMAP enabled.
I used the root file system (initrd) from
https://github.com/groeck/linux-build-test/blob/master/rootfs/ppc/rootfs.cpio.gz

Hope this helps,
Guenter
Yury Norov May 20, 2022, 4:18 p.m. UTC | #4
On Thu, May 19, 2022 at 11:04 AM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On 5/19/22 09:01, Yury Norov wrote:
> > On Thu, May 19, 2022 at 8:09 AM Guenter Roeck <linux@roeck-us.net> wrote:
> >>
> >> On Thu, Apr 28, 2022 at 01:51:14PM -0700, Yury Norov wrote:
> >>> Test newly added bitmap_{from,to}_arr64() functions similarly to
> >>> already existing bitmap_{from,to}_arr32() tests.
> >>>
> >>> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> >>
> >> With this patch in linux-next (including next-20220519), I see lots of
> >> bitmap test errors when booting 32-bit ppc images in qemu. Examples:
> >>
> >> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0", got "0,65"
> >> ...
> >> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65", got "0,65,128"
> >> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65", got "0,65,128-129"
> >> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65", got "0,65,128-130"
> >> ...
> >> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65,128-143", got "0,65,128-143,208-209"
> >> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65,128-143", got "0,65,128-143,208-210"
> >>
> >> and so on. It only  gets worse from there, and ends with:
> >>
> >> test_bitmap: parselist: 14: input is '0-2047:128/256' OK, Time: 4274
> >> test_bitmap: bitmap_print_to_pagebuf: input is '0-32767
> >> ', Time: 127267
> >> test_bitmap: failed 337 out of 3801 tests
> >>
> >> Other architectures and 64-bit ppc builds seem to be fine.
> >
> > Hi Guenter,
> >
> > Thanks for letting me know. It's really weird because it has already
> > been for 2 weeks
> > in next with no issues. But I tested it on mips32, not powerpc. I'll
> > check what happens
> > there.
> >
> Oh, I have seen the problem for a while, it is just that -next is in
> such a bad shape that it is difficult to bisect individual problems.
>
> > Can you please share your config and qemu image if possible?
> >
>
> First, you have to revert commit b033767848c411
> ("powerpc/code-patching: Use jump_label for testing freed initmem")
> to avoid a crash. After that, a recent version of qemu should work
> with the following command line.
>
> qemu-system-ppc -kernel arch/powerpc/boot/uImage -M mpc8544ds \
>         -m 256 -no-reboot -initrd rootfs.cpio \
>         --append "rdinit=/sbin/init coherent_pool=512k mem=256M console=ttyS0" \
>         -monitor none -nographic
>
> Configuration is mpc85xx_defconfig with CONFIG_TEST_BITMAP enabled.
> I used the root file system (initrd) from
> https://github.com/groeck/linux-build-test/blob/master/rootfs/ppc/rootfs.cpio.gz

Yes, that helped a lot. Thanks, I was able to reproduce it. I'll take
a look shortly.

Thanks,
Yury
Yury Norov May 21, 2022, 7:38 a.m. UTC | #5
On Thu, May 19, 2022 at 08:09:29AM -0700, Guenter Roeck wrote:
> On Thu, Apr 28, 2022 at 01:51:14PM -0700, Yury Norov wrote:
> > Test newly added bitmap_{from,to}_arr64() functions similarly to
> > already existing bitmap_{from,to}_arr32() tests.
> > 
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> 
> With this patch in linux-next (including next-20220519), I see lots of
> bitmap test errors when booting 32-bit ppc images in qemu. Examples:
> 
> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0", got "0,65"
> ...
> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65", got "0,65,128"
> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65", got "0,65,128-129"
> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65", got "0,65,128-130"
> ...
> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65,128-143", got "0,65,128-143,208-209"
> test_bitmap: [lib/test_bitmap.c:600] bitmaps contents differ: expected "0,65,128-143", got "0,65,128-143,208-210"
> 
> and so on. It only  gets worse from there, and ends with:
> 
> test_bitmap: parselist: 14: input is '0-2047:128/256' OK, Time: 4274
> test_bitmap: bitmap_print_to_pagebuf: input is '0-32767
> ', Time: 127267
> test_bitmap: failed 337 out of 3801 tests
> 
> Other architectures and 64-bit ppc builds seem to be fine. 

So, the problem is in previous patch. I'll fold-in the following fix
into that if no objections.

diff --git a/lib/bitmap.c b/lib/bitmap.c
index 52b9912a71ea..97c14845f452 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -1585,7 +1585,7 @@ void bitmap_to_arr64(u64 *buf, const unsigned long *bitmap, unsigned int nbits)

        /* Clear tail bits in the last element of array beyond nbits. */
        if (nbits % 64)
-               buf[-1] &= GENMASK_ULL(nbits, 0);
+               buf[-1] &= GENMASK_ULL(nbits % 64, 0);
 }
 EXPORT_SYMBOL(bitmap_to_arr64);
 #endif
Guenter Roeck Feb. 25, 2023, 6:47 p.m. UTC | #6
Hi,

On Thu, Apr 28, 2022 at 01:51:14PM -0700, Yury Norov wrote:
> Test newly added bitmap_{from,to}_arr64() functions similarly to
> already existing bitmap_{from,to}_arr32() tests.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

Ever since this test is in the tree, several of my boot tests show
lots of messages such as

test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001)
test_bitmap: bitmap_to_arr64(nbits == 2): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000003)
test_bitmap: bitmap_to_arr64(nbits == 3): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000007)
...
test_bitmap: bitmap_to_arr64(nbits == 927): tail is not safely cleared: 0xa5a5a5a500000000 (must be 0x000000007fffffff)
test_bitmap: bitmap_to_arr64(nbits == 928): tail is not safely cleared: 0xa5a5a5a580000000 (must be 0x00000000ffffffff)

but then:

test_bitmap: all 6550 tests passed

The message suggests an error, given that it is displayed with pr_err,
but the summary suggests otherwise.

Is the message just noise, or is there a problem ?

Thanks,
Guenter

> ---
>  lib/test_bitmap.c | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
> 
> diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
> index 0c82f07f74fc..d5923a640457 100644
> --- a/lib/test_bitmap.c
> +++ b/lib/test_bitmap.c
> @@ -585,6 +585,30 @@ static void __init test_bitmap_arr32(void)
>  	}
>  }
>  
> +static void __init test_bitmap_arr64(void)
> +{
> +	unsigned int nbits, next_bit;
> +	u64 arr[EXP1_IN_BITS / 64];
> +	DECLARE_BITMAP(bmap2, EXP1_IN_BITS);
> +
> +	memset(arr, 0xa5, sizeof(arr));
> +
> +	for (nbits = 0; nbits < EXP1_IN_BITS; ++nbits) {
> +		memset(bmap2, 0xff, sizeof(arr));
> +		bitmap_to_arr64(arr, exp1, nbits);
> +		bitmap_from_arr64(bmap2, arr, nbits);
> +		expect_eq_bitmap(bmap2, exp1, nbits);
> +
> +		next_bit = find_next_bit(bmap2, round_up(nbits, BITS_PER_LONG), nbits);
> +		if (next_bit < round_up(nbits, BITS_PER_LONG))
> +			pr_err("bitmap_copy_arr64(nbits == %d:"
> +				" tail is not safely cleared: %d\n", nbits, next_bit);
> +
> +		if (nbits < EXP1_IN_BITS - 64)
> +			expect_eq_uint(arr[DIV_ROUND_UP(nbits, 64)], 0xa5a5a5a5);
> +	}
> +}
> +
>  static void noinline __init test_mem_optimisations(void)
>  {
>  	DECLARE_BITMAP(bmap1, 1024);
> @@ -852,6 +876,7 @@ static void __init selftest(void)
>  	test_copy();
>  	test_replace();
>  	test_bitmap_arr32();
> +	test_bitmap_arr64();
>  	test_bitmap_parse();
>  	test_bitmap_parselist();
>  	test_bitmap_printlist();
> -- 
> 2.32.0
>
Yury Norov Feb. 26, 2023, 12:04 a.m. UTC | #7
On Sat, Feb 25, 2023 at 10:47:02AM -0800, Guenter Roeck wrote:
> Hi,
> 
> On Thu, Apr 28, 2022 at 01:51:14PM -0700, Yury Norov wrote:
> > Test newly added bitmap_{from,to}_arr64() functions similarly to
> > already existing bitmap_{from,to}_arr32() tests.
> > 
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> 
> Ever since this test is in the tree, several of my boot tests show
> lots of messages such as
> 
> test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001)
> test_bitmap: bitmap_to_arr64(nbits == 2): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000003)
> test_bitmap: bitmap_to_arr64(nbits == 3): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000007)
> ...
> test_bitmap: bitmap_to_arr64(nbits == 927): tail is not safely cleared: 0xa5a5a5a500000000 (must be 0x000000007fffffff)
> test_bitmap: bitmap_to_arr64(nbits == 928): tail is not safely cleared: 0xa5a5a5a580000000 (must be 0x00000000ffffffff)

This may be a real problem. Can you share what's the system is? What's
endianness and register length?

+ Alexander Lobakin, the author of the exact subtest.
 
> but then:
> 
> test_bitmap: all 6550 tests passed

It's because corresponding error path doesn't increment failed_tests
counter. I'll send a fix shortly.

> 
> The message suggests an error, given that it is displayed with pr_err,
> but the summary suggests otherwise.
> 
> Is the message just noise, or is there a problem ?
> 
> Thanks,
> Guenter
> 
> > ---
> >  lib/test_bitmap.c | 25 +++++++++++++++++++++++++
> >  1 file changed, 25 insertions(+)
> > 
> > diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
> > index 0c82f07f74fc..d5923a640457 100644
> > --- a/lib/test_bitmap.c
> > +++ b/lib/test_bitmap.c
> > @@ -585,6 +585,30 @@ static void __init test_bitmap_arr32(void)
> >  	}
> >  }
> >  
> > +static void __init test_bitmap_arr64(void)
> > +{
> > +	unsigned int nbits, next_bit;
> > +	u64 arr[EXP1_IN_BITS / 64];
> > +	DECLARE_BITMAP(bmap2, EXP1_IN_BITS);
> > +
> > +	memset(arr, 0xa5, sizeof(arr));
> > +
> > +	for (nbits = 0; nbits < EXP1_IN_BITS; ++nbits) {
> > +		memset(bmap2, 0xff, sizeof(arr));
> > +		bitmap_to_arr64(arr, exp1, nbits);
> > +		bitmap_from_arr64(bmap2, arr, nbits);
> > +		expect_eq_bitmap(bmap2, exp1, nbits);
> > +
> > +		next_bit = find_next_bit(bmap2, round_up(nbits, BITS_PER_LONG), nbits);
> > +		if (next_bit < round_up(nbits, BITS_PER_LONG))
> > +			pr_err("bitmap_copy_arr64(nbits == %d:"
> > +				" tail is not safely cleared: %d\n", nbits, next_bit);
> > +
> > +		if (nbits < EXP1_IN_BITS - 64)
> > +			expect_eq_uint(arr[DIV_ROUND_UP(nbits, 64)], 0xa5a5a5a5);
> > +	}
> > +}
> > +
> >  static void noinline __init test_mem_optimisations(void)
> >  {
> >  	DECLARE_BITMAP(bmap1, 1024);
> > @@ -852,6 +876,7 @@ static void __init selftest(void)
> >  	test_copy();
> >  	test_replace();
> >  	test_bitmap_arr32();
> > +	test_bitmap_arr64();
> >  	test_bitmap_parse();
> >  	test_bitmap_parselist();
> >  	test_bitmap_printlist();
> > -- 
> > 2.32.0
> >
Yury Norov Feb. 26, 2023, 12:06 a.m. UTC | #8
On Sat, Feb 25, 2023 at 04:05:02PM -0800, Yury Norov wrote:
> On Sat, Feb 25, 2023 at 10:47:02AM -0800, Guenter Roeck wrote:
> > Hi,
> > 
> > On Thu, Apr 28, 2022 at 01:51:14PM -0700, Yury Norov wrote:
> > > Test newly added bitmap_{from,to}_arr64() functions similarly to
> > > already existing bitmap_{from,to}_arr32() tests.
> > > 
> > > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > 
> > Ever since this test is in the tree, several of my boot tests show
> > lots of messages such as
> > 
> > test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001)
> > test_bitmap: bitmap_to_arr64(nbits == 2): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000003)
> > test_bitmap: bitmap_to_arr64(nbits == 3): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000007)
> > ...
> > test_bitmap: bitmap_to_arr64(nbits == 927): tail is not safely cleared: 0xa5a5a5a500000000 (must be 0x000000007fffffff)
> > test_bitmap: bitmap_to_arr64(nbits == 928): tail is not safely cleared: 0xa5a5a5a580000000 (must be 0x00000000ffffffff)
> 
> This may be a real problem. Can you share what's the system is? What's
> endianness and register length?
> 
> + Alexander Lobakin, the author of the exact subtest.

Forgot to add
  
> > but then:
> > 
> > test_bitmap: all 6550 tests passed
> 
> It's because corresponding error path doesn't increment failed_tests
> counter. I'll send a fix shortly.
> 
> > 
> > The message suggests an error, given that it is displayed with pr_err,
> > but the summary suggests otherwise.
> > 
> > Is the message just noise, or is there a problem ?
> > 
> > Thanks,
> > Guenter
> > 
> > > ---
> > >  lib/test_bitmap.c | 25 +++++++++++++++++++++++++
> > >  1 file changed, 25 insertions(+)
> > > 
> > > diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
> > > index 0c82f07f74fc..d5923a640457 100644
> > > --- a/lib/test_bitmap.c
> > > +++ b/lib/test_bitmap.c
> > > @@ -585,6 +585,30 @@ static void __init test_bitmap_arr32(void)
> > >  	}
> > >  }
> > >  
> > > +static void __init test_bitmap_arr64(void)
> > > +{
> > > +	unsigned int nbits, next_bit;
> > > +	u64 arr[EXP1_IN_BITS / 64];
> > > +	DECLARE_BITMAP(bmap2, EXP1_IN_BITS);
> > > +
> > > +	memset(arr, 0xa5, sizeof(arr));
> > > +
> > > +	for (nbits = 0; nbits < EXP1_IN_BITS; ++nbits) {
> > > +		memset(bmap2, 0xff, sizeof(arr));
> > > +		bitmap_to_arr64(arr, exp1, nbits);
> > > +		bitmap_from_arr64(bmap2, arr, nbits);
> > > +		expect_eq_bitmap(bmap2, exp1, nbits);
> > > +
> > > +		next_bit = find_next_bit(bmap2, round_up(nbits, BITS_PER_LONG), nbits);
> > > +		if (next_bit < round_up(nbits, BITS_PER_LONG))
> > > +			pr_err("bitmap_copy_arr64(nbits == %d:"
> > > +				" tail is not safely cleared: %d\n", nbits, next_bit);
> > > +
> > > +		if (nbits < EXP1_IN_BITS - 64)
> > > +			expect_eq_uint(arr[DIV_ROUND_UP(nbits, 64)], 0xa5a5a5a5);
> > > +	}
> > > +}
> > > +
> > >  static void noinline __init test_mem_optimisations(void)
> > >  {
> > >  	DECLARE_BITMAP(bmap1, 1024);
> > > @@ -852,6 +876,7 @@ static void __init selftest(void)
> > >  	test_copy();
> > >  	test_replace();
> > >  	test_bitmap_arr32();
> > > +	test_bitmap_arr64();
> > >  	test_bitmap_parse();
> > >  	test_bitmap_parselist();
> > >  	test_bitmap_printlist();
> > > -- 
> > > 2.32.0
> > >
Guenter Roeck Feb. 26, 2023, 12:42 a.m. UTC | #9
On 2/25/23 16:04, Yury Norov wrote:
> On Sat, Feb 25, 2023 at 10:47:02AM -0800, Guenter Roeck wrote:
>> Hi,
>>
>> On Thu, Apr 28, 2022 at 01:51:14PM -0700, Yury Norov wrote:
>>> Test newly added bitmap_{from,to}_arr64() functions similarly to
>>> already existing bitmap_{from,to}_arr32() tests.
>>>
>>> Signed-off-by: Yury Norov <yury.norov@gmail.com>
>>
>> Ever since this test is in the tree, several of my boot tests show
>> lots of messages such as
>>
>> test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001)
>> test_bitmap: bitmap_to_arr64(nbits == 2): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000003)
>> test_bitmap: bitmap_to_arr64(nbits == 3): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000007)
>> ...
>> test_bitmap: bitmap_to_arr64(nbits == 927): tail is not safely cleared: 0xa5a5a5a500000000 (must be 0x000000007fffffff)
>> test_bitmap: bitmap_to_arr64(nbits == 928): tail is not safely cleared: 0xa5a5a5a580000000 (must be 0x00000000ffffffff)
> 
> This may be a real problem. Can you share what's the system is? What's
> endianness and register length?
> 

https://kerneltests.org/builders/qemu-arm-v7-master/builds/532/steps/qemubuildcommand/logs/stdio
https://kerneltests.org/builders/qemu-arm-next/builds/2086/steps/qemubuildcommand/logs/stdio

are examples from different arm machines.

https://kerneltests.org/builders/qemu-xtensa-master/builds/2178/steps/qemubuildcommand/logs/stdio

is an example from xtensa.

https://kerneltests.org/builders/qemu-x86-master/builds/2248/steps/qemubuildcommand/logs/stdio

is an example from a 32-bit x86 test.

I am sure there are others; I only notice the messages if there is some other problem,
and came to believe that they must be noise because it happens across several
architectures and because it claimed at the end that all tests passed.

Just to confirm, I ran a test with big and little endian mips machines.
The problem is only seen with the little endian 32-bit machine. It is not
seen with the 32-bit big endian machine, and it is not seen with any of
the 64-bit machines (big and little endian).

Guenter

> + Alexander Lobakin, the author of the exact subtest.
>   
>> but then:
>>
>> test_bitmap: all 6550 tests passed
> 
> It's because corresponding error path doesn't increment failed_tests
> counter. I'll send a fix shortly.
> 
>>
>> The message suggests an error, given that it is displayed with pr_err,
>> but the summary suggests otherwise.
>>
>> Is the message just noise, or is there a problem ?
>>
>> Thanks,
>> Guenter
>>
>>> ---
>>>   lib/test_bitmap.c | 25 +++++++++++++++++++++++++
>>>   1 file changed, 25 insertions(+)
>>>
>>> diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
>>> index 0c82f07f74fc..d5923a640457 100644
>>> --- a/lib/test_bitmap.c
>>> +++ b/lib/test_bitmap.c
>>> @@ -585,6 +585,30 @@ static void __init test_bitmap_arr32(void)
>>>   	}
>>>   }
>>>   
>>> +static void __init test_bitmap_arr64(void)
>>> +{
>>> +	unsigned int nbits, next_bit;
>>> +	u64 arr[EXP1_IN_BITS / 64];
>>> +	DECLARE_BITMAP(bmap2, EXP1_IN_BITS);
>>> +
>>> +	memset(arr, 0xa5, sizeof(arr));
>>> +
>>> +	for (nbits = 0; nbits < EXP1_IN_BITS; ++nbits) {
>>> +		memset(bmap2, 0xff, sizeof(arr));
>>> +		bitmap_to_arr64(arr, exp1, nbits);
>>> +		bitmap_from_arr64(bmap2, arr, nbits);
>>> +		expect_eq_bitmap(bmap2, exp1, nbits);
>>> +
>>> +		next_bit = find_next_bit(bmap2, round_up(nbits, BITS_PER_LONG), nbits);
>>> +		if (next_bit < round_up(nbits, BITS_PER_LONG))
>>> +			pr_err("bitmap_copy_arr64(nbits == %d:"
>>> +				" tail is not safely cleared: %d\n", nbits, next_bit);
>>> +
>>> +		if (nbits < EXP1_IN_BITS - 64)
>>> +			expect_eq_uint(arr[DIV_ROUND_UP(nbits, 64)], 0xa5a5a5a5);
>>> +	}
>>> +}
>>> +
>>>   static void noinline __init test_mem_optimisations(void)
>>>   {
>>>   	DECLARE_BITMAP(bmap1, 1024);
>>> @@ -852,6 +876,7 @@ static void __init selftest(void)
>>>   	test_copy();
>>>   	test_replace();
>>>   	test_bitmap_arr32();
>>> +	test_bitmap_arr64();
>>>   	test_bitmap_parse();
>>>   	test_bitmap_parselist();
>>>   	test_bitmap_printlist();
>>> -- 
>>> 2.32.0
>>>
Alexander Lobakin Feb. 27, 2023, 2:46 p.m. UTC | #10
From: Yury Norov <yury.norov@gmail.com>
Date: Sat, 25 Feb 2023 16:06:45 -0800

> On Sat, Feb 25, 2023 at 04:05:02PM -0800, Yury Norov wrote:
>> On Sat, Feb 25, 2023 at 10:47:02AM -0800, Guenter Roeck wrote:
>>> Hi,
>>>
>>> On Thu, Apr 28, 2022 at 01:51:14PM -0700, Yury Norov wrote:
>>>> Test newly added bitmap_{from,to}_arr64() functions similarly to
>>>> already existing bitmap_{from,to}_arr32() tests.
>>>>
>>>> Signed-off-by: Yury Norov <yury.norov@gmail.com>
>>>
>>> Ever since this test is in the tree, several of my boot tests show
>>> lots of messages such as
>>>
>>> test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001)

Hmmm, the whole 4 bytes weren't touched.

>>> test_bitmap: bitmap_to_arr64(nbits == 2): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000003)
>>> test_bitmap: bitmap_to_arr64(nbits == 3): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000007)

This is where it gets worse...

>>> ...
>>> test_bitmap: bitmap_to_arr64(nbits == 927): tail is not safely cleared: 0xa5a5a5a500000000 (must be 0x000000007fffffff)
>>> test_bitmap: bitmap_to_arr64(nbits == 928): tail is not safely cleared: 0xa5a5a5a580000000 (must be 0x00000000ffffffff)

I don't see the pattern how the actual result gets generated. But the
problem is in the bitmap code rather than in the subtest -- "must be"s
are fully correct.

Given that the 0xa5s are present in the upper 32 bits, it is Big Endian
I guess? Maybe even 32-bit Big Endian? Otherwise I'd start concerning
how comes it doesn't reproduce on x86_64s :D

>>
>> This may be a real problem. Can you share what's the system is? What's
>> endianness and register length?
>>
>> + Alexander Lobakin, the author of the exact subtest.
> 
> Forgot to add

Oh, thanks for letting me know!

>   
>>> but then:
>>>
>>> test_bitmap: all 6550 tests passed
>>
>> It's because corresponding error path doesn't increment failed_tests
>> counter. I'll send a fix shortly.

[...]

Thanks,
Olek
Guenter Roeck Feb. 27, 2023, 2:59 p.m. UTC | #11
On 2/27/23 06:46, Alexander Lobakin wrote:
> From: Yury Norov <yury.norov@gmail.com>
> Date: Sat, 25 Feb 2023 16:06:45 -0800
> 
>> On Sat, Feb 25, 2023 at 04:05:02PM -0800, Yury Norov wrote:
>>> On Sat, Feb 25, 2023 at 10:47:02AM -0800, Guenter Roeck wrote:
>>>> Hi,
>>>>
>>>> On Thu, Apr 28, 2022 at 01:51:14PM -0700, Yury Norov wrote:
>>>>> Test newly added bitmap_{from,to}_arr64() functions similarly to
>>>>> already existing bitmap_{from,to}_arr32() tests.
>>>>>
>>>>> Signed-off-by: Yury Norov <yury.norov@gmail.com>
>>>>
>>>> Ever since this test is in the tree, several of my boot tests show
>>>> lots of messages such as
>>>>
>>>> test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001)
> 
> Hmmm, the whole 4 bytes weren't touched.
> 
>>>> test_bitmap: bitmap_to_arr64(nbits == 2): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000003)
>>>> test_bitmap: bitmap_to_arr64(nbits == 3): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000007)
> 
> This is where it gets worse...
> 
>>>> ...
>>>> test_bitmap: bitmap_to_arr64(nbits == 927): tail is not safely cleared: 0xa5a5a5a500000000 (must be 0x000000007fffffff)
>>>> test_bitmap: bitmap_to_arr64(nbits == 928): tail is not safely cleared: 0xa5a5a5a580000000 (must be 0x00000000ffffffff)
> 
> I don't see the pattern how the actual result gets generated. But the
> problem is in the bitmap code rather than in the subtest -- "must be"s
> are fully correct.
> 
> Given that the 0xa5s are present in the upper 32 bits, it is Big Endian
> I guess? Maybe even 32-bit Big Endian? Otherwise I'd start concerning
> how comes it doesn't reproduce on x86_64s :D
> 

It does reproduce on 32-bit x86 builds, and as far as I can see
it is only seen with 32-bit little endian systems.

Guenter

>>>
>>> This may be a real problem. Can you share what's the system is? What's
>>> endianness and register length?
>>>
>>> + Alexander Lobakin, the author of the exact subtest.
>>
>> Forgot to add
> 
> Oh, thanks for letting me know!
> 
>>    
>>>> but then:
>>>>
>>>> test_bitmap: all 6550 tests passed
>>>
>>> It's because corresponding error path doesn't increment failed_tests
>>> counter. I'll send a fix shortly.
> 
> [...]
> 
> Thanks,
> Olek
Yury Norov Feb. 27, 2023, 7:24 p.m. UTC | #12
On Mon, Feb 27, 2023 at 06:59:12AM -0800, Guenter Roeck wrote:
> On 2/27/23 06:46, Alexander Lobakin wrote:
> > From: Yury Norov <yury.norov@gmail.com>
> > Date: Sat, 25 Feb 2023 16:06:45 -0800
> > 
> > > On Sat, Feb 25, 2023 at 04:05:02PM -0800, Yury Norov wrote:
> > > > On Sat, Feb 25, 2023 at 10:47:02AM -0800, Guenter Roeck wrote:
> > > > > Hi,
> > > > > 
> > > > > On Thu, Apr 28, 2022 at 01:51:14PM -0700, Yury Norov wrote:
> > > > > > Test newly added bitmap_{from,to}_arr64() functions similarly to
> > > > > > already existing bitmap_{from,to}_arr32() tests.
> > > > > > 
> > > > > > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > > > > 
> > > > > Ever since this test is in the tree, several of my boot tests show
> > > > > lots of messages such as
> > > > > 
> > > > > test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001)
> > 
> > Hmmm, the whole 4 bytes weren't touched.
> > 
> > > > > test_bitmap: bitmap_to_arr64(nbits == 2): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000003)
> > > > > test_bitmap: bitmap_to_arr64(nbits == 3): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000007)
> > 
> > This is where it gets worse...
> > 
> > > > > ...
> > > > > test_bitmap: bitmap_to_arr64(nbits == 927): tail is not safely cleared: 0xa5a5a5a500000000 (must be 0x000000007fffffff)
> > > > > test_bitmap: bitmap_to_arr64(nbits == 928): tail is not safely cleared: 0xa5a5a5a580000000 (must be 0x00000000ffffffff)
> > 
> > I don't see the pattern how the actual result gets generated. But the
> > problem is in the bitmap code rather than in the subtest -- "must be"s
> > are fully correct.
> > 
> > Given that the 0xa5s are present in the upper 32 bits, it is Big Endian
> > I guess? Maybe even 32-bit Big Endian? Otherwise I'd start concerning
> > how comes it doesn't reproduce on x86_64s :D
> > 
> 
> It does reproduce on 32-bit x86 builds, and as far as I can see
> it is only seen with 32-bit little endian systems.

Hi Guenter, Alexander,

I think that the reason for the failures like this:

> test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001)

is that bitmap_to_arr64 is overly optimized for 32-bit LE architectures.

Regarding this:

> test_bitmap: bitmap_to_arr64(nbits == 927): tail is not safely cleared: 0xa5a5a5a500000000 (must be 0x000000007fffffff)

I am not sure what happens, but because this again happens on 32-bit
LE only, I hope the following fix would help too.

Can you please check if the patch works for you? I don't have a 32-bit LE
machine in hand, and all my 32-bit VMs (arm and i386) refuse to load the
latest kernels for some weird reason, so it's only build-tested.

I'll give it a full-run when restore my 32-bit setups.

Thanks,
Yury

From 2881714db497aed103e310865da075e7b0ce7e1a Mon Sep 17 00:00:00 2001
From: Yury Norov <yury.norov@gmail.com>
Date: Mon, 27 Feb 2023 09:21:59 -0800
Subject: [PATCH] lib/bitmap: drop optimization of bitmap_{from,to}_arr64

bitmap_{from,to}_arr64() optimization is overly optimistic on 32-bit LE
architectures when it's wired to bitmap_copy_clear_tail().

bitmap_copy_clear_tail() takes care of unused bits in the bitmap up to
the next word boundary. But on 32-bit machines when copying bits from
bitmap to array of 64-bit words, it's expected that the unused part of
a recipient array must be cleared up to 64-bit boundary, so the last 4
bytes may stay untouched.

While the copying part of the optimization works correct, that clear-tail
trick makes corresponding tests reasonably fail when nbits % 64 <= 32:

test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001)

Fix it by removing bitmap_{from,to}_arr64() optimization for 32-bit LE
arches.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Fixes: 0a97953fd2210 ("lib: add bitmap_{from,to}_arr64")
Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/bitmap.h | 8 +++-----
 lib/bitmap.c           | 2 +-
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 40e53a2ecc0d..5abc993903fb 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -302,12 +302,10 @@ void bitmap_to_arr32(u32 *buf, const unsigned long *bitmap,
 #endif
 
 /*
- * On 64-bit systems bitmaps are represented as u64 arrays internally. On LE32
- * machines the order of hi and lo parts of numbers match the bitmap structure.
- * In both cases conversion is not needed when copying data from/to arrays of
- * u64.
+ * On 64-bit systems bitmaps are represented as u64 arrays internally. So,
+ * conversion is not needed when copying data from/to arrays of u64.
  */
-#if (BITS_PER_LONG == 32) && defined(__BIG_ENDIAN)
+#if BITS_PER_LONG == 32
 void bitmap_from_arr64(unsigned long *bitmap, const u64 *buf, unsigned int nbits);
 void bitmap_to_arr64(u64 *buf, const unsigned long *bitmap, unsigned int nbits);
 #else
diff --git a/lib/bitmap.c b/lib/bitmap.c
index 1c81413c51f8..ddb31015e38a 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -1495,7 +1495,7 @@ void bitmap_to_arr32(u32 *buf, const unsigned long *bitmap, unsigned int nbits)
 EXPORT_SYMBOL(bitmap_to_arr32);
 #endif
 
-#if (BITS_PER_LONG == 32) && defined(__BIG_ENDIAN)
+#if BITS_PER_LONG == 32
 /**
  * bitmap_from_arr64 - copy the contents of u64 array of bits to bitmap
  *	@bitmap: array of unsigned longs, the destination bitmap
Guenter Roeck Feb. 27, 2023, 8:12 p.m. UTC | #13
On 2/27/23 11:24, Yury Norov wrote:
> On Mon, Feb 27, 2023 at 06:59:12AM -0800, Guenter Roeck wrote:
>> On 2/27/23 06:46, Alexander Lobakin wrote:
>>> From: Yury Norov <yury.norov@gmail.com>
>>> Date: Sat, 25 Feb 2023 16:06:45 -0800
>>>
>>>> On Sat, Feb 25, 2023 at 04:05:02PM -0800, Yury Norov wrote:
>>>>> On Sat, Feb 25, 2023 at 10:47:02AM -0800, Guenter Roeck wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On Thu, Apr 28, 2022 at 01:51:14PM -0700, Yury Norov wrote:
>>>>>>> Test newly added bitmap_{from,to}_arr64() functions similarly to
>>>>>>> already existing bitmap_{from,to}_arr32() tests.
>>>>>>>
>>>>>>> Signed-off-by: Yury Norov <yury.norov@gmail.com>
>>>>>>
>>>>>> Ever since this test is in the tree, several of my boot tests show
>>>>>> lots of messages such as
>>>>>>
>>>>>> test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001)
>>>
>>> Hmmm, the whole 4 bytes weren't touched.
>>>
>>>>>> test_bitmap: bitmap_to_arr64(nbits == 2): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000003)
>>>>>> test_bitmap: bitmap_to_arr64(nbits == 3): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000007)
>>>
>>> This is where it gets worse...
>>>
>>>>>> ...
>>>>>> test_bitmap: bitmap_to_arr64(nbits == 927): tail is not safely cleared: 0xa5a5a5a500000000 (must be 0x000000007fffffff)
>>>>>> test_bitmap: bitmap_to_arr64(nbits == 928): tail is not safely cleared: 0xa5a5a5a580000000 (must be 0x00000000ffffffff)
>>>
>>> I don't see the pattern how the actual result gets generated. But the
>>> problem is in the bitmap code rather than in the subtest -- "must be"s
>>> are fully correct.
>>>
>>> Given that the 0xa5s are present in the upper 32 bits, it is Big Endian
>>> I guess? Maybe even 32-bit Big Endian? Otherwise I'd start concerning
>>> how comes it doesn't reproduce on x86_64s :D
>>>
>>
>> It does reproduce on 32-bit x86 builds, and as far as I can see
>> it is only seen with 32-bit little endian systems.
> 
> Hi Guenter, Alexander,
> 
> I think that the reason for the failures like this:
> 
>> test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001)
> 
> is that bitmap_to_arr64 is overly optimized for 32-bit LE architectures.
> 
> Regarding this:
> 
>> test_bitmap: bitmap_to_arr64(nbits == 927): tail is not safely cleared: 0xa5a5a5a500000000 (must be 0x000000007fffffff)
> 
> I am not sure what happens, but because this again happens on 32-bit
> LE only, I hope the following fix would help too.
> 
> Can you please check if the patch works for you? I don't have a 32-bit LE
> machine in hand, and all my 32-bit VMs (arm and i386) refuse to load the
> latest kernels for some weird reason, so it's only build-tested.
> 
> I'll give it a full-run when restore my 32-bit setups.
> 
> Thanks,
> Yury
> 
>>From 2881714db497aed103e310865da075e7b0ce7e1a Mon Sep 17 00:00:00 2001
> From: Yury Norov <yury.norov@gmail.com>
> Date: Mon, 27 Feb 2023 09:21:59 -0800
> Subject: [PATCH] lib/bitmap: drop optimization of bitmap_{from,to}_arr64
> 
> bitmap_{from,to}_arr64() optimization is overly optimistic on 32-bit LE
> architectures when it's wired to bitmap_copy_clear_tail().
> 
> bitmap_copy_clear_tail() takes care of unused bits in the bitmap up to
> the next word boundary. But on 32-bit machines when copying bits from
> bitmap to array of 64-bit words, it's expected that the unused part of
> a recipient array must be cleared up to 64-bit boundary, so the last 4
> bytes may stay untouched.
> 
> While the copying part of the optimization works correct, that clear-tail
> trick makes corresponding tests reasonably fail when nbits % 64 <= 32:
> 
> test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001)
> 
> Fix it by removing bitmap_{from,to}_arr64() optimization for 32-bit LE
> arches.
> 
> Reported-by: Guenter Roeck <linux@roeck-us.net>
> Fixes: 0a97953fd2210 ("lib: add bitmap_{from,to}_arr64")
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

Tested with 32-bit i386 image. With this patch on top of
v6.2-12765-g982818426a0f, the log messages are gone. Without this patch,
they are still seen.

Tested-by: Guenter Roeck <linux@roeck-us.net>

Guenter

> ---
>   include/linux/bitmap.h | 8 +++-----
>   lib/bitmap.c           | 2 +-
>   2 files changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
> index 40e53a2ecc0d..5abc993903fb 100644
> --- a/include/linux/bitmap.h
> +++ b/include/linux/bitmap.h
> @@ -302,12 +302,10 @@ void bitmap_to_arr32(u32 *buf, const unsigned long *bitmap,
>   #endif
>   
>   /*
> - * On 64-bit systems bitmaps are represented as u64 arrays internally. On LE32
> - * machines the order of hi and lo parts of numbers match the bitmap structure.
> - * In both cases conversion is not needed when copying data from/to arrays of
> - * u64.
> + * On 64-bit systems bitmaps are represented as u64 arrays internally. So,
> + * conversion is not needed when copying data from/to arrays of u64.
>    */
> -#if (BITS_PER_LONG == 32) && defined(__BIG_ENDIAN)
> +#if BITS_PER_LONG == 32
>   void bitmap_from_arr64(unsigned long *bitmap, const u64 *buf, unsigned int nbits);
>   void bitmap_to_arr64(u64 *buf, const unsigned long *bitmap, unsigned int nbits);
>   #else
> diff --git a/lib/bitmap.c b/lib/bitmap.c
> index 1c81413c51f8..ddb31015e38a 100644
> --- a/lib/bitmap.c
> +++ b/lib/bitmap.c
> @@ -1495,7 +1495,7 @@ void bitmap_to_arr32(u32 *buf, const unsigned long *bitmap, unsigned int nbits)
>   EXPORT_SYMBOL(bitmap_to_arr32);
>   #endif
>   
> -#if (BITS_PER_LONG == 32) && defined(__BIG_ENDIAN)
> +#if BITS_PER_LONG == 32
>   /**
>    * bitmap_from_arr64 - copy the contents of u64 array of bits to bitmap
>    *	@bitmap: array of unsigned longs, the destination bitmap
Yury Norov Feb. 27, 2023, 8:23 p.m. UTC | #14
On Mon, Feb 27, 2023 at 12:12:01PM -0800, Guenter Roeck wrote:
> On 2/27/23 11:24, Yury Norov wrote:
> > On Mon, Feb 27, 2023 at 06:59:12AM -0800, Guenter Roeck wrote:
> > > On 2/27/23 06:46, Alexander Lobakin wrote:
> > > > From: Yury Norov <yury.norov@gmail.com>
> > > > Date: Sat, 25 Feb 2023 16:06:45 -0800
> > > > 
> > > > > On Sat, Feb 25, 2023 at 04:05:02PM -0800, Yury Norov wrote:
> > > > > > On Sat, Feb 25, 2023 at 10:47:02AM -0800, Guenter Roeck wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > On Thu, Apr 28, 2022 at 01:51:14PM -0700, Yury Norov wrote:
> > > > > > > > Test newly added bitmap_{from,to}_arr64() functions similarly to
> > > > > > > > already existing bitmap_{from,to}_arr32() tests.
> > > > > > > > 
> > > > > > > > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > > > > > > 
> > > > > > > Ever since this test is in the tree, several of my boot tests show
> > > > > > > lots of messages such as
> > > > > > > 
> > > > > > > test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001)
> > > > 
> > > > Hmmm, the whole 4 bytes weren't touched.
> > > > 
> > > > > > > test_bitmap: bitmap_to_arr64(nbits == 2): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000003)
> > > > > > > test_bitmap: bitmap_to_arr64(nbits == 3): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000007)
> > > > 
> > > > This is where it gets worse...
> > > > 
> > > > > > > ...
> > > > > > > test_bitmap: bitmap_to_arr64(nbits == 927): tail is not safely cleared: 0xa5a5a5a500000000 (must be 0x000000007fffffff)
> > > > > > > test_bitmap: bitmap_to_arr64(nbits == 928): tail is not safely cleared: 0xa5a5a5a580000000 (must be 0x00000000ffffffff)
> > > > 
> > > > I don't see the pattern how the actual result gets generated. But the
> > > > problem is in the bitmap code rather than in the subtest -- "must be"s
> > > > are fully correct.
> > > > 
> > > > Given that the 0xa5s are present in the upper 32 bits, it is Big Endian
> > > > I guess? Maybe even 32-bit Big Endian? Otherwise I'd start concerning
> > > > how comes it doesn't reproduce on x86_64s :D
> > > > 
> > > 
> > > It does reproduce on 32-bit x86 builds, and as far as I can see
> > > it is only seen with 32-bit little endian systems.
> > 
> > Hi Guenter, Alexander,
> > 
> > I think that the reason for the failures like this:
> > 
> > > test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001)
> > 
> > is that bitmap_to_arr64 is overly optimized for 32-bit LE architectures.
> > 
> > Regarding this:
> > 
> > > test_bitmap: bitmap_to_arr64(nbits == 927): tail is not safely cleared: 0xa5a5a5a500000000 (must be 0x000000007fffffff)
> > 
> > I am not sure what happens, but because this again happens on 32-bit
> > LE only, I hope the following fix would help too.
> > 
> > Can you please check if the patch works for you? I don't have a 32-bit LE
> > machine in hand, and all my 32-bit VMs (arm and i386) refuse to load the
> > latest kernels for some weird reason, so it's only build-tested.
> > 
> > I'll give it a full-run when restore my 32-bit setups.
> > 
> > Thanks,
> > Yury
> > 
> > > From 2881714db497aed103e310865da075e7b0ce7e1a Mon Sep 17 00:00:00 2001
> > From: Yury Norov <yury.norov@gmail.com>
> > Date: Mon, 27 Feb 2023 09:21:59 -0800
> > Subject: [PATCH] lib/bitmap: drop optimization of bitmap_{from,to}_arr64
> > 
> > bitmap_{from,to}_arr64() optimization is overly optimistic on 32-bit LE
> > architectures when it's wired to bitmap_copy_clear_tail().
> > 
> > bitmap_copy_clear_tail() takes care of unused bits in the bitmap up to
> > the next word boundary. But on 32-bit machines when copying bits from
> > bitmap to array of 64-bit words, it's expected that the unused part of
> > a recipient array must be cleared up to 64-bit boundary, so the last 4
> > bytes may stay untouched.
> > 
> > While the copying part of the optimization works correct, that clear-tail
> > trick makes corresponding tests reasonably fail when nbits % 64 <= 32:
> > 
> > test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001)
> > 
> > Fix it by removing bitmap_{from,to}_arr64() optimization for 32-bit LE
> > arches.
> > 
> > Reported-by: Guenter Roeck <linux@roeck-us.net>
> > Fixes: 0a97953fd2210 ("lib: add bitmap_{from,to}_arr64")
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> 
> Tested with 32-bit i386 image. With this patch on top of
> v6.2-12765-g982818426a0f, the log messages are gone. Without this patch,
> they are still seen.
> 
> Tested-by: Guenter Roeck <linux@roeck-us.net>

Thanks!

Then, I'll submit it properly together with a fix for fail_counter.

Thanks,
Yury
diff mbox series

Patch

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 0c82f07f74fc..d5923a640457 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -585,6 +585,30 @@  static void __init test_bitmap_arr32(void)
 	}
 }
 
+static void __init test_bitmap_arr64(void)
+{
+	unsigned int nbits, next_bit;
+	u64 arr[EXP1_IN_BITS / 64];
+	DECLARE_BITMAP(bmap2, EXP1_IN_BITS);
+
+	memset(arr, 0xa5, sizeof(arr));
+
+	for (nbits = 0; nbits < EXP1_IN_BITS; ++nbits) {
+		memset(bmap2, 0xff, sizeof(arr));
+		bitmap_to_arr64(arr, exp1, nbits);
+		bitmap_from_arr64(bmap2, arr, nbits);
+		expect_eq_bitmap(bmap2, exp1, nbits);
+
+		next_bit = find_next_bit(bmap2, round_up(nbits, BITS_PER_LONG), nbits);
+		if (next_bit < round_up(nbits, BITS_PER_LONG))
+			pr_err("bitmap_copy_arr64(nbits == %d:"
+				" tail is not safely cleared: %d\n", nbits, next_bit);
+
+		if (nbits < EXP1_IN_BITS - 64)
+			expect_eq_uint(arr[DIV_ROUND_UP(nbits, 64)], 0xa5a5a5a5);
+	}
+}
+
 static void noinline __init test_mem_optimisations(void)
 {
 	DECLARE_BITMAP(bmap1, 1024);
@@ -852,6 +876,7 @@  static void __init selftest(void)
 	test_copy();
 	test_replace();
 	test_bitmap_arr32();
+	test_bitmap_arr64();
 	test_bitmap_parse();
 	test_bitmap_parselist();
 	test_bitmap_printlist();