diff mbox series

[V2,blktests] block: add regression test for null-blk concurrently power/submit_queues test

Message ID 20240605010542.216971-1-yi.zhang@redhat.com (mailing list archive)
State New, archived
Headers show
Series [V2,blktests] block: add regression test for null-blk concurrently power/submit_queues test | expand

Commit Message

Yi Zhang June 5, 2024, 1:05 a.m. UTC
null-blk currently power/submit_queues operations which will lead kernel
null-ptr-dereference[1], add one regression test for it and the fix has
been merged to v6.10-rc1 by [2].
[1]
https://lore.kernel.org/linux-block/CAHj4cs9LgsHLnjg8z06LQ3Pr5cax-+Ps+xT7AP7TPnEjStuwZA@mail.gmail.com/
https://lore.kernel.org/linux-block/20240523153934.1937851-1-yukuai1@huaweicloud.com/
[2]
commit a2db328b0839 ("null_blk: fix null-ptr-dereference while configuring 'power' and 'submit_queues'")

Signed-off-by: Yi Zhang <yi.zhang@redhat.com>
---
 tests/block/038     | 61 +++++++++++++++++++++++++++++++++++++++++++++
 tests/block/038.out |  2 ++
 2 files changed, 63 insertions(+)
 create mode 100755 tests/block/038
 create mode 100644 tests/block/038.out

Comments

Chaitanya Kulkarni June 5, 2024, 5:30 a.m. UTC | #1
On 6/4/24 18:05, Yi Zhang wrote:
> null-blk currently power/submit_queues operations which will lead kernel
> null-ptr-dereference[1], add one regression test for it and the fix has
> been merged to v6.10-rc1 by [2].
> [1]
> https://lore.kernel.org/linux-block/CAHj4cs9LgsHLnjg8z06LQ3Pr5cax-+Ps+xT7AP7TPnEjStuwZA@mail.gmail.com/
> https://lore.kernel.org/linux-block/20240523153934.1937851-1-yukuai1@huaweicloud.com/
> [2]
> commit a2db328b0839 ("null_blk: fix null-ptr-dereference while configuring 'power' and 'submit_queues'")
>
> Signed-off-by: Yi Zhang<yi.zhang@redhat.com>

Thanks for the test, looks good, I wonder what other drivers
might need such test ...

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>

-ck
Shinichiro Kawasaki June 5, 2024, 11:52 a.m. UTC | #2
On Jun 04, 2024 / 21:05, Yi Zhang wrote:
> null-blk currently power/submit_queues operations which will lead kernel
> null-ptr-dereference[1], add one regression test for it and the fix has
> been merged to v6.10-rc1 by [2].
> [1]
> https://lore.kernel.org/linux-block/CAHj4cs9LgsHLnjg8z06LQ3Pr5cax-+Ps+xT7AP7TPnEjStuwZA@mail.gmail.com/
> https://lore.kernel.org/linux-block/20240523153934.1937851-1-yukuai1@huaweicloud.com/
> [2]
> commit a2db328b0839 ("null_blk: fix null-ptr-dereference while configuring 'power' and 'submit_queues'")

Thank you Yi. I ran the test case and it looks working good. It passes with
the kernel v6.10-rc2. It causes the hang with the kernel v6.9. To not confuse
blktests users with the hang, I will wait for the commit a2db328b0839 to land on
the stable kernels before I apply this patch.

One more thing I noticed is that your current patch requires loadable null_blk.
To allow it run with built-in null_blk, I would like to suggest additional
change below on top of your patch. It,

- calls _have_null_blk instead of _have_module null_blk,
- checks submit_queues parameter with _have_null_blk_feature instead of
  _have_module_param,
- does not call _init_null_blk, and
- uses nullb1 instead for nullb0.

Please let me know your thought about these changes. If you are ok with them, I
can fold them in the commit.

diff --git a/tests/block/038 b/tests/block/038
index fe3c7cd..56272be 100755
--- a/tests/block/038
+++ b/tests/block/038
@@ -12,9 +12,10 @@ DESCRIPTION="Test null-blk concurrent power/submit_queues operations"
 QUICK=1
 
 requires() {
-	_have_module null_blk
-	_have_module_param null_blk nr_devices
-	_have_module_param null_blk submit_queues
+	_have_null_blk
+	if ! _have_null_blk_feature submit_queues; then
+		SKIP_REASONS+=("null_blk does not support submit_queues")
+	fi
 }
 
 null_blk_power_loop() {
@@ -36,23 +37,15 @@ null_blk_submit_queues_loop() {
 test() {
 	echo "Running ${TEST_NAME}"
 
-	local nullb_params=(
-		nr_devices=0
-	)
-	if ! _init_null_blk "${nullb_params[@]}"; then
-		echo "Loading null_blk failed"
-		return 1
-	fi
-
-	if ! _configure_null_blk nullb0; then
-		echo "Configuring null_blk nullb0 failed"
+	if ! _configure_null_blk nullb1; then
+		echo "Configuring null_blk nullb1 failed"
 		return 1
 	fi
 
 	# fire off two null-blk power/submit_queues concurrently and wait
 	# for them to complete...
-	null_blk_power_loop nullb0 &
-	null_blk_submit_queues_loop nullb0 &
+	null_blk_power_loop nullb1 &
+	null_blk_submit_queues_loop nullb1 &
 	wait
 
 	_exit_null_blk
Yi Zhang June 5, 2024, 12:01 p.m. UTC | #3
On Wed, Jun 5, 2024 at 7:52 PM Shinichiro Kawasaki
<shinichiro.kawasaki@wdc.com> wrote:
>
> On Jun 04, 2024 / 21:05, Yi Zhang wrote:
> > null-blk currently power/submit_queues operations which will lead kernel
> > null-ptr-dereference[1], add one regression test for it and the fix has
> > been merged to v6.10-rc1 by [2].
> > [1]
> > https://lore.kernel.org/linux-block/CAHj4cs9LgsHLnjg8z06LQ3Pr5cax-+Ps+xT7AP7TPnEjStuwZA@mail.gmail.com/
> > https://lore.kernel.org/linux-block/20240523153934.1937851-1-yukuai1@huaweicloud.com/
> > [2]
> > commit a2db328b0839 ("null_blk: fix null-ptr-dereference while configuring 'power' and 'submit_queues'")
>
> Thank you Yi. I ran the test case and it looks working good. It passes with
> the kernel v6.10-rc2. It causes the hang with the kernel v6.9. To not confuse
> blktests users with the hang, I will wait for the commit a2db328b0839 to land on
> the stable kernels before I apply this patch.
>
> One more thing I noticed is that your current patch requires loadable null_blk.
> To allow it run with built-in null_blk, I would like to suggest additional
> change below on top of your patch. It,
>
> - calls _have_null_blk instead of _have_module null_blk,
> - checks submit_queues parameter with _have_null_blk_feature instead of
>   _have_module_param,
> - does not call _init_null_blk, and
> - uses nullb1 instead for nullb0.
>
> Please let me know your thought about these changes. If you are ok with them, I
> can fold them in the commit.

Sure, I'm OK with the change, thanks. :)

>
> diff --git a/tests/block/038 b/tests/block/038
> index fe3c7cd..56272be 100755
> --- a/tests/block/038
> +++ b/tests/block/038
> @@ -12,9 +12,10 @@ DESCRIPTION="Test null-blk concurrent power/submit_queues operations"
>  QUICK=1
>
>  requires() {
> -       _have_module null_blk
> -       _have_module_param null_blk nr_devices
> -       _have_module_param null_blk submit_queues
> +       _have_null_blk
> +       if ! _have_null_blk_feature submit_queues; then
> +               SKIP_REASONS+=("null_blk does not support submit_queues")
> +       fi
>  }
>
>  null_blk_power_loop() {
> @@ -36,23 +37,15 @@ null_blk_submit_queues_loop() {
>  test() {
>         echo "Running ${TEST_NAME}"
>
> -       local nullb_params=(
> -               nr_devices=0
> -       )
> -       if ! _init_null_blk "${nullb_params[@]}"; then
> -               echo "Loading null_blk failed"
> -               return 1
> -       fi
> -
> -       if ! _configure_null_blk nullb0; then
> -               echo "Configuring null_blk nullb0 failed"
> +       if ! _configure_null_blk nullb1; then
> +               echo "Configuring null_blk nullb1 failed"
>                 return 1
>         fi
>
>         # fire off two null-blk power/submit_queues concurrently and wait
>         # for them to complete...
> -       null_blk_power_loop nullb0 &
> -       null_blk_submit_queues_loop nullb0 &
> +       null_blk_power_loop nullb1 &
> +       null_blk_submit_queues_loop nullb1 &
>         wait
>
>         _exit_null_blk
>
Shinichiro Kawasaki June 18, 2024, 5:31 a.m. UTC | #4
On Jun 05, 2024 / 20:01, Yi Zhang wrote:
> On Wed, Jun 5, 2024 at 7:52 PM Shinichiro Kawasaki
> <shinichiro.kawasaki@wdc.com> wrote:
> >
> > On Jun 04, 2024 / 21:05, Yi Zhang wrote:
> > > null-blk currently power/submit_queues operations which will lead kernel
> > > null-ptr-dereference[1], add one regression test for it and the fix has
> > > been merged to v6.10-rc1 by [2].
> > > [1]
> > > https://lore.kernel.org/linux-block/CAHj4cs9LgsHLnjg8z06LQ3Pr5cax-+Ps+xT7AP7TPnEjStuwZA@mail.gmail.com/
> > > https://lore.kernel.org/linux-block/20240523153934.1937851-1-yukuai1@huaweicloud.com/
> > > [2]
> > > commit a2db328b0839 ("null_blk: fix null-ptr-dereference while configuring 'power' and 'submit_queues'")
> >
> > Thank you Yi. I ran the test case and it looks working good. It passes with
> > the kernel v6.10-rc2. It causes the hang with the kernel v6.9. To not confuse
> > blktests users with the hang, I will wait for the commit a2db328b0839 to land on
> > the stable kernels before I apply this patch.
> >
> > One more thing I noticed is that your current patch requires loadable null_blk.
> > To allow it run with built-in null_blk, I would like to suggest additional
> > change below on top of your patch. It,
> >
> > - calls _have_null_blk instead of _have_module null_blk,
> > - checks submit_queues parameter with _have_null_blk_feature instead of
> >   _have_module_param,
> > - does not call _init_null_blk, and
> > - uses nullb1 instead for nullb0.
> >
> > Please let me know your thought about these changes. If you are ok with them, I
> > can fold them in the commit.
> 
> Sure, I'm OK with the change, thanks. :)

Now the kernel side fix is in v6.9.4 kernel. I've applied this blktests patch.
Thanks! FYI, I added my modification as a following commit [*].

[*] https://github.com/osandov/blktests/commit/eba35aaf32ceb9516a7dbc856c9db4d0ae3268d0
diff mbox series

Patch

diff --git a/tests/block/038 b/tests/block/038
new file mode 100755
index 0000000..fe3c7cd
--- /dev/null
+++ b/tests/block/038
@@ -0,0 +1,61 @@ 
+#!/bin/bash
+# SPDX-License-Identifier: GPL-3.0+
+# Copyright (C) 2024 Yi Zhang <yi.zhang@redhat.com>
+#
+# Regression test for commit a2db328b0839 ("null_blk: fix null-ptr-dereference
+# while configuring 'power' and 'submit_queues'").
+
+. tests/block/rc
+. common/null_blk
+
+DESCRIPTION="Test null-blk concurrent power/submit_queues operations"
+QUICK=1
+
+requires() {
+	_have_module null_blk
+	_have_module_param null_blk nr_devices
+	_have_module_param null_blk submit_queues
+}
+
+null_blk_power_loop() {
+	local nullb="$1"
+	for ((i = 1; i <= 200; i++)); do
+		echo 1 > "/sys/kernel/config/nullb/${nullb}/power"
+		echo 0 > "/sys/kernel/config/nullb/${nullb}/power"
+	done
+}
+
+null_blk_submit_queues_loop() {
+	local nullb="$1"
+	for ((i = 1; i <= 200; i++)); do
+		echo 1 > "/sys/kernel/config/nullb/${nullb}/submit_queues"
+		echo 4 > "/sys/kernel/config/nullb/${nullb}/submit_queues"
+	done
+}
+
+test() {
+	echo "Running ${TEST_NAME}"
+
+	local nullb_params=(
+		nr_devices=0
+	)
+	if ! _init_null_blk "${nullb_params[@]}"; then
+		echo "Loading null_blk failed"
+		return 1
+	fi
+
+	if ! _configure_null_blk nullb0; then
+		echo "Configuring null_blk nullb0 failed"
+		return 1
+	fi
+
+	# fire off two null-blk power/submit_queues concurrently and wait
+	# for them to complete...
+	null_blk_power_loop nullb0 &
+	null_blk_submit_queues_loop nullb0 &
+	wait
+
+	_exit_null_blk
+
+	echo "Test complete"
+}
diff --git a/tests/block/038.out b/tests/block/038.out
new file mode 100644
index 0000000..aebde64
--- /dev/null
+++ b/tests/block/038.out
@@ -0,0 +1,2 @@ 
+Running block/038
+Test complete