diff mbox series

block/035: add test to cover blk-cgroup vs. disk rebind

Message ID 20240407125717.4052964-1-ming.lei@redhat.com (mailing list archive)
State New, archived
Headers show
Series block/035: add test to cover blk-cgroup vs. disk rebind | expand

Commit Message

Ming Lei April 7, 2024, 12:57 p.m. UTC
Recently it is observed that list corruption is triggered when running
scsi disk rebind in case of blk-cgroup.

Add one such test case for covering this unusual operation.

Cc: Changhui Zhong <czhong@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 tests/block/035     | 54 +++++++++++++++++++++++++++++++++++++++++++++
 tests/block/035.out |  2 ++
 2 files changed, 56 insertions(+)
 create mode 100755 tests/block/035
 create mode 100644 tests/block/035.out

Comments

Shinichiro Kawasaki April 9, 2024, 12:56 a.m. UTC | #1
On Apr 07, 2024 / 20:57, Ming Lei wrote:
> Recently it is observed that list corruption is triggered when running
> scsi disk rebind in case of blk-cgroup.
> 
> Add one such test case for covering this unusual operation.
> 
> Cc: Changhui Zhong <czhong@redhat.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>

Thanks for the patch. Overall it looks good to me. I confirmed that this test
case causes the system hang with v6.9-rc2 kernel and your fix patch [1] avoids
it.

[1] https://lore.kernel.org/linux-block/20240407125910.4053377-1-ming.lei@redhat.com/

As I commented in line, I will do an edit when I apply this patch. No need to
respin this patch unless someone makes other comments.

Before I apply this patch, I will wait until the kernel side fix gets
upstreamed and then downstreamed to the stable kernels, so that blktests users
won't be upset with the hang. Until then, I expect other new test cases will get
the test case number block/035. In that case, I will modify this test case
number to block/036 or 037.

> ---
>  tests/block/035     | 54 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/block/035.out |  2 ++
>  2 files changed, 56 insertions(+)
>  create mode 100755 tests/block/035
>  create mode 100644 tests/block/035.out
> 
> diff --git a/tests/block/035 b/tests/block/035
> new file mode 100755
> index 0000000..a1057a3
> --- /dev/null
> +++ b/tests/block/035
> @@ -0,0 +1,54 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-3.0+
> +# Copyright (C) 2024 Ming Lei
> +#
> +# blk-cgroup is usually initialized in disk allocation code, and
> +# de-initialized in disk release code. And scsi disk rebind needs
> +# to re-allocate/re-add disk, meantime request queue is kept as
> +# live during the whole cycle.
> +#
> +# Add this test for covering blk-cgroup & disk rebind.
> +
> +. tests/block/rc
> +. common/scsi_debug
> +. common/cgroup
> +
> +DESCRIPTION="test cgroup vs. scsi_debug rebind"
> +QUICK=1
> +
> +requires() {
> +	_have_cgroup2_controller io
> +	_have_scsi_debug
> +	_have_fio

Nit: this check for fio is not needed. I will remove it when I merge this patch.
Shinichiro Kawasaki April 18, 2024, 6:50 a.m. UTC | #2
On Apr 09, 2024 / 00:56, Shinichiro Kawasaki wrote:
> On Apr 07, 2024 / 20:57, Ming Lei wrote:
> > Recently it is observed that list corruption is triggered when running
> > scsi disk rebind in case of blk-cgroup.
> > 
> > Add one such test case for covering this unusual operation.
> > 
> > Cc: Changhui Zhong <czhong@redhat.com>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> 
> Thanks for the patch. Overall it looks good to me. I confirmed that this test
> case causes the system hang with v6.9-rc2 kernel and your fix patch [1] avoids
> it.
> 
> [1] https://lore.kernel.org/linux-block/20240407125910.4053377-1-ming.lei@redhat.com/
> 
> As I commented in line, I will do an edit when I apply this patch. No need to
> respin this patch unless someone makes other comments.
> 
> Before I apply this patch, I will wait until the kernel side fix gets
> upstreamed and then downstreamed to the stable kernels, so that blktests users
> won't be upset with the hang. Until then, I expect other new test cases will get
> the test case number block/035. In that case, I will modify this test case
> number to block/036 or 037.

The kernel side fix landed on v6.9-rc4, v6.8.7 and v6.6.28. I have applied this
blktests patch along with the edits I mentioned. The test case number was
modified from block/035 to block/037. Thanks!
diff mbox series

Patch

diff --git a/tests/block/035 b/tests/block/035
new file mode 100755
index 0000000..a1057a3
--- /dev/null
+++ b/tests/block/035
@@ -0,0 +1,54 @@ 
+#!/bin/bash
+# SPDX-License-Identifier: GPL-3.0+
+# Copyright (C) 2024 Ming Lei
+#
+# blk-cgroup is usually initialized in disk allocation code, and
+# de-initialized in disk release code. And scsi disk rebind needs
+# to re-allocate/re-add disk, meantime request queue is kept as
+# live during the whole cycle.
+#
+# Add this test for covering blk-cgroup & disk rebind.
+
+. tests/block/rc
+. common/scsi_debug
+. common/cgroup
+
+DESCRIPTION="test cgroup vs. scsi_debug rebind"
+QUICK=1
+
+requires() {
+	_have_cgroup2_controller io
+	_have_scsi_debug
+	_have_fio
+}
+
+scsi_debug_rebind() {
+	if ! _configure_scsi_debug; then
+		return
+	fi
+
+	_init_cgroup2
+
+	echo "+io" > "/sys/fs/cgroup/cgroup.subtree_control"
+	echo "+io" > "$CGROUP2_DIR/cgroup.subtree_control"
+	mkdir -p "$CGROUP2_DIR/${TEST_NAME}"
+
+	local dev dev_path hctl
+	dev=${SCSI_DEBUG_DEVICES[0]}
+	dev_path="$(realpath "/sys/block/${dev}/device")"
+	hctl="$(basename "$dev_path")"
+
+	echo -n "${hctl}" > "/sys/bus/scsi/drivers/sd/unbind"
+	echo -n "${hctl}" > "/sys/bus/scsi/drivers/sd/bind"
+
+	_exit_cgroup2
+	_exit_scsi_debug
+}
+
+test() {
+	echo "Running ${TEST_NAME}"
+
+	scsi_debug_rebind
+
+	echo "Test complete"
+}
diff --git a/tests/block/035.out b/tests/block/035.out
new file mode 100644
index 0000000..6ffa504
--- /dev/null
+++ b/tests/block/035.out
@@ -0,0 +1,2 @@ 
+Running block/035
+Test complete