diff mbox series

[V3,blktests] loop: Detect a race condition between loop detach and open

Message ID 20240530062545.79400-1-gulam.mohamed@oracle.com (mailing list archive)
State New, archived
Headers show
Series [V3,blktests] loop: Detect a race condition between loop detach and open | expand

Commit Message

Gulam Mohamed May 30, 2024, 6:25 a.m. UTC
When one process opens a loop device partition and another process detaches
it, there will be a race condition due to which stale loop partitions are
created causing IO errors. This test will detect the race

Signed-off-by: Gulam Mohamed <gulam.mohamed@oracle.com>
---
v3<-v2:
Resolved all the formatting issues

 tests/loop/010     | 77 ++++++++++++++++++++++++++++++++++++++++++++++
 tests/loop/010.out |  2 ++
 2 files changed, 79 insertions(+)
 create mode 100755 tests/loop/010
 create mode 100644 tests/loop/010.out

Comments

Shinichiro Kawasaki June 3, 2024, 10:48 a.m. UTC | #1
Hi Gulam, thanks for the v3 patch. I think it's almost good. I found some more
minor points to improve. Please find my comments and consider if they make
sense.

On May 30, 2024 / 06:25, Gulam Mohamed wrote:
> When one process opens a loop device partition and another process detaches
> it, there will be a race condition due to which stale loop partitions are
> created causing IO errors. This test will detect the race

Missing last period '.'?

> 
> Signed-off-by: Gulam Mohamed <gulam.mohamed@oracle.com>
> ---
> v3<-v2:
> Resolved all the formatting issues
> 
>  tests/loop/010     | 77 ++++++++++++++++++++++++++++++++++++++++++++++
>  tests/loop/010.out |  2 ++
>  2 files changed, 79 insertions(+)
>  create mode 100755 tests/loop/010
>  create mode 100644 tests/loop/010.out
> 
> diff --git a/tests/loop/010 b/tests/loop/010
> new file mode 100755
> index 000000000000..19ceb6ab69cf
> --- /dev/null
> +++ b/tests/loop/010
> @@ -0,0 +1,77 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-3.0+
> +# Copyright (C) 2024, Oracle and/or its affiliates.
> +#
> +# Test to detect a race between loop detach and loop open which creates
> +# stale loop partitions when one process opens the loop partition and
> +# another process detaches the loop device

Missing last period '.'?

> +#
> +. tests/loop/rc
> +DESCRIPTION="check stale loop partition"
> +TIMED=1
> +
> +requires() {
> +	_have_program parted
> +	_have_program mkfs.xfs
> +	_have_program blkid
> +	_have_program udevadm

I understand that Chaitanya suggested the checks for blkid and udevadm.
Actually, I don't think they are needed. blkid is included in util-linux, which
is documented in README as a requirement. Other util-linux tools' availability
is not checked: e.g., blockdev. As for udevadm, many test cases use it (ref:
common/null_blk, common/scsi_block), but no test case checks udevadm
availability. It doesn't make much sense to check blkid and udevadm only for
this test case. I plan to post a patch to document that udevadm (systemdev-udev)
requirement in README.

> +}
> +
> +image_file="$TMPDIR/loopImg"
> +
> +create_loop() {
> +	while true
> +	do
> +		loop_device="$(losetup -P -f --show "${image_file}")"

Recently, we had a couple of troubles due to behavior changes of short options.
It is more robust (and readable) to use long options than short options. I
suggest longer options like this:

		loop_device="$(losetup --partscan --find --show \
			"${image_file}")"

The same comment applies to losetup, truncate, parted and mkfs.xfs commands
below.

> +		blkid /dev/loop0p1 >> /dev/null 2>&1

The line above can be a bit shorter:

		blkid /dev/loop0p1 >& /dev/null

The same comment applies to some lines below.

> +	done
> +}
> +
> +detach_loop() {
> +	while true
> +	do
> +		if [ -e /dev/loop0 ]; then
> +			losetup -d /dev/loop0 > /dev/null 2>&1
> +		fi
> +	done
> +}
> +
> +test() {
> +	echo "Running ${TEST_NAME}"
> +	local loop_device
> +	local create_pid
> +	local detach_pid
> +
> +	truncate -s 1G "${image_file}"
> +	parted -a none -s "${image_file}" mklabel gpt
> +	loop_device="$(losetup -P -f --show "${image_file}")"
> +	parted -a none -s "${loop_device}" mkpart primary 64s 109051s
> +
> +	udevadm settle
> +
> +	if [ ! -e "${loop_device}" ]; then
> +		return 1
> +	fi
> +
> +	mkfs.xfs -f "${loop_device}p1" > /dev/null 2>&1
> +	losetup -d "${loop_device}" >  /dev/null 2>&1
> +
> +	create_loop &
> +	create_pid=$!
> +	detach_loop &
> +	detach_pid=$!
> +
> +	sleep "${TIMEOUT:-90}"
> +	{
> +		kill -9 $create_pid
> +		kill -9 $detach_pid
> +		wait
> +		sleep 1
> +	} 2>/dev/null
> +
> +	losetup -D > /dev/null 2>&1
> +	if _dmesg_since_test_start | grep -q "partition scan of loop0 failed (rc=-16)"; then
> +		echo "Fail"
> +	fi
> +	echo "Test complete"
> +}
> diff --git a/tests/loop/010.out b/tests/loop/010.out
> new file mode 100644
> index 000000000000..64a6aee00b8a
> --- /dev/null
> +++ b/tests/loop/010.out
> @@ -0,0 +1,2 @@
> +Running loop/010
> +Test complete
> -- 
> 2.39.3
>
Gulam Mohamed June 5, 2024, 11 a.m. UTC | #2
Hi Shinichiro,

> -----Original Message-----
> From: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> Sent: Monday, June 3, 2024 4:19 PM
> To: Gulam Mohamed <gulam.mohamed@oracle.com>
> Cc: linux-block@vger.kernel.org; chaitanyak@nvidia.com
> Subject: Re: [PATCH V3 blktests] loop: Detect a race condition between loop
> detach and open
> 
> Hi Gulam, thanks for the v3 patch. I think it's almost good. I found some more
> minor points to improve. Please find my comments and consider if they make
> sense.
> 
> On May 30, 2024 / 06:25, Gulam Mohamed wrote:
> > When one process opens a loop device partition and another process
> > detaches it, there will be a race condition due to which stale loop
> > partitions are created causing IO errors. This test will detect the
> > race
> 
> Missing last period '.'?
Resolved.
> 
> >
> > Signed-off-by: Gulam Mohamed <gulam.mohamed@oracle.com>
> > ---
> > v3<-v2:
> > Resolved all the formatting issues
> >
> >  tests/loop/010     | 77
> ++++++++++++++++++++++++++++++++++++++++++++++
> >  tests/loop/010.out |  2 ++
> >  2 files changed, 79 insertions(+)
> >  create mode 100755 tests/loop/010
> >  create mode 100644 tests/loop/010.out
> >
> > diff --git a/tests/loop/010 b/tests/loop/010 new file mode 100755
> > index 000000000000..19ceb6ab69cf
> > --- /dev/null
> > +++ b/tests/loop/010
> > @@ -0,0 +1,77 @@
> > +#!/bin/bash
> > +# SPDX-License-Identifier: GPL-3.0+
> > +# Copyright (C) 2024, Oracle and/or its affiliates.
> > +#
> > +# Test to detect a race between loop detach and loop open which
> > +creates # stale loop partitions when one process opens the loop
> > +partition and # another process detaches the loop device
> 
> Missing last period '.'?
Resolved
> 
> > +#
> > +. tests/loop/rc
> > +DESCRIPTION="check stale loop partition"
> > +TIMED=1
> > +
> > +requires() {
> > +	_have_program parted
> > +	_have_program mkfs.xfs
> > +	_have_program blkid
> > +	_have_program udevadm
> 
> I understand that Chaitanya suggested the checks for blkid and udevadm.
> Actually, I don't think they are needed. blkid is included in util-linux, which is
> documented in README as a requirement. Other util-linux tools' availability is
> not checked: e.g., blockdev. As for udevadm, many test cases use it (ref:
> common/null_blk, common/scsi_block), but no test case checks udevadm
> availability. It doesn't make much sense to check blkid and udevadm only for
> this test case. I plan to post a patch to document that udevadm (systemdev-
> udev) requirement in README.
> 
Resolved
> > +}
> > +
> > +image_file="$TMPDIR/loopImg"
> > +
> > +create_loop() {
> > +	while true
> > +	do
> > +		loop_device="$(losetup -P -f --show "${image_file}")"
> 
> Recently, we had a couple of troubles due to behavior changes of short
> options.
> It is more robust (and readable) to use long options than short options. I
> suggest longer options like this:
> 
> 		loop_device="$(losetup --partscan --find --show \
> 			"${image_file}")"
> 
> The same comment applies to losetup, truncate, parted and mkfs.xfs
> commands below.
Resolved. Using long options now
> 
> > +		blkid /dev/loop0p1 >> /dev/null 2>&1
> 
> The line above can be a bit shorter:
> 
> 		blkid /dev/loop0p1 >& /dev/null
> 
> The same comment applies to some lines below.
> 
Resolved.
Sending the V4 with all the above comments resolved.

Regards,
Gulam Mohamed.
> > +	done
> > +}
> > +
> > +detach_loop() {
> > +	while true
> > +	do
> > +		if [ -e /dev/loop0 ]; then
> > +			losetup -d /dev/loop0 > /dev/null 2>&1
> > +		fi
> > +	done
> > +}
> > +
> > +test() {
> > +	echo "Running ${TEST_NAME}"
> > +	local loop_device
> > +	local create_pid
> > +	local detach_pid
> > +
> > +	truncate -s 1G "${image_file}"
> > +	parted -a none -s "${image_file}" mklabel gpt
> > +	loop_device="$(losetup -P -f --show "${image_file}")"
> > +	parted -a none -s "${loop_device}" mkpart primary 64s 109051s
> > +
> > +	udevadm settle
> > +
> > +	if [ ! -e "${loop_device}" ]; then
> > +		return 1
> > +	fi
> > +
> > +	mkfs.xfs -f "${loop_device}p1" > /dev/null 2>&1
> > +	losetup -d "${loop_device}" >  /dev/null 2>&1
> > +
> > +	create_loop &
> > +	create_pid=$!
> > +	detach_loop &
> > +	detach_pid=$!
> > +
> > +	sleep "${TIMEOUT:-90}"
> > +	{
> > +		kill -9 $create_pid
> > +		kill -9 $detach_pid
> > +		wait
> > +		sleep 1
> > +	} 2>/dev/null
> > +
> > +	losetup -D > /dev/null 2>&1
> > +	if _dmesg_since_test_start | grep -q "partition scan of loop0 failed
> (rc=-16)"; then
> > +		echo "Fail"
> > +	fi
> > +	echo "Test complete"
> > +}
> > diff --git a/tests/loop/010.out b/tests/loop/010.out new file mode
> > 100644 index 000000000000..64a6aee00b8a
> > --- /dev/null
> > +++ b/tests/loop/010.out
> > @@ -0,0 +1,2 @@
> > +Running loop/010
> > +Test complete
> > --
> > 2.39.3
> >
diff mbox series

Patch

diff --git a/tests/loop/010 b/tests/loop/010
new file mode 100755
index 000000000000..19ceb6ab69cf
--- /dev/null
+++ b/tests/loop/010
@@ -0,0 +1,77 @@ 
+#!/bin/bash
+# SPDX-License-Identifier: GPL-3.0+
+# Copyright (C) 2024, Oracle and/or its affiliates.
+#
+# Test to detect a race between loop detach and loop open which creates
+# stale loop partitions when one process opens the loop partition and
+# another process detaches the loop device
+#
+. tests/loop/rc
+DESCRIPTION="check stale loop partition"
+TIMED=1
+
+requires() {
+	_have_program parted
+	_have_program mkfs.xfs
+	_have_program blkid
+	_have_program udevadm
+}
+
+image_file="$TMPDIR/loopImg"
+
+create_loop() {
+	while true
+	do
+		loop_device="$(losetup -P -f --show "${image_file}")"
+		blkid /dev/loop0p1 >> /dev/null 2>&1
+	done
+}
+
+detach_loop() {
+	while true
+	do
+		if [ -e /dev/loop0 ]; then
+			losetup -d /dev/loop0 > /dev/null 2>&1
+		fi
+	done
+}
+
+test() {
+	echo "Running ${TEST_NAME}"
+	local loop_device
+	local create_pid
+	local detach_pid
+
+	truncate -s 1G "${image_file}"
+	parted -a none -s "${image_file}" mklabel gpt
+	loop_device="$(losetup -P -f --show "${image_file}")"
+	parted -a none -s "${loop_device}" mkpart primary 64s 109051s
+
+	udevadm settle
+
+	if [ ! -e "${loop_device}" ]; then
+		return 1
+	fi
+
+	mkfs.xfs -f "${loop_device}p1" > /dev/null 2>&1
+	losetup -d "${loop_device}" >  /dev/null 2>&1
+
+	create_loop &
+	create_pid=$!
+	detach_loop &
+	detach_pid=$!
+
+	sleep "${TIMEOUT:-90}"
+	{
+		kill -9 $create_pid
+		kill -9 $detach_pid
+		wait
+		sleep 1
+	} 2>/dev/null
+
+	losetup -D > /dev/null 2>&1
+	if _dmesg_since_test_start | grep -q "partition scan of loop0 failed (rc=-16)"; then
+		echo "Fail"
+	fi
+	echo "Test complete"
+}
diff --git a/tests/loop/010.out b/tests/loop/010.out
new file mode 100644
index 000000000000..64a6aee00b8a
--- /dev/null
+++ b/tests/loop/010.out
@@ -0,0 +1,2 @@ 
+Running loop/010
+Test complete