btrfs/011: handle finished replace properly

Message ID	20220110112848.37491-1-wqu@suse.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <fstests-owner@kernel.org> From: Qu Wenruo <wqu@suse.com> To: fstests@vger.kernel.org Cc: linux-btrfs@vger.kernel.org Subject: [PATCH] btrfs/011: handle finished replace properly Date: Mon, 10 Jan 2022 19:28:48 +0800 Message-Id: <20220110112848.37491-1-wqu@suse.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	btrfs/011: handle finished replace properly \| expand btrfs/011: handle finished replace properly

Qu Wenruo Jan. 10, 2022, 11:28 a.m. UTC

[BUG]
When running btrfs/011 inside VM which has unsafe cache set for its
devices, and the host have enough memory to cache all the IO:

btrfs/011 98s ... [failed, exit status 1]- output mismatch
    --- tests/btrfs/011.out	2019-10-22 15:18:13.962298674 +0800
    +++ /xfstests-dev/results//btrfs/011.out.bad	2022-01-10 19:12:14.683333251 +0800
    @@ -1,3 +1,4 @@
     QA output created by 011
     *** test btrfs replace
    -*** done
    +failed: '/usr/bin/btrfs replace cancel /mnt/scratch'
    +(see /xfstests-dev/results//btrfs/011.full for details)
    ...
Ran: btrfs/011
Failures: btrfs/011
Failed 1 of 1 tests

[CAUSE]
Although commit fa85aa64 ("btrfs/011: Fill the fs to ensure we
have enough data for dev-replace") tries to address the problem by
filling the fs with extra content, there is still no guarantee that 2
seconds of IO still needs 2 seconds to finish.

Thus even we tried our best to make sure the replace will take 2
seconds, it can still finish faster than 2 seconds.

And just to mention how fast the test finishes, after the fix, the test
takes around 90~100 seconds to finish.
While on real-hardware it can take over 1000 seconds.

[FIX]
Instead of further enlarging the IO, here we just accept the fact that
replace can finish faster than our expectation, and continue the test.

One thing to notice is, since the replace finished, we need to replace
back the device, or later fsck will be executed on blank device, and
cause false alert.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 tests/btrfs/011 | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

Filipe Manana Jan. 10, 2022, 11:57 a.m. UTC | #1

On Mon, Jan 10, 2022 at 07:28:48PM +0800, Qu Wenruo wrote:
> [BUG]
> When running btrfs/011 inside VM which has unsafe cache set for its
> devices, and the host have enough memory to cache all the IO:

Btw, I use the same setup and the test never fails for me.
It's quite of a strech assuming that using unsafe caching and having
enough memory are the only reasons for the failure.

I use a debug kernel with plenty of heavy debug features enabled,
so that may be one reason why I can't trigger it.

> 
> btrfs/011 98s ... [failed, exit status 1]- output mismatch
>     --- tests/btrfs/011.out	2019-10-22 15:18:13.962298674 +0800
>     +++ /xfstests-dev/results//btrfs/011.out.bad	2022-01-10 19:12:14.683333251 +0800
>     @@ -1,3 +1,4 @@
>      QA output created by 011
>      *** test btrfs replace
>     -*** done
>     +failed: '/usr/bin/btrfs replace cancel /mnt/scratch'
>     +(see /xfstests-dev/results//btrfs/011.full for details)
>     ...
> Ran: btrfs/011
> Failures: btrfs/011
> Failed 1 of 1 tests
> 
> [CAUSE]
> Although commit fa85aa64 ("btrfs/011: Fill the fs to ensure we
> have enough data for dev-replace") tries to address the problem by
> filling the fs with extra content, there is still no guarantee that 2
> seconds of IO still needs 2 seconds to finish.
> 
> Thus even we tried our best to make sure the replace will take 2
> seconds, it can still finish faster than 2 seconds.
> 
> And just to mention how fast the test finishes, after the fix, the test
> takes around 90~100 seconds to finish.
> While on real-hardware it can take over 1000 seconds.
> 
> [FIX]
> Instead of further enlarging the IO, here we just accept the fact that
> replace can finish faster than our expectation, and continue the test.

If I'm reading this, and the code, correctly this means we end up never
testing that the replace cancel feature ends up being exercised and
while a replace is in progress. That's far from ideal, losing test
coverage.

Josef sent a patch for this last month:

https://lore.kernel.org/linux-btrfs/01796d6bcec40ae80b5af3269e60a66cd4b89262.1638382763.git.josef@toxicpanda.com/

Don't know why it wasn't merged. I agree that changing timings in the
test is not ideal and always prone to still fail on some setups, but
seems more acceptable to me rather than losing test coverage.

Thanks.

> 
> One thing to notice is, since the replace finished, we need to replace
> back the device, or later fsck will be executed on blank device, and
> cause false alert.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  tests/btrfs/011 | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/btrfs/011 b/tests/btrfs/011
> index b4673341..aae89696 100755
> --- a/tests/btrfs/011
> +++ b/tests/btrfs/011
> @@ -171,13 +171,24 @@ btrfs_replace_test()
>  		# background the replace operation (no '-B' option given)
>  		_run_btrfs_util_prog replace start -f $replace_options $source_dev $target_dev $SCRATCH_MNT
>  		sleep $wait_time
> -		_run_btrfs_util_prog replace cancel $SCRATCH_MNT
> +		$BTRFS_UTIL_PROG replace cancel $SCRATCH_MNT 2>&1 >> $seqres.full
>  
>  		# 'replace status' waits for the replace operation to finish
>  		# before the status is printed
>  		$BTRFS_UTIL_PROG replace status $SCRATCH_MNT > $tmp.tmp 2>&1
>  		cat $tmp.tmp >> $seqres.full
> -		grep -q canceled $tmp.tmp || _fail "btrfs replace status (canceled) failed"
> +
> +		# There is no guarantee we canceled the replace, it can finish
> +		if grep -q 'finished' $tmp.tmp ; then
> +			# The replace finished, we need to replace it back or
> +			# later fsck will report error as $SCRATCH_DEV is now
> +			# blank
> +			$BTRFS_UTIL_PROG replace start -Bf $target_dev \
> +				$source_dev $SCRATCH_MNT > /dev/null
> +		else
> +			grep -q 'canceled' $tmp.tmp || _fail \
> +				"btrfs replace status (canceled ) failed"
> +		fi
>  	else
>  		if [ "${quick}Q" = "thoroughQ" ]; then
>  			# The thorough test runs around 2 * $wait_time seconds.
> -- 
> 2.34.1
>

Qu Wenruo Jan. 10, 2022, 12:44 p.m. UTC | #2

On 2022/1/10 19:57, Filipe Manana wrote:
> On Mon, Jan 10, 2022 at 07:28:48PM +0800, Qu Wenruo wrote:
>> [BUG]
>> When running btrfs/011 inside VM which has unsafe cache set for its
>> devices, and the host have enough memory to cache all the IO:
>
> Btw, I use the same setup and the test never fails for me.
> It's quite of a strech assuming that using unsafe caching and having
> enough memory are the only reasons for the failure.
>
> I use a debug kernel with plenty of heavy debug features enabled,
> so that may be one reason why I can't trigger it.

I guess that's the reason.

For most cases I only enable light-weight debug features for regular runs.
The heavy ones like KASAN/kmemleak are reserved for more tricky cases.

>
>>
>> btrfs/011 98s ... [failed, exit status 1]- output mismatch
>>      --- tests/btrfs/011.out	2019-10-22 15:18:13.962298674 +0800
>>      +++ /xfstests-dev/results//btrfs/011.out.bad	2022-01-10 19:12:14.683333251 +0800
>>      @@ -1,3 +1,4 @@
>>       QA output created by 011
>>       *** test btrfs replace
>>      -*** done
>>      +failed: '/usr/bin/btrfs replace cancel /mnt/scratch'
>>      +(see /xfstests-dev/results//btrfs/011.full for details)
>>      ...
>> Ran: btrfs/011
>> Failures: btrfs/011
>> Failed 1 of 1 tests
>>
>> [CAUSE]
>> Although commit fa85aa64 ("btrfs/011: Fill the fs to ensure we
>> have enough data for dev-replace") tries to address the problem by
>> filling the fs with extra content, there is still no guarantee that 2
>> seconds of IO still needs 2 seconds to finish.
>>
>> Thus even we tried our best to make sure the replace will take 2
>> seconds, it can still finish faster than 2 seconds.
>>
>> And just to mention how fast the test finishes, after the fix, the test
>> takes around 90~100 seconds to finish.
>> While on real-hardware it can take over 1000 seconds.
>>
>> [FIX]
>> Instead of further enlarging the IO, here we just accept the fact that
>> replace can finish faster than our expectation, and continue the test.
>
> If I'm reading this, and the code, correctly this means we end up never
> testing that the replace cancel feature ends up being exercised and
> while a replace is in progress. That's far from ideal, losing test
> coverage.

Yep, you're completely right.

In fact for my environment, only around 10% runs really utilized the
cancel path, the remaining 90% go finished routine.

But...

>
> Josef sent a patch for this last month:
>
> https://lore.kernel.org/linux-btrfs/01796d6bcec40ae80b5af3269e60a66cd4b89262.1638382763.git.josef@toxicpanda.com/
>
> Don't know why it wasn't merged. I agree that changing timings in the
> test is not ideal and always prone to still fail on some setups, but
> seems more acceptable to me rather than losing test coverage.

My concern is, it's just going to be another whac-a-mole game.

With more RAM in the host, and further optimization, I'm not sure if 10
seconds is enough.
Furthermore 10 seconds may be too long for certain environment, to fill
the whole test filesystem and cause other false alerts.

As a safenet, we have dedicated cancel workload test in btrfs/212, and
IIRC for most (if not all) bugs exposed in btrfs/011, it's in the
regular replace path, not the cancel path.

Thus I want to end the whac-a-hole game once and for all, even it will
drop the coverage for super fast setup.

Thanks,
Qu

>
> Thanks.
>
>>
>> One thing to notice is, since the replace finished, we need to replace
>> back the device, or later fsck will be executed on blank device, and
>> cause false alert.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>   tests/btrfs/011 | 15 +++++++++++++--
>>   1 file changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/tests/btrfs/011 b/tests/btrfs/011
>> index b4673341..aae89696 100755
>> --- a/tests/btrfs/011
>> +++ b/tests/btrfs/011
>> @@ -171,13 +171,24 @@ btrfs_replace_test()
>>   		# background the replace operation (no '-B' option given)
>>   		_run_btrfs_util_prog replace start -f $replace_options $source_dev $target_dev $SCRATCH_MNT
>>   		sleep $wait_time
>> -		_run_btrfs_util_prog replace cancel $SCRATCH_MNT
>> +		$BTRFS_UTIL_PROG replace cancel $SCRATCH_MNT 2>&1 >> $seqres.full
>>
>>   		# 'replace status' waits for the replace operation to finish
>>   		# before the status is printed
>>   		$BTRFS_UTIL_PROG replace status $SCRATCH_MNT > $tmp.tmp 2>&1
>>   		cat $tmp.tmp >> $seqres.full
>> -		grep -q canceled $tmp.tmp || _fail "btrfs replace status (canceled) failed"
>> +
>> +		# There is no guarantee we canceled the replace, it can finish
>> +		if grep -q 'finished' $tmp.tmp ; then
>> +			# The replace finished, we need to replace it back or
>> +			# later fsck will report error as $SCRATCH_DEV is now
>> +			# blank
>> +			$BTRFS_UTIL_PROG replace start -Bf $target_dev \
>> +				$source_dev $SCRATCH_MNT > /dev/null
>> +		else
>> +			grep -q 'canceled' $tmp.tmp || _fail \
>> +				"btrfs replace status (canceled ) failed"
>> +		fi
>>   	else
>>   		if [ "${quick}Q" = "thoroughQ" ]; then
>>   			# The thorough test runs around 2 * $wait_time seconds.
>> --
>> 2.34.1
>>

Filipe Manana Jan. 10, 2022, 12:52 p.m. UTC | #3

On Mon, Jan 10, 2022 at 12:44 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
>
> On 2022/1/10 19:57, Filipe Manana wrote:
> > On Mon, Jan 10, 2022 at 07:28:48PM +0800, Qu Wenruo wrote:
> >> [BUG]
> >> When running btrfs/011 inside VM which has unsafe cache set for its
> >> devices, and the host have enough memory to cache all the IO:
> >
> > Btw, I use the same setup and the test never fails for me.
> > It's quite of a strech assuming that using unsafe caching and having
> > enough memory are the only reasons for the failure.
> >
> > I use a debug kernel with plenty of heavy debug features enabled,
> > so that may be one reason why I can't trigger it.
>
> I guess that's the reason.
>
> For most cases I only enable light-weight debug features for regular runs.
> The heavy ones like KASAN/kmemleak are reserved for more tricky cases.
>
> >
> >>
> >> btrfs/011 98s ... [failed, exit status 1]- output mismatch
> >>      --- tests/btrfs/011.out 2019-10-22 15:18:13.962298674 +0800
> >>      +++ /xfstests-dev/results//btrfs/011.out.bad    2022-01-10 19:12:14.683333251 +0800
> >>      @@ -1,3 +1,4 @@
> >>       QA output created by 011
> >>       *** test btrfs replace
> >>      -*** done
> >>      +failed: '/usr/bin/btrfs replace cancel /mnt/scratch'
> >>      +(see /xfstests-dev/results//btrfs/011.full for details)
> >>      ...
> >> Ran: btrfs/011
> >> Failures: btrfs/011
> >> Failed 1 of 1 tests
> >>
> >> [CAUSE]
> >> Although commit fa85aa64 ("btrfs/011: Fill the fs to ensure we
> >> have enough data for dev-replace") tries to address the problem by
> >> filling the fs with extra content, there is still no guarantee that 2
> >> seconds of IO still needs 2 seconds to finish.
> >>
> >> Thus even we tried our best to make sure the replace will take 2
> >> seconds, it can still finish faster than 2 seconds.
> >>
> >> And just to mention how fast the test finishes, after the fix, the test
> >> takes around 90~100 seconds to finish.
> >> While on real-hardware it can take over 1000 seconds.
> >>
> >> [FIX]
> >> Instead of further enlarging the IO, here we just accept the fact that
> >> replace can finish faster than our expectation, and continue the test.
> >
> > If I'm reading this, and the code, correctly this means we end up never
> > testing that the replace cancel feature ends up being exercised and
> > while a replace is in progress. That's far from ideal, losing test
> > coverage.
>
> Yep, you're completely right.
>
> In fact for my environment, only around 10% runs really utilized the
> cancel path, the remaining 90% go finished routine.
>
> But...
>
> >
> > Josef sent a patch for this last month:
> >
> > https://lore.kernel.org/linux-btrfs/01796d6bcec40ae80b5af3269e60a66cd4b89262.1638382763.git.josef@toxicpanda.com/
> >
> > Don't know why it wasn't merged. I agree that changing timings in the
> > test is not ideal and always prone to still fail on some setups, but
> > seems more acceptable to me rather than losing test coverage.
>
> My concern is, it's just going to be another whac-a-mole game.
>
> With more RAM in the host, and further optimization, I'm not sure if 10
> seconds is enough.
> Furthermore 10 seconds may be too long for certain environment, to fill
> the whole test filesystem and cause other false alerts.
>
> As a safenet, we have dedicated cancel workload test in btrfs/212, and

btrfs/212 is about balance, not device replace.

> IIRC for most (if not all) bugs exposed in btrfs/011, it's in the
> regular replace path, not the cancel path.

Even if we have had few bugs in the replace cancel patch, compared to
the regular replace path,
that's not an excuse to remove or reduce replace cancel coverage.

>
> Thus I want to end the whac-a-hole game once and for all, even it will
> drop the coverage for super fast setup.

If everyone starts having a setup that is faster, then we will end up
not getting replace cancel covered in the long run,
and increasing the chances of finding out regressions only after
kernels are released, by users.

As I said earlier, I agree that the whac-a-mole style of change sent
by Josef is far from ideal, but at least it doesn't lose test
coverage.

Thanks.



>
> Thanks,
> Qu
>
> >
> > Thanks.
> >
> >>
> >> One thing to notice is, since the replace finished, we need to replace
> >> back the device, or later fsck will be executed on blank device, and
> >> cause false alert.
> >>
> >> Signed-off-by: Qu Wenruo <wqu@suse.com>
> >> ---
> >>   tests/btrfs/011 | 15 +++++++++++++--
> >>   1 file changed, 13 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/tests/btrfs/011 b/tests/btrfs/011
> >> index b4673341..aae89696 100755
> >> --- a/tests/btrfs/011
> >> +++ b/tests/btrfs/011
> >> @@ -171,13 +171,24 @@ btrfs_replace_test()
> >>              # background the replace operation (no '-B' option given)
> >>              _run_btrfs_util_prog replace start -f $replace_options $source_dev $target_dev $SCRATCH_MNT
> >>              sleep $wait_time
> >> -            _run_btrfs_util_prog replace cancel $SCRATCH_MNT
> >> +            $BTRFS_UTIL_PROG replace cancel $SCRATCH_MNT 2>&1 >> $seqres.full
> >>
> >>              # 'replace status' waits for the replace operation to finish
> >>              # before the status is printed
> >>              $BTRFS_UTIL_PROG replace status $SCRATCH_MNT > $tmp.tmp 2>&1
> >>              cat $tmp.tmp >> $seqres.full
> >> -            grep -q canceled $tmp.tmp || _fail "btrfs replace status (canceled) failed"
> >> +
> >> +            # There is no guarantee we canceled the replace, it can finish
> >> +            if grep -q 'finished' $tmp.tmp ; then
> >> +                    # The replace finished, we need to replace it back or
> >> +                    # later fsck will report error as $SCRATCH_DEV is now
> >> +                    # blank
> >> +                    $BTRFS_UTIL_PROG replace start -Bf $target_dev \
> >> +                            $source_dev $SCRATCH_MNT > /dev/null
> >> +            else
> >> +                    grep -q 'canceled' $tmp.tmp || _fail \
> >> +                            "btrfs replace status (canceled ) failed"
> >> +            fi
> >>      else
> >>              if [ "${quick}Q" = "thoroughQ" ]; then
> >>                      # The thorough test runs around 2 * $wait_time seconds.
> >> --
> >> 2.34.1
> >>

Qu Wenruo Jan. 10, 2022, 1:01 p.m. UTC | #4

On 2022/1/10 20:52, Filipe Manana wrote:
> On Mon, Jan 10, 2022 at 12:44 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
>>
>>
>> On 2022/1/10 19:57, Filipe Manana wrote:
>>> On Mon, Jan 10, 2022 at 07:28:48PM +0800, Qu Wenruo wrote:
>>>> [BUG]
>>>> When running btrfs/011 inside VM which has unsafe cache set for its
>>>> devices, and the host have enough memory to cache all the IO:
>>>
>>> Btw, I use the same setup and the test never fails for me.
>>> It's quite of a strech assuming that using unsafe caching and having
>>> enough memory are the only reasons for the failure.
>>>
>>> I use a debug kernel with plenty of heavy debug features enabled,
>>> so that may be one reason why I can't trigger it.
>>
>> I guess that's the reason.
>>
>> For most cases I only enable light-weight debug features for regular runs.
>> The heavy ones like KASAN/kmemleak are reserved for more tricky cases.
>>
>>>
>>>>
>>>> btrfs/011 98s ... [failed, exit status 1]- output mismatch
>>>>       --- tests/btrfs/011.out 2019-10-22 15:18:13.962298674 +0800
>>>>       +++ /xfstests-dev/results//btrfs/011.out.bad    2022-01-10 19:12:14.683333251 +0800
>>>>       @@ -1,3 +1,4 @@
>>>>        QA output created by 011
>>>>        *** test btrfs replace
>>>>       -*** done
>>>>       +failed: '/usr/bin/btrfs replace cancel /mnt/scratch'
>>>>       +(see /xfstests-dev/results//btrfs/011.full for details)
>>>>       ...
>>>> Ran: btrfs/011
>>>> Failures: btrfs/011
>>>> Failed 1 of 1 tests
>>>>
>>>> [CAUSE]
>>>> Although commit fa85aa64 ("btrfs/011: Fill the fs to ensure we
>>>> have enough data for dev-replace") tries to address the problem by
>>>> filling the fs with extra content, there is still no guarantee that 2
>>>> seconds of IO still needs 2 seconds to finish.
>>>>
>>>> Thus even we tried our best to make sure the replace will take 2
>>>> seconds, it can still finish faster than 2 seconds.
>>>>
>>>> And just to mention how fast the test finishes, after the fix, the test
>>>> takes around 90~100 seconds to finish.
>>>> While on real-hardware it can take over 1000 seconds.
>>>>
>>>> [FIX]
>>>> Instead of further enlarging the IO, here we just accept the fact that
>>>> replace can finish faster than our expectation, and continue the test.
>>>
>>> If I'm reading this, and the code, correctly this means we end up never
>>> testing that the replace cancel feature ends up being exercised and
>>> while a replace is in progress. That's far from ideal, losing test
>>> coverage.
>>
>> Yep, you're completely right.
>>
>> In fact for my environment, only around 10% runs really utilized the
>> cancel path, the remaining 90% go finished routine.
>>
>> But...
>>
>>>
>>> Josef sent a patch for this last month:
>>>
>>> https://lore.kernel.org/linux-btrfs/01796d6bcec40ae80b5af3269e60a66cd4b89262.1638382763.git.josef@toxicpanda.com/
>>>
>>> Don't know why it wasn't merged. I agree that changing timings in the
>>> test is not ideal and always prone to still fail on some setups, but
>>> seems more acceptable to me rather than losing test coverage.
>>
>> My concern is, it's just going to be another whac-a-mole game.
>>
>> With more RAM in the host, and further optimization, I'm not sure if 10
>> seconds is enough.
>> Furthermore 10 seconds may be too long for certain environment, to fill
>> the whole test filesystem and cause other false alerts.
>>
>> As a safenet, we have dedicated cancel workload test in btrfs/212, and
>
> btrfs/212 is about balance, not device replace.

Oh, my bad.

Then I guess it's time for us to have dedicated replace and cancel test
case.

>
>> IIRC for most (if not all) bugs exposed in btrfs/011, it's in the
>> regular replace path, not the cancel path.
>
> Even if we have had few bugs in the replace cancel patch, compared to
> the regular replace path,
> that's not an excuse to remove or reduce replace cancel coverage.
>
>>
>> Thus I want to end the whac-a-hole game once and for all, even it will
>> drop the coverage for super fast setup.
>
> If everyone starts having a setup that is faster, then we will end up
> not getting replace cancel covered in the long run,
> and increasing the chances of finding out regressions only after
> kernels are released, by users.
>
> As I said earlier, I agree that the whac-a-mole style of change sent
> by Josef is far from ideal, but at least it doesn't lose test
> coverage.

It still doesn't solve the possible false alerts, it's doing masking,
just a different way.

But that masking will eventually hit ENOSPC and cause a different false
alerts.


If we have dedicated replace cancel tests, can we remove the false-alert
prone cancel test in btrfs/011?

Thanks,
Qu

>
> Thanks.
>
>
>
>>
>> Thanks,
>> Qu
>>
>>>
>>> Thanks.
>>>
>>>>
>>>> One thing to notice is, since the replace finished, we need to replace
>>>> back the device, or later fsck will be executed on blank device, and
>>>> cause false alert.
>>>>
>>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>>> ---
>>>>    tests/btrfs/011 | 15 +++++++++++++--
>>>>    1 file changed, 13 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/tests/btrfs/011 b/tests/btrfs/011
>>>> index b4673341..aae89696 100755
>>>> --- a/tests/btrfs/011
>>>> +++ b/tests/btrfs/011
>>>> @@ -171,13 +171,24 @@ btrfs_replace_test()
>>>>               # background the replace operation (no '-B' option given)
>>>>               _run_btrfs_util_prog replace start -f $replace_options $source_dev $target_dev $SCRATCH_MNT
>>>>               sleep $wait_time
>>>> -            _run_btrfs_util_prog replace cancel $SCRATCH_MNT
>>>> +            $BTRFS_UTIL_PROG replace cancel $SCRATCH_MNT 2>&1 >> $seqres.full
>>>>
>>>>               # 'replace status' waits for the replace operation to finish
>>>>               # before the status is printed
>>>>               $BTRFS_UTIL_PROG replace status $SCRATCH_MNT > $tmp.tmp 2>&1
>>>>               cat $tmp.tmp >> $seqres.full
>>>> -            grep -q canceled $tmp.tmp || _fail "btrfs replace status (canceled) failed"
>>>> +
>>>> +            # There is no guarantee we canceled the replace, it can finish
>>>> +            if grep -q 'finished' $tmp.tmp ; then
>>>> +                    # The replace finished, we need to replace it back or
>>>> +                    # later fsck will report error as $SCRATCH_DEV is now
>>>> +                    # blank
>>>> +                    $BTRFS_UTIL_PROG replace start -Bf $target_dev \
>>>> +                            $source_dev $SCRATCH_MNT > /dev/null
>>>> +            else
>>>> +                    grep -q 'canceled' $tmp.tmp || _fail \
>>>> +                            "btrfs replace status (canceled ) failed"
>>>> +            fi
>>>>       else
>>>>               if [ "${quick}Q" = "thoroughQ" ]; then
>>>>                       # The thorough test runs around 2 * $wait_time seconds.
>>>> --
>>>> 2.34.1
>>>>

Filipe Manana Jan. 10, 2022, 1:54 p.m. UTC | #5

On Mon, Jan 10, 2022 at 1:02 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
>
> On 2022/1/10 20:52, Filipe Manana wrote:
> > On Mon, Jan 10, 2022 at 12:44 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
> >>
> >>
> >>
> >> On 2022/1/10 19:57, Filipe Manana wrote:
> >>> On Mon, Jan 10, 2022 at 07:28:48PM +0800, Qu Wenruo wrote:
> >>>> [BUG]
> >>>> When running btrfs/011 inside VM which has unsafe cache set for its
> >>>> devices, and the host have enough memory to cache all the IO:
> >>>
> >>> Btw, I use the same setup and the test never fails for me.
> >>> It's quite of a strech assuming that using unsafe caching and having
> >>> enough memory are the only reasons for the failure.
> >>>
> >>> I use a debug kernel with plenty of heavy debug features enabled,
> >>> so that may be one reason why I can't trigger it.
> >>
> >> I guess that's the reason.
> >>
> >> For most cases I only enable light-weight debug features for regular runs.
> >> The heavy ones like KASAN/kmemleak are reserved for more tricky cases.
> >>
> >>>
> >>>>
> >>>> btrfs/011 98s ... [failed, exit status 1]- output mismatch
> >>>>       --- tests/btrfs/011.out 2019-10-22 15:18:13.962298674 +0800
> >>>>       +++ /xfstests-dev/results//btrfs/011.out.bad    2022-01-10 19:12:14.683333251 +0800
> >>>>       @@ -1,3 +1,4 @@
> >>>>        QA output created by 011
> >>>>        *** test btrfs replace
> >>>>       -*** done
> >>>>       +failed: '/usr/bin/btrfs replace cancel /mnt/scratch'
> >>>>       +(see /xfstests-dev/results//btrfs/011.full for details)
> >>>>       ...
> >>>> Ran: btrfs/011
> >>>> Failures: btrfs/011
> >>>> Failed 1 of 1 tests
> >>>>
> >>>> [CAUSE]
> >>>> Although commit fa85aa64 ("btrfs/011: Fill the fs to ensure we
> >>>> have enough data for dev-replace") tries to address the problem by
> >>>> filling the fs with extra content, there is still no guarantee that 2
> >>>> seconds of IO still needs 2 seconds to finish.
> >>>>
> >>>> Thus even we tried our best to make sure the replace will take 2
> >>>> seconds, it can still finish faster than 2 seconds.
> >>>>
> >>>> And just to mention how fast the test finishes, after the fix, the test
> >>>> takes around 90~100 seconds to finish.
> >>>> While on real-hardware it can take over 1000 seconds.
> >>>>
> >>>> [FIX]
> >>>> Instead of further enlarging the IO, here we just accept the fact that
> >>>> replace can finish faster than our expectation, and continue the test.
> >>>
> >>> If I'm reading this, and the code, correctly this means we end up never
> >>> testing that the replace cancel feature ends up being exercised and
> >>> while a replace is in progress. That's far from ideal, losing test
> >>> coverage.
> >>
> >> Yep, you're completely right.
> >>
> >> In fact for my environment, only around 10% runs really utilized the
> >> cancel path, the remaining 90% go finished routine.
> >>
> >> But...
> >>
> >>>
> >>> Josef sent a patch for this last month:
> >>>
> >>> https://lore.kernel.org/linux-btrfs/01796d6bcec40ae80b5af3269e60a66cd4b89262.1638382763.git.josef@toxicpanda.com/
> >>>
> >>> Don't know why it wasn't merged. I agree that changing timings in the
> >>> test is not ideal and always prone to still fail on some setups, but
> >>> seems more acceptable to me rather than losing test coverage.
> >>
> >> My concern is, it's just going to be another whac-a-mole game.
> >>
> >> With more RAM in the host, and further optimization, I'm not sure if 10
> >> seconds is enough.
> >> Furthermore 10 seconds may be too long for certain environment, to fill
> >> the whole test filesystem and cause other false alerts.
> >>
> >> As a safenet, we have dedicated cancel workload test in btrfs/212, and
> >
> > btrfs/212 is about balance, not device replace.
>
> Oh, my bad.
>
> Then I guess it's time for us to have dedicated replace and cancel test
> case.
>
> >
> >> IIRC for most (if not all) bugs exposed in btrfs/011, it's in the
> >> regular replace path, not the cancel path.
> >
> > Even if we have had few bugs in the replace cancel patch, compared to
> > the regular replace path,
> > that's not an excuse to remove or reduce replace cancel coverage.
> >
> >>
> >> Thus I want to end the whac-a-hole game once and for all, even it will
> >> drop the coverage for super fast setup.
> >
> > If everyone starts having a setup that is faster, then we will end up
> > not getting replace cancel covered in the long run,
> > and increasing the chances of finding out regressions only after
> > kernels are released, by users.
> >
> > As I said earlier, I agree that the whac-a-mole style of change sent
> > by Josef is far from ideal, but at least it doesn't lose test
> > coverage.
>
> It still doesn't solve the possible false alerts, it's doing masking,
> just a different way.
>
> But that masking will eventually hit ENOSPC and cause a different false
> alerts.
>
>
> If we have dedicated replace cancel tests, can we remove the false-alert
> prone cancel test in btrfs/011?

In the end it's the same problem however, making sure that by the time
the cancel is requested, the replace operation is still in progress.

I don't see how making that in a separate test case will be more
reliable, unless you're considering something like dm delay (and in
that case why can't it be made in btrfs/011).

Thanks.

>
> Thanks,
> Qu
>
> >
> > Thanks.
> >
> >
> >
> >>
> >> Thanks,
> >> Qu
> >>
> >>>
> >>> Thanks.
> >>>
> >>>>
> >>>> One thing to notice is, since the replace finished, we need to replace
> >>>> back the device, or later fsck will be executed on blank device, and
> >>>> cause false alert.
> >>>>
> >>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
> >>>> ---
> >>>>    tests/btrfs/011 | 15 +++++++++++++--
> >>>>    1 file changed, 13 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/tests/btrfs/011 b/tests/btrfs/011
> >>>> index b4673341..aae89696 100755
> >>>> --- a/tests/btrfs/011
> >>>> +++ b/tests/btrfs/011
> >>>> @@ -171,13 +171,24 @@ btrfs_replace_test()
> >>>>               # background the replace operation (no '-B' option given)
> >>>>               _run_btrfs_util_prog replace start -f $replace_options $source_dev $target_dev $SCRATCH_MNT
> >>>>               sleep $wait_time
> >>>> -            _run_btrfs_util_prog replace cancel $SCRATCH_MNT
> >>>> +            $BTRFS_UTIL_PROG replace cancel $SCRATCH_MNT 2>&1 >> $seqres.full
> >>>>
> >>>>               # 'replace status' waits for the replace operation to finish
> >>>>               # before the status is printed
> >>>>               $BTRFS_UTIL_PROG replace status $SCRATCH_MNT > $tmp.tmp 2>&1
> >>>>               cat $tmp.tmp >> $seqres.full
> >>>> -            grep -q canceled $tmp.tmp || _fail "btrfs replace status (canceled) failed"
> >>>> +
> >>>> +            # There is no guarantee we canceled the replace, it can finish
> >>>> +            if grep -q 'finished' $tmp.tmp ; then
> >>>> +                    # The replace finished, we need to replace it back or
> >>>> +                    # later fsck will report error as $SCRATCH_DEV is now
> >>>> +                    # blank
> >>>> +                    $BTRFS_UTIL_PROG replace start -Bf $target_dev \
> >>>> +                            $source_dev $SCRATCH_MNT > /dev/null
> >>>> +            else
> >>>> +                    grep -q 'canceled' $tmp.tmp || _fail \
> >>>> +                            "btrfs replace status (canceled ) failed"
> >>>> +            fi
> >>>>       else
> >>>>               if [ "${quick}Q" = "thoroughQ" ]; then
> >>>>                       # The thorough test runs around 2 * $wait_time seconds.
> >>>> --
> >>>> 2.34.1
> >>>>

Qu Wenruo Jan. 10, 2022, 11:51 p.m. UTC | #6

On 2022/1/10 21:54, Filipe Manana wrote:
[...]
>>>>
>>>> Thus I want to end the whac-a-hole game once and for all, even it will
>>>> drop the coverage for super fast setup.
>>>
>>> If everyone starts having a setup that is faster, then we will end up
>>> not getting replace cancel covered in the long run,
>>> and increasing the chances of finding out regressions only after
>>> kernels are released, by users.
>>>
>>> As I said earlier, I agree that the whac-a-mole style of change sent
>>> by Josef is far from ideal, but at least it doesn't lose test
>>> coverage.
>>
>> It still doesn't solve the possible false alerts, it's doing masking,
>> just a different way.
>>
>> But that masking will eventually hit ENOSPC and cause a different false
>> alerts.
>>
>>
>> If we have dedicated replace cancel tests, can we remove the false-alert
>> prone cancel test in btrfs/011?
>
> In the end it's the same problem however, making sure that by the time
> the cancel is requested, the replace operation is still in progress.
>
> I don't see how making that in a separate test case will be more
> reliable, unless you're considering something like dm delay (and in
> that case why can't it be made in btrfs/011).

Isn't the same method of btrfs/212 more reliable?

Just run foreground replace in a loop, then another process to try to
cancel the replace.

Thanks,
Qu

>
> Thanks.
>
>>
>> Thanks,
>> Qu
>>
>>>
>>> Thanks.
>>>
>>>
>>>
>>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>>
>>>>> Thanks.
>>>>>
>>>>>>
>>>>>> One thing to notice is, since the replace finished, we need to replace
>>>>>> back the device, or later fsck will be executed on blank device, and
>>>>>> cause false alert.
>>>>>>
>>>>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>>>>> ---
>>>>>>     tests/btrfs/011 | 15 +++++++++++++--
>>>>>>     1 file changed, 13 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/tests/btrfs/011 b/tests/btrfs/011
>>>>>> index b4673341..aae89696 100755
>>>>>> --- a/tests/btrfs/011
>>>>>> +++ b/tests/btrfs/011
>>>>>> @@ -171,13 +171,24 @@ btrfs_replace_test()
>>>>>>                # background the replace operation (no '-B' option given)
>>>>>>                _run_btrfs_util_prog replace start -f $replace_options $source_dev $target_dev $SCRATCH_MNT
>>>>>>                sleep $wait_time
>>>>>> -            _run_btrfs_util_prog replace cancel $SCRATCH_MNT
>>>>>> +            $BTRFS_UTIL_PROG replace cancel $SCRATCH_MNT 2>&1 >> $seqres.full
>>>>>>
>>>>>>                # 'replace status' waits for the replace operation to finish
>>>>>>                # before the status is printed
>>>>>>                $BTRFS_UTIL_PROG replace status $SCRATCH_MNT > $tmp.tmp 2>&1
>>>>>>                cat $tmp.tmp >> $seqres.full
>>>>>> -            grep -q canceled $tmp.tmp || _fail "btrfs replace status (canceled) failed"
>>>>>> +
>>>>>> +            # There is no guarantee we canceled the replace, it can finish
>>>>>> +            if grep -q 'finished' $tmp.tmp ; then
>>>>>> +                    # The replace finished, we need to replace it back or
>>>>>> +                    # later fsck will report error as $SCRATCH_DEV is now
>>>>>> +                    # blank
>>>>>> +                    $BTRFS_UTIL_PROG replace start -Bf $target_dev \
>>>>>> +                            $source_dev $SCRATCH_MNT > /dev/null
>>>>>> +            else
>>>>>> +                    grep -q 'canceled' $tmp.tmp || _fail \
>>>>>> +                            "btrfs replace status (canceled ) failed"
>>>>>> +            fi
>>>>>>        else
>>>>>>                if [ "${quick}Q" = "thoroughQ" ]; then
>>>>>>                        # The thorough test runs around 2 * $wait_time seconds.
>>>>>> --
>>>>>> 2.34.1
>>>>>>

btrfs/011: handle finished replace properly

Commit Message

Comments

Patch