Message ID | 20220703180956.2922025-1-yi.zhang@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [blktests] block/008: fix cpu online restore | expand |
On Jul 04, 2022 / 02:09, Yi Zhang wrote: > The offline cpus cannot be restored during _cleanup when only _offline_cpu > executed, fix it by reset RESTORE_CPUS_ONLINE=1 during test. > > Fixes: bd6b882 ("block/008: check CPU offline failure due to many IRQs") > Signed-off-by: Yi Zhang <yi.zhang@redhat.com> Hello Yi, thank you for catching this. The commit bd6b882 put the _offline_cpu call into a sub-shell, then RESTORE_CPUS_ONLINE reset in the function no longer affects _cleanup function. When I made the commit, I overlooked that point. Your change should fix the issue but it makes the RESTORE_CPUS_ONLINE=1 in _offline_cpu meaningless. Instead, I suggest following patch. Could you confirm it fixes the issue in your environment? diff --git a/tests/block/008 b/tests/block/008 index 75aae65..cd09352 100755 --- a/tests/block/008 +++ b/tests/block/008 @@ -34,6 +34,7 @@ test_device() { local offline_cpus=() local offlining=1 local max_offline=${#HOTPLUGGABLE_CPUS[@]} + local o=$TMPDIR/offline_cpu_out if [[ ${#HOTPLUGGABLE_CPUS[@]} -eq ${#ALL_CPUS[@]} ]]; then (( max_offline-- )) fi @@ -60,18 +61,18 @@ test_device() { if (( offlining )); then idx=$((RANDOM % ${#online_cpus[@]})) - if err=$(_offline_cpu "${online_cpus[$idx]}" 2>&1); then + if _offline_cpu "${online_cpus[$idx]}" > "$o" 2>&1; then offline_cpus+=("${online_cpus[$idx]}") unset "online_cpus[$idx]" online_cpus=("${online_cpus[@]}") - elif [[ $err =~ "No space left on device" ]]; then + elif [[ $(<"$o") =~ "No space left on device" ]]; then # ENOSPC means CPU offline failure due to IRQ # vector shortage. Keep current number of # offline CPUs as maximum CPUs to offline. max_offline=${#offline_cpus[@]} offlining=0 else - echo "Failed to offline CPU: $err" + echo "Failed to offline CPU: $(<"$o")" break fi fi
On Mon, Jul 4, 2022 at 1:20 PM Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> wrote: > > On Jul 04, 2022 / 02:09, Yi Zhang wrote: > > The offline cpus cannot be restored during _cleanup when only _offline_cpu > > executed, fix it by reset RESTORE_CPUS_ONLINE=1 during test. > > > > Fixes: bd6b882 ("block/008: check CPU offline failure due to many IRQs") > > Signed-off-by: Yi Zhang <yi.zhang@redhat.com> > > Hello Yi, thank you for catching this. The commit bd6b882 put the _offline_cpu > call into a sub-shell, then RESTORE_CPUS_ONLINE reset in the function no longer > affects _cleanup function. When I made the commit, I overlooked that point. > > Your change should fix the issue but it makes the RESTORE_CPUS_ONLINE=1 in > _offline_cpu meaningless. Instead, I suggest following patch. Could you confirm > it fixes the issue in your environment? Yeah, your change works well, feel free to add Tested-by: Yi Zhang <yi.zhang@redhat.com> > > diff --git a/tests/block/008 b/tests/block/008 > index 75aae65..cd09352 100755 > --- a/tests/block/008 > +++ b/tests/block/008 > @@ -34,6 +34,7 @@ test_device() { > local offline_cpus=() > local offlining=1 > local max_offline=${#HOTPLUGGABLE_CPUS[@]} > + local o=$TMPDIR/offline_cpu_out > if [[ ${#HOTPLUGGABLE_CPUS[@]} -eq ${#ALL_CPUS[@]} ]]; then > (( max_offline-- )) > fi > @@ -60,18 +61,18 @@ test_device() { > > if (( offlining )); then > idx=$((RANDOM % ${#online_cpus[@]})) > - if err=$(_offline_cpu "${online_cpus[$idx]}" 2>&1); then > + if _offline_cpu "${online_cpus[$idx]}" > "$o" 2>&1; then > offline_cpus+=("${online_cpus[$idx]}") > unset "online_cpus[$idx]" > online_cpus=("${online_cpus[@]}") > - elif [[ $err =~ "No space left on device" ]]; then > + elif [[ $(<"$o") =~ "No space left on device" ]]; then > # ENOSPC means CPU offline failure due to IRQ > # vector shortage. Keep current number of > # offline CPUs as maximum CPUs to offline. > max_offline=${#offline_cpus[@]} > offlining=0 > else > - echo "Failed to offline CPU: $err" > + echo "Failed to offline CPU: $(<"$o")" > break > fi > fi > > > -- > Shin'ichiro Kawasaki > -- Best Regards, Yi Zhang
On Jul 04, 2022 / 14:47, Yi Zhang wrote: > On Mon, Jul 4, 2022 at 1:20 PM Shinichiro Kawasaki > <shinichiro.kawasaki@wdc.com> wrote: > > > > On Jul 04, 2022 / 02:09, Yi Zhang wrote: > > > The offline cpus cannot be restored during _cleanup when only _offline_cpu > > > executed, fix it by reset RESTORE_CPUS_ONLINE=1 during test. > > > > > > Fixes: bd6b882 ("block/008: check CPU offline failure due to many IRQs") > > > Signed-off-by: Yi Zhang <yi.zhang@redhat.com> > > > > Hello Yi, thank you for catching this. The commit bd6b882 put the _offline_cpu > > call into a sub-shell, then RESTORE_CPUS_ONLINE reset in the function no longer > > affects _cleanup function. When I made the commit, I overlooked that point. > > > > Your change should fix the issue but it makes the RESTORE_CPUS_ONLINE=1 in > > _offline_cpu meaningless. Instead, I suggest following patch. Could you confirm > > it fixes the issue in your environment? > > Yeah, your change works well, feel free to add > Tested-by: Yi Zhang <yi.zhang@redhat.com> Thanks! I've posted the change with your tag as a formal patch.
diff --git a/tests/block/008 b/tests/block/008 index 75aae65..f61f33a 100755 --- a/tests/block/008 +++ b/tests/block/008 @@ -91,6 +91,11 @@ test_device() { fi done + # The RESTORE_CPUS_ONLINE setting was ignored with previous + # err=$(_offline_cpu "${online_cpus[$idx]}" when only _offline_cpu + # executed, reset it here then the offline cpus can be retored + # shellcheck disable=SC2034 + RESTORE_CPUS_ONLINE=1 FIO_PERF_FIELDS=("read iops") _fio_perf_report
The offline cpus cannot be restored during _cleanup when only _offline_cpu executed, fix it by reset RESTORE_CPUS_ONLINE=1 during test. Fixes: bd6b882 ("block/008: check CPU offline failure due to many IRQs") Signed-off-by: Yi Zhang <yi.zhang@redhat.com> --- tests/block/008 | 5 +++++ 1 file changed, 5 insertions(+)