Message ID | 20250107115245.52755-1-thuth@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | tests/functional/test_x86_64_hotplug_cpu: Fix race condition during unplug | expand |
On Tue, Jan 07, 2025 at 12:52:45PM +0100, Thomas Huth wrote: > When unplugging the CPU, the test tries to check for a successful > unplug by changing to the /sys/devices/system/cpu/cpu1 directory > to see whether that fails. However, the "cd" could be faster than > the unplug operation in the kernel, so there is a race condition > and the test sometimes fails here. > Fix it by trying to change the directory in a loop until the the > CPU has really been unplugged. > > Reported-by: Stefan Hajnoczi <stefanha@gmail.com> > Signed-off-by: Thomas Huth <thuth@redhat.com> > --- > tests/functional/test_x86_64_hotplug_cpu.py | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/tests/functional/test_x86_64_hotplug_cpu.py b/tests/functional/test_x86_64_hotplug_cpu.py > index b1d5156c72..7b9200ac2e 100755 > --- a/tests/functional/test_x86_64_hotplug_cpu.py > +++ b/tests/functional/test_x86_64_hotplug_cpu.py > @@ -59,11 +59,13 @@ def test_hotplug(self): > 'cd /sys/devices/system/cpu/cpu1', > 'cpu1#') > > + exec_command_and_wait_for_pattern(self, 'cd ..', prompt) Is this actually needed ? Are we keeping the CPU from being unplugged by being in that dir ? If so, why isn't it also needed in the while loop below ? > self.vm.cmd('device_del', id='c1') > > exec_command_and_wait_for_pattern(self, > - 'cd /sys/devices/system/cpu/cpu1', > - 'No such file or directory') > + 'while cd /sys/devices/system/cpu/cpu1 ;' > + ' do sleep 0.2 ; done', > + 'No such file or directory') > > if __name__ == '__main__': > LinuxKernelTest.main() > -- > 2.47.1 > With regards, Daniel
On 07/01/2025 12.57, Daniel P. Berrangé wrote: > On Tue, Jan 07, 2025 at 12:52:45PM +0100, Thomas Huth wrote: >> When unplugging the CPU, the test tries to check for a successful >> unplug by changing to the /sys/devices/system/cpu/cpu1 directory >> to see whether that fails. However, the "cd" could be faster than >> the unplug operation in the kernel, so there is a race condition >> and the test sometimes fails here. >> Fix it by trying to change the directory in a loop until the the >> CPU has really been unplugged. >> >> Reported-by: Stefan Hajnoczi <stefanha@gmail.com> >> Signed-off-by: Thomas Huth <thuth@redhat.com> >> --- >> tests/functional/test_x86_64_hotplug_cpu.py | 6 ++++-- >> 1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/tests/functional/test_x86_64_hotplug_cpu.py b/tests/functional/test_x86_64_hotplug_cpu.py >> index b1d5156c72..7b9200ac2e 100755 >> --- a/tests/functional/test_x86_64_hotplug_cpu.py >> +++ b/tests/functional/test_x86_64_hotplug_cpu.py >> @@ -59,11 +59,13 @@ def test_hotplug(self): >> 'cd /sys/devices/system/cpu/cpu1', >> 'cpu1#') >> >> + exec_command_and_wait_for_pattern(self, 'cd ..', prompt) > > Is this actually needed ? Are we keeping the CPU from being unplugged by > being in that dir ? If so, why isn't it also needed in the while loop > below ? I added it to make the console output less confusing (otherwise it's still shown in the prompt after the CPU has been unplugged)... but I can also remove this line again if you prefer it? Thomas
On Tue, Jan 07, 2025 at 01:03:52PM +0100, Thomas Huth wrote: > On 07/01/2025 12.57, Daniel P. Berrangé wrote: > > On Tue, Jan 07, 2025 at 12:52:45PM +0100, Thomas Huth wrote: > > > When unplugging the CPU, the test tries to check for a successful > > > unplug by changing to the /sys/devices/system/cpu/cpu1 directory > > > to see whether that fails. However, the "cd" could be faster than > > > the unplug operation in the kernel, so there is a race condition > > > and the test sometimes fails here. > > > Fix it by trying to change the directory in a loop until the the > > > CPU has really been unplugged. > > > > > > Reported-by: Stefan Hajnoczi <stefanha@gmail.com> > > > Signed-off-by: Thomas Huth <thuth@redhat.com> > > > --- > > > tests/functional/test_x86_64_hotplug_cpu.py | 6 ++++-- > > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > > > diff --git a/tests/functional/test_x86_64_hotplug_cpu.py b/tests/functional/test_x86_64_hotplug_cpu.py > > > index b1d5156c72..7b9200ac2e 100755 > > > --- a/tests/functional/test_x86_64_hotplug_cpu.py > > > +++ b/tests/functional/test_x86_64_hotplug_cpu.py > > > @@ -59,11 +59,13 @@ def test_hotplug(self): > > > 'cd /sys/devices/system/cpu/cpu1', > > > 'cpu1#') > > > + exec_command_and_wait_for_pattern(self, 'cd ..', prompt) > > > > Is this actually needed ? Are we keeping the CPU from being unplugged by > > being in that dir ? If so, why isn't it also needed in the while loop > > below ? > > I added it to make the console output less confusing (otherwise it's still > shown in the prompt after the CPU has been unplugged)... but I can also > remove this line again if you prefer it? No, i'm not that fussed. With regards, Daniel
On Tue, 7 Jan 2025 at 06:52, Thomas Huth <thuth@redhat.com> wrote: > > When unplugging the CPU, the test tries to check for a successful > unplug by changing to the /sys/devices/system/cpu/cpu1 directory > to see whether that fails. However, the "cd" could be faster than > the unplug operation in the kernel, so there is a race condition > and the test sometimes fails here. > Fix it by trying to change the directory in a loop until the the > CPU has really been unplugged. > > Reported-by: Stefan Hajnoczi <stefanha@gmail.com> > Signed-off-by: Thomas Huth <thuth@redhat.com> > --- > tests/functional/test_x86_64_hotplug_cpu.py | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) Thanks for fixing this! I'll keep an eye on the CI job status after merging your next pull request. Stefan
diff --git a/tests/functional/test_x86_64_hotplug_cpu.py b/tests/functional/test_x86_64_hotplug_cpu.py index b1d5156c72..7b9200ac2e 100755 --- a/tests/functional/test_x86_64_hotplug_cpu.py +++ b/tests/functional/test_x86_64_hotplug_cpu.py @@ -59,11 +59,13 @@ def test_hotplug(self): 'cd /sys/devices/system/cpu/cpu1', 'cpu1#') + exec_command_and_wait_for_pattern(self, 'cd ..', prompt) self.vm.cmd('device_del', id='c1') exec_command_and_wait_for_pattern(self, - 'cd /sys/devices/system/cpu/cpu1', - 'No such file or directory') + 'while cd /sys/devices/system/cpu/cpu1 ;' + ' do sleep 0.2 ; done', + 'No such file or directory') if __name__ == '__main__': LinuxKernelTest.main()
When unplugging the CPU, the test tries to check for a successful unplug by changing to the /sys/devices/system/cpu/cpu1 directory to see whether that fails. However, the "cd" could be faster than the unplug operation in the kernel, so there is a race condition and the test sometimes fails here. Fix it by trying to change the directory in a loop until the the CPU has really been unplugged. Reported-by: Stefan Hajnoczi <stefanha@gmail.com> Signed-off-by: Thomas Huth <thuth@redhat.com> --- tests/functional/test_x86_64_hotplug_cpu.py | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)