diff mbox series

tests/functional/test_x86_64_hotplug_cpu: Fix race condition during unplug

Message ID 20250107115245.52755-1-thuth@redhat.com (mailing list archive)
State New
Headers show
Series tests/functional/test_x86_64_hotplug_cpu: Fix race condition during unplug | expand

Commit Message

Thomas Huth Jan. 7, 2025, 11:52 a.m. UTC
When unplugging the CPU, the test tries to check for a successful
unplug by changing to the /sys/devices/system/cpu/cpu1 directory
to see whether that fails. However, the "cd" could be faster than
the unplug operation in the kernel, so there is a race condition
and the test sometimes fails here.
Fix it by trying to change the directory in a loop until the the
CPU has really been unplugged.

Reported-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 tests/functional/test_x86_64_hotplug_cpu.py | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Daniel P. Berrangé Jan. 7, 2025, 11:57 a.m. UTC | #1
On Tue, Jan 07, 2025 at 12:52:45PM +0100, Thomas Huth wrote:
> When unplugging the CPU, the test tries to check for a successful
> unplug by changing to the /sys/devices/system/cpu/cpu1 directory
> to see whether that fails. However, the "cd" could be faster than
> the unplug operation in the kernel, so there is a race condition
> and the test sometimes fails here.
> Fix it by trying to change the directory in a loop until the the
> CPU has really been unplugged.
> 
> Reported-by: Stefan Hajnoczi <stefanha@gmail.com>
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  tests/functional/test_x86_64_hotplug_cpu.py | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/functional/test_x86_64_hotplug_cpu.py b/tests/functional/test_x86_64_hotplug_cpu.py
> index b1d5156c72..7b9200ac2e 100755
> --- a/tests/functional/test_x86_64_hotplug_cpu.py
> +++ b/tests/functional/test_x86_64_hotplug_cpu.py
> @@ -59,11 +59,13 @@ def test_hotplug(self):
>                                            'cd /sys/devices/system/cpu/cpu1',
>                                            'cpu1#')
>  
> +        exec_command_and_wait_for_pattern(self, 'cd ..', prompt)

Is this actually needed ? Are we keeping the CPU from being unplugged by
being in that dir ? If so, why isn't it also needed in the while loop
below ?

>          self.vm.cmd('device_del', id='c1')
>  
>          exec_command_and_wait_for_pattern(self,
> -                                          'cd /sys/devices/system/cpu/cpu1',
> -                                          'No such file or directory')
> +                                    'while cd /sys/devices/system/cpu/cpu1 ;'
> +                                    ' do sleep 0.2 ; done',
> +                                    'No such file or directory')
>  
>  if __name__ == '__main__':
>      LinuxKernelTest.main()
> -- 
> 2.47.1
> 

With regards,
Daniel
Thomas Huth Jan. 7, 2025, 12:03 p.m. UTC | #2
On 07/01/2025 12.57, Daniel P. Berrangé wrote:
> On Tue, Jan 07, 2025 at 12:52:45PM +0100, Thomas Huth wrote:
>> When unplugging the CPU, the test tries to check for a successful
>> unplug by changing to the /sys/devices/system/cpu/cpu1 directory
>> to see whether that fails. However, the "cd" could be faster than
>> the unplug operation in the kernel, so there is a race condition
>> and the test sometimes fails here.
>> Fix it by trying to change the directory in a loop until the the
>> CPU has really been unplugged.
>>
>> Reported-by: Stefan Hajnoczi <stefanha@gmail.com>
>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>> ---
>>   tests/functional/test_x86_64_hotplug_cpu.py | 6 ++++--
>>   1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/tests/functional/test_x86_64_hotplug_cpu.py b/tests/functional/test_x86_64_hotplug_cpu.py
>> index b1d5156c72..7b9200ac2e 100755
>> --- a/tests/functional/test_x86_64_hotplug_cpu.py
>> +++ b/tests/functional/test_x86_64_hotplug_cpu.py
>> @@ -59,11 +59,13 @@ def test_hotplug(self):
>>                                             'cd /sys/devices/system/cpu/cpu1',
>>                                             'cpu1#')
>>   
>> +        exec_command_and_wait_for_pattern(self, 'cd ..', prompt)
> 
> Is this actually needed ? Are we keeping the CPU from being unplugged by
> being in that dir ? If so, why isn't it also needed in the while loop
> below ?

I added it to make the console output less confusing (otherwise it's still 
shown in the prompt after the CPU has been unplugged)... but I can also 
remove this line again if you prefer it?

  Thomas
Daniel P. Berrangé Jan. 7, 2025, 12:42 p.m. UTC | #3
On Tue, Jan 07, 2025 at 01:03:52PM +0100, Thomas Huth wrote:
> On 07/01/2025 12.57, Daniel P. Berrangé wrote:
> > On Tue, Jan 07, 2025 at 12:52:45PM +0100, Thomas Huth wrote:
> > > When unplugging the CPU, the test tries to check for a successful
> > > unplug by changing to the /sys/devices/system/cpu/cpu1 directory
> > > to see whether that fails. However, the "cd" could be faster than
> > > the unplug operation in the kernel, so there is a race condition
> > > and the test sometimes fails here.
> > > Fix it by trying to change the directory in a loop until the the
> > > CPU has really been unplugged.
> > > 
> > > Reported-by: Stefan Hajnoczi <stefanha@gmail.com>
> > > Signed-off-by: Thomas Huth <thuth@redhat.com>
> > > ---
> > >   tests/functional/test_x86_64_hotplug_cpu.py | 6 ++++--
> > >   1 file changed, 4 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/tests/functional/test_x86_64_hotplug_cpu.py b/tests/functional/test_x86_64_hotplug_cpu.py
> > > index b1d5156c72..7b9200ac2e 100755
> > > --- a/tests/functional/test_x86_64_hotplug_cpu.py
> > > +++ b/tests/functional/test_x86_64_hotplug_cpu.py
> > > @@ -59,11 +59,13 @@ def test_hotplug(self):
> > >                                             'cd /sys/devices/system/cpu/cpu1',
> > >                                             'cpu1#')
> > > +        exec_command_and_wait_for_pattern(self, 'cd ..', prompt)
> > 
> > Is this actually needed ? Are we keeping the CPU from being unplugged by
> > being in that dir ? If so, why isn't it also needed in the while loop
> > below ?
> 
> I added it to make the console output less confusing (otherwise it's still
> shown in the prompt after the CPU has been unplugged)... but I can also
> remove this line again if you prefer it?

No, i'm not that fussed.


With regards,
Daniel
Stefan Hajnoczi Jan. 7, 2025, 12:50 p.m. UTC | #4
On Tue, 7 Jan 2025 at 06:52, Thomas Huth <thuth@redhat.com> wrote:
>
> When unplugging the CPU, the test tries to check for a successful
> unplug by changing to the /sys/devices/system/cpu/cpu1 directory
> to see whether that fails. However, the "cd" could be faster than
> the unplug operation in the kernel, so there is a race condition
> and the test sometimes fails here.
> Fix it by trying to change the directory in a loop until the the
> CPU has really been unplugged.
>
> Reported-by: Stefan Hajnoczi <stefanha@gmail.com>
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  tests/functional/test_x86_64_hotplug_cpu.py | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)

Thanks for fixing this! I'll keep an eye on the CI job status after
merging your next pull request.

Stefan
diff mbox series

Patch

diff --git a/tests/functional/test_x86_64_hotplug_cpu.py b/tests/functional/test_x86_64_hotplug_cpu.py
index b1d5156c72..7b9200ac2e 100755
--- a/tests/functional/test_x86_64_hotplug_cpu.py
+++ b/tests/functional/test_x86_64_hotplug_cpu.py
@@ -59,11 +59,13 @@  def test_hotplug(self):
                                           'cd /sys/devices/system/cpu/cpu1',
                                           'cpu1#')
 
+        exec_command_and_wait_for_pattern(self, 'cd ..', prompt)
         self.vm.cmd('device_del', id='c1')
 
         exec_command_and_wait_for_pattern(self,
-                                          'cd /sys/devices/system/cpu/cpu1',
-                                          'No such file or directory')
+                                    'while cd /sys/devices/system/cpu/cpu1 ;'
+                                    ' do sleep 0.2 ; done',
+                                    'No such file or directory')
 
 if __name__ == '__main__':
     LinuxKernelTest.main()