diff mbox series

tests/qtest/fuzz/virtio_net_fuzz.c: fix virtio_net_fuzz_multi

Message ID 20240523102813.396750-2-frolov@swemel.ru (mailing list archive)
State New, archived
Headers show
Series tests/qtest/fuzz/virtio_net_fuzz.c: fix virtio_net_fuzz_multi | expand

Commit Message

Дмитрий Фролов May 23, 2024, 10:28 a.m. UTC
If QTestState was already CLOSED due to error, calling qtest_clock_step()
afterwards makes no sense and only raises false-crash with message:
"assertion timer != NULL failed".

Signed-off-by: Dmitry Frolov <frolov@swemel.ru>
---
 tests/qtest/fuzz/virtio_net_fuzz.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Дмитрий Фролов June 11, 2024, 12:31 p.m. UTC | #1
ping

https://patchew.org/QEMU/20240523102813.396750-2-frolov@swemel.ru/

On 23.05.2024 13:28, Dmitry Frolov wrote:
> If QTestState was already CLOSED due to error, calling qtest_clock_step()
> afterwards makes no sense and only raises false-crash with message:
> "assertion timer != NULL failed".
>
> Signed-off-by: Dmitry Frolov <frolov@swemel.ru>
> ---
>   tests/qtest/fuzz/virtio_net_fuzz.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/tests/qtest/fuzz/virtio_net_fuzz.c b/tests/qtest/fuzz/virtio_net_fuzz.c
> index e239875e3b..2f57a8ddd8 100644
> --- a/tests/qtest/fuzz/virtio_net_fuzz.c
> +++ b/tests/qtest/fuzz/virtio_net_fuzz.c
> @@ -81,6 +81,9 @@ static void virtio_net_fuzz_multi(QTestState *s,
>           /* Run the main loop */
>           qtest_clock_step(s, 100);
>           flush_events(s);
> +        if (!qtest_probe_child(s)) {
> +            return;
> +        }
>   
>           /* Wait on used descriptors */
>           if (check_used && !vqa.rx) {
Thomas Huth June 13, 2024, 10:08 a.m. UTC | #2
On 23/05/2024 12.28, Dmitry Frolov wrote:
> If QTestState was already CLOSED due to error, calling qtest_clock_step()
> afterwards makes no sense and only raises false-crash with message:
> "assertion timer != NULL failed".
> 
> Signed-off-by: Dmitry Frolov <frolov@swemel.ru>
> ---
>   tests/qtest/fuzz/virtio_net_fuzz.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/tests/qtest/fuzz/virtio_net_fuzz.c b/tests/qtest/fuzz/virtio_net_fuzz.c
> index e239875e3b..2f57a8ddd8 100644
> --- a/tests/qtest/fuzz/virtio_net_fuzz.c
> +++ b/tests/qtest/fuzz/virtio_net_fuzz.c
> @@ -81,6 +81,9 @@ static void virtio_net_fuzz_multi(QTestState *s,
>           /* Run the main loop */
>           qtest_clock_step(s, 100);
>           flush_events(s);
> +        if (!qtest_probe_child(s)) {
> +            return;
> +        }

According to your patch description, it rather sounds like the check should 
be done before the qtest_clock_step() ? ... or where does the QTestState get 
closed? During flush_events() ?

  Thomas
Дмитрий Фролов June 13, 2024, 11:59 a.m. UTC | #3
On 13.06.2024 13:08, Thomas Huth wrote:
> On 23/05/2024 12.28, Dmitry Frolov wrote:
>> If QTestState was already CLOSED due to error, calling 
>> qtest_clock_step()
>> afterwards makes no sense and only raises false-crash with message:
>> "assertion timer != NULL failed".
>>
>> Signed-off-by: Dmitry Frolov <frolov@swemel.ru>
>> ---
>>   tests/qtest/fuzz/virtio_net_fuzz.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/tests/qtest/fuzz/virtio_net_fuzz.c 
>> b/tests/qtest/fuzz/virtio_net_fuzz.c
>> index e239875e3b..2f57a8ddd8 100644
>> --- a/tests/qtest/fuzz/virtio_net_fuzz.c
>> +++ b/tests/qtest/fuzz/virtio_net_fuzz.c
>> @@ -81,6 +81,9 @@ static void virtio_net_fuzz_multi(QTestState *s,
>>           /* Run the main loop */
>>           qtest_clock_step(s, 100);
>>           flush_events(s);
>> +        if (!qtest_probe_child(s)) {
>> +            return;
>> +        }
>
> According to your patch description, it rather sounds like the check 
> should be done before the qtest_clock_step() ? ... or where does the 
> QTestState get closed? During flush_events() ?
To my understanding, the main loop is executed during flush_events(), 
where an error may occur. This behavior is legit and should not produce 
any crash report.
Without the check, the test continues to wait on used descriptors, and 
finally fails with message: "assertion timer != NULL failed".
Thus, any invalid input data produces a meaningless crash report.
>  Thomas
>
Alexander Bulekov June 13, 2024, 3:54 p.m. UTC | #4
This fixes the almost-immediate timeout issue for me on the
virtio_net_fuzz target, but I'm not sure why this works or if it is
fixing the right problem:

qtest_probe_child is designed to run from a libqtest process which
uses waitpid on the PID of the child (qemu) process (stored in
QTestState->qemu_pid) . With qemu-fuzz we do not have a separate
libqtest and qemu process:

(gdb) p s->qemu_pid
$1 = 0

So we are calling waitpid with pid = 0. From the man-page:
"0 meaning wait for any child process whose process group ID is equal to
that of the calling process at the time of the call to waitpid()."

And we are calling it with WNOHANG. So I would expect that this almost
always returns 0 unless some adjacent thread has changed state
(libfuzzer uses extra threads to manage timeouts).

I'm happy that the fuzzer works again, and am happy to leave a review,
but I would like to first understand what the behavior of
qtest_probe_child here is, since it isn't really designed to work with
the fuzzer.

On 240523 1328, Dmitry Frolov wrote:
> If QTestState was already CLOSED due to error, calling qtest_clock_step()
> afterwards makes no sense and only raises false-crash with message:
> "assertion timer != NULL failed".
> 
> Signed-off-by: Dmitry Frolov <frolov@swemel.ru>
> ---
>  tests/qtest/fuzz/virtio_net_fuzz.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/tests/qtest/fuzz/virtio_net_fuzz.c b/tests/qtest/fuzz/virtio_net_fuzz.c
> index e239875e3b..2f57a8ddd8 100644
> --- a/tests/qtest/fuzz/virtio_net_fuzz.c
> +++ b/tests/qtest/fuzz/virtio_net_fuzz.c
> @@ -81,6 +81,9 @@ static void virtio_net_fuzz_multi(QTestState *s,
>          /* Run the main loop */
>          qtest_clock_step(s, 100);
>          flush_events(s);
> +        if (!qtest_probe_child(s)) {
> +            return;
> +        }
>  
>          /* Wait on used descriptors */
>          if (check_used && !vqa.rx) {
> -- 
> 2.43.0
>
Thomas Huth June 13, 2024, 4:50 p.m. UTC | #5
On 13/06/2024 13.59, Дмитрий Фролов wrote:
> 
> 
> On 13.06.2024 13:08, Thomas Huth wrote:
>> On 23/05/2024 12.28, Dmitry Frolov wrote:
>>> If QTestState was already CLOSED due to error, calling qtest_clock_step()
>>> afterwards makes no sense and only raises false-crash with message:
>>> "assertion timer != NULL failed".
>>>
>>> Signed-off-by: Dmitry Frolov <frolov@swemel.ru>
>>> ---
>>>   tests/qtest/fuzz/virtio_net_fuzz.c | 3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/tests/qtest/fuzz/virtio_net_fuzz.c 
>>> b/tests/qtest/fuzz/virtio_net_fuzz.c
>>> index e239875e3b..2f57a8ddd8 100644
>>> --- a/tests/qtest/fuzz/virtio_net_fuzz.c
>>> +++ b/tests/qtest/fuzz/virtio_net_fuzz.c
>>> @@ -81,6 +81,9 @@ static void virtio_net_fuzz_multi(QTestState *s,
>>>           /* Run the main loop */
>>>           qtest_clock_step(s, 100);
>>>           flush_events(s);
>>> +        if (!qtest_probe_child(s)) {
>>> +            return;
>>> +        }
>>
>> According to your patch description, it rather sounds like the check 
>> should be done before the qtest_clock_step() ? ... or where does the 
>> QTestState get closed? During flush_events() ?
> To my understanding, the main loop is executed during flush_events(), where 
> an error may occur. This behavior is legit and should not produce any crash 
> report.
> Without the check, the test continues to wait on used descriptors, and 
> finally fails with message: "assertion timer != NULL failed".
> Thus, any invalid input data produces a meaningless crash report.

Ok, makes sense now, thanks!

There seems to be another while loop with a flush_events() call later in 
this file, does it maybe need the same treatment, too?

  Thomas
Дмитрий Фролов June 14, 2024, 9:33 a.m. UTC | #6
On 13.06.2024 18:54, Alexander Bulekov wrote:
> This fixes the almost-immediate timeout issue for me on the
> virtio_net_fuzz target, but I'm not sure why this works or if it is
> fixing the right problem:
>
> qtest_probe_child is designed to run from a libqtest process which
> uses waitpid on the PID of the child (qemu) process (stored in
> QTestState->qemu_pid) . With qemu-fuzz we do not have a separate
> libqtest and qemu process:
>
> (gdb) p s->qemu_pid
> $1 = 0
>
> So we are calling waitpid with pid = 0. From the man-page:
> "0 meaning wait for any child process whose process group ID is equal to
> that of the calling process at the time of the call to waitpid()."
>
> And we are calling it with WNOHANG. So I would expect that this almost
> always returns 0 unless some adjacent thread has changed state
> (libfuzzer uses extra threads to manage timeouts).
According to 
https://www.redhat.com/en/blog/hardening-qemu-through-continuous-security-testing#:~:text=each%20input%20is%20executed%20within%20a%20forked%20child%20process
"each input is executed within a forked child process".
According to crash reports, an error occurs first (which may be different),
followed by the crash with message "assertion timer != NULL failed". To 
my opinion, waiting for an answer from dead children is the reason of 
crashes.
> I'm happy that the fuzzer works again, and am happy to leave a review,
> but I would like to first understand what the behavior of
> qtest_probe_child here is, since it isn't really designed to work with
> the fuzzer.
>
> On 240523 1328, Dmitry Frolov wrote:
>> If QTestState was already CLOSED due to error, calling qtest_clock_step()
>> afterwards makes no sense and only raises false-crash with message:
>> "assertion timer != NULL failed".
>>
>> Signed-off-by: Dmitry Frolov<frolov@swemel.ru>
>> ---
>>   tests/qtest/fuzz/virtio_net_fuzz.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/tests/qtest/fuzz/virtio_net_fuzz.c b/tests/qtest/fuzz/virtio_net_fuzz.c
>> index e239875e3b..2f57a8ddd8 100644
>> --- a/tests/qtest/fuzz/virtio_net_fuzz.c
>> +++ b/tests/qtest/fuzz/virtio_net_fuzz.c
>> @@ -81,6 +81,9 @@ static void virtio_net_fuzz_multi(QTestState *s,
>>           /* Run the main loop */
>>           qtest_clock_step(s, 100);
>>           flush_events(s);
>> +        if (!qtest_probe_child(s)) {
>> +            return;
>> +        }
>>   
>>           /* Wait on used descriptors */
>>           if (check_used && !vqa.rx) {
>> -- 
>> 2.43.0
>>
Дмитрий Фролов June 14, 2024, 9:41 a.m. UTC | #7
On 13.06.2024 19:50, Thomas Huth wrote:
> On 13/06/2024 13.59, Дмитрий Фролов wrote:
>>
>>
>> On 13.06.2024 13:08, Thomas Huth wrote:
>>> On 23/05/2024 12.28, Dmitry Frolov wrote:
>>>> If QTestState was already CLOSED due to error, calling 
>>>> qtest_clock_step()
>>>> afterwards makes no sense and only raises false-crash with message:
>>>> "assertion timer != NULL failed".
>>>>
>>>> Signed-off-by: Dmitry Frolov <frolov@swemel.ru>
>>>> ---
>>>>   tests/qtest/fuzz/virtio_net_fuzz.c | 3 +++
>>>>   1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/tests/qtest/fuzz/virtio_net_fuzz.c 
>>>> b/tests/qtest/fuzz/virtio_net_fuzz.c
>>>> index e239875e3b..2f57a8ddd8 100644
>>>> --- a/tests/qtest/fuzz/virtio_net_fuzz.c
>>>> +++ b/tests/qtest/fuzz/virtio_net_fuzz.c
>>>> @@ -81,6 +81,9 @@ static void virtio_net_fuzz_multi(QTestState *s,
>>>>           /* Run the main loop */
>>>>           qtest_clock_step(s, 100);
>>>>           flush_events(s);
>>>> +        if (!qtest_probe_child(s)) {
>>>> +            return;
>>>> +        }
>>>
>>> According to your patch description, it rather sounds like the check 
>>> should be done before the qtest_clock_step() ? ... or where does the 
>>> QTestState get closed? During flush_events() ?
>> To my understanding, the main loop is executed during flush_events(), 
>> where an error may occur. This behavior is legit and should not 
>> produce any crash report.
>> Without the check, the test continues to wait on used descriptors, 
>> and finally fails with message: "assertion timer != NULL failed".
>> Thus, any invalid input data produces a meaningless crash report.
>
> Ok, makes sense now, thanks!
>
> There seems to be another while loop with a flush_events() call later 
> in this file, does it maybe need the same treatment, too?
With this fix, the number of crashes reduced significantly, but I guess, 
you are right...
If another similar crash will occur - I`ll make another patch.
Many thanks!
Dmitry
>  Thomas
>
Michael Tokarev Sept. 7, 2024, 7:06 a.m. UTC | #8
23.05.2024 13:28, Dmitry Frolov wrote:
> If QTestState was already CLOSED due to error, calling qtest_clock_step()
> afterwards makes no sense and only raises false-crash with message:
> "assertion timer != NULL failed".

So, can we have any Reviewed-by tags here? :)

Thanks,

/mjt
diff mbox series

Patch

diff --git a/tests/qtest/fuzz/virtio_net_fuzz.c b/tests/qtest/fuzz/virtio_net_fuzz.c
index e239875e3b..2f57a8ddd8 100644
--- a/tests/qtest/fuzz/virtio_net_fuzz.c
+++ b/tests/qtest/fuzz/virtio_net_fuzz.c
@@ -81,6 +81,9 @@  static void virtio_net_fuzz_multi(QTestState *s,
         /* Run the main loop */
         qtest_clock_step(s, 100);
         flush_events(s);
+        if (!qtest_probe_child(s)) {
+            return;
+        }
 
         /* Wait on used descriptors */
         if (check_used && !vqa.rx) {