[00/10] Retire Fork-Based Fuzzing

Message ID	20230205042951.3570008-1-alxndr@bu.edu (mailing list archive)
Headers	show Return-Path: <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org> IronPort-Data: A9a23:vO3A16wMFFtqOC0hW656t+eJxCrEfRIJ4+MujC+fZmUNrF6WrkUFy DNNUGyFOPqJa2SmLowlOd7n80MOucXRmoBlSAc9qy00HyNBpPSeOdnIdU2Y0wF+jyHgoOCLy +1EN7Es+ehtFie0Si+Fa+Sn9j8kkPnSHdIQMcacUghpXwhoVSw9vhxqnu89k+ZAjMOwRgiAo rsemeWGULOe82MyYz18B56r8ks156yo4G9A5TTSWNgQ1LPgvyhNZH4gDfzpR5fIatE8NvK3Q e/F0Ia48gvxl/v6Ior4+lpTWhRiro/6ZGBiuFIPM0SRqkEqShgJ70oOHKF0hXG7Ktm+t4sZJ N1l7fRcQOqyV0HGsL11vxJwSkmSMUDakVNuzLfWXcG7liX7n3XQL/pGBk03D7EG/fRMEHB/z eY7DzM0Xx2AmLfjqF67YrEEasULKcDqOMYGuSglw2uBVbApRpfMR6iM7thdtNsyrpoWTLCOO oxDM2ApNkyYC/FMEg5/5JYWleO4gHXlWzdF7l+ZuMLb5kCJkVMuiOSwbYu9ltqid4J5hEzFh jj/zWGkLTMFF9q+9jGl/Sf57gPItWahMG4IL5Wh+/t3xVGe2GEXIBsRU1S9vL++kEHWZj5EA 0kd+y5rtKtrsULxFoG7UBq/r3qJ+BUbXrK8DtEH1e1E8YKMiy7xO4TOZmcphAAO3CPueQEX6 w== IronPort-HdrOrdr: A9a23:r06DVq9xobFp8TqxeCZuk+ACI+orL9Y04lQ7vn2ZhyYlFvBw8P re5sjzsCWftN9/YgBHpTntAtjjfZq+z+8P3WBuB8baYOCOggLBR/AA0WKL+V3d8kbFh4lgPM lbAs1DIey1J3RByejB3CmEP+AJ/OSnmZrY+Ns2DE0AceipUcxdBstCZDpzancGPDWuzKBXda ah2g== From: Alexander Bulekov <alxndr@bu.edu> To: qemu-devel@nongnu.org Cc: Alexander Bulekov <alxndr@bu.edu>, Stefan Hajnoczi <stefanha@redhat.com>, Bandan Das <bsd@redhat.com>, Darren Kenny <darren.kenny@oracle.com>, Paolo Bonzini <pbonzini@redhat.com> Subject: [PATCH 00/10] Retire Fork-Based Fuzzing Date: Sat, 4 Feb 2023 23:29:41 -0500 Message-Id: <20230205042951.3570008-1-alxndr@bu.edu> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CES-GSUITE_AUTH: bf3aNvsZpxl8 Received-SPF: pass client-ip=216.71.137.80; envelope-from=alxndr@bu.edu; helo=esa7.hc2706-39.iphmx.com X-Spam_score_int: -10 X-Spam_score: -1.1 X-Spam_bar: - X-Spam_report: (-1.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HK_RANDOM_ENVFROM=0.001, HK_RANDOM_FROM=0.999, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Series	Retire Fork-Based Fuzzing \| expand [00/10] Retire Fork-Based Fuzzing [01/10] hw/sparse-mem: clear memory on reset [02/10] fuzz: add fuzz_reboot API [03/10] fuzz/generic-fuzz: use reboots instead of forks to reset state [04/10] fuzz/generic-fuzz: add a limit on DMA bytes written [05/10] fuzz/virtio-scsi: remove fork-based fuzzer [06/10] fuzz/virtio-net: remove fork-based fuzzer [07/10] fuzz/virtio-blk: remove fork-based fuzzer [08/10] fuzz/i440fx: remove fork-based fuzzer [09/10] fuzz: remove fork-fuzzing scaffolding [10/10] docs/fuzz: remove mentions of fork-based fuzzing

Alexander Bulekov Feb. 5, 2023, 4:29 a.m. UTC

Hello,
This series removes fork-based fuzzing.
How does fork-based fuzzing work?
 * A single parent process initializes QEMU
 * We identify the devices we wish to fuzz (fuzzer-dependent)
 * Use QTest to PCI enumerate the devices
 * After that we start a fork-server which forks the process and executes
   fuzzer inputs inside the disposable children.

In a normal fuzzing process, everything happens in a single process.

Pros of fork-based fuzzing:
 * We only need to do common configuration once (e.g. PCI enumeration).
 * Fork provides a strong guarantee that fuzzer inputs will not interfere with
   each-other
 * The fuzzing process can continue even after a child-process crashes
 * We can apply our-own timers to child-processes to exit slow inputs, early

Cons of fork-based fuzzing:
 * Fork-based fuzzing is not supported by libfuzzer. We had to build our own
   fork-server and rely on tricks using linker-scripts and shared-memory to
   support fuzzing. ( https://physics.bu.edu/~alxndr/libfuzzer-forkserver/ )
 * Fork-based fuzzing is currently the main blocker preventing us from enabling
   other fuzzers such as AFL++ on OSS-Fuzz
 * Fork-based fuzzing may be a reason why coverage-builds are failing on
   OSS-Fuzz. Coverage is an important fuzzing metric which would allow us to
   find parts of the code that are not well-covered.
 * Fork-based fuzzing has high overhead. fork() is an expensive system-call,
   especially for processes running ASAN (with large/complex) VMA layouts.
 * Fork prevents us from effectively fuzzing devices that rely on
   threads (e.g. qxl).

These patches remove fork-based fuzzing and replace it with reboot-based
fuzzing for most cases. Misc notes about this change:
 * libfuzzer appears to be no longer in active development. As such, the
   current implementation of fork-based fuzzing (while having some nice
   advantages) is likely to hold us back in the future. If these changes
   are approved and appear to run successfully on OSS-Fuzz, we should be
   able to easily experiment with other fuzzing engines (AFL++).
 * Some device do not completely reset their state. This can lead to
   non-reproducible crashes. However, in my local tests, most crashes
   were reproducible. OSS-Fuzz shouldn't send us reports unless it can
   consistently reproduce a crash.
 * In theory, the corpus-format should not change, so the existing
   corpus-inputs on OSS-Fuzz will transfer to the new reset()-able
   fuzzers.
 * Each fuzzing process will now exit after a single crash is found. To
   continue the fuzzing process, use libfuzzer flags such as -jobs=-1
 * We no long control input-timeouts (those are handled by libfuzzer).
   Since timeouts on oss-fuzz can be many seconds long, I added a limit
   on the number of DMA bytes written.
 

Alexander Bulekov (10):
  hw/sparse-mem: clear memory on reset
  fuzz: add fuzz_reboot API
  fuzz/generic-fuzz: use reboots instead of forks to reset state
  fuzz/generic-fuzz: add a limit on DMA bytes written
  fuzz/virtio-scsi: remove fork-based fuzzer
  fuzz/virtio-net: remove fork-based fuzzer
  fuzz/virtio-blk: remove fork-based fuzzer
  fuzz/i440fx: remove fork-based fuzzer
  fuzz: remove fork-fuzzing scaffolding
  docs/fuzz: remove mentions of fork-based fuzzing

 docs/devel/fuzzing.rst              |  22 +-----
 hw/mem/sparse-mem.c                 |  13 +++-
 meson.build                         |   4 -
 tests/qtest/fuzz/fork_fuzz.c        |  41 ----------
 tests/qtest/fuzz/fork_fuzz.h        |  23 ------
 tests/qtest/fuzz/fork_fuzz.ld       |  56 --------------
 tests/qtest/fuzz/fuzz.c             |   6 ++
 tests/qtest/fuzz/fuzz.h             |   2 +-
 tests/qtest/fuzz/generic_fuzz.c     | 111 +++++++---------------------
 tests/qtest/fuzz/i440fx_fuzz.c      |  27 +------
 tests/qtest/fuzz/meson.build        |   6 +-
 tests/qtest/fuzz/virtio_blk_fuzz.c  |  51 ++-----------
 tests/qtest/fuzz/virtio_net_fuzz.c  |  54 ++------------
 tests/qtest/fuzz/virtio_scsi_fuzz.c |  51 ++-----------
 14 files changed, 72 insertions(+), 395 deletions(-)
 delete mode 100644 tests/qtest/fuzz/fork_fuzz.c
 delete mode 100644 tests/qtest/fuzz/fork_fuzz.h
 delete mode 100644 tests/qtest/fuzz/fork_fuzz.ld

Philippe Mathieu-Daudé Feb. 5, 2023, 10:39 a.m. UTC | #1

On 5/2/23 05:29, Alexander Bulekov wrote:

>   * Some device do not completely reset their state. This can lead to
>     non-reproducible crashes. However, in my local tests, most crashes
>     were reproducible. OSS-Fuzz shouldn't send us reports unless it can
>     consistently reproduce a crash.

These devices are buggy, hard/cold reset should be reproducible.

>   * In theory, the corpus-format should not change, so the existing
>     corpus-inputs on OSS-Fuzz will transfer to the new reset()-able
>     fuzzers.

Alexander Bulekov Feb. 6, 2023, 2:09 p.m. UTC | #2

On 230205 1139, Philippe Mathieu-Daudé wrote:
> On 5/2/23 05:29, Alexander Bulekov wrote:
> 
> >   * Some device do not completely reset their state. This can lead to
> >     non-reproducible crashes. However, in my local tests, most crashes
> >     were reproducible. OSS-Fuzz shouldn't send us reports unless it can
> >     consistently reproduce a crash.
> 
> These devices are buggy, hard/cold reset should be reproducible.

Agreed. However I don't think the fuzzer is tailored to report these
types of bugs. OSS-Fuzz will just see that some crashes/inputs are not
reproducible. I have been thinking about ways to make the fuzzer report
incomplete VMStateDescriptions. Maybe something similar can be done for
reboots.
-Alex

> 
> >   * In theory, the corpus-format should not change, so the existing
> >     corpus-inputs on OSS-Fuzz will transfer to the new reset()-able
> >     fuzzers.
>

Alexander Bulekov Feb. 13, 2023, 2:11 a.m. UTC | #3

ping

On 230204 2329, Alexander Bulekov wrote:
> Hello,
> This series removes fork-based fuzzing.
> How does fork-based fuzzing work?
>  * A single parent process initializes QEMU
>  * We identify the devices we wish to fuzz (fuzzer-dependent)
>  * Use QTest to PCI enumerate the devices
>  * After that we start a fork-server which forks the process and executes
>    fuzzer inputs inside the disposable children.
> 
> In a normal fuzzing process, everything happens in a single process.
> 
> Pros of fork-based fuzzing:
>  * We only need to do common configuration once (e.g. PCI enumeration).
>  * Fork provides a strong guarantee that fuzzer inputs will not interfere with
>    each-other
>  * The fuzzing process can continue even after a child-process crashes
>  * We can apply our-own timers to child-processes to exit slow inputs, early
> 
> Cons of fork-based fuzzing:
>  * Fork-based fuzzing is not supported by libfuzzer. We had to build our own
>    fork-server and rely on tricks using linker-scripts and shared-memory to
>    support fuzzing. ( https://physics.bu.edu/~alxndr/libfuzzer-forkserver/ )
>  * Fork-based fuzzing is currently the main blocker preventing us from enabling
>    other fuzzers such as AFL++ on OSS-Fuzz
>  * Fork-based fuzzing may be a reason why coverage-builds are failing on
>    OSS-Fuzz. Coverage is an important fuzzing metric which would allow us to
>    find parts of the code that are not well-covered.
>  * Fork-based fuzzing has high overhead. fork() is an expensive system-call,
>    especially for processes running ASAN (with large/complex) VMA layouts.
>  * Fork prevents us from effectively fuzzing devices that rely on
>    threads (e.g. qxl).
> 
> These patches remove fork-based fuzzing and replace it with reboot-based
> fuzzing for most cases. Misc notes about this change:
>  * libfuzzer appears to be no longer in active development. As such, the
>    current implementation of fork-based fuzzing (while having some nice
>    advantages) is likely to hold us back in the future. If these changes
>    are approved and appear to run successfully on OSS-Fuzz, we should be
>    able to easily experiment with other fuzzing engines (AFL++).
>  * Some device do not completely reset their state. This can lead to
>    non-reproducible crashes. However, in my local tests, most crashes
>    were reproducible. OSS-Fuzz shouldn't send us reports unless it can
>    consistently reproduce a crash.
>  * In theory, the corpus-format should not change, so the existing
>    corpus-inputs on OSS-Fuzz will transfer to the new reset()-able
>    fuzzers.
>  * Each fuzzing process will now exit after a single crash is found. To
>    continue the fuzzing process, use libfuzzer flags such as -jobs=-1
>  * We no long control input-timeouts (those are handled by libfuzzer).
>    Since timeouts on oss-fuzz can be many seconds long, I added a limit
>    on the number of DMA bytes written.
>  
> 
> Alexander Bulekov (10):
>   hw/sparse-mem: clear memory on reset
>   fuzz: add fuzz_reboot API
>   fuzz/generic-fuzz: use reboots instead of forks to reset state
>   fuzz/generic-fuzz: add a limit on DMA bytes written
>   fuzz/virtio-scsi: remove fork-based fuzzer
>   fuzz/virtio-net: remove fork-based fuzzer
>   fuzz/virtio-blk: remove fork-based fuzzer
>   fuzz/i440fx: remove fork-based fuzzer
>   fuzz: remove fork-fuzzing scaffolding
>   docs/fuzz: remove mentions of fork-based fuzzing
> 
>  docs/devel/fuzzing.rst              |  22 +-----
>  hw/mem/sparse-mem.c                 |  13 +++-
>  meson.build                         |   4 -
>  tests/qtest/fuzz/fork_fuzz.c        |  41 ----------
>  tests/qtest/fuzz/fork_fuzz.h        |  23 ------
>  tests/qtest/fuzz/fork_fuzz.ld       |  56 --------------
>  tests/qtest/fuzz/fuzz.c             |   6 ++
>  tests/qtest/fuzz/fuzz.h             |   2 +-
>  tests/qtest/fuzz/generic_fuzz.c     | 111 +++++++---------------------
>  tests/qtest/fuzz/i440fx_fuzz.c      |  27 +------
>  tests/qtest/fuzz/meson.build        |   6 +-
>  tests/qtest/fuzz/virtio_blk_fuzz.c  |  51 ++-----------
>  tests/qtest/fuzz/virtio_net_fuzz.c  |  54 ++------------
>  tests/qtest/fuzz/virtio_scsi_fuzz.c |  51 ++-----------
>  14 files changed, 72 insertions(+), 395 deletions(-)
>  delete mode 100644 tests/qtest/fuzz/fork_fuzz.c
>  delete mode 100644 tests/qtest/fuzz/fork_fuzz.h
>  delete mode 100644 tests/qtest/fuzz/fork_fuzz.ld
> 
> -- 
> 2.39.0
>

Stefan Hajnoczi Feb. 14, 2023, 3:38 p.m. UTC | #4

On Sat, Feb 04, 2023 at 11:29:41PM -0500, Alexander Bulekov wrote:
> Hello,
> This series removes fork-based fuzzing.
> How does fork-based fuzzing work?
>  * A single parent process initializes QEMU
>  * We identify the devices we wish to fuzz (fuzzer-dependent)
>  * Use QTest to PCI enumerate the devices
>  * After that we start a fork-server which forks the process and executes
>    fuzzer inputs inside the disposable children.
> 
> In a normal fuzzing process, everything happens in a single process.
> 
> Pros of fork-based fuzzing:
>  * We only need to do common configuration once (e.g. PCI enumeration).
>  * Fork provides a strong guarantee that fuzzer inputs will not interfere with
>    each-other
>  * The fuzzing process can continue even after a child-process crashes
>  * We can apply our-own timers to child-processes to exit slow inputs, early
> 
> Cons of fork-based fuzzing:
>  * Fork-based fuzzing is not supported by libfuzzer. We had to build our own
>    fork-server and rely on tricks using linker-scripts and shared-memory to
>    support fuzzing. ( https://physics.bu.edu/~alxndr/libfuzzer-forkserver/ )
>  * Fork-based fuzzing is currently the main blocker preventing us from enabling
>    other fuzzers such as AFL++ on OSS-Fuzz
>  * Fork-based fuzzing may be a reason why coverage-builds are failing on
>    OSS-Fuzz. Coverage is an important fuzzing metric which would allow us to
>    find parts of the code that are not well-covered.
>  * Fork-based fuzzing has high overhead. fork() is an expensive system-call,
>    especially for processes running ASAN (with large/complex) VMA layouts.
>  * Fork prevents us from effectively fuzzing devices that rely on
>    threads (e.g. qxl).
> 
> These patches remove fork-based fuzzing and replace it with reboot-based
> fuzzing for most cases. Misc notes about this change:
>  * libfuzzer appears to be no longer in active development. As such, the
>    current implementation of fork-based fuzzing (while having some nice
>    advantages) is likely to hold us back in the future. If these changes
>    are approved and appear to run successfully on OSS-Fuzz, we should be
>    able to easily experiment with other fuzzing engines (AFL++).
>  * Some device do not completely reset their state. This can lead to
>    non-reproducible crashes. However, in my local tests, most crashes
>    were reproducible. OSS-Fuzz shouldn't send us reports unless it can
>    consistently reproduce a crash.
>  * In theory, the corpus-format should not change, so the existing
>    corpus-inputs on OSS-Fuzz will transfer to the new reset()-able
>    fuzzers.
>  * Each fuzzing process will now exit after a single crash is found. To
>    continue the fuzzing process, use libfuzzer flags such as -jobs=-1
>  * We no long control input-timeouts (those are handled by libfuzzer).
>    Since timeouts on oss-fuzz can be many seconds long, I added a limit
>    on the number of DMA bytes written.
>  
> 
> Alexander Bulekov (10):
>   hw/sparse-mem: clear memory on reset
>   fuzz: add fuzz_reboot API
>   fuzz/generic-fuzz: use reboots instead of forks to reset state
>   fuzz/generic-fuzz: add a limit on DMA bytes written
>   fuzz/virtio-scsi: remove fork-based fuzzer
>   fuzz/virtio-net: remove fork-based fuzzer
>   fuzz/virtio-blk: remove fork-based fuzzer
>   fuzz/i440fx: remove fork-based fuzzer
>   fuzz: remove fork-fuzzing scaffolding
>   docs/fuzz: remove mentions of fork-based fuzzing
> 
>  docs/devel/fuzzing.rst              |  22 +-----
>  hw/mem/sparse-mem.c                 |  13 +++-
>  meson.build                         |   4 -
>  tests/qtest/fuzz/fork_fuzz.c        |  41 ----------
>  tests/qtest/fuzz/fork_fuzz.h        |  23 ------
>  tests/qtest/fuzz/fork_fuzz.ld       |  56 --------------
>  tests/qtest/fuzz/fuzz.c             |   6 ++
>  tests/qtest/fuzz/fuzz.h             |   2 +-
>  tests/qtest/fuzz/generic_fuzz.c     | 111 +++++++---------------------
>  tests/qtest/fuzz/i440fx_fuzz.c      |  27 +------
>  tests/qtest/fuzz/meson.build        |   6 +-
>  tests/qtest/fuzz/virtio_blk_fuzz.c  |  51 ++-----------
>  tests/qtest/fuzz/virtio_net_fuzz.c  |  54 ++------------
>  tests/qtest/fuzz/virtio_scsi_fuzz.c |  51 ++-----------
>  14 files changed, 72 insertions(+), 395 deletions(-)
>  delete mode 100644 tests/qtest/fuzz/fork_fuzz.c
>  delete mode 100644 tests/qtest/fuzz/fork_fuzz.h
>  delete mode 100644 tests/qtest/fuzz/fork_fuzz.ld
> 
> -- 
> 2.39.0
> 

Whose tree should this go through? Laurent's qtest tree?

Acked-by: Stefan Hajnoczi <stefanha@redhat.com>

Philippe Mathieu-Daudé Feb. 14, 2023, 4:08 p.m. UTC | #5

On 14/2/23 16:38, Stefan Hajnoczi wrote:
> On Sat, Feb 04, 2023 at 11:29:41PM -0500, Alexander Bulekov wrote:
>> Hello,
>> This series removes fork-based fuzzing.
>> How does fork-based fuzzing work?
>>   * A single parent process initializes QEMU
>>   * We identify the devices we wish to fuzz (fuzzer-dependent)
>>   * Use QTest to PCI enumerate the devices
>>   * After that we start a fork-server which forks the process and executes
>>     fuzzer inputs inside the disposable children.
>>
>> In a normal fuzzing process, everything happens in a single process.
>>
>> Pros of fork-based fuzzing:
>>   * We only need to do common configuration once (e.g. PCI enumeration).
>>   * Fork provides a strong guarantee that fuzzer inputs will not interfere with
>>     each-other
>>   * The fuzzing process can continue even after a child-process crashes
>>   * We can apply our-own timers to child-processes to exit slow inputs, early
>>
>> Cons of fork-based fuzzing:
>>   * Fork-based fuzzing is not supported by libfuzzer. We had to build our own
>>     fork-server and rely on tricks using linker-scripts and shared-memory to
>>     support fuzzing. ( https://physics.bu.edu/~alxndr/libfuzzer-forkserver/ )
>>   * Fork-based fuzzing is currently the main blocker preventing us from enabling
>>     other fuzzers such as AFL++ on OSS-Fuzz
>>   * Fork-based fuzzing may be a reason why coverage-builds are failing on
>>     OSS-Fuzz. Coverage is an important fuzzing metric which would allow us to
>>     find parts of the code that are not well-covered.
>>   * Fork-based fuzzing has high overhead. fork() is an expensive system-call,
>>     especially for processes running ASAN (with large/complex) VMA layouts.
>>   * Fork prevents us from effectively fuzzing devices that rely on
>>     threads (e.g. qxl).
>>
>> These patches remove fork-based fuzzing and replace it with reboot-based
>> fuzzing for most cases. Misc notes about this change:
>>   * libfuzzer appears to be no longer in active development. As such, the
>>     current implementation of fork-based fuzzing (while having some nice
>>     advantages) is likely to hold us back in the future. If these changes
>>     are approved and appear to run successfully on OSS-Fuzz, we should be
>>     able to easily experiment with other fuzzing engines (AFL++).
>>   * Some device do not completely reset their state. This can lead to
>>     non-reproducible crashes. However, in my local tests, most crashes
>>     were reproducible. OSS-Fuzz shouldn't send us reports unless it can
>>     consistently reproduce a crash.
>>   * In theory, the corpus-format should not change, so the existing
>>     corpus-inputs on OSS-Fuzz will transfer to the new reset()-able
>>     fuzzers.
>>   * Each fuzzing process will now exit after a single crash is found. To
>>     continue the fuzzing process, use libfuzzer flags such as -jobs=-1
>>   * We no long control input-timeouts (those are handled by libfuzzer).
>>     Since timeouts on oss-fuzz can be many seconds long, I added a limit
>>     on the number of DMA bytes written.
>>   
>>
>> Alexander Bulekov (10):
>>    hw/sparse-mem: clear memory on reset
>>    fuzz: add fuzz_reboot API
>>    fuzz/generic-fuzz: use reboots instead of forks to reset state
>>    fuzz/generic-fuzz: add a limit on DMA bytes written
>>    fuzz/virtio-scsi: remove fork-based fuzzer
>>    fuzz/virtio-net: remove fork-based fuzzer
>>    fuzz/virtio-blk: remove fork-based fuzzer
>>    fuzz/i440fx: remove fork-based fuzzer
>>    fuzz: remove fork-fuzzing scaffolding
>>    docs/fuzz: remove mentions of fork-based fuzzing
>>
>>   docs/devel/fuzzing.rst              |  22 +-----
>>   hw/mem/sparse-mem.c                 |  13 +++-
>>   meson.build                         |   4 -
>>   tests/qtest/fuzz/fork_fuzz.c        |  41 ----------
>>   tests/qtest/fuzz/fork_fuzz.h        |  23 ------
>>   tests/qtest/fuzz/fork_fuzz.ld       |  56 --------------
>>   tests/qtest/fuzz/fuzz.c             |   6 ++
>>   tests/qtest/fuzz/fuzz.h             |   2 +-
>>   tests/qtest/fuzz/generic_fuzz.c     | 111 +++++++---------------------
>>   tests/qtest/fuzz/i440fx_fuzz.c      |  27 +------
>>   tests/qtest/fuzz/meson.build        |   6 +-
>>   tests/qtest/fuzz/virtio_blk_fuzz.c  |  51 ++-----------
>>   tests/qtest/fuzz/virtio_net_fuzz.c  |  54 ++------------
>>   tests/qtest/fuzz/virtio_scsi_fuzz.c |  51 ++-----------
>>   14 files changed, 72 insertions(+), 395 deletions(-)
>>   delete mode 100644 tests/qtest/fuzz/fork_fuzz.c
>>   delete mode 100644 tests/qtest/fuzz/fork_fuzz.h
>>   delete mode 100644 tests/qtest/fuzz/fork_fuzz.ld
>>
>> -- 
>> 2.39.0
>>
> 
> Whose tree should this go through? Laurent's qtest tree?

Do you mean Thomas?

$ git shortlog -cs tests/qtest/fuzz | sort -rn
     32  Thomas Huth
     26  Paolo Bonzini
     19  Stefan Hajnoczi
      6  Markus Armbruster
      5  Alexander Bulekov
      4  Marc-André Lureau
      3  Peter Maydell
      2  Laurent Vivier
      1  Michael S. Tsirkin
      1  Gerd Hoffmann

In doubt, cc'ing both :)

> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>

Laurent Vivier Feb. 14, 2023, 5:58 p.m. UTC | #6

On 2/14/23 17:08, Philippe Mathieu-Daudé wrote:
> On 14/2/23 16:38, Stefan Hajnoczi wrote:
>> On Sat, Feb 04, 2023 at 11:29:41PM -0500, Alexander Bulekov wrote:
>>> Hello,
>>> This series removes fork-based fuzzing.
>>> How does fork-based fuzzing work?
>>>   * A single parent process initializes QEMU
>>>   * We identify the devices we wish to fuzz (fuzzer-dependent)
>>>   * Use QTest to PCI enumerate the devices
>>>   * After that we start a fork-server which forks the process and executes
>>>     fuzzer inputs inside the disposable children.
>>>
>>> In a normal fuzzing process, everything happens in a single process.
>>>
>>> Pros of fork-based fuzzing:
>>>   * We only need to do common configuration once (e.g. PCI enumeration).
>>>   * Fork provides a strong guarantee that fuzzer inputs will not interfere with
>>>     each-other
>>>   * The fuzzing process can continue even after a child-process crashes
>>>   * We can apply our-own timers to child-processes to exit slow inputs, early
>>>
>>> Cons of fork-based fuzzing:
>>>   * Fork-based fuzzing is not supported by libfuzzer. We had to build our own
>>>     fork-server and rely on tricks using linker-scripts and shared-memory to
>>>     support fuzzing. ( https://physics.bu.edu/~alxndr/libfuzzer-forkserver/ )
>>>   * Fork-based fuzzing is currently the main blocker preventing us from enabling
>>>     other fuzzers such as AFL++ on OSS-Fuzz
>>>   * Fork-based fuzzing may be a reason why coverage-builds are failing on
>>>     OSS-Fuzz. Coverage is an important fuzzing metric which would allow us to
>>>     find parts of the code that are not well-covered.
>>>   * Fork-based fuzzing has high overhead. fork() is an expensive system-call,
>>>     especially for processes running ASAN (with large/complex) VMA layouts.
>>>   * Fork prevents us from effectively fuzzing devices that rely on
>>>     threads (e.g. qxl).
>>>
>>> These patches remove fork-based fuzzing and replace it with reboot-based
>>> fuzzing for most cases. Misc notes about this change:
>>>   * libfuzzer appears to be no longer in active development. As such, the
>>>     current implementation of fork-based fuzzing (while having some nice
>>>     advantages) is likely to hold us back in the future. If these changes
>>>     are approved and appear to run successfully on OSS-Fuzz, we should be
>>>     able to easily experiment with other fuzzing engines (AFL++).
>>>   * Some device do not completely reset their state. This can lead to
>>>     non-reproducible crashes. However, in my local tests, most crashes
>>>     were reproducible. OSS-Fuzz shouldn't send us reports unless it can
>>>     consistently reproduce a crash.
>>>   * In theory, the corpus-format should not change, so the existing
>>>     corpus-inputs on OSS-Fuzz will transfer to the new reset()-able
>>>     fuzzers.
>>>   * Each fuzzing process will now exit after a single crash is found. To
>>>     continue the fuzzing process, use libfuzzer flags such as -jobs=-1
>>>   * We no long control input-timeouts (those are handled by libfuzzer).
>>>     Since timeouts on oss-fuzz can be many seconds long, I added a limit
>>>     on the number of DMA bytes written.
>>>
>>> Alexander Bulekov (10):
>>>    hw/sparse-mem: clear memory on reset
>>>    fuzz: add fuzz_reboot API
>>>    fuzz/generic-fuzz: use reboots instead of forks to reset state
>>>    fuzz/generic-fuzz: add a limit on DMA bytes written
>>>    fuzz/virtio-scsi: remove fork-based fuzzer
>>>    fuzz/virtio-net: remove fork-based fuzzer
>>>    fuzz/virtio-blk: remove fork-based fuzzer
>>>    fuzz/i440fx: remove fork-based fuzzer
>>>    fuzz: remove fork-fuzzing scaffolding
>>>    docs/fuzz: remove mentions of fork-based fuzzing
>>>
>>>   docs/devel/fuzzing.rst              |  22 +-----
>>>   hw/mem/sparse-mem.c                 |  13 +++-
>>>   meson.build                         |   4 -
>>>   tests/qtest/fuzz/fork_fuzz.c        |  41 ----------
>>>   tests/qtest/fuzz/fork_fuzz.h        |  23 ------
>>>   tests/qtest/fuzz/fork_fuzz.ld       |  56 --------------
>>>   tests/qtest/fuzz/fuzz.c             |   6 ++
>>>   tests/qtest/fuzz/fuzz.h             |   2 +-
>>>   tests/qtest/fuzz/generic_fuzz.c     | 111 +++++++---------------------
>>>   tests/qtest/fuzz/i440fx_fuzz.c      |  27 +------
>>>   tests/qtest/fuzz/meson.build        |   6 +-
>>>   tests/qtest/fuzz/virtio_blk_fuzz.c  |  51 ++-----------
>>>   tests/qtest/fuzz/virtio_net_fuzz.c  |  54 ++------------
>>>   tests/qtest/fuzz/virtio_scsi_fuzz.c |  51 ++-----------
>>>   14 files changed, 72 insertions(+), 395 deletions(-)
>>>   delete mode 100644 tests/qtest/fuzz/fork_fuzz.c
>>>   delete mode 100644 tests/qtest/fuzz/fork_fuzz.h
>>>   delete mode 100644 tests/qtest/fuzz/fork_fuzz.ld
>>>
>>> -- 
>>> 2.39.0
>>>
>>
>> Whose tree should this go through? Laurent's qtest tree?
> 
> Do you mean Thomas?
> 
> $ git shortlog -cs tests/qtest/fuzz | sort -rn
>      32  Thomas Huth
>      26  Paolo Bonzini
>      19  Stefan Hajnoczi
>       6  Markus Armbruster
>       5  Alexander Bulekov
>       4  Marc-André Lureau
>       3  Peter Maydell
>       2  Laurent Vivier
>       1  Michael S. Tsirkin
>       1  Gerd Hoffmann
> 
> In doubt, cc'ing both :)

Yes, Thomas is the real maintainer.

> 
>> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
>

Stefan Hajnoczi Feb. 14, 2023, 6:46 p.m. UTC | #7

On Tue, 14 Feb 2023 at 12:59, Laurent Vivier <lvivier@redhat.com> wrote:
>
> On 2/14/23 17:08, Philippe Mathieu-Daudé wrote:
> > On 14/2/23 16:38, Stefan Hajnoczi wrote:
> >> On Sat, Feb 04, 2023 at 11:29:41PM -0500, Alexander Bulekov wrote:
> >>> Hello,
> >>> This series removes fork-based fuzzing.
> >>> How does fork-based fuzzing work?
> >>>   * A single parent process initializes QEMU
> >>>   * We identify the devices we wish to fuzz (fuzzer-dependent)
> >>>   * Use QTest to PCI enumerate the devices
> >>>   * After that we start a fork-server which forks the process and executes
> >>>     fuzzer inputs inside the disposable children.
> >>>
> >>> In a normal fuzzing process, everything happens in a single process.
> >>>
> >>> Pros of fork-based fuzzing:
> >>>   * We only need to do common configuration once (e.g. PCI enumeration).
> >>>   * Fork provides a strong guarantee that fuzzer inputs will not interfere with
> >>>     each-other
> >>>   * The fuzzing process can continue even after a child-process crashes
> >>>   * We can apply our-own timers to child-processes to exit slow inputs, early
> >>>
> >>> Cons of fork-based fuzzing:
> >>>   * Fork-based fuzzing is not supported by libfuzzer. We had to build our own
> >>>     fork-server and rely on tricks using linker-scripts and shared-memory to
> >>>     support fuzzing. ( https://physics.bu.edu/~alxndr/libfuzzer-forkserver/ )
> >>>   * Fork-based fuzzing is currently the main blocker preventing us from enabling
> >>>     other fuzzers such as AFL++ on OSS-Fuzz
> >>>   * Fork-based fuzzing may be a reason why coverage-builds are failing on
> >>>     OSS-Fuzz. Coverage is an important fuzzing metric which would allow us to
> >>>     find parts of the code that are not well-covered.
> >>>   * Fork-based fuzzing has high overhead. fork() is an expensive system-call,
> >>>     especially for processes running ASAN (with large/complex) VMA layouts.
> >>>   * Fork prevents us from effectively fuzzing devices that rely on
> >>>     threads (e.g. qxl).
> >>>
> >>> These patches remove fork-based fuzzing and replace it with reboot-based
> >>> fuzzing for most cases. Misc notes about this change:
> >>>   * libfuzzer appears to be no longer in active development. As such, the
> >>>     current implementation of fork-based fuzzing (while having some nice
> >>>     advantages) is likely to hold us back in the future. If these changes
> >>>     are approved and appear to run successfully on OSS-Fuzz, we should be
> >>>     able to easily experiment with other fuzzing engines (AFL++).
> >>>   * Some device do not completely reset their state. This can lead to
> >>>     non-reproducible crashes. However, in my local tests, most crashes
> >>>     were reproducible. OSS-Fuzz shouldn't send us reports unless it can
> >>>     consistently reproduce a crash.
> >>>   * In theory, the corpus-format should not change, so the existing
> >>>     corpus-inputs on OSS-Fuzz will transfer to the new reset()-able
> >>>     fuzzers.
> >>>   * Each fuzzing process will now exit after a single crash is found. To
> >>>     continue the fuzzing process, use libfuzzer flags such as -jobs=-1
> >>>   * We no long control input-timeouts (those are handled by libfuzzer).
> >>>     Since timeouts on oss-fuzz can be many seconds long, I added a limit
> >>>     on the number of DMA bytes written.
> >>>
> >>> Alexander Bulekov (10):
> >>>    hw/sparse-mem: clear memory on reset
> >>>    fuzz: add fuzz_reboot API
> >>>    fuzz/generic-fuzz: use reboots instead of forks to reset state
> >>>    fuzz/generic-fuzz: add a limit on DMA bytes written
> >>>    fuzz/virtio-scsi: remove fork-based fuzzer
> >>>    fuzz/virtio-net: remove fork-based fuzzer
> >>>    fuzz/virtio-blk: remove fork-based fuzzer
> >>>    fuzz/i440fx: remove fork-based fuzzer
> >>>    fuzz: remove fork-fuzzing scaffolding
> >>>    docs/fuzz: remove mentions of fork-based fuzzing
> >>>
> >>>   docs/devel/fuzzing.rst              |  22 +-----
> >>>   hw/mem/sparse-mem.c                 |  13 +++-
> >>>   meson.build                         |   4 -
> >>>   tests/qtest/fuzz/fork_fuzz.c        |  41 ----------
> >>>   tests/qtest/fuzz/fork_fuzz.h        |  23 ------
> >>>   tests/qtest/fuzz/fork_fuzz.ld       |  56 --------------
> >>>   tests/qtest/fuzz/fuzz.c             |   6 ++
> >>>   tests/qtest/fuzz/fuzz.h             |   2 +-
> >>>   tests/qtest/fuzz/generic_fuzz.c     | 111 +++++++---------------------
> >>>   tests/qtest/fuzz/i440fx_fuzz.c      |  27 +------
> >>>   tests/qtest/fuzz/meson.build        |   6 +-
> >>>   tests/qtest/fuzz/virtio_blk_fuzz.c  |  51 ++-----------
> >>>   tests/qtest/fuzz/virtio_net_fuzz.c  |  54 ++------------
> >>>   tests/qtest/fuzz/virtio_scsi_fuzz.c |  51 ++-----------
> >>>   14 files changed, 72 insertions(+), 395 deletions(-)
> >>>   delete mode 100644 tests/qtest/fuzz/fork_fuzz.c
> >>>   delete mode 100644 tests/qtest/fuzz/fork_fuzz.h
> >>>   delete mode 100644 tests/qtest/fuzz/fork_fuzz.ld
> >>>
> >>> --
> >>> 2.39.0
> >>>
> >>
> >> Whose tree should this go through? Laurent's qtest tree?
> >
> > Do you mean Thomas?
> >
> > $ git shortlog -cs tests/qtest/fuzz | sort -rn
> >      32  Thomas Huth
> >      26  Paolo Bonzini
> >      19  Stefan Hajnoczi
> >       6  Markus Armbruster
> >       5  Alexander Bulekov
> >       4  Marc-André Lureau
> >       3  Peter Maydell
> >       2  Laurent Vivier
> >       1  Michael S. Tsirkin
> >       1  Gerd Hoffmann
> >
> > In doubt, cc'ing both :)
>
> Yes, Thomas is the real maintainer.

Want to update the ./MAINTAINERS file? That's where I found your name.
Thomas is only listed as a reviewer.

Stefan

Thomas Huth Feb. 14, 2023, 7:09 p.m. UTC | #8

On 14/02/2023 17.08, Philippe Mathieu-Daudé wrote:
> On 14/2/23 16:38, Stefan Hajnoczi wrote:
>> On Sat, Feb 04, 2023 at 11:29:41PM -0500, Alexander Bulekov wrote:
>>> Hello,
>>> This series removes fork-based fuzzing.
>>> How does fork-based fuzzing work?
>>>   * A single parent process initializes QEMU
>>>   * We identify the devices we wish to fuzz (fuzzer-dependent)
>>>   * Use QTest to PCI enumerate the devices
>>>   * After that we start a fork-server which forks the process and executes
>>>     fuzzer inputs inside the disposable children.
>>>
>>> In a normal fuzzing process, everything happens in a single process.
>>>
>>> Pros of fork-based fuzzing:
>>>   * We only need to do common configuration once (e.g. PCI enumeration).
>>>   * Fork provides a strong guarantee that fuzzer inputs will not 
>>> interfere with
>>>     each-other
>>>   * The fuzzing process can continue even after a child-process crashes
>>>   * We can apply our-own timers to child-processes to exit slow inputs, 
>>> early
>>>
>>> Cons of fork-based fuzzing:
>>>   * Fork-based fuzzing is not supported by libfuzzer. We had to build our 
>>> own
>>>     fork-server and rely on tricks using linker-scripts and shared-memory to
>>>     support fuzzing. ( 
>>> https://physics.bu.edu/~alxndr/libfuzzer-forkserver/ )
>>>   * Fork-based fuzzing is currently the main blocker preventing us from 
>>> enabling
>>>     other fuzzers such as AFL++ on OSS-Fuzz
>>>   * Fork-based fuzzing may be a reason why coverage-builds are failing on
>>>     OSS-Fuzz. Coverage is an important fuzzing metric which would allow 
>>> us to
>>>     find parts of the code that are not well-covered.
>>>   * Fork-based fuzzing has high overhead. fork() is an expensive 
>>> system-call,
>>>     especially for processes running ASAN (with large/complex) VMA layouts.
>>>   * Fork prevents us from effectively fuzzing devices that rely on
>>>     threads (e.g. qxl).
>>>
>>> These patches remove fork-based fuzzing and replace it with reboot-based
>>> fuzzing for most cases. Misc notes about this change:
>>>   * libfuzzer appears to be no longer in active development. As such, the
>>>     current implementation of fork-based fuzzing (while having some nice
>>>     advantages) is likely to hold us back in the future. If these changes
>>>     are approved and appear to run successfully on OSS-Fuzz, we should be
>>>     able to easily experiment with other fuzzing engines (AFL++).
>>>   * Some device do not completely reset their state. This can lead to
>>>     non-reproducible crashes. However, in my local tests, most crashes
>>>     were reproducible. OSS-Fuzz shouldn't send us reports unless it can
>>>     consistently reproduce a crash.
>>>   * In theory, the corpus-format should not change, so the existing
>>>     corpus-inputs on OSS-Fuzz will transfer to the new reset()-able
>>>     fuzzers.
>>>   * Each fuzzing process will now exit after a single crash is found. To
>>>     continue the fuzzing process, use libfuzzer flags such as -jobs=-1
>>>   * We no long control input-timeouts (those are handled by libfuzzer).
>>>     Since timeouts on oss-fuzz can be many seconds long, I added a limit
>>>     on the number of DMA bytes written.
>>>
>>> Alexander Bulekov (10):
>>>    hw/sparse-mem: clear memory on reset
>>>    fuzz: add fuzz_reboot API
>>>    fuzz/generic-fuzz: use reboots instead of forks to reset state
>>>    fuzz/generic-fuzz: add a limit on DMA bytes written
>>>    fuzz/virtio-scsi: remove fork-based fuzzer
>>>    fuzz/virtio-net: remove fork-based fuzzer
>>>    fuzz/virtio-blk: remove fork-based fuzzer
>>>    fuzz/i440fx: remove fork-based fuzzer
>>>    fuzz: remove fork-fuzzing scaffolding
>>>    docs/fuzz: remove mentions of fork-based fuzzing
>>>
>>>   docs/devel/fuzzing.rst              |  22 +-----
>>>   hw/mem/sparse-mem.c                 |  13 +++-
>>>   meson.build                         |   4 -
>>>   tests/qtest/fuzz/fork_fuzz.c        |  41 ----------
>>>   tests/qtest/fuzz/fork_fuzz.h        |  23 ------
>>>   tests/qtest/fuzz/fork_fuzz.ld       |  56 --------------
>>>   tests/qtest/fuzz/fuzz.c             |   6 ++
>>>   tests/qtest/fuzz/fuzz.h             |   2 +-
>>>   tests/qtest/fuzz/generic_fuzz.c     | 111 +++++++---------------------
>>>   tests/qtest/fuzz/i440fx_fuzz.c      |  27 +------
>>>   tests/qtest/fuzz/meson.build        |   6 +-
>>>   tests/qtest/fuzz/virtio_blk_fuzz.c  |  51 ++-----------
>>>   tests/qtest/fuzz/virtio_net_fuzz.c  |  54 ++------------
>>>   tests/qtest/fuzz/virtio_scsi_fuzz.c |  51 ++-----------
>>>   14 files changed, 72 insertions(+), 395 deletions(-)
>>>   delete mode 100644 tests/qtest/fuzz/fork_fuzz.c
>>>   delete mode 100644 tests/qtest/fuzz/fork_fuzz.h
>>>   delete mode 100644 tests/qtest/fuzz/fork_fuzz.ld
>>>
>>> -- 
>>> 2.39.0
>>>
>>
>> Whose tree should this go through? Laurent's qtest tree?
> 
> Do you mean Thomas?

I thought Alexander would be doing pull requests for fuzzing-related patches 
nowadays (since he's the listed maintainer for these files)? Or did I get 
that wrong?

  Thomas

Alexander Bulekov Feb. 14, 2023, 7:14 p.m. UTC | #9

On 230214 2009, Thomas Huth wrote:
> On 14/02/2023 17.08, Philippe Mathieu-Daudé wrote:
> > On 14/2/23 16:38, Stefan Hajnoczi wrote:
> > > On Sat, Feb 04, 2023 at 11:29:41PM -0500, Alexander Bulekov wrote:
> > > > Hello,
> > > > This series removes fork-based fuzzing.
> > > > How does fork-based fuzzing work?
> > > >   * A single parent process initializes QEMU
> > > >   * We identify the devices we wish to fuzz (fuzzer-dependent)
> > > >   * Use QTest to PCI enumerate the devices
> > > >   * After that we start a fork-server which forks the process and executes
> > > >     fuzzer inputs inside the disposable children.
> > > > 
> > > > In a normal fuzzing process, everything happens in a single process.
> > > > 
> > > > Pros of fork-based fuzzing:
> > > >   * We only need to do common configuration once (e.g. PCI enumeration).
> > > >   * Fork provides a strong guarantee that fuzzer inputs will not
> > > > interfere with
> > > >     each-other
> > > >   * The fuzzing process can continue even after a child-process crashes
> > > >   * We can apply our-own timers to child-processes to exit slow
> > > > inputs, early
> > > > 
> > > > Cons of fork-based fuzzing:
> > > >   * Fork-based fuzzing is not supported by libfuzzer. We had to
> > > > build our own
> > > >     fork-server and rely on tricks using linker-scripts and shared-memory to
> > > >     support fuzzing. (
> > > > https://physics.bu.edu/~alxndr/libfuzzer-forkserver/ )
> > > >   * Fork-based fuzzing is currently the main blocker preventing
> > > > us from enabling
> > > >     other fuzzers such as AFL++ on OSS-Fuzz
> > > >   * Fork-based fuzzing may be a reason why coverage-builds are failing on
> > > >     OSS-Fuzz. Coverage is an important fuzzing metric which
> > > > would allow us to
> > > >     find parts of the code that are not well-covered.
> > > >   * Fork-based fuzzing has high overhead. fork() is an expensive
> > > > system-call,
> > > >     especially for processes running ASAN (with large/complex) VMA layouts.
> > > >   * Fork prevents us from effectively fuzzing devices that rely on
> > > >     threads (e.g. qxl).
> > > > 
> > > > These patches remove fork-based fuzzing and replace it with reboot-based
> > > > fuzzing for most cases. Misc notes about this change:
> > > >   * libfuzzer appears to be no longer in active development. As such, the
> > > >     current implementation of fork-based fuzzing (while having some nice
> > > >     advantages) is likely to hold us back in the future. If these changes
> > > >     are approved and appear to run successfully on OSS-Fuzz, we should be
> > > >     able to easily experiment with other fuzzing engines (AFL++).
> > > >   * Some device do not completely reset their state. This can lead to
> > > >     non-reproducible crashes. However, in my local tests, most crashes
> > > >     were reproducible. OSS-Fuzz shouldn't send us reports unless it can
> > > >     consistently reproduce a crash.
> > > >   * In theory, the corpus-format should not change, so the existing
> > > >     corpus-inputs on OSS-Fuzz will transfer to the new reset()-able
> > > >     fuzzers.
> > > >   * Each fuzzing process will now exit after a single crash is found. To
> > > >     continue the fuzzing process, use libfuzzer flags such as -jobs=-1
> > > >   * We no long control input-timeouts (those are handled by libfuzzer).
> > > >     Since timeouts on oss-fuzz can be many seconds long, I added a limit
> > > >     on the number of DMA bytes written.
> > > > 
> > > > Alexander Bulekov (10):
> > > >    hw/sparse-mem: clear memory on reset
> > > >    fuzz: add fuzz_reboot API
> > > >    fuzz/generic-fuzz: use reboots instead of forks to reset state
> > > >    fuzz/generic-fuzz: add a limit on DMA bytes written
> > > >    fuzz/virtio-scsi: remove fork-based fuzzer
> > > >    fuzz/virtio-net: remove fork-based fuzzer
> > > >    fuzz/virtio-blk: remove fork-based fuzzer
> > > >    fuzz/i440fx: remove fork-based fuzzer
> > > >    fuzz: remove fork-fuzzing scaffolding
> > > >    docs/fuzz: remove mentions of fork-based fuzzing
> > > > 
> > > >   docs/devel/fuzzing.rst              |  22 +-----
> > > >   hw/mem/sparse-mem.c                 |  13 +++-
> > > >   meson.build                         |   4 -
> > > >   tests/qtest/fuzz/fork_fuzz.c        |  41 ----------
> > > >   tests/qtest/fuzz/fork_fuzz.h        |  23 ------
> > > >   tests/qtest/fuzz/fork_fuzz.ld       |  56 --------------
> > > >   tests/qtest/fuzz/fuzz.c             |   6 ++
> > > >   tests/qtest/fuzz/fuzz.h             |   2 +-
> > > >   tests/qtest/fuzz/generic_fuzz.c     | 111 +++++++---------------------
> > > >   tests/qtest/fuzz/i440fx_fuzz.c      |  27 +------
> > > >   tests/qtest/fuzz/meson.build        |   6 +-
> > > >   tests/qtest/fuzz/virtio_blk_fuzz.c  |  51 ++-----------
> > > >   tests/qtest/fuzz/virtio_net_fuzz.c  |  54 ++------------
> > > >   tests/qtest/fuzz/virtio_scsi_fuzz.c |  51 ++-----------
> > > >   14 files changed, 72 insertions(+), 395 deletions(-)
> > > >   delete mode 100644 tests/qtest/fuzz/fork_fuzz.c
> > > >   delete mode 100644 tests/qtest/fuzz/fork_fuzz.h
> > > >   delete mode 100644 tests/qtest/fuzz/fork_fuzz.ld
> > > > 
> > > > -- 
> > > > 2.39.0
> > > > 
> > > 
> > > Whose tree should this go through? Laurent's qtest tree?
> > 
> > Do you mean Thomas?
> 
> I thought Alexander would be doing pull requests for fuzzing-related patches
> nowadays (since he's the listed maintainer for these files)? Or did I get
> that wrong?

I have, though in the past I've been asked to send the PR to different
people. Who should I send this PR to?
-Alex

Thomas Huth Feb. 14, 2023, 9:08 p.m. UTC | #10

On 14/02/2023 20.14, Alexander Bulekov wrote:
> On 230214 2009, Thomas Huth wrote:
>> On 14/02/2023 17.08, Philippe Mathieu-Daudé wrote:
>>> On 14/2/23 16:38, Stefan Hajnoczi wrote:
>>>> On Sat, Feb 04, 2023 at 11:29:41PM -0500, Alexander Bulekov wrote:
>>>>> Hello,
>>>>> This series removes fork-based fuzzing.
>>>>> How does fork-based fuzzing work?
>>>>>    * A single parent process initializes QEMU
>>>>>    * We identify the devices we wish to fuzz (fuzzer-dependent)
>>>>>    * Use QTest to PCI enumerate the devices
>>>>>    * After that we start a fork-server which forks the process and executes
>>>>>      fuzzer inputs inside the disposable children.
>>>>>
>>>>> In a normal fuzzing process, everything happens in a single process.
>>>>>
>>>>> Pros of fork-based fuzzing:
>>>>>    * We only need to do common configuration once (e.g. PCI enumeration).
>>>>>    * Fork provides a strong guarantee that fuzzer inputs will not
>>>>> interfere with
>>>>>      each-other
>>>>>    * The fuzzing process can continue even after a child-process crashes
>>>>>    * We can apply our-own timers to child-processes to exit slow
>>>>> inputs, early
>>>>>
>>>>> Cons of fork-based fuzzing:
>>>>>    * Fork-based fuzzing is not supported by libfuzzer. We had to
>>>>> build our own
>>>>>      fork-server and rely on tricks using linker-scripts and shared-memory to
>>>>>      support fuzzing. (
>>>>> https://physics.bu.edu/~alxndr/libfuzzer-forkserver/ )
>>>>>    * Fork-based fuzzing is currently the main blocker preventing
>>>>> us from enabling
>>>>>      other fuzzers such as AFL++ on OSS-Fuzz
>>>>>    * Fork-based fuzzing may be a reason why coverage-builds are failing on
>>>>>      OSS-Fuzz. Coverage is an important fuzzing metric which
>>>>> would allow us to
>>>>>      find parts of the code that are not well-covered.
>>>>>    * Fork-based fuzzing has high overhead. fork() is an expensive
>>>>> system-call,
>>>>>      especially for processes running ASAN (with large/complex) VMA layouts.
>>>>>    * Fork prevents us from effectively fuzzing devices that rely on
>>>>>      threads (e.g. qxl).
>>>>>
>>>>> These patches remove fork-based fuzzing and replace it with reboot-based
>>>>> fuzzing for most cases. Misc notes about this change:
>>>>>    * libfuzzer appears to be no longer in active development. As such, the
>>>>>      current implementation of fork-based fuzzing (while having some nice
>>>>>      advantages) is likely to hold us back in the future. If these changes
>>>>>      are approved and appear to run successfully on OSS-Fuzz, we should be
>>>>>      able to easily experiment with other fuzzing engines (AFL++).
>>>>>    * Some device do not completely reset their state. This can lead to
>>>>>      non-reproducible crashes. However, in my local tests, most crashes
>>>>>      were reproducible. OSS-Fuzz shouldn't send us reports unless it can
>>>>>      consistently reproduce a crash.
>>>>>    * In theory, the corpus-format should not change, so the existing
>>>>>      corpus-inputs on OSS-Fuzz will transfer to the new reset()-able
>>>>>      fuzzers.
>>>>>    * Each fuzzing process will now exit after a single crash is found. To
>>>>>      continue the fuzzing process, use libfuzzer flags such as -jobs=-1
>>>>>    * We no long control input-timeouts (those are handled by libfuzzer).
>>>>>      Since timeouts on oss-fuzz can be many seconds long, I added a limit
>>>>>      on the number of DMA bytes written.
>>>>>
>>>>> Alexander Bulekov (10):
>>>>>     hw/sparse-mem: clear memory on reset
>>>>>     fuzz: add fuzz_reboot API
>>>>>     fuzz/generic-fuzz: use reboots instead of forks to reset state
>>>>>     fuzz/generic-fuzz: add a limit on DMA bytes written
>>>>>     fuzz/virtio-scsi: remove fork-based fuzzer
>>>>>     fuzz/virtio-net: remove fork-based fuzzer
>>>>>     fuzz/virtio-blk: remove fork-based fuzzer
>>>>>     fuzz/i440fx: remove fork-based fuzzer
>>>>>     fuzz: remove fork-fuzzing scaffolding
>>>>>     docs/fuzz: remove mentions of fork-based fuzzing
>>>>>
>>>>>    docs/devel/fuzzing.rst              |  22 +-----
>>>>>    hw/mem/sparse-mem.c                 |  13 +++-
>>>>>    meson.build                         |   4 -
>>>>>    tests/qtest/fuzz/fork_fuzz.c        |  41 ----------
>>>>>    tests/qtest/fuzz/fork_fuzz.h        |  23 ------
>>>>>    tests/qtest/fuzz/fork_fuzz.ld       |  56 --------------
>>>>>    tests/qtest/fuzz/fuzz.c             |   6 ++
>>>>>    tests/qtest/fuzz/fuzz.h             |   2 +-
>>>>>    tests/qtest/fuzz/generic_fuzz.c     | 111 +++++++---------------------
>>>>>    tests/qtest/fuzz/i440fx_fuzz.c      |  27 +------
>>>>>    tests/qtest/fuzz/meson.build        |   6 +-
>>>>>    tests/qtest/fuzz/virtio_blk_fuzz.c  |  51 ++-----------
>>>>>    tests/qtest/fuzz/virtio_net_fuzz.c  |  54 ++------------
>>>>>    tests/qtest/fuzz/virtio_scsi_fuzz.c |  51 ++-----------
>>>>>    14 files changed, 72 insertions(+), 395 deletions(-)
>>>>>    delete mode 100644 tests/qtest/fuzz/fork_fuzz.c
>>>>>    delete mode 100644 tests/qtest/fuzz/fork_fuzz.h
>>>>>    delete mode 100644 tests/qtest/fuzz/fork_fuzz.ld
>>>>
>>>> Whose tree should this go through? Laurent's qtest tree?
>>>
>>> Do you mean Thomas?
>>
>> I thought Alexander would be doing pull requests for fuzzing-related patches
>> nowadays (since he's the listed maintainer for these files)? Or did I get
>> that wrong?
> 
> I have, though in the past I've been asked to send the PR to different
> people. Who should I send this PR to?

I assume you should have enough experience with sending PRs now, so if Peter 
does not mind, I'd suggest to directly send PRs to Peter now. If that does 
not work for some reason, feel free to send a “not for master” pull request 
to me instead, then I'll take it along with my next qtest-related PR.

  Thomas

[00/10] Retire Fork-Based Fuzzing

Message

Comments