mbox series

[RFC,0/5] Add support for suppressing warning backtraces

Message ID 20240305184033.425294-1-linux@roeck-us.net (mailing list archive)
Headers show
Series Add support for suppressing warning backtraces | expand

Message

Guenter Roeck March 5, 2024, 6:40 p.m. UTC
Some unit tests intentionally trigger warning backtraces by passing bad
parameters to kernel API functions. Such unit tests typically check the
return value from such calls, not the existence of the warning backtrace.

Such intentionally generated warning backtraces are neither desirable
nor useful for a number of reasons.
- They can result in overlooked real problems.
- A warning that suddenly starts to show up in unit tests needs to be
  investigated and has to be marked to be ignored, for example by
  adjusting filter scripts. Such filters are ad-hoc because there is
  no real standard format for warnings. On top of that, such filter
  scripts would require constant maintenance.

One option to address problem would be to add messages such as "expected
warning backtraces start / end here" to the kernel log.  However, that
would again require filter scripts, it might result in missing real
problematic warning backtraces triggered while the test is running, and
the irrelevant backtrace(s) would still clog the kernel log.

Solve the problem by providing a means to identify and suppress specific
warning backtraces while executing test code. Support suppressing multiple
backtraces while at the same time limiting changes to generic code to the
absolute minimum. Architecture specific changes are kept at minimum by
retaining function names only if both CONFIG_DEBUG_BUGVERBOSE and
CONFIG_KUNIT are enabled.

The first patch of the series introduces the necessary infrastructure.
The second patch marks the warning message in drm_calc_scale() in the DRM
subsystem as intentional where warranted. This patch is intended to serve
as an example for the use of the functionality introduced with this series.
The last three patches in the series introduce the necessary architecture
specific changes for x86, arm64, and loongarch.

This series is based on the RFC patch and subsequent discussion at
https://patchwork.kernel.org/project/linux-kselftest/patch/02546e59-1afe-4b08-ba81-d94f3b691c9a@moroto.mountain/
and offers a more comprehensive solution of the problem discussed there.

Checkpatch note:
  Remaining checkpatch errors and warnings were deliberately ignored.
  Some are triggered by matching coding style or by comments interpreted
  as code, others by assembler macros which are disliked by checkpatch.
  Suggestions for improvements are welcome.

Some questions:

- Is the general approach promising ? If not, are there other possible
  solutions ?
- Function pointers are only added to the __bug_table section if both
  CONFIG_KUNIT and CONFIG_DEBUG_BUGVERBOSE are enabled. This avoids image
  size increases if CONFIG_KUNIT=n. Downside is slightly more complex
  architecture specific assembler code. If function pointers were always
  added to the __bug_table section, vmlinux image size would increase by
  approximately 0.6-0.7%. Is the increased complexity in assembler code
  worth the reduced image size ? I think so, but I would like to hear
  other opinions.
- There are additional possibilities associated with storing the bug
  function name in the __bug_table section. It could be independent of
  KUNIT, it could be a configuration flag, and/or it could be used to
  display the name of the offending function in BUG/WARN messages.
  Is any of those of interest ?

----------------------------------------------------------------
Guenter Roeck (5):
      bug: Core support for suppressing warning backtraces
      drm: Suppress intentional warning backtraces in scaling unit tests
      x86: Add support for suppressing warning tracebacks
      arm64: Add support for suppressing warning tracebacks
      loongarch: Add support for suppressing warning tracebacks

 arch/arm64/include/asm/asm-bug.h      | 29 +++++++++++++-------
 arch/arm64/include/asm/bug.h          |  8 +++++-
 arch/loongarch/include/asm/bug.h      | 38 ++++++++++++++++++--------
 arch/x86/include/asm/bug.h            | 21 +++++++++++----
 drivers/gpu/drm/tests/drm_rect_test.c |  6 +++++
 include/asm-generic/bug.h             | 16 ++++++++---
 include/kunit/bug.h                   | 51 +++++++++++++++++++++++++++++++++++
 include/linux/bug.h                   | 13 +++++++++
 lib/bug.c                             | 51 ++++++++++++++++++++++++++++++++---
 lib/kunit/Makefile                    |  6 +++--
 lib/kunit/bug.c                       | 40 +++++++++++++++++++++++++++
 11 files changed, 243 insertions(+), 36 deletions(-)
 create mode 100644 include/kunit/bug.h
 create mode 100644 lib/kunit/bug.c

Comments

Daniel Díaz March 6, 2024, 6:24 p.m. UTC | #1
Hello!

On Tue, 5 Mar 2024 at 12:40, Guenter Roeck <linux@roeck-us.net> wrote:
> Some unit tests intentionally trigger warning backtraces by passing bad
> parameters to kernel API functions. Such unit tests typically check the
> return value from such calls, not the existence of the warning backtrace.
>
> Such intentionally generated warning backtraces are neither desirable
> nor useful for a number of reasons.
> - They can result in overlooked real problems.
> - A warning that suddenly starts to show up in unit tests needs to be
>   investigated and has to be marked to be ignored, for example by
>   adjusting filter scripts. Such filters are ad-hoc because there is
>   no real standard format for warnings. On top of that, such filter
>   scripts would require constant maintenance.
>
> One option to address problem would be to add messages such as "expected
> warning backtraces start / end here" to the kernel log.  However, that
> would again require filter scripts, it might result in missing real
> problematic warning backtraces triggered while the test is running, and
> the irrelevant backtrace(s) would still clog the kernel log.
>
> Solve the problem by providing a means to identify and suppress specific
> warning backtraces while executing test code. Support suppressing multiple
> backtraces while at the same time limiting changes to generic code to the
> absolute minimum. Architecture specific changes are kept at minimum by
> retaining function names only if both CONFIG_DEBUG_BUGVERBOSE and
> CONFIG_KUNIT are enabled.
>
> The first patch of the series introduces the necessary infrastructure.
> The second patch marks the warning message in drm_calc_scale() in the DRM
> subsystem as intentional where warranted. This patch is intended to serve
> as an example for the use of the functionality introduced with this series.
> The last three patches in the series introduce the necessary architecture
> specific changes for x86, arm64, and loongarch.
>
> This series is based on the RFC patch and subsequent discussion at
> https://patchwork.kernel.org/project/linux-kselftest/patch/02546e59-1afe-4b08-ba81-d94f3b691c9a@moroto.mountain/
> and offers a more comprehensive solution of the problem discussed there.
>
> Checkpatch note:
>   Remaining checkpatch errors and warnings were deliberately ignored.
>   Some are triggered by matching coding style or by comments interpreted
>   as code, others by assembler macros which are disliked by checkpatch.
>   Suggestions for improvements are welcome.
>
> Some questions:
>
> - Is the general approach promising ? If not, are there other possible
>   solutions ?
> - Function pointers are only added to the __bug_table section if both
>   CONFIG_KUNIT and CONFIG_DEBUG_BUGVERBOSE are enabled. This avoids image
>   size increases if CONFIG_KUNIT=n. Downside is slightly more complex
>   architecture specific assembler code. If function pointers were always
>   added to the __bug_table section, vmlinux image size would increase by
>   approximately 0.6-0.7%. Is the increased complexity in assembler code
>   worth the reduced image size ? I think so, but I would like to hear
>   other opinions.
> - There are additional possibilities associated with storing the bug
>   function name in the __bug_table section. It could be independent of
>   KUNIT, it could be a configuration flag, and/or it could be used to
>   display the name of the offending function in BUG/WARN messages.
>   Is any of those of interest ?

Thank you SO very much for this work! This is very much appreciated!
We run into these warnings at LKFT all the time, and making sure that
the noise doesn't drown the relevant signal is very important.

Greetings!

Daniel Díaz
daniel.diaz@linaro.org
Guenter Roeck March 6, 2024, 6:57 p.m. UTC | #2
Hi Daniel,

On 3/6/24 10:24, Daniel Díaz wrote:
[ ... ]
> 
> Thank you SO very much for this work! This is very much appreciated!

Thanks a lot for the feedback.

> We run into these warnings at LKFT all the time, and making sure that
> the noise doesn't drown the relevant signal is very important.
> 

Can you send me a list of all the warnings you are seeing ? I do see
lots of warnings when running drm tests in qemu, but I am not sure if
those are caused by emulation problems or if they are expected.
A list of warnings seen on real hardware would help me prepare
additional patches to address (or, rather, suppress) those.

Thanks,
Guenter
Guenter Roeck March 11, 2024, 4:36 a.m. UTC | #3
On Tue, Mar 05, 2024 at 10:40:28AM -0800, Guenter Roeck wrote:
> Some unit tests intentionally trigger warning backtraces by passing bad
> parameters to kernel API functions. Such unit tests typically check the
> return value from such calls, not the existence of the warning backtrace.
> 
> Such intentionally generated warning backtraces are neither desirable
> nor useful for a number of reasons.
> - They can result in overlooked real problems.
> - A warning that suddenly starts to show up in unit tests needs to be
>   investigated and has to be marked to be ignored, for example by
>   adjusting filter scripts. Such filters are ad-hoc because there is
>   no real standard format for warnings. On top of that, such filter
>   scripts would require constant maintenance.
> 
> One option to address problem would be to add messages such as "expected
> warning backtraces start / end here" to the kernel log.  However, that
> would again require filter scripts, it might result in missing real
> problematic warning backtraces triggered while the test is running, and
> the irrelevant backtrace(s) would still clog the kernel log.
> 
> Solve the problem by providing a means to identify and suppress specific
> warning backtraces while executing test code. Support suppressing multiple
> backtraces while at the same time limiting changes to generic code to the
> absolute minimum. Architecture specific changes are kept at minimum by
> retaining function names only if both CONFIG_DEBUG_BUGVERBOSE and
> CONFIG_KUNIT are enabled.
> 
> The first patch of the series introduces the necessary infrastructure.
> The second patch marks the warning message in drm_calc_scale() in the DRM
> subsystem as intentional where warranted. This patch is intended to serve
> as an example for the use of the functionality introduced with this series.
> The last three patches in the series introduce the necessary architecture
> specific changes for x86, arm64, and loongarch.
> 
> This series is based on the RFC patch and subsequent discussion at
> https://patchwork.kernel.org/project/linux-kselftest/patch/02546e59-1afe-4b08-ba81-d94f3b691c9a@moroto.mountain/
> and offers a more comprehensive solution of the problem discussed there.
> 
> Checkpatch note:
>   Remaining checkpatch errors and warnings were deliberately ignored.
>   Some are triggered by matching coding style or by comments interpreted
>   as code, others by assembler macros which are disliked by checkpatch.
>   Suggestions for improvements are welcome.
> 
> Some questions:
> 
> - Is the general approach promising ? If not, are there other possible
>   solutions ?
> - Function pointers are only added to the __bug_table section if both
>   CONFIG_KUNIT and CONFIG_DEBUG_BUGVERBOSE are enabled. This avoids image
>   size increases if CONFIG_KUNIT=n. Downside is slightly more complex
>   architecture specific assembler code. If function pointers were always
>   added to the __bug_table section, vmlinux image size would increase by
>   approximately 0.6-0.7%. Is the increased complexity in assembler code
>   worth the reduced image size ? I think so, but I would like to hear
>   other opinions.
> - There are additional possibilities associated with storing the bug
>   function name in the __bug_table section. It could be independent of
>   KUNIT, it could be a configuration flag, and/or it could be used to
>   display the name of the offending function in BUG/WARN messages.
>   Is any of those of interest ?
> 

I am ready to send a full version of this series with support for
all affected architectures. I am undecided if I should send it now,
based on v6.8, and send v2 after rebasing it to v6.9-rc1, or if I
should just wait for v6.9-rc1.

I understand that some maintainers dislike getting new patch series
while the commit window is is open. On the ther side, I tested the
series thoroughly on top of v6.8-rc7, and initial v6.9 release candidates
may have their own problems. Given that, I tend to send the series now.

Any thoughts ? Unless there is strong negative feedback, I'll likely
do that in a day or two.

Thanks,
Guenter