Message ID | 20240305184033.425294-1-linux@roeck-us.net (mailing list archive) |
---|---|
Headers | show |
Series | Add support for suppressing warning backtraces | expand |
Hello! On Tue, 5 Mar 2024 at 12:40, Guenter Roeck <linux@roeck-us.net> wrote: > Some unit tests intentionally trigger warning backtraces by passing bad > parameters to kernel API functions. Such unit tests typically check the > return value from such calls, not the existence of the warning backtrace. > > Such intentionally generated warning backtraces are neither desirable > nor useful for a number of reasons. > - They can result in overlooked real problems. > - A warning that suddenly starts to show up in unit tests needs to be > investigated and has to be marked to be ignored, for example by > adjusting filter scripts. Such filters are ad-hoc because there is > no real standard format for warnings. On top of that, such filter > scripts would require constant maintenance. > > One option to address problem would be to add messages such as "expected > warning backtraces start / end here" to the kernel log. However, that > would again require filter scripts, it might result in missing real > problematic warning backtraces triggered while the test is running, and > the irrelevant backtrace(s) would still clog the kernel log. > > Solve the problem by providing a means to identify and suppress specific > warning backtraces while executing test code. Support suppressing multiple > backtraces while at the same time limiting changes to generic code to the > absolute minimum. Architecture specific changes are kept at minimum by > retaining function names only if both CONFIG_DEBUG_BUGVERBOSE and > CONFIG_KUNIT are enabled. > > The first patch of the series introduces the necessary infrastructure. > The second patch marks the warning message in drm_calc_scale() in the DRM > subsystem as intentional where warranted. This patch is intended to serve > as an example for the use of the functionality introduced with this series. > The last three patches in the series introduce the necessary architecture > specific changes for x86, arm64, and loongarch. > > This series is based on the RFC patch and subsequent discussion at > https://patchwork.kernel.org/project/linux-kselftest/patch/02546e59-1afe-4b08-ba81-d94f3b691c9a@moroto.mountain/ > and offers a more comprehensive solution of the problem discussed there. > > Checkpatch note: > Remaining checkpatch errors and warnings were deliberately ignored. > Some are triggered by matching coding style or by comments interpreted > as code, others by assembler macros which are disliked by checkpatch. > Suggestions for improvements are welcome. > > Some questions: > > - Is the general approach promising ? If not, are there other possible > solutions ? > - Function pointers are only added to the __bug_table section if both > CONFIG_KUNIT and CONFIG_DEBUG_BUGVERBOSE are enabled. This avoids image > size increases if CONFIG_KUNIT=n. Downside is slightly more complex > architecture specific assembler code. If function pointers were always > added to the __bug_table section, vmlinux image size would increase by > approximately 0.6-0.7%. Is the increased complexity in assembler code > worth the reduced image size ? I think so, but I would like to hear > other opinions. > - There are additional possibilities associated with storing the bug > function name in the __bug_table section. It could be independent of > KUNIT, it could be a configuration flag, and/or it could be used to > display the name of the offending function in BUG/WARN messages. > Is any of those of interest ? Thank you SO very much for this work! This is very much appreciated! We run into these warnings at LKFT all the time, and making sure that the noise doesn't drown the relevant signal is very important. Greetings! Daniel Díaz daniel.diaz@linaro.org
Hi Daniel, On 3/6/24 10:24, Daniel Díaz wrote: [ ... ] > > Thank you SO very much for this work! This is very much appreciated! Thanks a lot for the feedback. > We run into these warnings at LKFT all the time, and making sure that > the noise doesn't drown the relevant signal is very important. > Can you send me a list of all the warnings you are seeing ? I do see lots of warnings when running drm tests in qemu, but I am not sure if those are caused by emulation problems or if they are expected. A list of warnings seen on real hardware would help me prepare additional patches to address (or, rather, suppress) those. Thanks, Guenter
On Tue, Mar 05, 2024 at 10:40:28AM -0800, Guenter Roeck wrote: > Some unit tests intentionally trigger warning backtraces by passing bad > parameters to kernel API functions. Such unit tests typically check the > return value from such calls, not the existence of the warning backtrace. > > Such intentionally generated warning backtraces are neither desirable > nor useful for a number of reasons. > - They can result in overlooked real problems. > - A warning that suddenly starts to show up in unit tests needs to be > investigated and has to be marked to be ignored, for example by > adjusting filter scripts. Such filters are ad-hoc because there is > no real standard format for warnings. On top of that, such filter > scripts would require constant maintenance. > > One option to address problem would be to add messages such as "expected > warning backtraces start / end here" to the kernel log. However, that > would again require filter scripts, it might result in missing real > problematic warning backtraces triggered while the test is running, and > the irrelevant backtrace(s) would still clog the kernel log. > > Solve the problem by providing a means to identify and suppress specific > warning backtraces while executing test code. Support suppressing multiple > backtraces while at the same time limiting changes to generic code to the > absolute minimum. Architecture specific changes are kept at minimum by > retaining function names only if both CONFIG_DEBUG_BUGVERBOSE and > CONFIG_KUNIT are enabled. > > The first patch of the series introduces the necessary infrastructure. > The second patch marks the warning message in drm_calc_scale() in the DRM > subsystem as intentional where warranted. This patch is intended to serve > as an example for the use of the functionality introduced with this series. > The last three patches in the series introduce the necessary architecture > specific changes for x86, arm64, and loongarch. > > This series is based on the RFC patch and subsequent discussion at > https://patchwork.kernel.org/project/linux-kselftest/patch/02546e59-1afe-4b08-ba81-d94f3b691c9a@moroto.mountain/ > and offers a more comprehensive solution of the problem discussed there. > > Checkpatch note: > Remaining checkpatch errors and warnings were deliberately ignored. > Some are triggered by matching coding style or by comments interpreted > as code, others by assembler macros which are disliked by checkpatch. > Suggestions for improvements are welcome. > > Some questions: > > - Is the general approach promising ? If not, are there other possible > solutions ? > - Function pointers are only added to the __bug_table section if both > CONFIG_KUNIT and CONFIG_DEBUG_BUGVERBOSE are enabled. This avoids image > size increases if CONFIG_KUNIT=n. Downside is slightly more complex > architecture specific assembler code. If function pointers were always > added to the __bug_table section, vmlinux image size would increase by > approximately 0.6-0.7%. Is the increased complexity in assembler code > worth the reduced image size ? I think so, but I would like to hear > other opinions. > - There are additional possibilities associated with storing the bug > function name in the __bug_table section. It could be independent of > KUNIT, it could be a configuration flag, and/or it could be used to > display the name of the offending function in BUG/WARN messages. > Is any of those of interest ? > I am ready to send a full version of this series with support for all affected architectures. I am undecided if I should send it now, based on v6.8, and send v2 after rebasing it to v6.9-rc1, or if I should just wait for v6.9-rc1. I understand that some maintainers dislike getting new patch series while the commit window is is open. On the ther side, I tested the series thoroughly on top of v6.8-rc7, and initial v6.9 release candidates may have their own problems. Given that, I tend to send the series now. Any thoughts ? Unless there is strong negative feedback, I'll likely do that in a day or two. Thanks, Guenter