Message ID | 20240905043245.1389509-3-wentaoz5@illinois.edu (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2,1/4] llvm-cov: add Clang's Source-based Code Coverage support | expand |
Hi Wentao, On Wed, Sep 04, 2024 at 11:32:43PM -0500, Wentao Zhang wrote: > Add infrastructure to enable Clang's Modified Condition/Decision Coverage > (MC/DC) [1]. > > Clang has added MC/DC support as of its 18.1.0 release. MC/DC is a fine- > grained coverage metric required by many automotive and aviation industrial > standards for certifying mission-critical software [2]. > > In the following example from arch/x86/events/probe.c, llvm-cov gives the > MC/DC measurement for the compound logic decision at line 43. > > 43| 12| if (msr[bit].test && !msr[bit].test(bit, data)) > ------------------ > |---> MC/DC Decision Region (43:8) to (43:50) > | > | Number of Conditions: 2 > | Condition C1 --> (43:8) > | Condition C2 --> (43:25) > | > | Executed MC/DC Test Vectors: > | > | C1, C2 Result > | 1 { T, F = F } > | 2 { T, T = T } > | > | C1-Pair: not covered > | C2-Pair: covered: (1,2) > | MC/DC Coverage for Decision: 50.00% > | > ------------------ > 44| 5| continue; > > As the results suggest, during the span of measurement, only condition C2 > (!msr[bit].test(bit, data)) is covered. That means C2 was evaluated to both > true and false, and in those test vectors C2 affected the decision outcome > independently. Therefore MC/DC for this decision is 1 out of 2 (50.00%). Thanks a lot for the detail in the commit message. Your first talk at LPC in the Refereed Track was excellent as well. If the video for that talk becomes available soon, it would be helpful to link that in the commit message as well. > As of Clang 19, users can determine the max number of conditions in a > decision to measure via option LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS, which > controls -fmcdc-max-conditions flag of Clang cc1 [3]. Since MC/DC > implementation utilizes bitmaps to track the execution of test vectors, > more memory is consumed if larger decisions are getting counted. The Some of this could potentially be in the Kconfig text below as it seems relevant for users to make a decision on modifying its value. > maximum value supported by Clang is 32767. According to local experiments, > the working maximum for Linux kernel is 46, with the largest decisions in > kernel codebase (with 47 conditions, as of v6.11) excluded, otherwise the > kernel image size limit will be exceeded. The largest decisions in kernel > are contributed for example by macros checking CPUID. > > Code exceeding LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS will produce compiler > warnings. > > As of LLVM 19, certain expressions are still not covered, and will produce > build warnings when they are encountered: > > "[...] if a boolean expression is embedded in the nest of another boolean > expression but separated by a non-logical operator, this is also not > supported. For example, in x = (a && b && c && func(d && f)), the d && f > case starts a new boolean expression that is separated from the other > conditions by the operator func(). When this is encountered, a warning > will be generated and the boolean expression will not be > instrumented." [4] These two sets of warnings appear to be pretty noisy in my build testing... Is there any way to shut them up? Perhaps it is good for users to see these limitations but it basically makes the build output useless. If there were switches, then they could be disabled in the default case with a Kconfig option to turn them on if the user is concerned with seeing which parts of their code are not instrumented. I could see developers wanting to run this for writing tests and they might not care about this as much as someone else might. I did leave LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS at its default value. Perhaps there is a more reasonable default that would result in less noisy build output but not run afoul of potential memory usage concerns? I assume that mention means that memory usage may be a concern for the type of deployments this technology would commonly be used with? > Link: https://en.wikipedia.org/wiki/Modified_condition%2Fdecision_coverage [1] > Link: https://digital-library.theiet.org/content/journals/10.1049/sej.1994.0025 [2] > Link: https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798 [3] > Link: https://clang.llvm.org/docs/SourceBasedCodeCoverage.html#mc-dc-instrumentation [4] Thank you for using this link format :) > Signed-off-by: Wentao Zhang <wentaoz5@illinois.edu> > Reviewed-by: Chuck Wolber <chuck.wolber@boeing.com> > Tested-by: Chuck Wolber <chuck.wolber@boeing.com> From an actual code perspective, this looks good to me. Reviewed-by: Nathan Chancellor <nathan@kernel.org> > diff --git a/Makefile b/Makefile > index 51498134c..1185b38d6 100644 > --- a/Makefile > +++ b/Makefile > @@ -740,6 +740,12 @@ all: vmlinux > CFLAGS_LLVM_COV := -fprofile-instr-generate -fcoverage-mapping > export CFLAGS_LLVM_COV > > +CFLAGS_LLVM_COV_MCDC := -fcoverage-mcdc > +ifdef CONFIG_LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS > +CFLAGS_LLVM_COV_MCDC += -Xclang -fmcdc-max-conditions=$(CONFIG_LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS) Why is -Xclang needed here? Is this not a full frontend flag? > +endif > +export CFLAGS_LLVM_COV_MCDC > + > CFLAGS_GCOV := -fprofile-arcs -ftest-coverage > ifdef CONFIG_CC_IS_GCC > CFLAGS_GCOV += -fno-tree-loop-im
Hi Nathan, Thanks for your review! See some of my responses below inline. Other comments, including those to [1/4] and [4/4], are acknowledged and will be updated in v3. On 2024-10-01 20:10, Nathan Chancellor wrote: > ... > > maximum value supported by Clang is 32767. According to local experiments, > > the working maximum for Linux kernel is 46, with the largest decisions in > > kernel codebase (with 47 conditions, as of v6.11) excluded, otherwise the > > kernel image size limit will be exceeded. The largest decisions in kernel > > are contributed for example by macros checking CPUID. > > > > Code exceeding LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS will produce compiler > > warnings. > > > > As of LLVM 19, certain expressions are still not covered, and will produce > > build warnings when they are encountered: > > > > "[...] if a boolean expression is embedded in the nest of another boolean > > expression but separated by a non-logical operator, this is also not > > supported. For example, in x = (a && b && c && func(d && f)), the d && f > > case starts a new boolean expression that is separated from the other > > conditions by the operator func(). When this is encountered, a warning > > will be generated and the boolean expression will not be > > instrumented." [4] > > These two sets of warnings appear to be pretty noisy in my build > testing... Is there any way to shut them up? Perhaps it is good for These two warnings are currently implemented as custom diagnostic in clang/lib/CodeGen/CodeGenPGO.cpp:dataTraverseStmtPost. So I'm afraid there is no corresponding "-W[no-]xxx" flag at this moment. I agree such switches would be desirable but we might have to nudge this in LLVM community. > users to see these limitations but it basically makes the build output > useless. If there were switches, then they could be disabled in the > default case with a Kconfig option to turn them on if the user is > concerned with seeing which parts of their code are not instrumented. I > could see developers wanting to run this for writing tests and they > might not care about this as much as someone else might. > > I did leave LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS at its default value. > Perhaps there is a more reasonable default that would result in less > noisy build output but not run afoul of potential memory usage concerns? > I assume that mention means that memory usage may be a concern for the > type of deployments this technology would commonly be used with? To my own experiences, enlarging this threshold won't really help with the issue, because the other type of warning ("nested boolean") is even more prevalent in kernel codebase. I once built the kernel serially and counted the number of instances from the gigantic log: unsupported number of conditions (>6): 837 unsupported nested boolean: 8029 So again we should probably improve this on the tool side. I can talk to developers there separately. > ... > > diff --git a/Makefile b/Makefile > > index 51498134c..1185b38d6 100644 > > --- a/Makefile > > +++ b/Makefile > > @@ -740,6 +740,12 @@ all: vmlinux > > CFLAGS_LLVM_COV := -fprofile-instr-generate -fcoverage-mapping > > export CFLAGS_LLVM_COV > > > > +CFLAGS_LLVM_COV_MCDC := -fcoverage-mcdc > > +ifdef CONFIG_LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS > > +CFLAGS_LLVM_COV_MCDC += -Xclang -fmcdc-max-conditions=$(CONFIG_LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS) > > Why is -Xclang needed here? Is this not a full frontend flag? "-fmcdc-max-conditions" is a cc1 option only, while "-fcoverage-mcdc" is both a cc1 option and a clang option. See llvm/llvm-project#82448 and their changes to clang/include/clang/Driver/Options.td. Thanks, Wentao > > > +endif > > +export CFLAGS_LLVM_COV_MCDC > > + > > CFLAGS_GCOV := -fprofile-arcs -ftest-coverage > > ifdef CONFIG_CC_IS_GCC > > CFLAGS_GCOV += -fno-tree-loop-im
diff --git a/Makefile b/Makefile index 51498134c..1185b38d6 100644 --- a/Makefile +++ b/Makefile @@ -740,6 +740,12 @@ all: vmlinux CFLAGS_LLVM_COV := -fprofile-instr-generate -fcoverage-mapping export CFLAGS_LLVM_COV +CFLAGS_LLVM_COV_MCDC := -fcoverage-mcdc +ifdef CONFIG_LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS +CFLAGS_LLVM_COV_MCDC += -Xclang -fmcdc-max-conditions=$(CONFIG_LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS) +endif +export CFLAGS_LLVM_COV_MCDC + CFLAGS_GCOV := -fprofile-arcs -ftest-coverage ifdef CONFIG_CC_IS_GCC CFLAGS_GCOV += -fno-tree-loop-im diff --git a/kernel/llvm-cov/Kconfig b/kernel/llvm-cov/Kconfig index 9241fdfb0..66259e1f2 100644 --- a/kernel/llvm-cov/Kconfig +++ b/kernel/llvm-cov/Kconfig @@ -61,4 +61,40 @@ config LLVM_COV_PROFILE_ALL Note that a kernel compiled with profiling flags will be significantly larger and run slower. +config LLVM_COV_KERNEL_MCDC + bool "Enable measuring modified condition/decision coverage (MC/DC)" + depends on LLVM_COV_KERNEL + depends on CLANG_VERSION >= 180000 + help + This option enables modified condition/decision coverage (MC/DC) + code coverage instrumentation. + + If unsure, say N. + + This will add Clang's Source-based Code Coverage MC/DC + instrumentation to your kernel. As of LLVM 19, certain expressions + are still not covered, and will produce build warnings when they are + encountered. + + "[...] if a boolean expression is embedded in the nest of another + boolean expression but separated by a non-logical operator, this is + also not supported. For example, in + x = (a && b && c && func(d && f)), the d && f case starts a new + boolean expression that is separated from the other conditions by the + operator func(). When this is encountered, a warning will be + generated and the boolean expression will not be instrumented." + + https://clang.llvm.org/docs/SourceBasedCodeCoverage.html#mc-dc-instrumentation + +config LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS + int "Maximum number of conditions in a decision to instrument" + range 6 32767 + depends on LLVM_COV_KERNEL_MCDC + depends on CLANG_VERSION >= 190000 + default "6" + help + This value is passed to "-fmcdc-max-conditions" flag of Clang cc1. + Expressions whose number of conditions is greater than this value will + produce warnings and will not be instrumented. + endmenu diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib index b468856b8..afc94e92d 100644 --- a/scripts/Makefile.lib +++ b/scripts/Makefile.lib @@ -169,6 +169,18 @@ _c_flags += $(if $(patsubst n%,, \ $(CFLAGS_LLVM_COV)) endif +# +# Flag that turns on modified condition/decision coverage (MC/DC) measurement +# with Clang's Source-based Code Coverage. Enable the flag for a file or +# directory depending on variables LLVM_COV_PROFILE_obj.o, LLVM_COV_PROFILE and +# CONFIG_LLVM_COV_PROFILE_ALL. +# +ifeq ($(CONFIG_LLVM_COV_KERNEL_MCDC),y) +_c_flags += $(if $(patsubst n%,, \ + $(LLVM_COV_PROFILE_$(target-stem).o)$(LLVM_COV_PROFILE)$(if $(is-kernel-object),$(CONFIG_LLVM_COV_PROFILE_ALL))), \ + $(CFLAGS_LLVM_COV_MCDC)) +endif + # # Enable address sanitizer flags for kernel except some files or directories # we don't want to check (depends on variables KASAN_SANITIZE_obj.o, KASAN_SANITIZE)