diff mbox series

[RFC,1/4] xen/Makefile: add analysis-coverity and analysis-eclair

Message ID 20221107104739.10404-2-luca.fancellu@arm.com (mailing list archive)
State Superseded
Headers show
Series Static analyser finding deviation | expand

Commit Message

Luca Fancellu Nov. 7, 2022, 10:47 a.m. UTC
Add new targets to makefile, analysis-{coverity,eclair} that will:
 - Create a tag database using a new tool called xenfusa-gen-tags.py
 - Get every file with the FuSa tag SAF- in-code comment, create a
   copy of it as <file>.safparse and substituting the tags with
   proprietary tool syntax in-code comments using the database.
 - build Xen, coverity and eclair are capable of intercepting the
   compiler invocation on every build file so the only action from
   them is to run these new targets, the file they will analyse will
   automatically contain understandable suppression in-code comment
   for them.
 - call analysis-clean to restore original files.

In case of any error, the user needs to manually run the target
analysis-clean to restore the original files, before that step, any
following run of analysis-{coverity,eclair} will stop and won't
overwrite the original files.

Add in docs/misra/ the files safe.json and
false-positive-{coverity,eclair}.json that are JSON files containing
the data structures for the justifications, they are used by the
xenfusa-gen-tags.py to create the substitution list.

Add docs/misra/documenting-violations.rst to explain how to add
justifications.

Add files to .gitignore and update clean rule content in Makefile.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
 .gitignore                              |   2 +
 docs/misra/documenting-violations.rst   | 172 ++++++++++++++++++++++++
 docs/misra/false-positive-coverity.json |  12 ++
 docs/misra/false-positive-eclair.json   |  12 ++
 docs/misra/safe.json                    |  11 ++
 xen/Makefile                            |  50 ++++++-
 xen/tools/xenfusa-gen-tags.py           |  81 +++++++++++
 7 files changed, 338 insertions(+), 2 deletions(-)
 create mode 100644 docs/misra/documenting-violations.rst
 create mode 100644 docs/misra/false-positive-coverity.json
 create mode 100644 docs/misra/false-positive-eclair.json
 create mode 100644 docs/misra/safe.json
 create mode 100755 xen/tools/xenfusa-gen-tags.py

Comments

Jan Beulich Nov. 7, 2022, 4:35 p.m. UTC | #1
On 07.11.2022 11:47, Luca Fancellu wrote:
> +Here is an example to add a new justification in false-positive-<tool>.json::

With <tool> already present in the name, ...

> +|{
> +|    "version": "1.0",
> +|    "content": [
> +|        {
> +|            "id": "SAF-0-false-positive-<tool>",
> +|            "analyser": {
> +|                "<tool>": "<proprietary-id>"

... can we avoid the redundancy here? Perhaps ...

> +|            },
> +|            "tool-version": "<version>",

... it could be

            "analyser": {
                "<version>": "<proprietary-id>"
            },

? It's not really clear to me though how a false positive would be
correctly recorded which is present over a range of versions.

> --- a/xen/Makefile
> +++ b/xen/Makefile
> @@ -457,7 +457,8 @@ endif # need-config
>  
>  __all: build
>  
> -main-targets := build install uninstall clean distclean MAP cppcheck cppcheck-html
> +main-targets := build install uninstall clean distclean MAP cppcheck \
> +    cppcheck-html analysis-coverity analysis-eclair
>  .PHONY: $(main-targets)
>  ifneq ($(XEN_TARGET_ARCH),x86_32)
>  $(main-targets): %: _% ;
> @@ -572,7 +573,7 @@ _clean:
>  	rm -f $(TARGET).efi $(TARGET).efi.map $(TARGET).efi.stripped
>  	rm -f asm-offsets.s arch/*/include/asm/asm-offsets.h
>  	rm -f .banner .allconfig.tmp include/xen/compile.h
> -	rm -f cppcheck-misra.* xen-cppcheck.xml
> +	rm -f cppcheck-misra.* xen-cppcheck.xml *.sed

Is *.sed perhaps a little too wide? But yes, we can of course deal with that
in case any *.sed file appears in the source tree.

> @@ -757,6 +758,51 @@ cppcheck-version:
>  $(objtree)/include/generated/compiler-def.h:
>  	$(Q)$(CC) -dM -E -o $@ - < /dev/null
>  
> +JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
> +                       $(XEN_ROOT)/docs/misra/false-positive-$$*.json
> +
> +# The following command is using grep to find all files that contains a comment
> +# containing "SAF-<anything>" on a single line.
> +# %.safparse will be the original files saved from the build system, these files
> +# will be restored at the end of the analysis step
> +PARSE_FILE_LIST := $(addsuffix .safparse,$(filter-out %.safparse,\
> +$(shell grep -ERl '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree))))

Please indent such line continuations. And then isn't this going to risk
matching non-source files as well? Perhaps you want to restrict this to
*.c and *.h?

> +.PRECIOUS: $(PARSE_FILE_LIST) $(objtree)/%.sed
> +
> +.SECONDEXPANSION:

I have to admit that I'm a little worried of this living relatively early in
the script.

> +$(objtree)/%.sed: $(JUSTIFICATION_FILES) $(srctree)/tools/xenfusa-gen-tags.py
> +	$(PYTHON) $(srctree)/tools/xenfusa-gen-tags.py \
> +		$(foreach file, $(filter %.json, $^), --input $(file)) --output $@ \
> +		--tool $*

To reduce redundancy, how about

$(objtree)/%.sed: $(srctree)/tools/xenfusa-gen-tags.py $(JUSTIFICATION_FILES)
	$(PYTHON) $< --output $@ --tool $* \
		$(foreach file, $(filter %.json, $^), --input $(file))

?

> +%.safparse: %

For this to not be overly widely matching, maybe better

$(PARSE_FILE_LIST): %.safparse: %

?

> +# Create a copy of the original file (-p preserves also timestamp)
> +	$(Q)if [ -f "$@" ]; then \
> +		echo "Found $@, please check the integrity of $*"; \
> +		exit 1; \
> +	fi
> +	$(Q)cp -p "$*" "$@"

While you use the full source name as the stem, I still think $< would be
more clear to use here.

To limit work done, could this me "mv" instead of "cp -p", and then ...

> +analysis-parse-tags-%: $(PARSE_FILE_LIST) $(objtree)/%.sed
> +	$(Q)for file in $(patsubst %.safparse,%,$(PARSE_FILE_LIST)); do \
> +		sed -i -f "$(objtree)/$*.sed" "$${file}"; \

... with then using

		sed -f "$(objtree)/$*.sed" "$${file}.safparse" >"$${file}"

here? This would then also have source consistent between prereqs and
rule.

> +	done
> +
> +analysis-build-%: analysis-parse-tags-%
> +	$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile build

This rule doesn't use the stem, so I'm struggling to understand what
this is about.

> +analysis-clean:
> +# Reverts the original file (-p preserves also timestamp)
> +	$(Q)find $(srctree) -type f -name "*.safparse" -print | \
> +	while IFS= read file; do \
> +		cp -p "$${file}" "$${file%.safparse}"; \
> +		rm -f "$${file}"; \

Why not "mv"?

> +	done
> +
> +_analysis-%: analysis-build-%
> +	$(Q)$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile analysis-clean

Again no use of the stem, plus here I wonder if this may not lead to
people invoking "analysis-clean" without having said anything about
cleaning on their command line.

> --- /dev/null
> +++ b/xen/tools/xenfusa-gen-tags.py
> @@ -0,0 +1,81 @@
> +#!/usr/bin/env python
> +
> +import sys, getopt, json
> +
> +def help():
> +    print('Usage: {} [OPTION] ...'.format(sys.argv[0]))
> +    print('')
> +    print('This script converts the justification file to a set of sed rules')
> +    print('that will replace generic tags from Xen codebase in-code comments')
> +    print('to in-code comments having the proprietary syntax for the selected')
> +    print('tool.')
> +    print('')
> +    print('Options:')
> +    print('  -i/--input   Json file containing the justifications, can be')
> +    print('               passed multiple times for multiple files')
> +    print('  -o/--output  Sed file containing the substitution rules')
> +    print('  -t/--tool    Tool that will use the in-code comments')
> +    print('')
> +
> +# This is the dictionary for the rules that translates to proprietary comments:
> +#  - cppcheck: /* cppcheck-suppress[id] */
> +#  - coverity: /* coverity[id] */
> +#  - eclair:   /* -E> hide id 1 "" */
> +# Add entries to support more analyzers
> +tool_syntax = {
> +    "cppcheck":"s,^.*/*[[:space:]]*TAG.*$,/* cppcheck-suppress[VID] */,g",
> +    "coverity":"s,^.*/*[[:space:]]*TAG.*$,/* coverity[VID] */,g",
> +    "eclair":"s,^.*/*[[:space:]]*TAG.*$,/* -E> hide VID 1 \"\" */,g"
> +}
> +
> +def main(argv):
> +    infiles = []
> +    justifications = []
> +    outfile = ''
> +    tool = ''
> +
> +    try:
> +        opts, args = getopt.getopt(argv,"hi:o:t:",["input=","output=","tool="])
> +    except getopt.GetoptError:
> +        help()
> +        sys.exit(2)
> +    for opt, arg in opts:
> +        if opt == '-h':
> +            help()
> +            sys.exit(0)
> +        elif opt in ("-i", "--input"):
> +            infiles.append(arg)
> +        elif opt in ("-o", "--output"):
> +            outfile = arg
> +        elif opt in ("-t", "--tool"):
> +            tool = arg
> +
> +    # Open all input files
> +    for file in infiles:
> +        try:
> +            handle = open(file, 'rt')
> +            content = json.load(handle)
> +            justifications = justifications + content['content']
> +            handle.close()
> +        except json.JSONDecodeError:
> +            print('JSON decoding error in file: ' + file)
> +        except:
> +            print('Error opening ' + file)
> +            sys.exit(1)
> +
> +    try:
> +        outstr = open(outfile, "w")
> +    except:
> +        print('Error creating ' + outfile)
> +        sys.exit(1)
> +
> +    for j in justifications:
> +        if tool in j['analyser']:
> +            comment=tool_syntax[tool].replace("TAG",j['id'])
> +            comment=comment.replace("VID",j['analyser'][tool])
> +            outstr.write('{}\n'.format(comment))
> +
> +    outstr.close()
> +
> +if __name__ == "__main__":
> +   main(sys.argv[1:])
> \ No newline at end of file

Nit: ^^^

Jan
Luca Fancellu Nov. 8, 2022, 10:59 a.m. UTC | #2
Hi Jan

> 
> On 07.11.2022 11:47, Luca Fancellu wrote:
>> +Here is an example to add a new justification in false-positive-<tool>.json::
> 
> With <tool> already present in the name, ...
> 
>> +|{
>> +|    "version": "1.0",
>> +|    "content": [
>> +|        {
>> +|            "id": "SAF-0-false-positive-<tool>",
>> +|            "analyser": {
>> +|                "<tool>": "<proprietary-id>"
> 
> ... can we avoid the redundancy here? Perhaps ...
> 
>> +|            },
>> +|            "tool-version": "<version>",
> 
> ... it could be
> 
>            "analyser": {
>                "<version>": "<proprietary-id>"
>            },

Yes it’s a bit redundant but it helps re-using the same tool we use for safe.json

> 
> ? It's not really clear to me though how a false positive would be
> correctly recorded which is present over a range of versions.

We could put a range in "tool-version”: “<verision-old> - <version-new>"

> 
>> --- a/xen/Makefile
>> +++ b/xen/Makefile
>> @@ -457,7 +457,8 @@ endif # need-config
>> 
>> __all: build
>> 
>> -main-targets := build install uninstall clean distclean MAP cppcheck cppcheck-html
>> +main-targets := build install uninstall clean distclean MAP cppcheck \
>> +    cppcheck-html analysis-coverity analysis-eclair
>> .PHONY: $(main-targets)
>> ifneq ($(XEN_TARGET_ARCH),x86_32)
>> $(main-targets): %: _% ;
>> @@ -572,7 +573,7 @@ _clean:
>> 	rm -f $(TARGET).efi $(TARGET).efi.map $(TARGET).efi.stripped
>> 	rm -f asm-offsets.s arch/*/include/asm/asm-offsets.h
>> 	rm -f .banner .allconfig.tmp include/xen/compile.h
>> -	rm -f cppcheck-misra.* xen-cppcheck.xml
>> +	rm -f cppcheck-misra.* xen-cppcheck.xml *.sed
> 
> Is *.sed perhaps a little too wide? But yes, we can of course deal with that
> in case any *.sed file appears in the source tree.
> 
>> @@ -757,6 +758,51 @@ cppcheck-version:
>> $(objtree)/include/generated/compiler-def.h:
>> 	$(Q)$(CC) -dM -E -o $@ - < /dev/null
>> 
>> +JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
>> +                       $(XEN_ROOT)/docs/misra/false-positive-$$*.json
>> +
>> +# The following command is using grep to find all files that contains a comment
>> +# containing "SAF-<anything>" on a single line.
>> +# %.safparse will be the original files saved from the build system, these files
>> +# will be restored at the end of the analysis step
>> +PARSE_FILE_LIST := $(addsuffix .safparse,$(filter-out %.safparse,\
>> +$(shell grep -ERl '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree))))
> 
> Please indent such line continuations. And then isn't this going to risk
> matching non-source files as well? Perhaps you want to restrict this to
> *.c and *.h?

Yes, how about this, it will filter out *.safparse files while keeping in only .h and .c:

PARSE_FILE_LIST := $(addsuffix .safparse,$(filter %.c %.h,\
    $(shell grep -ERl '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree))))


> 
>> +.PRECIOUS: $(PARSE_FILE_LIST) $(objtree)/%.sed
>> +
>> +.SECONDEXPANSION:
> 
> I have to admit that I'm a little worried of this living relatively early in
> the script.
> 
>> +$(objtree)/%.sed: $(JUSTIFICATION_FILES) $(srctree)/tools/xenfusa-gen-tags.py
>> +	$(PYTHON) $(srctree)/tools/xenfusa-gen-tags.py \
>> +		$(foreach file, $(filter %.json, $^), --input $(file)) --output $@ \
>> +		--tool $*
> 
> To reduce redundancy, how about
> 
> $(objtree)/%.sed: $(srctree)/tools/xenfusa-gen-tags.py $(JUSTIFICATION_FILES)
> 	$(PYTHON) $< --output $@ --tool $* \
> 		$(foreach file, $(filter %.json, $^), --input $(file))
> 
> ?

Yes it sounds better

> 
>> +%.safparse: %
> 
> For this to not be overly widely matching, maybe better
> 
> $(PARSE_FILE_LIST): %.safparse: %
> 
> ?

Yes very sensible

> 
>> +# Create a copy of the original file (-p preserves also timestamp)
>> +	$(Q)if [ -f "$@" ]; then \
>> +		echo "Found $@, please check the integrity of $*"; \
>> +		exit 1; \
>> +	fi
>> +	$(Q)cp -p "$*" "$@"
> 
> While you use the full source name as the stem, I still think $< would be
> more clear to use here.

Agree

> 
> To limit work done, could this me "mv" instead of "cp -p", and then ...
> 
>> +analysis-parse-tags-%: $(PARSE_FILE_LIST) $(objtree)/%.sed
>> +	$(Q)for file in $(patsubst %.safparse,%,$(PARSE_FILE_LIST)); do \
>> +		sed -i -f "$(objtree)/$*.sed" "$${file}"; \
> 
> ... with then using
> 
> 		sed -f "$(objtree)/$*.sed" "$${file}.safparse" >"$${file}"
> 
> here? This would then also have source consistent between prereqs and
> rule.

We saw that mv is not preserving the timestamp of the file, instead we would like to preserve
it, for this reason we used cp -p

> 
>> +	done
>> +
>> +analysis-build-%: analysis-parse-tags-%
>> +	$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile build
> 
> This rule doesn't use the stem, so I'm struggling to understand what
> this is about.

Yes, here my aim was to catch analysis-build-{eclair,coverity}, here I see that if the user has a typo
the rule will run anyway, but it will be stopped by the dependency chain because at the end we have:

JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
                       $(XEN_ROOT)/docs/misra/false-positive-$$*.json

That will give an error because $(XEN_ROOT)/docs/misra/false-positive-<typo>.json does not exists.

If you think it is not enough, what if I reduce the scope of the rule like this?

_analysis-coverity _analysis-eclair: _analysis-%: analysis-build-%

Or, if you are still worried about “analysis-build-%: analysis-parse-tags-%”, then I can do something
like this: 

analysis-supported-coverity analysis-supported-eclair:
    @echo > /dev/null

analysis-supported-%:
    @error Unsupported analysis tool @*

analysis-build-%: analysis-parse-tags-% | analysis-supported-%
    $(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile build

[…]

_analysis-%: analysis-build-% | analysis-supported-%
    $(Q)$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile analysis-clean

> 
>> +analysis-clean:
>> +# Reverts the original file (-p preserves also timestamp)
>> +	$(Q)find $(srctree) -type f -name "*.safparse" -print | \
>> +	while IFS= read file; do \
>> +		cp -p "$${file}" "$${file%.safparse}"; \
>> +		rm -f "$${file}"; \
> 
> Why not "mv"?
> 
>> +	done
>> +
>> +_analysis-%: analysis-build-%
>> +	$(Q)$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile analysis-clean
> 
> Again no use of the stem, plus here I wonder if this may not lead to
> people invoking "analysis-clean" without having said anything about
> cleaning on their command line.

In any case, the cleaning process is very safe and does not clean anything that was not dirty before,
so in case of typos, it’s just like a nop.

> 
>> --- /dev/null
>> +++ b/xen/tools/xenfusa-gen-tags.py
>> @@ -0,0 +1,81 @@
>> +#!/usr/bin/env python
>> +
>> +import sys, getopt, json
>> +
>> +def help():
>> +    print('Usage: {} [OPTION] ...'.format(sys.argv[0]))
>> +    print('')
>> +    print('This script converts the justification file to a set of sed rules')
>> +    print('that will replace generic tags from Xen codebase in-code comments')
>> +    print('to in-code comments having the proprietary syntax for the selected')
>> +    print('tool.')
>> +    print('')
>> +    print('Options:')
>> +    print('  -i/--input   Json file containing the justifications, can be')
>> +    print('               passed multiple times for multiple files')
>> +    print('  -o/--output  Sed file containing the substitution rules')
>> +    print('  -t/--tool    Tool that will use the in-code comments')
>> +    print('')
>> +
>> +# This is the dictionary for the rules that translates to proprietary comments:
>> +#  - cppcheck: /* cppcheck-suppress[id] */
>> +#  - coverity: /* coverity[id] */
>> +#  - eclair:   /* -E> hide id 1 "" */
>> +# Add entries to support more analyzers
>> +tool_syntax = {
>> +    "cppcheck":"s,^.*/*[[:space:]]*TAG.*$,/* cppcheck-suppress[VID] */,g",
>> +    "coverity":"s,^.*/*[[:space:]]*TAG.*$,/* coverity[VID] */,g",
>> +    "eclair":"s,^.*/*[[:space:]]*TAG.*$,/* -E> hide VID 1 \"\" */,g"
>> +}
>> +
>> +def main(argv):
>> +    infiles = []
>> +    justifications = []
>> +    outfile = ''
>> +    tool = ''
>> +
>> +    try:
>> +        opts, args = getopt.getopt(argv,"hi:o:t:",["input=","output=","tool="])
>> +    except getopt.GetoptError:
>> +        help()
>> +        sys.exit(2)
>> +    for opt, arg in opts:
>> +        if opt == '-h':
>> +            help()
>> +            sys.exit(0)
>> +        elif opt in ("-i", "--input"):
>> +            infiles.append(arg)
>> +        elif opt in ("-o", "--output"):
>> +            outfile = arg
>> +        elif opt in ("-t", "--tool"):
>> +            tool = arg
>> +
>> +    # Open all input files
>> +    for file in infiles:
>> +        try:
>> +            handle = open(file, 'rt')
>> +            content = json.load(handle)
>> +            justifications = justifications + content['content']
>> +            handle.close()
>> +        except json.JSONDecodeError:
>> +            print('JSON decoding error in file: ' + file)
>> +        except:
>> +            print('Error opening ' + file)
>> +            sys.exit(1)
>> +
>> +    try:
>> +        outstr = open(outfile, "w")
>> +    except:
>> +        print('Error creating ' + outfile)
>> +        sys.exit(1)
>> +
>> +    for j in justifications:
>> +        if tool in j['analyser']:
>> +            comment=tool_syntax[tool].replace("TAG",j['id'])
>> +            comment=comment.replace("VID",j['analyser'][tool])
>> +            outstr.write('{}\n'.format(comment))
>> +
>> +    outstr.close()
>> +
>> +if __name__ == "__main__":
>> +   main(sys.argv[1:])
>> \ No newline at end of file
> 
> Nit: ^^^

Will fix

> 
> Jan
Jan Beulich Nov. 8, 2022, 11:48 a.m. UTC | #3
On 08.11.2022 11:59, Luca Fancellu wrote:
>> On 07.11.2022 11:47, Luca Fancellu wrote:
>>> +Here is an example to add a new justification in false-positive-<tool>.json::
>>
>> With <tool> already present in the name, ...
>>
>>> +|{
>>> +|    "version": "1.0",
>>> +|    "content": [
>>> +|        {
>>> +|            "id": "SAF-0-false-positive-<tool>",
>>> +|            "analyser": {
>>> +|                "<tool>": "<proprietary-id>"
>>
>> ... can we avoid the redundancy here? Perhaps ...
>>
>>> +|            },
>>> +|            "tool-version": "<version>",
>>
>> ... it could be
>>
>>            "analyser": {
>>                "<version>": "<proprietary-id>"
>>            },
> 
> Yes it’s a bit redundant but it helps re-using the same tool we use for safe.json

I guess the tool could also be made cope without much effort.

>>> @@ -757,6 +758,51 @@ cppcheck-version:
>>> $(objtree)/include/generated/compiler-def.h:
>>> 	$(Q)$(CC) -dM -E -o $@ - < /dev/null
>>>
>>> +JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
>>> +                       $(XEN_ROOT)/docs/misra/false-positive-$$*.json
>>> +
>>> +# The following command is using grep to find all files that contains a comment
>>> +# containing "SAF-<anything>" on a single line.
>>> +# %.safparse will be the original files saved from the build system, these files
>>> +# will be restored at the end of the analysis step
>>> +PARSE_FILE_LIST := $(addsuffix .safparse,$(filter-out %.safparse,\
>>> +$(shell grep -ERl '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree))))
>>
>> Please indent such line continuations. And then isn't this going to risk
>> matching non-source files as well? Perhaps you want to restrict this to
>> *.c and *.h?
> 
> Yes, how about this, it will filter out *.safparse files while keeping in only .h and .c:
> 
> PARSE_FILE_LIST := $(addsuffix .safparse,$(filter %.c %.h,\
>     $(shell grep -ERl '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree))))

That's better, but still means touching all files by grep despite now
only a subset really looked for. If I was to use the new goals on a
more or less regular basis, I'd expect that this enumeration of files
doesn't read _much_ more stuff from disk than is actually necessary.

>> To limit work done, could this me "mv" instead of "cp -p", and then ...
>>
>>> +analysis-parse-tags-%: $(PARSE_FILE_LIST) $(objtree)/%.sed
>>> +	$(Q)for file in $(patsubst %.safparse,%,$(PARSE_FILE_LIST)); do \
>>> +		sed -i -f "$(objtree)/$*.sed" "$${file}"; \
>>
>> ... with then using
>>
>> 		sed -f "$(objtree)/$*.sed" "$${file}.safparse" >"$${file}"
>>
>> here? This would then also have source consistent between prereqs and
>> rule.
> 
> We saw that mv is not preserving the timestamp of the file, instead we would like to preserve
> it, for this reason we used cp -p

Buggy mv? It certainly doesn't alter timestamps here, and I don't think
the spec allows for it doing so (at least when it doesn't need to resort
to copying to deal with cross-volume moves, but those can't happen here).

>>> +	done
>>> +
>>> +analysis-build-%: analysis-parse-tags-%
>>> +	$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile build
>>
>> This rule doesn't use the stem, so I'm struggling to understand what
>> this is about.
> 
> Yes, here my aim was to catch analysis-build-{eclair,coverity}, here I see that if the user has a typo
> the rule will run anyway, but it will be stopped by the dependency chain because at the end we have:
> 
> JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
>                        $(XEN_ROOT)/docs/misra/false-positive-$$*.json
> 
> That will give an error because $(XEN_ROOT)/docs/misra/false-positive-<typo>.json does not exists.
> 
> If you think it is not enough, what if I reduce the scope of the rule like this?
> 
> _analysis-coverity _analysis-eclair: _analysis-%: analysis-build-%

But then, without using the stem, how does it know whether to do an
Eclair or a Coverity run?

> Or, if you are still worried about “analysis-build-%: analysis-parse-tags-%”, then I can do something
> like this: 
> 
> analysis-supported-coverity analysis-supported-eclair:
>     @echo > /dev/null
> 
> analysis-supported-%:
>     @error Unsupported analysis tool @*
> 
> analysis-build-%: analysis-parse-tags-% | analysis-supported-%
>     $(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile build

If I'm not mistaken support for | doesn't exist in make 3.80 (the
minimum version we require to be used).

>>> +analysis-clean:
>>> +# Reverts the original file (-p preserves also timestamp)
>>> +	$(Q)find $(srctree) -type f -name "*.safparse" -print | \
>>> +	while IFS= read file; do \
>>> +		cp -p "$${file}" "$${file%.safparse}"; \
>>> +		rm -f "$${file}"; \
>>
>> Why not "mv"?
>>
>>> +	done
>>> +
>>> +_analysis-%: analysis-build-%
>>> +	$(Q)$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile analysis-clean
>>
>> Again no use of the stem, plus here I wonder if this may not lead to
>> people invoking "analysis-clean" without having said anything about
>> cleaning on their command line.
> 
> In any case, the cleaning process is very safe and does not clean anything that was not dirty before,
> so in case of typos, it’s just like a nop.

People may put transient files in their trees. Of course they need to be
aware that when they specify a "clean" target their files may be deleted.
But without any "clean" target specified nothing should be removed.

Jan
Luca Fancellu Nov. 8, 2022, 2 p.m. UTC | #4
> On 8 Nov 2022, at 11:48, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 08.11.2022 11:59, Luca Fancellu wrote:
>>> On 07.11.2022 11:47, Luca Fancellu wrote:
>>>> +Here is an example to add a new justification in false-positive-<tool>.json::
>>> 
>>> With <tool> already present in the name, ...
>>> 
>>>> +|{
>>>> +|    "version": "1.0",
>>>> +|    "content": [
>>>> +|        {
>>>> +|            "id": "SAF-0-false-positive-<tool>",
>>>> +|            "analyser": {
>>>> +|                "<tool>": "<proprietary-id>"
>>> 
>>> ... can we avoid the redundancy here? Perhaps ...
>>> 
>>>> +|            },
>>>> +|            "tool-version": "<version>",
>>> 
>>> ... it could be
>>> 
>>>           "analyser": {
>>>               "<version>": "<proprietary-id>"
>>>           },
>> 
>> Yes it’s a bit redundant but it helps re-using the same tool we use for safe.json
> 
> I guess the tool could also be made cope without much effort.

I can modify the script to take an additional parameter to distinguish between safe.json
and false-positive-*.json, then call twice the script and append the result to the .sed file.

> 
>>>> @@ -757,6 +758,51 @@ cppcheck-version:
>>>> $(objtree)/include/generated/compiler-def.h:
>>>> 	$(Q)$(CC) -dM -E -o $@ - < /dev/null
>>>> 
>>>> +JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
>>>> +                       $(XEN_ROOT)/docs/misra/false-positive-$$*.json
>>>> +
>>>> +# The following command is using grep to find all files that contains a comment
>>>> +# containing "SAF-<anything>" on a single line.
>>>> +# %.safparse will be the original files saved from the build system, these files
>>>> +# will be restored at the end of the analysis step
>>>> +PARSE_FILE_LIST := $(addsuffix .safparse,$(filter-out %.safparse,\
>>>> +$(shell grep -ERl '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree))))
>>> 
>>> Please indent such line continuations. And then isn't this going to risk
>>> matching non-source files as well? Perhaps you want to restrict this to
>>> *.c and *.h?
>> 
>> Yes, how about this, it will filter out *.safparse files while keeping in only .h and .c:
>> 
>> PARSE_FILE_LIST := $(addsuffix .safparse,$(filter %.c %.h,\
>>    $(shell grep -ERl '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree))))
> 
> That's better, but still means touching all files by grep despite now
> only a subset really looked for. If I was to use the new goals on a
> more or less regular basis, I'd expect that this enumeration of files
> doesn't read _much_ more stuff from disk than is actually necessary.

Ok would it be ok?

PARSE_FILE_LIST := $(addsuffix .safparse,$(shell grep -ERl --include=\*.h \
    --include=\*.c '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree)))

> 
>>> To limit work done, could this me "mv" instead of "cp -p", and then ...
>>> 
>>>> +analysis-parse-tags-%: $(PARSE_FILE_LIST) $(objtree)/%.sed
>>>> +	$(Q)for file in $(patsubst %.safparse,%,$(PARSE_FILE_LIST)); do \
>>>> +		sed -i -f "$(objtree)/$*.sed" "$${file}"; \
>>> 
>>> ... with then using
>>> 
>>> 		sed -f "$(objtree)/$*.sed" "$${file}.safparse" >"$${file}"
>>> 
>>> here? This would then also have source consistent between prereqs and
>>> rule.
>> 
>> We saw that mv is not preserving the timestamp of the file, instead we would like to preserve
>> it, for this reason we used cp -p
> 
> Buggy mv? It certainly doesn't alter timestamps here, and I don't think
> the spec allows for it doing so (at least when it doesn't need to resort
> to copying to deal with cross-volume moves, but those can't happen here).

Yes you are right, my assumption was wrong, I will change the code as you suggested.

> 
>>>> +	done
>>>> +
>>>> +analysis-build-%: analysis-parse-tags-%
>>>> +	$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile build
>>> 
>>> This rule doesn't use the stem, so I'm struggling to understand what
>>> this is about.
>> 
>> Yes, here my aim was to catch analysis-build-{eclair,coverity}, here I see that if the user has a typo
>> the rule will run anyway, but it will be stopped by the dependency chain because at the end we have:
>> 
>> JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
>>                       $(XEN_ROOT)/docs/misra/false-positive-$$*.json
>> 
>> That will give an error because $(XEN_ROOT)/docs/misra/false-positive-<typo>.json does not exists.
>> 
>> If you think it is not enough, what if I reduce the scope of the rule like this?
>> 
>> _analysis-coverity _analysis-eclair: _analysis-%: analysis-build-%
> 
> But then, without using the stem, how does it know whether to do an
> Eclair or a Coverity run?

Sorry I think I’m a bit lost here, the makefile is working on both analysis-coverity and analysis-eclair
because the % is solving in coverity or eclair depending on which the makefile has in input, it is not complaining
so I guess it works.
Do you see something not working? If so, are you able to provide a piece of code for that to make me understand?

> 
>> Or, if you are still worried about “analysis-build-%: analysis-parse-tags-%”, then I can do something
>> like this: 
>> 
>> analysis-supported-coverity analysis-supported-eclair:
>>    @echo > /dev/null
>> 
>> analysis-supported-%:
>>    @error Unsupported analysis tool @*
>> 
>> analysis-build-%: analysis-parse-tags-% | analysis-supported-%
>>    $(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile build
> 
> If I'm not mistaken support for | doesn't exist in make 3.80 (the
> minimum version we require to be used).

IDK, we use order-only prerequisite already in the Makefile.

> 
>>>> +analysis-clean:
>>>> +# Reverts the original file (-p preserves also timestamp)
>>>> +	$(Q)find $(srctree) -type f -name "*.safparse" -print | \
>>>> +	while IFS= read file; do \
>>>> +		cp -p "$${file}" "$${file%.safparse}"; \
>>>> +		rm -f "$${file}"; \
>>> 
>>> Why not "mv"?
>>> 
>>>> +	done
>>>> +
>>>> +_analysis-%: analysis-build-%
>>>> +	$(Q)$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile analysis-clean
>>> 
>>> Again no use of the stem, plus here I wonder if this may not lead to
>>> people invoking "analysis-clean" without having said anything about
>>> cleaning on their command line.
>> 
>> In any case, the cleaning process is very safe and does not clean anything that was not dirty before,
>> so in case of typos, it’s just like a nop.
> 
> People may put transient files in their trees. Of course they need to be
> aware that when they specify a "clean" target their files may be deleted.
> But without any "clean" target specified nothing should be removed.

*.safparse files are not supposed to be used freely by user in their tree, those
files will be removed only if the user calls the “analysis-clean” target or if the
analysis-coverity or analysis-eclair reaches the end (a process that creates *.safparse).

There is no other way to trigger the “analysis-clean” unintentionally, so I’m not sure about
the modification you would like to see there.

> 
> Jan
Jan Beulich Nov. 8, 2022, 3:49 p.m. UTC | #5
On 08.11.2022 15:00, Luca Fancellu wrote:
>> On 8 Nov 2022, at 11:48, Jan Beulich <jbeulich@suse.com> wrote:
>> On 08.11.2022 11:59, Luca Fancellu wrote:
>>>> On 07.11.2022 11:47, Luca Fancellu wrote:
>>>>> @@ -757,6 +758,51 @@ cppcheck-version:
>>>>> $(objtree)/include/generated/compiler-def.h:
>>>>> 	$(Q)$(CC) -dM -E -o $@ - < /dev/null
>>>>>
>>>>> +JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
>>>>> +                       $(XEN_ROOT)/docs/misra/false-positive-$$*.json
>>>>> +
>>>>> +# The following command is using grep to find all files that contains a comment
>>>>> +# containing "SAF-<anything>" on a single line.
>>>>> +# %.safparse will be the original files saved from the build system, these files
>>>>> +# will be restored at the end of the analysis step
>>>>> +PARSE_FILE_LIST := $(addsuffix .safparse,$(filter-out %.safparse,\
>>>>> +$(shell grep -ERl '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree))))
>>>>
>>>> Please indent such line continuations. And then isn't this going to risk
>>>> matching non-source files as well? Perhaps you want to restrict this to
>>>> *.c and *.h?
>>>
>>> Yes, how about this, it will filter out *.safparse files while keeping in only .h and .c:
>>>
>>> PARSE_FILE_LIST := $(addsuffix .safparse,$(filter %.c %.h,\
>>>    $(shell grep -ERl '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree))))
>>
>> That's better, but still means touching all files by grep despite now
>> only a subset really looked for. If I was to use the new goals on a
>> more or less regular basis, I'd expect that this enumeration of files
>> doesn't read _much_ more stuff from disk than is actually necessary.
> 
> Ok would it be ok?
> 
> PARSE_FILE_LIST := $(addsuffix .safparse,$(shell grep -ERl --include=\*.h \
>     --include=\*.c '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree)))

Hmm, not sure: --include isn't a standard option to grep, and we
generally try to be portable. Actually -R (or -r) isn't either. It
may still be okay that way if properly documented where the involved
goals will work and where not.

And then - why do you escape slashes in the ERE?

Talking of escaping - personally I find backslash escapes harder to
read / grok than quotation, so I'd like to recommend using quotes
around each of the two --include (if they remain in the first place)
instead of the \* construct.

>>>>> +	done
>>>>> +
>>>>> +analysis-build-%: analysis-parse-tags-%
>>>>> +	$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile build
>>>>
>>>> This rule doesn't use the stem, so I'm struggling to understand what
>>>> this is about.
>>>
>>> Yes, here my aim was to catch analysis-build-{eclair,coverity}, here I see that if the user has a typo
>>> the rule will run anyway, but it will be stopped by the dependency chain because at the end we have:
>>>
>>> JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
>>>                       $(XEN_ROOT)/docs/misra/false-positive-$$*.json
>>>
>>> That will give an error because $(XEN_ROOT)/docs/misra/false-positive-<typo>.json does not exists.
>>>
>>> If you think it is not enough, what if I reduce the scope of the rule like this?
>>>
>>> _analysis-coverity _analysis-eclair: _analysis-%: analysis-build-%
>>
>> But then, without using the stem, how does it know whether to do an
>> Eclair or a Coverity run?
> 
> Sorry I think I’m a bit lost here, the makefile is working on both analysis-coverity and analysis-eclair
> because the % is solving in coverity or eclair depending on which the makefile has in input, it is not complaining
> so I guess it works.
> Do you see something not working? If so, are you able to provide a piece of code for that to make me understand?

Well, my problem is that I don't see how the distinction is conveyed
without the stem being used. With what you say I understand I'm
overlooking something, so I'd appreciate some explanation or at least
a pointer.

>>> Or, if you are still worried about “analysis-build-%: analysis-parse-tags-%”, then I can do something
>>> like this: 
>>>
>>> analysis-supported-coverity analysis-supported-eclair:
>>>    @echo > /dev/null
>>>
>>> analysis-supported-%:
>>>    @error Unsupported analysis tool @*
>>>
>>> analysis-build-%: analysis-parse-tags-% | analysis-supported-%
>>>    $(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile build
>>
>> If I'm not mistaken support for | doesn't exist in make 3.80 (the
>> minimum version we require to be used).
> 
> IDK, we use order-only prerequisite already in the Makefile.

Hmm, yes, for $(objtree)/%.c.cppcheck: . Question is whether this was
simply overlooked before. As said above such may be okay for these
special goals, but this needs properly documenting then.

>>>>> +analysis-clean:
>>>>> +# Reverts the original file (-p preserves also timestamp)
>>>>> +	$(Q)find $(srctree) -type f -name "*.safparse" -print | \
>>>>> +	while IFS= read file; do \
>>>>> +		cp -p "$${file}" "$${file%.safparse}"; \
>>>>> +		rm -f "$${file}"; \
>>>>
>>>> Why not "mv"?
>>>>
>>>>> +	done
>>>>> +
>>>>> +_analysis-%: analysis-build-%
>>>>> +	$(Q)$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile analysis-clean
>>>>
>>>> Again no use of the stem, plus here I wonder if this may not lead to
>>>> people invoking "analysis-clean" without having said anything about
>>>> cleaning on their command line.
>>>
>>> In any case, the cleaning process is very safe and does not clean anything that was not dirty before,
>>> so in case of typos, it’s just like a nop.
>>
>> People may put transient files in their trees. Of course they need to be
>> aware that when they specify a "clean" target their files may be deleted.
>> But without any "clean" target specified nothing should be removed.
> 
> *.safparse files are not supposed to be used freely by user in their tree, those
> files will be removed only if the user calls the “analysis-clean” target or if the
> analysis-coverity or analysis-eclair reaches the end (a process that creates *.safparse).
> 
> There is no other way to trigger the “analysis-clean” unintentionally, so I’m not sure about
> the modification you would like to see there.

I guess I don't understand: You have _analysis-% as the target, which I'd
assume will handle _analysis-clean just as much as _analysis-abc. This may
be connected to my lack of understanding as expressed further up. Or maybe
I'm simply not understanding what the _analysis-% target is about in the
first place, because with the analysis-build-% dependency I don't see how
_analysis-clean would actually work (with the scope restriction you
suggested earlier a rule for analysis-build-clean would not be found
afaict).

Jan
Luca Fancellu Nov. 8, 2022, 5:13 p.m. UTC | #6
> On 8 Nov 2022, at 15:49, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 08.11.2022 15:00, Luca Fancellu wrote:
>>> On 8 Nov 2022, at 11:48, Jan Beulich <jbeulich@suse.com> wrote:
>>> On 08.11.2022 11:59, Luca Fancellu wrote:
>>>>> On 07.11.2022 11:47, Luca Fancellu wrote:
>>>>>> @@ -757,6 +758,51 @@ cppcheck-version:
>>>>>> $(objtree)/include/generated/compiler-def.h:
>>>>>> 	$(Q)$(CC) -dM -E -o $@ - < /dev/null
>>>>>> 
>>>>>> +JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
>>>>>> +                       $(XEN_ROOT)/docs/misra/false-positive-$$*.json
>>>>>> +
>>>>>> +# The following command is using grep to find all files that contains a comment
>>>>>> +# containing "SAF-<anything>" on a single line.
>>>>>> +# %.safparse will be the original files saved from the build system, these files
>>>>>> +# will be restored at the end of the analysis step
>>>>>> +PARSE_FILE_LIST := $(addsuffix .safparse,$(filter-out %.safparse,\
>>>>>> +$(shell grep -ERl '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree))))
>>>>> 
>>>>> Please indent such line continuations. And then isn't this going to risk
>>>>> matching non-source files as well? Perhaps you want to restrict this to
>>>>> *.c and *.h?
>>>> 
>>>> Yes, how about this, it will filter out *.safparse files while keeping in only .h and .c:
>>>> 
>>>> PARSE_FILE_LIST := $(addsuffix .safparse,$(filter %.c %.h,\
>>>>   $(shell grep -ERl '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree))))
>>> 
>>> That's better, but still means touching all files by grep despite now
>>> only a subset really looked for. If I was to use the new goals on a
>>> more or less regular basis, I'd expect that this enumeration of files
>>> doesn't read _much_ more stuff from disk than is actually necessary.
>> 
>> Ok would it be ok?
>> 
>> PARSE_FILE_LIST := $(addsuffix .safparse,$(shell grep -ERl --include=\*.h \
>>    --include=\*.c '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree)))
> 
> Hmm, not sure: --include isn't a standard option to grep, and we
> generally try to be portable. Actually -R (or -r) isn't either. It
> may still be okay that way if properly documented where the involved
> goals will work and where not.

Is a comment before the line ok as documentation? To state that —include and
-R are not standard options so analysis-{coverity,eclair} will not work without a
grep that takes those parameters?

> 
> And then - why do you escape slashes in the ERE?
> 
> Talking of escaping - personally I find backslash escapes harder to
> read / grok than quotation, so I'd like to recommend using quotes
> around each of the two --include (if they remain in the first place)
> instead of the \* construct.

Ok I’ve removed the escape from the * and also from slashes:

PARSE_FILE_LIST := $(addsuffix .safparse,$(shell grep -ERl --include='*.h' \
    --include='*.c' '^[[:blank:]]*/\*[[:space:]]+SAF-.*\*/$$' $(srctree)))

> 
>>>>>> +	done
>>>>>> +
>>>>>> +analysis-build-%: analysis-parse-tags-%
>>>>>> +	$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile build
>>>>> 
>>>>> This rule doesn't use the stem, so I'm struggling to understand what
>>>>> this is about.
>>>> 
>>>> Yes, here my aim was to catch analysis-build-{eclair,coverity}, here I see that if the user has a typo
>>>> the rule will run anyway, but it will be stopped by the dependency chain because at the end we have:
>>>> 
>>>> JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
>>>>                      $(XEN_ROOT)/docs/misra/false-positive-$$*.json
>>>> 
>>>> That will give an error because $(XEN_ROOT)/docs/misra/false-positive-<typo>.json does not exists.
>>>> 
>>>> If you think it is not enough, what if I reduce the scope of the rule like this?
>>>> 
>>>> _analysis-coverity _analysis-eclair: _analysis-%: analysis-build-%
>>> 
>>> But then, without using the stem, how does it know whether to do an
>>> Eclair or a Coverity run?
>> 
>> Sorry I think I’m a bit lost here, the makefile is working on both analysis-coverity and analysis-eclair
>> because the % is solving in coverity or eclair depending on which the makefile has in input, it is not complaining
>> so I guess it works.
>> Do you see something not working? If so, are you able to provide a piece of code for that to make me understand?
> 
> Well, my problem is that I don't see how the distinction is conveyed
> without the stem being used. With what you say I understand I'm
> overlooking something, so I'd appreciate some explanation or at least
> a pointer.

Ok, I have that eclair and coverity shares the same commands to be executed by the build system,
so instead of duplicating the targets for coverity and eclair and their recipe, I’ve used the pattern rule
to have that these rules:

JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
                       $(XEN_ROOT)/docs/misra/false-positive-$$*.json

[…]

.SECONDEXPANSION:
$(objtree)/%.sed: $(srctree)/tools/xenfusa-gen-tags.py $(JUSTIFICATION_FILES)
    […]

[…]

analysis-parse-tags-%: $(PARSE_FILE_LIST) $(objtree)/%.sed
    […]

analysis-build-%: analysis-parse-tags-%
    $(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile build

analysis-clean:
   […]

_analysis-%: analysis-build-%
    $(Q)$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile analysis-clean

Matches the case where 'make analysis-coverity’ or ‘make analysis-eclair’ is called.

Now, please correct me if my assumption on the way make works are wrong, here my assumptions:

For example when ‘make analysis-coverity’ is called we have that this rule is the best match for the
called target:

_analysis-%:

So anything after _analysis- will be captured with % and this will be transferred to the dependency
of the target that is analysis-build-% -> analysis-build-coverity

Now analysis-build-coverity will be called, the best match is analysis-build-%, so again the dependency
which is analysis-parse-tags-%, will be translated to analysis-parse-tags-coverity.

Now analysis-parse-tags-coverity will be called, the best match is analysis-parse-tags-%, so the % will
Have the ‘coverity’ value and in the dependency we will have $(objtree)/%.sed -> $(objtree)/coverity.sed.

Looking for $(objtree)/coverity.sed the best match is $(objtree)/%.sed, which will have $(JUSTIFICATION_FILES)
and the python script in the dependency, here we will use the second expansion to solve
$(XEN_ROOT)/docs/misra/false-positive-$$*.json in $(XEN_ROOT)/docs/misra/false-positive-coverity.json

So now after analysis-parse-tags-coverity has ended its dependency it will start with its recipe, after it finishes,
the recipe of analysis-build-coverity will start and it will call make to actually build Xen.

After the build finishes, if the status is good, the analysis-build-coverity has finished and the _analysis-coverity
recipe can now run, it will call make with the analysis-clean target, restoring any <file>.{c,h}.safparse to <file>.{c,h}.

We will have the same with ‘make analysis-eclair’, if we do a mistake typing, like ‘make analysis-coveri’, we will
have:

make: Entering directory ‘/path/to/xen/xen'
make: *** No rule to make target 'analysis-coveri'.  Stop.
make: Leaving directory '/path/to/xen/xen'



> 
>>>> Or, if you are still worried about “analysis-build-%: analysis-parse-tags-%”, then I can do something
>>>> like this: 
>>>> 
>>>> analysis-supported-coverity analysis-supported-eclair:
>>>>   @echo > /dev/null
>>>> 
>>>> analysis-supported-%:
>>>>   @error Unsupported analysis tool @*
>>>> 
>>>> analysis-build-%: analysis-parse-tags-% | analysis-supported-%
>>>>   $(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile build
>>> 
>>> If I'm not mistaken support for | doesn't exist in make 3.80 (the
>>> minimum version we require to be used).
>> 
>> IDK, we use order-only prerequisite already in the Makefile.
> 
> Hmm, yes, for $(objtree)/%.c.cppcheck: . Question is whether this was
> simply overlooked before. As said above such may be okay for these
> special goals, but this needs properly documenting then.
> 
>>>>>> +analysis-clean:
>>>>>> +# Reverts the original file (-p preserves also timestamp)
>>>>>> +	$(Q)find $(srctree) -type f -name "*.safparse" -print | \
>>>>>> +	while IFS= read file; do \
>>>>>> +		cp -p "$${file}" "$${file%.safparse}"; \
>>>>>> +		rm -f "$${file}"; \
>>>>> 
>>>>> Why not "mv"?
>>>>> 
>>>>>> +	done
>>>>>> +
>>>>>> +_analysis-%: analysis-build-%
>>>>>> +	$(Q)$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile analysis-clean
>>>>> 
>>>>> Again no use of the stem, plus here I wonder if this may not lead to
>>>>> people invoking "analysis-clean" without having said anything about
>>>>> cleaning on their command line.
>>>> 
>>>> In any case, the cleaning process is very safe and does not clean anything that was not dirty before,
>>>> so in case of typos, it’s just like a nop.
>>> 
>>> People may put transient files in their trees. Of course they need to be
>>> aware that when they specify a "clean" target their files may be deleted.
>>> But without any "clean" target specified nothing should be removed.
>> 
>> *.safparse files are not supposed to be used freely by user in their tree, those
>> files will be removed only if the user calls the “analysis-clean” target or if the
>> analysis-coverity or analysis-eclair reaches the end (a process that creates *.safparse).
>> 
>> There is no other way to trigger the “analysis-clean” unintentionally, so I’m not sure about
>> the modification you would like to see there.
> 
> I guess I don't understand: You have _analysis-% as the target, which I'd
> assume will handle _analysis-clean just as much as _analysis-abc. This may
> be connected to my lack of understanding as expressed further up. Or maybe
> I'm simply not understanding what the _analysis-% target is about in the
> first place, because with the analysis-build-% dependency I don't see how
> _analysis-clean would actually work (with the scope restriction you
> suggested earlier a rule for analysis-build-clean would not be found
> afaict).

_analysis-clean will not work, neither _analysis-abc, because of what I wrote above.
analysis-clean instead is called from the recipe of _analysis-% if all its dependency are
built correctly, otherwise it’s the user that needs to call it directly by doing “make analysis-clean”.



> 
> Jan
Jan Beulich Nov. 9, 2022, 8:31 a.m. UTC | #7
On 08.11.2022 18:13, Luca Fancellu wrote:
>> On 8 Nov 2022, at 15:49, Jan Beulich <jbeulich@suse.com> wrote:
>> On 08.11.2022 15:00, Luca Fancellu wrote:
>>>> On 8 Nov 2022, at 11:48, Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 08.11.2022 11:59, Luca Fancellu wrote:
>>>>>> On 07.11.2022 11:47, Luca Fancellu wrote:
>>>>>>> @@ -757,6 +758,51 @@ cppcheck-version:
>>>>>>> $(objtree)/include/generated/compiler-def.h:
>>>>>>> 	$(Q)$(CC) -dM -E -o $@ - < /dev/null
>>>>>>>
>>>>>>> +JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
>>>>>>> +                       $(XEN_ROOT)/docs/misra/false-positive-$$*.json
>>>>>>> +
>>>>>>> +# The following command is using grep to find all files that contains a comment
>>>>>>> +# containing "SAF-<anything>" on a single line.
>>>>>>> +# %.safparse will be the original files saved from the build system, these files
>>>>>>> +# will be restored at the end of the analysis step
>>>>>>> +PARSE_FILE_LIST := $(addsuffix .safparse,$(filter-out %.safparse,\
>>>>>>> +$(shell grep -ERl '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree))))
>>>>>>
>>>>>> Please indent such line continuations. And then isn't this going to risk
>>>>>> matching non-source files as well? Perhaps you want to restrict this to
>>>>>> *.c and *.h?
>>>>>
>>>>> Yes, how about this, it will filter out *.safparse files while keeping in only .h and .c:
>>>>>
>>>>> PARSE_FILE_LIST := $(addsuffix .safparse,$(filter %.c %.h,\
>>>>>   $(shell grep -ERl '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree))))
>>>>
>>>> That's better, but still means touching all files by grep despite now
>>>> only a subset really looked for. If I was to use the new goals on a
>>>> more or less regular basis, I'd expect that this enumeration of files
>>>> doesn't read _much_ more stuff from disk than is actually necessary.
>>>
>>> Ok would it be ok?
>>>
>>> PARSE_FILE_LIST := $(addsuffix .safparse,$(shell grep -ERl --include=\*.h \
>>>    --include=\*.c '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree)))
>>
>> Hmm, not sure: --include isn't a standard option to grep, and we
>> generally try to be portable. Actually -R (or -r) isn't either. It
>> may still be okay that way if properly documented where the involved
>> goals will work and where not.
> 
> Is a comment before the line ok as documentation? To state that —include and
> -R are not standard options so analysis-{coverity,eclair} will not work without a
> grep that takes those parameters?

A comment _might_ be okay. Is there no other documentation on how these
goals are to be used? The main question here is how impacting this might
be to the various environments we allow Xen to be built in: Would at
least modern versions of all Linux distros we care about allow using
these rules? What about non-Linux?

And could you at least bail when PARSE_FILE_LIST ends up empty, with a
clear error message augmenting the one grep would have issued?

>> And then - why do you escape slashes in the ERE?
>>
>> Talking of escaping - personally I find backslash escapes harder to
>> read / grok than quotation, so I'd like to recommend using quotes
>> around each of the two --include (if they remain in the first place)
>> instead of the \* construct.
> 
> Ok I’ve removed the escape from the * and also from slashes:
> 
> PARSE_FILE_LIST := $(addsuffix .safparse,$(shell grep -ERl --include='*.h' \
>     --include='*.c' '^[[:blank:]]*/\*[[:space:]]+SAF-.*\*/$$' $(srctree)))

Good - seeing things more clearly now my next question is: Isn't
matching just "/* SAF-...*/" a little too lax? And is there really a
need to permit leading blanks?

>>>>>>> +	done
>>>>>>> +
>>>>>>> +analysis-build-%: analysis-parse-tags-%
>>>>>>> +	$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile build
>>>>>>
>>>>>> This rule doesn't use the stem, so I'm struggling to understand what
>>>>>> this is about.
>>>>>
>>>>> Yes, here my aim was to catch analysis-build-{eclair,coverity}, here I see that if the user has a typo
>>>>> the rule will run anyway, but it will be stopped by the dependency chain because at the end we have:
>>>>>
>>>>> JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
>>>>>                      $(XEN_ROOT)/docs/misra/false-positive-$$*.json
>>>>>
>>>>> That will give an error because $(XEN_ROOT)/docs/misra/false-positive-<typo>.json does not exists.
>>>>>
>>>>> If you think it is not enough, what if I reduce the scope of the rule like this?
>>>>>
>>>>> _analysis-coverity _analysis-eclair: _analysis-%: analysis-build-%
>>>>
>>>> But then, without using the stem, how does it know whether to do an
>>>> Eclair or a Coverity run?
>>>
>>> Sorry I think I’m a bit lost here, the makefile is working on both analysis-coverity and analysis-eclair
>>> because the % is solving in coverity or eclair depending on which the makefile has in input, it is not complaining
>>> so I guess it works.
>>> Do you see something not working? If so, are you able to provide a piece of code for that to make me understand?
>>
>> Well, my problem is that I don't see how the distinction is conveyed
>> without the stem being used. With what you say I understand I'm
>> overlooking something, so I'd appreciate some explanation or at least
>> a pointer.
> 
> Ok, I have that eclair and coverity shares the same commands to be executed by the build system,
> so instead of duplicating the targets for coverity and eclair and their recipe, I’ve used the pattern rule
> to have that these rules:
> 
> JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
>                        $(XEN_ROOT)/docs/misra/false-positive-$$*.json
> 
> […]
> 
> .SECONDEXPANSION:
> $(objtree)/%.sed: $(srctree)/tools/xenfusa-gen-tags.py $(JUSTIFICATION_FILES)
>     […]
> 
> […]
> 
> analysis-parse-tags-%: $(PARSE_FILE_LIST) $(objtree)/%.sed
>     […]
> 
> analysis-build-%: analysis-parse-tags-%
>     $(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile build
> 
> analysis-clean:
>    […]
> 
> _analysis-%: analysis-build-%
>     $(Q)$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile analysis-clean
> 
> Matches the case where 'make analysis-coverity’ or ‘make analysis-eclair’ is called.
> 
> Now, please correct me if my assumption on the way make works are wrong, here my assumptions:
> 
> For example when ‘make analysis-coverity’ is called we have that this rule is the best match for the
> called target:
> 
> _analysis-%:

So my main oversight was your addition to main-targets, which makes the
connection with this underscore-prefixed goal.

As to you saying "best match" - I didn't think make had such a concept
when it comes to considering pattern rules. Aiui it is "first match", in
the order that rules were parsed from all involved makefiles.

> So anything after _analysis- will be captured with % and this will be transferred to the dependency
> of the target that is analysis-build-% -> analysis-build-coverity
> 
> Now analysis-build-coverity will be called, the best match is analysis-build-%, so again the dependency
> which is analysis-parse-tags-%, will be translated to analysis-parse-tags-coverity.
> 
> Now analysis-parse-tags-coverity will be called, the best match is analysis-parse-tags-%, so the % will
> Have the ‘coverity’ value and in the dependency we will have $(objtree)/%.sed -> $(objtree)/coverity.sed.
> 
> Looking for $(objtree)/coverity.sed the best match is $(objtree)/%.sed, which will have $(JUSTIFICATION_FILES)
> and the python script in the dependency, here we will use the second expansion to solve
> $(XEN_ROOT)/docs/misra/false-positive-$$*.json in $(XEN_ROOT)/docs/misra/false-positive-coverity.json
> 
> So now after analysis-parse-tags-coverity has ended its dependency it will start with its recipe, after it finishes,
> the recipe of analysis-build-coverity will start and it will call make to actually build Xen.

Okay, I see now - this building of Xen really _is_ independent of the
checker chosen. I'm not sure though whether it is a good idea to
integrate all this, including ...

> After the build finishes, if the status is good, the analysis-build-coverity has finished and the _analysis-coverity
> recipe can now run, it will call make with the analysis-clean target, restoring any <file>.{c,h}.safparse to <file>.{c,h}.

... the subsequent cleaning. The state of the _source_ tree after a
build failure would be different from that after a successful build.
Personally I consider this at best surprising.

I wonder whether instead there could be a shell(?) script driving a
sequence of make invocations, leaving the new make goals all be self-
contained. Such a script could revert the source tree to its original
state even upon build failure by default, with an option allowing to
suppress this behavior.

Jan
Luca Fancellu Nov. 9, 2022, 10:08 a.m. UTC | #8
>> 
>> On 07.11.2022 11:47, Luca Fancellu wrote:
>>> +Here is an example to add a new justification in false-positive-<tool>.json::
>> 
>> With <tool> already present in the name, ...
>> 
>>> +|{
>>> +|    "version": "1.0",
>>> +|    "content": [
>>> +|        {
>>> +|            "id": "SAF-0-false-positive-<tool>",
>>> +|            "analyser": {
>>> +|                "<tool>": "<proprietary-id>"
>> 
>> ... can we avoid the redundancy here? Perhaps ...
>> 
>>> +|            },
>>> +|            "tool-version": "<version>",
>> 
>> ... it could be
>> 
>>           "analyser": {
>>               "<version>": "<proprietary-id>"
>>           },

About this, I’ve investigated a bit and I don’t think this is the right solution, it wouldn't make
much sense to have a schema where in one file the analyser dictionary key is the tool name
and in another it is a version (or range of versions).

However I can remove the analyser dictionary and use this schema for the false-positive, which is
more compact:

|{
|    "version": "1.0",
|    "content": [
|        {
|            "id": "SAF-0-false-positive-<tool>",
|            “tool-proprietary-id”: "<proprietary-id>”,
|            "tool-version": "<version>",
|            "name": "R20.7 [...]",
|            "text": "[...]"
|        },
|        {
|            "id": "SAF-1-false-positive-<tool>",
|            “tool-proprietary-id”: "",
|            "tool-version": "",
|            "name": "Sentinel",
|            "text": "Next ID to be used"
|        }
|    ]
|}

This needs however a change in the initial design and more documentation on the different handlings
of the safe.json schema and the false-positive-<tool>.json schema. Is it worth?

> On 9 Nov 2022, at 08:31, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 08.11.2022 18:13, Luca Fancellu wrote:
>>> On 8 Nov 2022, at 15:49, Jan Beulich <jbeulich@suse.com> wrote:
>>> On 08.11.2022 15:00, Luca Fancellu wrote:
>>>>> On 8 Nov 2022, at 11:48, Jan Beulich <jbeulich@suse.com> wrote:
>>>>> On 08.11.2022 11:59, Luca Fancellu wrote:
>>>>>>> On 07.11.2022 11:47, Luca Fancellu wrote:
>>>>>>>> @@ -757,6 +758,51 @@ cppcheck-version:
>>>>>>>> $(objtree)/include/generated/compiler-def.h:
>>>>>>>> 	$(Q)$(CC) -dM -E -o $@ - < /dev/null
>>>>>>>> 
>>>>>>>> +JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
>>>>>>>> +                       $(XEN_ROOT)/docs/misra/false-positive-$$*.json
>>>>>>>> +
>>>>>>>> +# The following command is using grep to find all files that contains a comment
>>>>>>>> +# containing "SAF-<anything>" on a single line.
>>>>>>>> +# %.safparse will be the original files saved from the build system, these files
>>>>>>>> +# will be restored at the end of the analysis step
>>>>>>>> +PARSE_FILE_LIST := $(addsuffix .safparse,$(filter-out %.safparse,\
>>>>>>>> +$(shell grep -ERl '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree))))
>>>>>>> 
>>>>>>> Please indent such line continuations. And then isn't this going to risk
>>>>>>> matching non-source files as well? Perhaps you want to restrict this to
>>>>>>> *.c and *.h?
>>>>>> 
>>>>>> Yes, how about this, it will filter out *.safparse files while keeping in only .h and .c:
>>>>>> 
>>>>>> PARSE_FILE_LIST := $(addsuffix .safparse,$(filter %.c %.h,\
>>>>>>  $(shell grep -ERl '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree))))
>>>>> 
>>>>> That's better, but still means touching all files by grep despite now
>>>>> only a subset really looked for. If I was to use the new goals on a
>>>>> more or less regular basis, I'd expect that this enumeration of files
>>>>> doesn't read _much_ more stuff from disk than is actually necessary.
>>>> 
>>>> Ok would it be ok?
>>>> 
>>>> PARSE_FILE_LIST := $(addsuffix .safparse,$(shell grep -ERl --include=\*.h \
>>>>   --include=\*.c '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree)))
>>> 
>>> Hmm, not sure: --include isn't a standard option to grep, and we
>>> generally try to be portable. Actually -R (or -r) isn't either. It
>>> may still be okay that way if properly documented where the involved
>>> goals will work and where not.
>> 
>> Is a comment before the line ok as documentation? To state that —include and
>> -R are not standard options so analysis-{coverity,eclair} will not work without a
>> grep that takes those parameters?
> 
> A comment _might_ be okay. Is there no other documentation on how these
> goals are to be used? The main question here is how impacting this might
> be to the various environments we allow Xen to be built in: Would at
> least modern versions of all Linux distros we care about allow using
> these rules? What about non-Linux?
> 
> And could you at least bail when PARSE_FILE_LIST ends up empty, with a
> clear error message augmenting the one grep would have issued?

An empty PARSE_FILE_LIST should not generate an error, it just means there are no
justifications, but I see it can be problematic in case grep does not work.

What about this? They should be standard options right?

PARSE_FILE_LIST := $(addsuffix .safparse,$(shell find $(srctree) -type f \
    -name '*.c' -o -name '*.h' -exec \
    grep -El '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' {} + ))

> 
>>> And then - why do you escape slashes in the ERE?
>>> 
>>> Talking of escaping - personally I find backslash escapes harder to
>>> read / grok than quotation, so I'd like to recommend using quotes
>>> around each of the two --include (if they remain in the first place)
>>> instead of the \* construct.
>> 
>> Ok I’ve removed the escape from the * and also from slashes:
>> 
>> PARSE_FILE_LIST := $(addsuffix .safparse,$(shell grep -ERl --include='*.h' \
>>    --include='*.c' '^[[:blank:]]*/\*[[:space:]]+SAF-.*\*/$$' $(srctree)))
> 
> Good - seeing things more clearly now my next question is: Isn't
> matching just "/* SAF-...*/" a little too lax? And is there really a
> need to permit leading blanks?

I’m permitting blanks to allow spaces or tabs, zero or more times before the start of
the comment, I think it shall be like that.
About matching, maybe I can match also the number after SAF-, this should be enough,

[…] grep -El '^[[:blank:]]*\/\*[[:space:]]+SAF-[0-9]+.*\*\/$$’ […]

> 
>>>>>>>> +	done
>>>>>>>> +
>>>>>>>> +analysis-build-%: analysis-parse-tags-%
>>>>>>>> +	$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile build
>>>>>>> 
>>>>>>> This rule doesn't use the stem, so I'm struggling to understand what
>>>>>>> this is about.
>>>>>> 
>>>>>> Yes, here my aim was to catch analysis-build-{eclair,coverity}, here I see that if the user has a typo
>>>>>> the rule will run anyway, but it will be stopped by the dependency chain because at the end we have:
>>>>>> 
>>>>>> JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
>>>>>>                     $(XEN_ROOT)/docs/misra/false-positive-$$*.json
>>>>>> 
>>>>>> That will give an error because $(XEN_ROOT)/docs/misra/false-positive-<typo>.json does not exists.
>>>>>> 
>>>>>> If you think it is not enough, what if I reduce the scope of the rule like this?
>>>>>> 
>>>>>> _analysis-coverity _analysis-eclair: _analysis-%: analysis-build-%
>>>>> 
>>>>> But then, without using the stem, how does it know whether to do an
>>>>> Eclair or a Coverity run?
>>>> 
>>>> Sorry I think I’m a bit lost here, the makefile is working on both analysis-coverity and analysis-eclair
>>>> because the % is solving in coverity or eclair depending on which the makefile has in input, it is not complaining
>>>> so I guess it works.
>>>> Do you see something not working? If so, are you able to provide a piece of code for that to make me understand?
>>> 
>>> Well, my problem is that I don't see how the distinction is conveyed
>>> without the stem being used. With what you say I understand I'm
>>> overlooking something, so I'd appreciate some explanation or at least
>>> a pointer.
>> 
>> Ok, I have that eclair and coverity shares the same commands to be executed by the build system,
>> so instead of duplicating the targets for coverity and eclair and their recipe, I’ve used the pattern rule
>> to have that these rules:
>> 
>> JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
>>                       $(XEN_ROOT)/docs/misra/false-positive-$$*.json
>> 
>> […]
>> 
>> .SECONDEXPANSION:
>> $(objtree)/%.sed: $(srctree)/tools/xenfusa-gen-tags.py $(JUSTIFICATION_FILES)
>>    […]
>> 
>> […]
>> 
>> analysis-parse-tags-%: $(PARSE_FILE_LIST) $(objtree)/%.sed
>>    […]
>> 
>> analysis-build-%: analysis-parse-tags-%
>>    $(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile build
>> 
>> analysis-clean:
>>   […]
>> 
>> _analysis-%: analysis-build-%
>>    $(Q)$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile analysis-clean
>> 
>> Matches the case where 'make analysis-coverity’ or ‘make analysis-eclair’ is called.
>> 
>> Now, please correct me if my assumption on the way make works are wrong, here my assumptions:
>> 
>> For example when ‘make analysis-coverity’ is called we have that this rule is the best match for the
>> called target:
>> 
>> _analysis-%:
> 
> So my main oversight was your addition to main-targets, which makes the
> connection with this underscore-prefixed goal.
> 
> As to you saying "best match" - I didn't think make had such a concept
> when it comes to considering pattern rules. Aiui it is "first match", in
> the order that rules were parsed from all involved makefiles.

Yes first match is the right term.

> 
>> So anything after _analysis- will be captured with % and this will be transferred to the dependency
>> of the target that is analysis-build-% -> analysis-build-coverity
>> 
>> Now analysis-build-coverity will be called, the best match is analysis-build-%, so again the dependency
>> which is analysis-parse-tags-%, will be translated to analysis-parse-tags-coverity.
>> 
>> Now analysis-parse-tags-coverity will be called, the best match is analysis-parse-tags-%, so the % will
>> Have the ‘coverity’ value and in the dependency we will have $(objtree)/%.sed -> $(objtree)/coverity.sed.
>> 
>> Looking for $(objtree)/coverity.sed the best match is $(objtree)/%.sed, which will have $(JUSTIFICATION_FILES)
>> and the python script in the dependency, here we will use the second expansion to solve
>> $(XEN_ROOT)/docs/misra/false-positive-$$*.json in $(XEN_ROOT)/docs/misra/false-positive-coverity.json
>> 
>> So now after analysis-parse-tags-coverity has ended its dependency it will start with its recipe, after it finishes,
>> the recipe of analysis-build-coverity will start and it will call make to actually build Xen.
> 
> Okay, I see now - this building of Xen really _is_ independent of the
> checker chosen. I'm not sure though whether it is a good idea to
> integrate all this, including ...
> 
>> After the build finishes, if the status is good, the analysis-build-coverity has finished and the _analysis-coverity
>> recipe can now run, it will call make with the analysis-clean target, restoring any <file>.{c,h}.safparse to <file>.{c,h}.
> 
> ... the subsequent cleaning. The state of the _source_ tree after a
> build failure would be different from that after a successful build.
> Personally I consider this at best surprising.
> 
> I wonder whether instead there could be a shell(?) script driving a
> sequence of make invocations, leaving the new make goals all be self-
> contained. Such a script could revert the source tree to its original
> state even upon build failure by default, with an option allowing to
> suppress this behavior.

Instead of adding another tool, so another layer to the overall system, I would be more willing to add documentation
about this process, explaining how to use the analysis-* build targets, what to expect after a successful run and what
to expect after a failure.

What do you think?

Cheers,
Luca

> 
> Jan
Jan Beulich Nov. 9, 2022, 10:36 a.m. UTC | #9
On 09.11.2022 11:08, Luca Fancellu wrote:
>>> On 07.11.2022 11:47, Luca Fancellu wrote:
>>>> +Here is an example to add a new justification in false-positive-<tool>.json::
>>>
>>> With <tool> already present in the name, ...
>>>
>>>> +|{
>>>> +|    "version": "1.0",
>>>> +|    "content": [
>>>> +|        {
>>>> +|            "id": "SAF-0-false-positive-<tool>",
>>>> +|            "analyser": {
>>>> +|                "<tool>": "<proprietary-id>"
>>>
>>> ... can we avoid the redundancy here? Perhaps ...
>>>
>>>> +|            },
>>>> +|            "tool-version": "<version>",
>>>
>>> ... it could be
>>>
>>>           "analyser": {
>>>               "<version>": "<proprietary-id>"
>>>           },
> 
> About this, I’ve investigated a bit and I don’t think this is the right solution, it wouldn't make
> much sense to have a schema where in one file the analyser dictionary key is the tool name
> and in another it is a version (or range of versions).
> 
> However I can remove the analyser dictionary and use this schema for the false-positive, which is
> more compact:
> 
> |{
> |    "version": "1.0",
> |    "content": [
> |        {
> |            "id": "SAF-0-false-positive-<tool>",
> |            “tool-proprietary-id”: "<proprietary-id>”,
> |            "tool-version": "<version>",
> |            "name": "R20.7 [...]",
> |            "text": "[...]"
> |        },
> |        {
> |            "id": "SAF-1-false-positive-<tool>",
> |            “tool-proprietary-id”: "",
> |            "tool-version": "",
> |            "name": "Sentinel",
> |            "text": "Next ID to be used"
> |        }
> |    ]
> |}
> 
> This needs however a change in the initial design and more documentation on the different handlings
> of the safe.json schema and the false-positive-<tool>.json schema. Is it worth?

I think it is, but of others disagree, so be it.

>> On 9 Nov 2022, at 08:31, Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 08.11.2022 18:13, Luca Fancellu wrote:
>>>> On 8 Nov 2022, at 15:49, Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 08.11.2022 15:00, Luca Fancellu wrote:
>>>>>> On 8 Nov 2022, at 11:48, Jan Beulich <jbeulich@suse.com> wrote:
>>>>>> On 08.11.2022 11:59, Luca Fancellu wrote:
>>>>>>>> On 07.11.2022 11:47, Luca Fancellu wrote:
>>>>>>>>> @@ -757,6 +758,51 @@ cppcheck-version:
>>>>>>>>> $(objtree)/include/generated/compiler-def.h:
>>>>>>>>> 	$(Q)$(CC) -dM -E -o $@ - < /dev/null
>>>>>>>>>
>>>>>>>>> +JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
>>>>>>>>> +                       $(XEN_ROOT)/docs/misra/false-positive-$$*.json
>>>>>>>>> +
>>>>>>>>> +# The following command is using grep to find all files that contains a comment
>>>>>>>>> +# containing "SAF-<anything>" on a single line.
>>>>>>>>> +# %.safparse will be the original files saved from the build system, these files
>>>>>>>>> +# will be restored at the end of the analysis step
>>>>>>>>> +PARSE_FILE_LIST := $(addsuffix .safparse,$(filter-out %.safparse,\
>>>>>>>>> +$(shell grep -ERl '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree))))
>>>>>>>>
>>>>>>>> Please indent such line continuations. And then isn't this going to risk
>>>>>>>> matching non-source files as well? Perhaps you want to restrict this to
>>>>>>>> *.c and *.h?
>>>>>>>
>>>>>>> Yes, how about this, it will filter out *.safparse files while keeping in only .h and .c:
>>>>>>>
>>>>>>> PARSE_FILE_LIST := $(addsuffix .safparse,$(filter %.c %.h,\
>>>>>>>  $(shell grep -ERl '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree))))
>>>>>>
>>>>>> That's better, but still means touching all files by grep despite now
>>>>>> only a subset really looked for. If I was to use the new goals on a
>>>>>> more or less regular basis, I'd expect that this enumeration of files
>>>>>> doesn't read _much_ more stuff from disk than is actually necessary.
>>>>>
>>>>> Ok would it be ok?
>>>>>
>>>>> PARSE_FILE_LIST := $(addsuffix .safparse,$(shell grep -ERl --include=\*.h \
>>>>>   --include=\*.c '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree)))
>>>>
>>>> Hmm, not sure: --include isn't a standard option to grep, and we
>>>> generally try to be portable. Actually -R (or -r) isn't either. It
>>>> may still be okay that way if properly documented where the involved
>>>> goals will work and where not.
>>>
>>> Is a comment before the line ok as documentation? To state that —include and
>>> -R are not standard options so analysis-{coverity,eclair} will not work without a
>>> grep that takes those parameters?
>>
>> A comment _might_ be okay. Is there no other documentation on how these
>> goals are to be used? The main question here is how impacting this might
>> be to the various environments we allow Xen to be built in: Would at
>> least modern versions of all Linux distros we care about allow using
>> these rules? What about non-Linux?
>>
>> And could you at least bail when PARSE_FILE_LIST ends up empty, with a
>> clear error message augmenting the one grep would have issued?
> 
> An empty PARSE_FILE_LIST should not generate an error, it just means there are no
> justifications, but I see it can be problematic in case grep does not work.
> 
> What about this? They should be standard options right?
> 
> PARSE_FILE_LIST := $(addsuffix .safparse,$(shell find $(srctree) -type f \
>     -name '*.c' -o -name '*.h' -exec \
>     grep -El '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' {} + ))

Coming closer to being generally usable. You now have the problem of
potentially exceeding command line limits (iirc there were issues in
find and/or kernels), but I agree it looks standard-conforming now.

>>>> And then - why do you escape slashes in the ERE?
>>>>
>>>> Talking of escaping - personally I find backslash escapes harder to
>>>> read / grok than quotation, so I'd like to recommend using quotes
>>>> around each of the two --include (if they remain in the first place)
>>>> instead of the \* construct.
>>>
>>> Ok I’ve removed the escape from the * and also from slashes:
>>>
>>> PARSE_FILE_LIST := $(addsuffix .safparse,$(shell grep -ERl --include='*.h' \
>>>    --include='*.c' '^[[:blank:]]*/\*[[:space:]]+SAF-.*\*/$$' $(srctree)))
>>
>> Good - seeing things more clearly now my next question is: Isn't
>> matching just "/* SAF-...*/" a little too lax? And is there really a
>> need to permit leading blanks?
> 
> I’m permitting blanks to allow spaces or tabs, zero or more times before the start of
> the comment, I think it shall be like that.

Hmm, I withdraw my question realizing that you want these comments
indented the same as the line they relate to.

> About matching, maybe I can match also the number after SAF-, this should be enough,
> 
> […] grep -El '^[[:blank:]]*\/\*[[:space:]]+SAF-[0-9]+.*\*\/$$’ […]

I'd like to suggest to go one tiny step further (and once again to
drop the escaping of slashes):

'^[[:blank:]]*/\*[[:space:]]+SAF-[0-9]+-.*\*/$$'

>>> Now analysis-build-coverity will be called, the best match is analysis-build-%, so again the dependency
>>> which is analysis-parse-tags-%, will be translated to analysis-parse-tags-coverity.
>>>
>>> Now analysis-parse-tags-coverity will be called, the best match is analysis-parse-tags-%, so the % will
>>> Have the ‘coverity’ value and in the dependency we will have $(objtree)/%.sed -> $(objtree)/coverity.sed.
>>>
>>> Looking for $(objtree)/coverity.sed the best match is $(objtree)/%.sed, which will have $(JUSTIFICATION_FILES)
>>> and the python script in the dependency, here we will use the second expansion to solve
>>> $(XEN_ROOT)/docs/misra/false-positive-$$*.json in $(XEN_ROOT)/docs/misra/false-positive-coverity.json
>>>
>>> So now after analysis-parse-tags-coverity has ended its dependency it will start with its recipe, after it finishes,
>>> the recipe of analysis-build-coverity will start and it will call make to actually build Xen.
>>
>> Okay, I see now - this building of Xen really _is_ independent of the
>> checker chosen. I'm not sure though whether it is a good idea to
>> integrate all this, including ...
>>
>>> After the build finishes, if the status is good, the analysis-build-coverity has finished and the _analysis-coverity
>>> recipe can now run, it will call make with the analysis-clean target, restoring any <file>.{c,h}.safparse to <file>.{c,h}.
>>
>> ... the subsequent cleaning. The state of the _source_ tree after a
>> build failure would be different from that after a successful build.
>> Personally I consider this at best surprising.
>>
>> I wonder whether instead there could be a shell(?) script driving a
>> sequence of make invocations, leaving the new make goals all be self-
>> contained. Such a script could revert the source tree to its original
>> state even upon build failure by default, with an option allowing to
>> suppress this behavior.
> 
> Instead of adding another tool, so another layer to the overall system, I would be more willing to add documentation
> about this process, explaining how to use the analysis-* build targets, what to expect after a successful run and what
> to expect after a failure.
> 
> What do you think?

Personally I'd prefer make goals to behave as such, with no surprises.

Jan
Luca Fancellu Nov. 11, 2022, 10:42 a.m. UTC | #10
> On 9 Nov 2022, at 10:36, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 09.11.2022 11:08, Luca Fancellu wrote:
>>>> On 07.11.2022 11:47, Luca Fancellu wrote:
>>>>> +Here is an example to add a new justification in false-positive-<tool>.json::
>>>> 
>>>> With <tool> already present in the name, ...
>>>> 
>>>>> +|{
>>>>> +|    "version": "1.0",
>>>>> +|    "content": [
>>>>> +|        {
>>>>> +|            "id": "SAF-0-false-positive-<tool>",
>>>>> +|            "analyser": {
>>>>> +|                "<tool>": "<proprietary-id>"
>>>> 
>>>> ... can we avoid the redundancy here? Perhaps ...
>>>> 
>>>>> +|            },
>>>>> +|            "tool-version": "<version>",
>>>> 
>>>> ... it could be
>>>> 
>>>>          "analyser": {
>>>>              "<version>": "<proprietary-id>"
>>>>          },
>> 
>> About this, I’ve investigated a bit and I don’t think this is the right solution, it wouldn't make
>> much sense to have a schema where in one file the analyser dictionary key is the tool name
>> and in another it is a version (or range of versions).
>> 
>> However I can remove the analyser dictionary and use this schema for the false-positive, which is
>> more compact:
>> 
>> |{
>> |    "version": "1.0",
>> |    "content": [
>> |        {
>> |            "id": "SAF-0-false-positive-<tool>",
>> |            “tool-proprietary-id”: "<proprietary-id>”,
>> |            "tool-version": "<version>",
>> |            "name": "R20.7 [...]",
>> |            "text": "[...]"
>> |        },
>> |        {
>> |            "id": "SAF-1-false-positive-<tool>",
>> |            “tool-proprietary-id”: "",
>> |            "tool-version": "",
>> |            "name": "Sentinel",
>> |            "text": "Next ID to be used"
>> |        }
>> |    ]
>> |}
>> 
>> This needs however a change in the initial design and more documentation on the different handlings
>> of the safe.json schema and the false-positive-<tool>.json schema. Is it worth?
> 
> I think it is, but of others disagree, so be it.

So, since no one replied on that, I think everybody agrees that safe and false-positive can have a different schema,
I will update the python tool to handle that and I will update the make recipe consequently.

>>>>> 
>>>>> Hmm, not sure: --include isn't a standard option to grep, and we
>>>>> generally try to be portable. Actually -R (or -r) isn't either. It
>>>>> may still be okay that way if properly documented where the involved
>>>>> goals will work and where not.
>>>> 
>>>> Is a comment before the line ok as documentation? To state that —include and
>>>> -R are not standard options so analysis-{coverity,eclair} will not work without a
>>>> grep that takes those parameters?
>>> 
>>> A comment _might_ be okay. Is there no other documentation on how these
>>> goals are to be used? The main question here is how impacting this might
>>> be to the various environments we allow Xen to be built in: Would at
>>> least modern versions of all Linux distros we care about allow using
>>> these rules? What about non-Linux?
>>> 
>>> And could you at least bail when PARSE_FILE_LIST ends up empty, with a
>>> clear error message augmenting the one grep would have issued?
>> 
>> An empty PARSE_FILE_LIST should not generate an error, it just means there are no
>> justifications, but I see it can be problematic in case grep does not work.
>> 
>> What about this? They should be standard options right?
>> 
>> PARSE_FILE_LIST := $(addsuffix .safparse,$(shell find $(srctree) -type f \
>>    -name '*.c' -o -name '*.h' -exec \
>>    grep -El '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' {} + ))
> 
> Coming closer to being generally usable. You now have the problem of
> potentially exceeding command line limits (iirc there were issues in
> find and/or kernels), but I agree it looks standard-conforming now.
> 
>>>>> And then - why do you escape slashes in the ERE?
>>>>> 
>>>>> Talking of escaping - personally I find backslash escapes harder to
>>>>> read / grok than quotation, so I'd like to recommend using quotes
>>>>> around each of the two --include (if they remain in the first place)
>>>>> instead of the \* construct.
>>>> 
>>>> Ok I’ve removed the escape from the * and also from slashes:
>>>> 
>>>> PARSE_FILE_LIST := $(addsuffix .safparse,$(shell grep -ERl --include='*.h' \
>>>>   --include='*.c' '^[[:blank:]]*/\*[[:space:]]+SAF-.*\*/$$' $(srctree)))
>>> 
>>> Good - seeing things more clearly now my next question is: Isn't
>>> matching just "/* SAF-...*/" a little too lax? And is there really a
>>> need to permit leading blanks?
>> 
>> I’m permitting blanks to allow spaces or tabs, zero or more times before the start of
>> the comment, I think it shall be like that.
> 
> Hmm, I withdraw my question realizing that you want these comments
> indented the same as the line they relate to.
> 
>> About matching, maybe I can match also the number after SAF-, this should be enough,
>> 
>> […] grep -El '^[[:blank:]]*\/\*[[:space:]]+SAF-[0-9]+.*\*\/$$’ […]
> 
> I'd like to suggest to go one tiny step further (and once again to
> drop the escaping of slashes):
> 
> '^[[:blank:]]*/\*[[:space:]]+SAF-[0-9]+-.*\*/$$'

I agree, I will use this one that is safer and includes your suggestions:

PARSE_FILE_LIST := $(addsuffix .safparse,$(shell find $(srctree) -type f \
    -name '*.c' -o -name '*.h' -exec \
    grep -El '^[[:blank:]]*/\*[[:space:]]+SAF-[0-9]+-.*\*/$$' {} \; ))

> 
>>>> Now analysis-build-coverity will be called, the best match is analysis-build-%, so again the dependency
>>>> which is analysis-parse-tags-%, will be translated to analysis-parse-tags-coverity.
>>>> 
>>>> Now analysis-parse-tags-coverity will be called, the best match is analysis-parse-tags-%, so the % will
>>>> Have the ‘coverity’ value and in the dependency we will have $(objtree)/%.sed -> $(objtree)/coverity.sed.
>>>> 
>>>> Looking for $(objtree)/coverity.sed the best match is $(objtree)/%.sed, which will have $(JUSTIFICATION_FILES)
>>>> and the python script in the dependency, here we will use the second expansion to solve
>>>> $(XEN_ROOT)/docs/misra/false-positive-$$*.json in $(XEN_ROOT)/docs/misra/false-positive-coverity.json
>>>> 
>>>> So now after analysis-parse-tags-coverity has ended its dependency it will start with its recipe, after it finishes,
>>>> the recipe of analysis-build-coverity will start and it will call make to actually build Xen.
>>> 
>>> Okay, I see now - this building of Xen really _is_ independent of the
>>> checker chosen. I'm not sure though whether it is a good idea to
>>> integrate all this, including ...
>>> 
>>>> After the build finishes, if the status is good, the analysis-build-coverity has finished and the _analysis-coverity
>>>> recipe can now run, it will call make with the analysis-clean target, restoring any <file>.{c,h}.safparse to <file>.{c,h}.
>>> 
>>> ... the subsequent cleaning. The state of the _source_ tree after a
>>> build failure would be different from that after a successful build.
>>> Personally I consider this at best surprising.
>>> 
>>> I wonder whether instead there could be a shell(?) script driving a
>>> sequence of make invocations, leaving the new make goals all be self-
>>> contained. Such a script could revert the source tree to its original
>>> state even upon build failure by default, with an option allowing to
>>> suppress this behavior.
>> 
>> Instead of adding another tool, so another layer to the overall system, I would be more willing to add documentation
>> about this process, explaining how to use the analysis-* build targets, what to expect after a successful run and what
>> to expect after a failure.
>> 
>> What do you think?
> 
> Personally I'd prefer make goals to behave as such, with no surprises.

The analysis-* goal requires a build step, otherwise no analysis can be performed by the analysis tools, so I hope we agree
we need to integrate that step as a dependency of the analysis-*.
I understand that the analysis-clean might be a “surprise” if not well documented, this comes from the need to substitute the
tags in the tree (to keep the real path in the report log) and to revert them back at the end of the analysis.

So, such script should just mask to the user the analysis-clean invocation in case of errors (with an option to don’t do that)?

> 
> Jan
Jan Beulich Nov. 11, 2022, 1:10 p.m. UTC | #11
On 11.11.2022 11:42, Luca Fancellu wrote:
>> On 9 Nov 2022, at 10:36, Jan Beulich <jbeulich@suse.com> wrote:
>> On 09.11.2022 11:08, Luca Fancellu wrote:
>>>>> On 07.11.2022 11:47, Luca Fancellu wrote:
>>>>> Now analysis-build-coverity will be called, the best match is analysis-build-%, so again the dependency
>>>>> which is analysis-parse-tags-%, will be translated to analysis-parse-tags-coverity.
>>>>>
>>>>> Now analysis-parse-tags-coverity will be called, the best match is analysis-parse-tags-%, so the % will
>>>>> Have the ‘coverity’ value and in the dependency we will have $(objtree)/%.sed -> $(objtree)/coverity.sed.
>>>>>
>>>>> Looking for $(objtree)/coverity.sed the best match is $(objtree)/%.sed, which will have $(JUSTIFICATION_FILES)
>>>>> and the python script in the dependency, here we will use the second expansion to solve
>>>>> $(XEN_ROOT)/docs/misra/false-positive-$$*.json in $(XEN_ROOT)/docs/misra/false-positive-coverity.json
>>>>>
>>>>> So now after analysis-parse-tags-coverity has ended its dependency it will start with its recipe, after it finishes,
>>>>> the recipe of analysis-build-coverity will start and it will call make to actually build Xen.
>>>>
>>>> Okay, I see now - this building of Xen really _is_ independent of the
>>>> checker chosen. I'm not sure though whether it is a good idea to
>>>> integrate all this, including ...
>>>>
>>>>> After the build finishes, if the status is good, the analysis-build-coverity has finished and the _analysis-coverity
>>>>> recipe can now run, it will call make with the analysis-clean target, restoring any <file>.{c,h}.safparse to <file>.{c,h}.
>>>>
>>>> ... the subsequent cleaning. The state of the _source_ tree after a
>>>> build failure would be different from that after a successful build.
>>>> Personally I consider this at best surprising.
>>>>
>>>> I wonder whether instead there could be a shell(?) script driving a
>>>> sequence of make invocations, leaving the new make goals all be self-
>>>> contained. Such a script could revert the source tree to its original
>>>> state even upon build failure by default, with an option allowing to
>>>> suppress this behavior.
>>>
>>> Instead of adding another tool, so another layer to the overall system, I would be more willing to add documentation
>>> about this process, explaining how to use the analysis-* build targets, what to expect after a successful run and what
>>> to expect after a failure.
>>>
>>> What do you think?
>>
>> Personally I'd prefer make goals to behave as such, with no surprises.
> 
> The analysis-* goal requires a build step, otherwise no analysis can be performed by the analysis tools, so I hope we agree
> we need to integrate that step as a dependency of the analysis-*.

No, I'm afraid we don't agree. But like said for another piece we didn't
initially agree on - if others think what you propose is fine, so be it.
I'm specifically adding Anthony to Cc, as he's been working on make rules
the most of all of us in the recent past.

> I understand that the analysis-clean might be a “surprise” if not well documented, this comes from the need to substitute the
> tags in the tree (to keep the real path in the report log) and to revert them back at the end of the analysis.
> 
> So, such script should just mask to the user the analysis-clean invocation in case of errors (with an option to don’t do that)?

Hmm, here you're saying "such script", which looks to not fit with the
earlier part of your reply above. (Just in case that's what I was to read
out of this: I wouldn't see value in a script which existed _solely_ to
make the cleaning conditional.)

Did you consider the alternative approach of copying the tree, altering
it (while or after copying), running the build there, pulling out the
result files, and delete the entire copy? Such a model would likely get
away without introducing surprising make rules.

Jan
Stefano Stabellini Nov. 11, 2022, 8:52 p.m. UTC | #12
On Fri, 11 Nov 2022, Jan Beulich wrote:
> On 11.11.2022 11:42, Luca Fancellu wrote:
> >> On 9 Nov 2022, at 10:36, Jan Beulich <jbeulich@suse.com> wrote:
> >> On 09.11.2022 11:08, Luca Fancellu wrote:
> >>>>> On 07.11.2022 11:47, Luca Fancellu wrote:
> >>>>> Now analysis-build-coverity will be called, the best match is analysis-build-%, so again the dependency
> >>>>> which is analysis-parse-tags-%, will be translated to analysis-parse-tags-coverity.
> >>>>>
> >>>>> Now analysis-parse-tags-coverity will be called, the best match is analysis-parse-tags-%, so the % will
> >>>>> Have the ‘coverity’ value and in the dependency we will have $(objtree)/%.sed -> $(objtree)/coverity.sed.
> >>>>>
> >>>>> Looking for $(objtree)/coverity.sed the best match is $(objtree)/%.sed, which will have $(JUSTIFICATION_FILES)
> >>>>> and the python script in the dependency, here we will use the second expansion to solve
> >>>>> $(XEN_ROOT)/docs/misra/false-positive-$$*.json in $(XEN_ROOT)/docs/misra/false-positive-coverity.json
> >>>>>
> >>>>> So now after analysis-parse-tags-coverity has ended its dependency it will start with its recipe, after it finishes,
> >>>>> the recipe of analysis-build-coverity will start and it will call make to actually build Xen.
> >>>>
> >>>> Okay, I see now - this building of Xen really _is_ independent of the
> >>>> checker chosen. I'm not sure though whether it is a good idea to
> >>>> integrate all this, including ...
> >>>>
> >>>>> After the build finishes, if the status is good, the analysis-build-coverity has finished and the _analysis-coverity
> >>>>> recipe can now run, it will call make with the analysis-clean target, restoring any <file>.{c,h}.safparse to <file>.{c,h}.
> >>>>
> >>>> ... the subsequent cleaning. The state of the _source_ tree after a
> >>>> build failure would be different from that after a successful build.
> >>>> Personally I consider this at best surprising.
> >>>>
> >>>> I wonder whether instead there could be a shell(?) script driving a
> >>>> sequence of make invocations, leaving the new make goals all be self-
> >>>> contained. Such a script could revert the source tree to its original
> >>>> state even upon build failure by default, with an option allowing to
> >>>> suppress this behavior.
> >>>
> >>> Instead of adding another tool, so another layer to the overall system, I would be more willing to add documentation
> >>> about this process, explaining how to use the analysis-* build targets, what to expect after a successful run and what
> >>> to expect after a failure.
> >>>
> >>> What do you think?
> >>
> >> Personally I'd prefer make goals to behave as such, with no surprises.
> > 
> > The analysis-* goal requires a build step, otherwise no analysis can be performed by the analysis tools, so I hope we agree
> > we need to integrate that step as a dependency of the analysis-*.
> 
> No, I'm afraid we don't agree. But like said for another piece we didn't
> initially agree on - if others think what you propose is fine, so be it.
> I'm specifically adding Anthony to Cc, as he's been working on make rules
> the most of all of us in the recent past.
> 
> > I understand that the analysis-clean might be a “surprise” if not well documented, this comes from the need to substitute the
> > tags in the tree (to keep the real path in the report log) and to revert them back at the end of the analysis.
> > 
> > So, such script should just mask to the user the analysis-clean invocation in case of errors (with an option to don’t do that)?
> 
> Hmm, here you're saying "such script", which looks to not fit with the
> earlier part of your reply above. (Just in case that's what I was to read
> out of this: I wouldn't see value in a script which existed _solely_ to
> make the cleaning conditional.)
> 
> Did you consider the alternative approach of copying the tree, altering
> it (while or after copying), running the build there, pulling out the
> result files, and delete the entire copy? Such a model would likely get
> away without introducing surprising make rules.

Another, maybe simpler idea: what if the build step is not a dependency
of the analysis-* goals?

Basically, the user is supposed to:

1) call analysis-parse-tags-*
2) build Xen (in any way they like)
3) call analysis-clean

Making steps 1-3 into a single step is slightly more convenient for the
user but the downside is that dealing with build errors becomes
problematic.

On the other hand, if we let the user call steps 1-3 by hand
individually, it is slightly less convenient for the user but they can
more easily deal with any build error and sophisticated build
configurations.

This is one of those cases where I think "less is more".
Jan Beulich Nov. 14, 2022, 7:30 a.m. UTC | #13
On 11.11.2022 21:52, Stefano Stabellini wrote:
> On Fri, 11 Nov 2022, Jan Beulich wrote:
>> On 11.11.2022 11:42, Luca Fancellu wrote:
>>>> On 9 Nov 2022, at 10:36, Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 09.11.2022 11:08, Luca Fancellu wrote:
>>>>>>> On 07.11.2022 11:47, Luca Fancellu wrote:
>>>>>>> Now analysis-build-coverity will be called, the best match is analysis-build-%, so again the dependency
>>>>>>> which is analysis-parse-tags-%, will be translated to analysis-parse-tags-coverity.
>>>>>>>
>>>>>>> Now analysis-parse-tags-coverity will be called, the best match is analysis-parse-tags-%, so the % will
>>>>>>> Have the ‘coverity’ value and in the dependency we will have $(objtree)/%.sed -> $(objtree)/coverity.sed.
>>>>>>>
>>>>>>> Looking for $(objtree)/coverity.sed the best match is $(objtree)/%.sed, which will have $(JUSTIFICATION_FILES)
>>>>>>> and the python script in the dependency, here we will use the second expansion to solve
>>>>>>> $(XEN_ROOT)/docs/misra/false-positive-$$*.json in $(XEN_ROOT)/docs/misra/false-positive-coverity.json
>>>>>>>
>>>>>>> So now after analysis-parse-tags-coverity has ended its dependency it will start with its recipe, after it finishes,
>>>>>>> the recipe of analysis-build-coverity will start and it will call make to actually build Xen.
>>>>>>
>>>>>> Okay, I see now - this building of Xen really _is_ independent of the
>>>>>> checker chosen. I'm not sure though whether it is a good idea to
>>>>>> integrate all this, including ...
>>>>>>
>>>>>>> After the build finishes, if the status is good, the analysis-build-coverity has finished and the _analysis-coverity
>>>>>>> recipe can now run, it will call make with the analysis-clean target, restoring any <file>.{c,h}.safparse to <file>.{c,h}.
>>>>>>
>>>>>> ... the subsequent cleaning. The state of the _source_ tree after a
>>>>>> build failure would be different from that after a successful build.
>>>>>> Personally I consider this at best surprising.
>>>>>>
>>>>>> I wonder whether instead there could be a shell(?) script driving a
>>>>>> sequence of make invocations, leaving the new make goals all be self-
>>>>>> contained. Such a script could revert the source tree to its original
>>>>>> state even upon build failure by default, with an option allowing to
>>>>>> suppress this behavior.
>>>>>
>>>>> Instead of adding another tool, so another layer to the overall system, I would be more willing to add documentation
>>>>> about this process, explaining how to use the analysis-* build targets, what to expect after a successful run and what
>>>>> to expect after a failure.
>>>>>
>>>>> What do you think?
>>>>
>>>> Personally I'd prefer make goals to behave as such, with no surprises.
>>>
>>> The analysis-* goal requires a build step, otherwise no analysis can be performed by the analysis tools, so I hope we agree
>>> we need to integrate that step as a dependency of the analysis-*.
>>
>> No, I'm afraid we don't agree. But like said for another piece we didn't
>> initially agree on - if others think what you propose is fine, so be it.
>> I'm specifically adding Anthony to Cc, as he's been working on make rules
>> the most of all of us in the recent past.
>>
>>> I understand that the analysis-clean might be a “surprise” if not well documented, this comes from the need to substitute the
>>> tags in the tree (to keep the real path in the report log) and to revert them back at the end of the analysis.
>>>
>>> So, such script should just mask to the user the analysis-clean invocation in case of errors (with an option to don’t do that)?
>>
>> Hmm, here you're saying "such script", which looks to not fit with the
>> earlier part of your reply above. (Just in case that's what I was to read
>> out of this: I wouldn't see value in a script which existed _solely_ to
>> make the cleaning conditional.)
>>
>> Did you consider the alternative approach of copying the tree, altering
>> it (while or after copying), running the build there, pulling out the
>> result files, and delete the entire copy? Such a model would likely get
>> away without introducing surprising make rules.
> 
> Another, maybe simpler idea: what if the build step is not a dependency
> of the analysis-* goals?
> 
> Basically, the user is supposed to:
> 
> 1) call analysis-parse-tags-*
> 2) build Xen (in any way they like)
> 3) call analysis-clean

Well, that's exactly what I've been proposing, with the (optional)
addition of a small (shell) script doing all of the three for ...

> Making steps 1-3 into a single step is slightly more convenient for the
> user but the downside is that dealing with build errors becomes
> problematic.
> 
> On the other hand, if we let the user call steps 1-3 by hand
> individually, it is slightly less convenient for the user but they can
> more easily deal with any build error and sophisticated build
> configurations.

... convenience.

Jan
Luca Fancellu Nov. 14, 2022, 12:30 p.m. UTC | #14
> On 14 Nov 2022, at 07:30, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 11.11.2022 21:52, Stefano Stabellini wrote:
>> On Fri, 11 Nov 2022, Jan Beulich wrote:
>>> On 11.11.2022 11:42, Luca Fancellu wrote:
>>>>> On 9 Nov 2022, at 10:36, Jan Beulich <jbeulich@suse.com> wrote:
>>>>> On 09.11.2022 11:08, Luca Fancellu wrote:
>>>>>>>> On 07.11.2022 11:47, Luca Fancellu wrote:
>>>>>>>> Now analysis-build-coverity will be called, the best match is analysis-build-%, so again the dependency
>>>>>>>> which is analysis-parse-tags-%, will be translated to analysis-parse-tags-coverity.
>>>>>>>> 
>>>>>>>> Now analysis-parse-tags-coverity will be called, the best match is analysis-parse-tags-%, so the % will
>>>>>>>> Have the ‘coverity’ value and in the dependency we will have $(objtree)/%.sed -> $(objtree)/coverity.sed.
>>>>>>>> 
>>>>>>>> Looking for $(objtree)/coverity.sed the best match is $(objtree)/%.sed, which will have $(JUSTIFICATION_FILES)
>>>>>>>> and the python script in the dependency, here we will use the second expansion to solve
>>>>>>>> $(XEN_ROOT)/docs/misra/false-positive-$$*.json in $(XEN_ROOT)/docs/misra/false-positive-coverity.json
>>>>>>>> 
>>>>>>>> So now after analysis-parse-tags-coverity has ended its dependency it will start with its recipe, after it finishes,
>>>>>>>> the recipe of analysis-build-coverity will start and it will call make to actually build Xen.
>>>>>>> 
>>>>>>> Okay, I see now - this building of Xen really _is_ independent of the
>>>>>>> checker chosen. I'm not sure though whether it is a good idea to
>>>>>>> integrate all this, including ...
>>>>>>> 
>>>>>>>> After the build finishes, if the status is good, the analysis-build-coverity has finished and the _analysis-coverity
>>>>>>>> recipe can now run, it will call make with the analysis-clean target, restoring any <file>.{c,h}.safparse to <file>.{c,h}.
>>>>>>> 
>>>>>>> ... the subsequent cleaning. The state of the _source_ tree after a
>>>>>>> build failure would be different from that after a successful build.
>>>>>>> Personally I consider this at best surprising.
>>>>>>> 
>>>>>>> I wonder whether instead there could be a shell(?) script driving a
>>>>>>> sequence of make invocations, leaving the new make goals all be self-
>>>>>>> contained. Such a script could revert the source tree to its original
>>>>>>> state even upon build failure by default, with an option allowing to
>>>>>>> suppress this behavior.
>>>>>> 
>>>>>> Instead of adding another tool, so another layer to the overall system, I would be more willing to add documentation
>>>>>> about this process, explaining how to use the analysis-* build targets, what to expect after a successful run and what
>>>>>> to expect after a failure.
>>>>>> 
>>>>>> What do you think?
>>>>> 
>>>>> Personally I'd prefer make goals to behave as such, with no surprises.
>>>> 
>>>> The analysis-* goal requires a build step, otherwise no analysis can be performed by the analysis tools, so I hope we agree
>>>> we need to integrate that step as a dependency of the analysis-*.
>>> 
>>> No, I'm afraid we don't agree. But like said for another piece we didn't
>>> initially agree on - if others think what you propose is fine, so be it.
>>> I'm specifically adding Anthony to Cc, as he's been working on make rules
>>> the most of all of us in the recent past.
>>> 
>>>> I understand that the analysis-clean might be a “surprise” if not well documented, this comes from the need to substitute the
>>>> tags in the tree (to keep the real path in the report log) and to revert them back at the end of the analysis.
>>>> 
>>>> So, such script should just mask to the user the analysis-clean invocation in case of errors (with an option to don’t do that)?
>>> 
>>> Hmm, here you're saying "such script", which looks to not fit with the
>>> earlier part of your reply above. (Just in case that's what I was to read
>>> out of this: I wouldn't see value in a script which existed _solely_ to
>>> make the cleaning conditional.)
>>> 
>>> Did you consider the alternative approach of copying the tree, altering
>>> it (while or after copying), running the build there, pulling out the
>>> result files, and delete the entire copy? Such a model would likely get
>>> away without introducing surprising make rules.

This approach does not work because the report will contain a path that is different from the source path and
some web based tools won’t be able to track back the origin of the finding.

e.g. /path/to/xen/arch/arm/<file> is the original file, we run the analysis on /path/to2/xen/arch/arm/<file>,
the finding is in /path/to2/xen/arch/arm/<file> but the source repository contains only /path/to/xen/arch/arm/<file>

>> 
>> Another, maybe simpler idea: what if the build step is not a dependency
>> of the analysis-* goals?
>> 
>> Basically, the user is supposed to:
>> 
>> 1) call analysis-parse-tags-*
>> 2) build Xen (in any way they like)
>> 3) call analysis-clean
> 
> Well, that's exactly what I've been proposing, with the (optional)
> addition of a small (shell) script doing all of the three for ...
> 
>> Making steps 1-3 into a single step is slightly more convenient for the
>> user but the downside is that dealing with build errors becomes
>> problematic.
>> 
>> On the other hand, if we let the user call steps 1-3 by hand
>> individually, it is slightly less convenient for the user but they can
>> more easily deal with any build error and sophisticated build
>> configurations.
> 
> ... convenience.

For coverity and eclair, it makes sense, these tools doesn’t require much effort to be integrated,
they are built to intercept files, compilers, environment variables during the make run in a
transparent way.

So the workflow is:

1) call analysis-parse-tags-*
2) build Xen (in any way they like)
3) call analysis-clean


If we think about cppcheck however, here the story changes, as it requires all these information
to be given as inputs, we have to do all the work the commercial tools do under the hood.

The cppcheck workflow instead is:

1) call analysis-parse-tags-cppcheck
2) generate cppcheck suppression list
3) build Xen (and run cppcheck on built source files)
4) collect and generate report
5) call analysis-clean

So let’s think about detaching the build stage from the previous stages, I think it is not very convenient
for the user, as during cppcheck analysis we build $(objtree)/include/generated/compiler-def.h, we build 
$(objtree)/suppression-list.txt, so the user needs to build Xen where those files are created
(in-tree or out-of-tree) otherwise the analysis won’t work and that’s the first user requirement (stage #3).

The most critical input to cppcheck is Xen’s $(CC), it comes from the build system in this serie, the user would
need to pass the correct one to cppcheck wrapper, together with cppcheck flags, and pass to Xen build stage #3
the wrapper as CC, second user requirement.

After the analysis, the user needs to run some scripts to put together the cppcheck report fragments
after its analysis, this step requires also the knowledge of were Xen is built, in-tree or out-of-tree, so
here the third user requirement (similar to the first one, but the stage is #4).

In the end, we can see the user would not be able to call individually the targets if it is not mastering
the system, it’s too complex to have something working, we could create a script to handle these requirements,
but it would be complex as it would do the job of the make system, plus it needs to forward additional make arguments
to it as well (CROSS_COMPILE, XEN_TARGET_ARCH, in-tree or Out-of-tree build, ... for example).

In this thread the message is that in case of errors, there will be some artifacts (<file>.safparse, modified <file>)
and this is unexpected or surprising, but we are going to add a lot of complexity to handle something that needs
just documentation (in my opinion).

If the community don’t agree that documentation is enough, a solution could be to provide a script that in case of
errors, calls automatically the analysis-clean target, analysis-<tool> will call also the build step in this case,
here some pseudocode:

	#!/bin/bash
	set -e

	trap [call analysis-clean] EXIT

	[call analysis-<tool>]


This script needs however all the make arguments that we would have passed to make instead:

./script.sh --tool=<tool> [--dont-clean-on-err] -- CROSS_COMPILE=“[...]“ XEN_TARGET_ARCH=“[...]” [others...]




> 
> Jan
Jan Beulich Nov. 14, 2022, 4:05 p.m. UTC | #15
On 14.11.2022 13:30, Luca Fancellu wrote:
>> On 14 Nov 2022, at 07:30, Jan Beulich <jbeulich@suse.com> wrote:
>> On 11.11.2022 21:52, Stefano Stabellini wrote:
>>> On Fri, 11 Nov 2022, Jan Beulich wrote:
>>>> Did you consider the alternative approach of copying the tree, altering
>>>> it (while or after copying), running the build there, pulling out the
>>>> result files, and delete the entire copy? Such a model would likely get
>>>> away without introducing surprising make rules.
> 
> This approach does not work because the report will contain a path that is different from the source path and
> some web based tools won’t be able to track back the origin of the finding.
> 
> e.g. /path/to/xen/arch/arm/<file> is the original file, we run the analysis on /path/to2/xen/arch/arm/<file>,
> the finding is in /path/to2/xen/arch/arm/<file> but the source repository contains only /path/to/xen/arch/arm/<file>

Simply run "sed" over the result?

>>> Another, maybe simpler idea: what if the build step is not a dependency
>>> of the analysis-* goals?
>>>
>>> Basically, the user is supposed to:
>>>
>>> 1) call analysis-parse-tags-*
>>> 2) build Xen (in any way they like)
>>> 3) call analysis-clean
>>
>> Well, that's exactly what I've been proposing, with the (optional)
>> addition of a small (shell) script doing all of the three for ...
>>
>>> Making steps 1-3 into a single step is slightly more convenient for the
>>> user but the downside is that dealing with build errors becomes
>>> problematic.
>>>
>>> On the other hand, if we let the user call steps 1-3 by hand
>>> individually, it is slightly less convenient for the user but they can
>>> more easily deal with any build error and sophisticated build
>>> configurations.
>>
>> ... convenience.
> 
> For coverity and eclair, it makes sense, these tools doesn’t require much effort to be integrated,
> they are built to intercept files, compilers, environment variables during the make run in a
> transparent way.
> 
> So the workflow is:
> 
> 1) call analysis-parse-tags-*
> 2) build Xen (in any way they like)
> 3) call analysis-clean
> 
> 
> If we think about cppcheck however, here the story changes, as it requires all these information
> to be given as inputs, we have to do all the work the commercial tools do under the hood.
> 
> The cppcheck workflow instead is:
> 
> 1) call analysis-parse-tags-cppcheck
> 2) generate cppcheck suppression list
> 3) build Xen (and run cppcheck on built source files)
> 4) collect and generate report
> 5) call analysis-clean

Which merely makes for a more involved (shell) script.

> So let’s think about detaching the build stage from the previous stages, I think it is not very convenient
> for the user, as during cppcheck analysis we build $(objtree)/include/generated/compiler-def.h, we build 
> $(objtree)/suppression-list.txt, so the user needs to build Xen where those files are created
> (in-tree or out-of-tree) otherwise the analysis won’t work and that’s the first user requirement (stage #3).
> 
> The most critical input to cppcheck is Xen’s $(CC), it comes from the build system in this serie, the user would
> need to pass the correct one to cppcheck wrapper, together with cppcheck flags, and pass to Xen build stage #3
> the wrapper as CC, second user requirement.
> 
> After the analysis, the user needs to run some scripts to put together the cppcheck report fragments
> after its analysis, this step requires also the knowledge of were Xen is built, in-tree or out-of-tree, so
> here the third user requirement (similar to the first one, but the stage is #4).
> 
> In the end, we can see the user would not be able to call individually the targets if it is not mastering
> the system, it’s too complex to have something working, we could create a script to handle these requirements,
> but it would be complex as it would do the job of the make system, plus it needs to forward additional make arguments
> to it as well (CROSS_COMPILE, XEN_TARGET_ARCH, in-tree or Out-of-tree build, ... for example).
> 
> In this thread the message is that in case of errors, there will be some artifacts (<file>.safparse, modified <file>)
> and this is unexpected or surprising, but we are going to add a lot of complexity to handle something that needs
> just documentation (in my opinion).
> 
> If the community don’t agree that documentation is enough, a solution could be to provide a script that in case of
> errors, calls automatically the analysis-clean target, analysis-<tool> will call also the build step in this case,
> here some pseudocode:
> 
> 	#!/bin/bash
> 	set -e
> 
> 	trap [call analysis-clean] EXIT
> 
> 	[call analysis-<tool>]
> 
> 
> This script needs however all the make arguments that we would have passed to make instead:
> 
> ./script.sh --tool=<tool> [--dont-clean-on-err] -- CROSS_COMPILE=“[...]“ XEN_TARGET_ARCH=“[...]” [others...]

Well, of course the suggested script would need to be passed overrides you'd
otherwise pass with "make build" or alike.

Jan
Anthony PERARD Nov. 14, 2022, 4:25 p.m. UTC | #16
On Mon, Nov 07, 2022 at 10:47:36AM +0000, Luca Fancellu wrote:
>  xen/Makefile                            |  50 ++++++-

Hi Luca,

Could you write a shell script which would probably be easier to
read/modify than this rather complicated looking set of Makefile rules?

As I see it, a potential `analysis` shell script would have a single
interaction with make, it would just have to run `make build
CC=cppcheck-gcc` or other.

Because I don't see how make is useful in this case. Or maybe you could
explain how writing this in make help?
Also non of this would work with out-of-tree builds, as you shouldn't
make modification to the source tree.

Cheers,
Anthony PERARD Nov. 14, 2022, 5:16 p.m. UTC | #17
On Mon, Nov 14, 2022 at 12:30:39PM +0000, Luca Fancellu wrote:
> The cppcheck workflow instead is:
> 
> 1) call analysis-parse-tags-cppcheck
> 2) generate cppcheck suppression list
> 3) build Xen (and run cppcheck on built source files)
> 4) collect and generate report
> 5) call analysis-clean
> 
> So let’s think about detaching the build stage from the previous stages, I think it is not very convenient
> for the user, as during cppcheck analysis we build $(objtree)/include/generated/compiler-def.h, we build 
> $(objtree)/suppression-list.txt, so the user needs to build Xen where those files are created
> (in-tree or out-of-tree) otherwise the analysis won’t work and that’s the first user requirement (stage #3).
> 
> The most critical input to cppcheck is Xen’s $(CC), it comes from the build system in this serie, the user would
> need to pass the correct one to cppcheck wrapper, together with cppcheck flags, and pass to Xen build stage #3
> the wrapper as CC, second user requirement.

You could add something like that to Makefile:
    export-variables:
        @echo "CC='$(CC)'"

And if "the user" is a shell script, it could easily figure out what $CC
is, without having to duplicate the Makefile's logic for it.

> After the analysis, the user needs to run some scripts to put together the cppcheck report fragments
> after its analysis, this step requires also the knowledge of were Xen is built, in-tree or out-of-tree, so
> here the third user requirement (similar to the first one, but the stage is #4).

Don't support out-of-tree, that would make things easier. I don't see
how that would work anyway with the needed temporary changes to the
source code.

> In the end, we can see the user would not be able to call individually the targets if it is not mastering
> the system, it’s too complex to have something working, we could create a script to handle these requirements,
> but it would be complex as it would do the job of the make system, plus it needs to forward additional make arguments
> to it as well (CROSS_COMPILE, XEN_TARGET_ARCH, in-tree or Out-of-tree build, ... for example).

Well, instead of running `make X XEN_TARGET_ARCH=x86`, a script would be
run as `./script XEN_TARGET_ARCH=x86`, so not much change.
Then the script can easily run `make "$@"`.

Cheers,
Luca Fancellu Nov. 25, 2022, 8:50 a.m. UTC | #18
> On 14 Nov 2022, at 16:25, Anthony PERARD <anthony.perard@citrix.com> wrote:
> 
> On Mon, Nov 07, 2022 at 10:47:36AM +0000, Luca Fancellu wrote:
>> xen/Makefile                            |  50 ++++++-
> 
> Hi Luca,

Hi,

> 
> Could you write a shell script which would probably be easier to
> read/modify than this rather complicated looking set of Makefile rules?

I admit the rules are a bit complicated

> 
> As I see it, a potential `analysis` shell script would have a single
> interaction with make, it would just have to run `make build
> CC=cppcheck-gcc` or other.
> 
> Because I don't see how make is useful in this case. Or maybe you could
> explain how writing this in make help?
> Also non of this would work with out-of-tree builds, as you shouldn't
> make modification to the source tree.

They both are good points, I will rewrite the rules as a script.

> 
> Cheers,
> 
> -- 
> Anthony PERARD
diff mbox series

Patch

diff --git a/.gitignore b/.gitignore
index 418bdfaebf36..b48e1e20c4fc 100644
--- a/.gitignore
+++ b/.gitignore
@@ -10,6 +10,7 @@ 
 *.c.cppcheck
 *.opic
 *.a
+*.safparse
 *.so
 *.so.[0-9]*
 *.bin
@@ -314,6 +315,7 @@  xen/xsm/flask/policy.*
 xen/xsm/flask/xenpolicy-*
 tools/flask/policy/policy.conf
 tools/flask/policy/xenpolicy-*
+xen/*.sed
 xen/xen
 xen/xen-cppcheck.xml
 xen/xen-syms
diff --git a/docs/misra/documenting-violations.rst b/docs/misra/documenting-violations.rst
new file mode 100644
index 000000000000..3430abfaa177
--- /dev/null
+++ b/docs/misra/documenting-violations.rst
@@ -0,0 +1,172 @@ 
+.. SPDX-License-Identifier: CC-BY-4.0
+
+Documenting violations
+======================
+
+Static analysers are used on the Xen codebase for both static analysis and MISRA
+compliance.
+There might be the need to suppress some findings instead of fixing them and
+many tools permit the usage of in-code comments that suppress findings so that
+they are not shown in the final report.
+
+Xen includes a tool capable of translating a specific comment used in its
+codebase to the right proprietary in-code comment understandable by the selected
+analyser that suppress its finding.
+
+In the Xen codebase, these tags will be used to document and suppress findings:
+
+ - SAF-X-safe: This tag means that the next line of code contains a finding, but
+   the non compliance to the checker is analysed and demonstrated to be safe.
+ - SAF-X-false-positive-<tool>: This tag means that the next line of code
+   contains a finding, but the finding is a bug of the tool.
+
+SAF stands for Static Analyser Finding, the X is a placeholder for a positive
+number that starts from zero, the number after SAF- shall be incremental and
+unique, base ten notation and without leading zeros.
+
+Entries in the database shall never be removed, even if they are not used
+anymore in the code (if a patch is removing or modifying the faulty line).
+This is to make sure that numbers are not reused which could lead to conflicts
+with old branches or misleading justifications.
+
+An entry can be reused in multiple places in the code to suppress a finding if
+and only if the justification holds for the same non-compliance to the coding
+standard.
+
+An orphan entry, that is an entry who was justifying a finding in the code, but
+later that code was removed and there is no other use of that entry in the code,
+can be reused as long as the justification for the finding holds. This is done
+to avoid the allocation of a new entry with exactly the same justification, that
+would lead to waste of space and maintenance issues of the database.
+
+The files where to store all the justifications are in xen/docs/misra/ and are
+named as safe.json and false-positive-<tool>.json, they have JSON format.
+
+Here is an example to add a new justification in safe.json::
+
+|{
+|    "version": "1.0",
+|    "content": [
+|        {
+|            "id": "SAF-0-safe",
+|            "analyser": {
+|                "coverity": "misra_c_2012_rule_20_7_violation",
+|                "eclair": "MC3R1.R20.7"
+|            },
+|            "name": "R20.7 C macro parameters not used as expression",
+|            "text": "The macro parameters used in this [...]"
+|        },
+|        {
+|            "id": "SAF-1-safe",
+|            "analyser": {},
+|            "name": "Sentinel",
+|            "text": "Next ID to be used"
+|        }
+|    ]
+|}
+
+Here is an example to add a new justification in false-positive-<tool>.json::
+
+|{
+|    "version": "1.0",
+|    "content": [
+|        {
+|            "id": "SAF-0-false-positive-<tool>",
+|            "analyser": {
+|                "<tool>": "<proprietary-id>"
+|            },
+|            "tool-version": "<version>",
+|            "name": "R20.7 [...]",
+|            "text": "[...]"
+|        },
+|        {
+|            "id": "SAF-1-false-positive-<tool>",
+|            "analyser": {},
+|            "tool-version": "",
+|            "name": "Sentinel",
+|            "text": "Next ID to be used"
+|        }
+|    ]
+|}
+
+To document a finding, just add another block {[...]} before the sentinel block,
+using the id contained in the sentinel block and increment by one the number
+contained in the id of the sentinel block.
+
+Here an explanation of the field inside an object of the "content" array:
+ - id: it is a unique string that is used to refer to the finding, many finding
+   can be tagged with the same id, if the justification holds for any applied
+   case.
+   It tells the tool to substitute a Xen in-code comment having this structure:
+   /* SAF-0-safe [...] \*/
+ - analyser: it is an object containing pair of key-value strings, the key is
+   the analyser, so it can be coverity or eclair, the value is the proprietary
+   id corresponding on the finding, for example when coverity is used as
+   analyser, the tool will translate the Xen in-code coment in this way:
+   /* SAF-0-safe [...] \*/ -> /* coverity[misra_c_2012_rule_20_7_violation] \*/
+   if the object doesn't have a key-value, then the corresponding in-code
+   comment won't be translated.
+ - name: a simple name for the finding
+ - text: a proper justification to turn off the finding.
+
+
+Justification example
+---------------------
+
+Here an example of the usage of the in-code comment tags to suppress a finding
+for the Rule 8.6:
+
+Eclair reports it in its web report, file xen/include/xen/kernel.h, line 68:
+
+| MC3R1.R8.6 for program 'xen/xen-syms', variable '_start' has no definition
+
+Also coverity reports it, here is an extract of the finding:
+
+| xen/include/xen/kernel.h:68:
+| 1. misra_c_2012_rule_8_6_violation: Function "_start" is declared but never
+ defined.
+
+The analysers are complaining because we have this in xen/include/xen/kernel.h
+at line 68::
+
+| extern char _start[], _end[], start[];
+
+Those are symbols exported by the linker, hence we will need to have a proper
+deviation for this finding.
+
+We will prepare our entry in the safe.json database::
+
+|{
+|    "version": "1.0",
+|    "content": [
+|        {
+|        [...]
+|        },
+|        {
+|            "id": "SAF-1-safe",
+|            "analyser": {
+|                "eclair": "MC3R1.R8.6",
+|                "coverity": "misra_c_2012_rule_8_6_violation"
+|            },
+|            "name": "Rule 8.6: linker script defined symbols",
+|            "text": "It is safe to declare this symbol because it is defined in the linker script."
+|        },
+|        {
+|            "id": "SAF-2-safe",
+|            "analyser": {},
+|            "name": "Sentinel",
+|            "text": "Next ID to be used"
+|        }
+|    ]
+|}
+
+And we will use the proper tag above the violation line::
+
+| /* SAF-1-safe R8.6 linker defined symbols */
+| extern char _start[], _end[], start[];
+
+This entry will fix also the violation on _end and start, because they are on
+the same line and the same "violation ID".
+
+Also, the same tag can be used on other symbols from the linker that are
+declared in the codebase, because the justification holds for them too.
diff --git a/docs/misra/false-positive-coverity.json b/docs/misra/false-positive-coverity.json
new file mode 100644
index 000000000000..f8e6a014acb5
--- /dev/null
+++ b/docs/misra/false-positive-coverity.json
@@ -0,0 +1,12 @@ 
+{
+    "version": "1.0",
+    "content": [
+        {
+            "id": "SAF-0-false-positive-coverity",
+            "analyser": {},
+            "tool-version": "",
+            "name": "Sentinel",
+            "text": "Next ID to be used"
+        }
+    ]
+}
diff --git a/docs/misra/false-positive-eclair.json b/docs/misra/false-positive-eclair.json
new file mode 100644
index 000000000000..63d00e160f9c
--- /dev/null
+++ b/docs/misra/false-positive-eclair.json
@@ -0,0 +1,12 @@ 
+{
+    "version": "1.0",
+    "content": [
+        {
+            "id": "SAF-0-false-positive-eclair",
+            "analyser": {},
+            "tool-version": "",
+            "name": "Sentinel",
+            "text": "Next ID to be used"
+        }
+    ]
+}
diff --git a/docs/misra/safe.json b/docs/misra/safe.json
new file mode 100644
index 000000000000..e079d3038120
--- /dev/null
+++ b/docs/misra/safe.json
@@ -0,0 +1,11 @@ 
+{
+    "version": "1.0",
+    "content": [
+        {
+            "id": "SAF-0-safe",
+            "analyser": {},
+            "name": "Sentinel",
+            "text": "Next ID to be used"
+        }
+    ]
+}
diff --git a/xen/Makefile b/xen/Makefile
index 9d0df5e2c543..3b8d1acd1697 100644
--- a/xen/Makefile
+++ b/xen/Makefile
@@ -457,7 +457,8 @@  endif # need-config
 
 __all: build
 
-main-targets := build install uninstall clean distclean MAP cppcheck cppcheck-html
+main-targets := build install uninstall clean distclean MAP cppcheck \
+    cppcheck-html analysis-coverity analysis-eclair
 .PHONY: $(main-targets)
 ifneq ($(XEN_TARGET_ARCH),x86_32)
 $(main-targets): %: _% ;
@@ -572,7 +573,7 @@  _clean:
 	rm -f $(TARGET).efi $(TARGET).efi.map $(TARGET).efi.stripped
 	rm -f asm-offsets.s arch/*/include/asm/asm-offsets.h
 	rm -f .banner .allconfig.tmp include/xen/compile.h
-	rm -f cppcheck-misra.* xen-cppcheck.xml
+	rm -f cppcheck-misra.* xen-cppcheck.xml *.sed
 
 .PHONY: _distclean
 _distclean: clean
@@ -757,6 +758,51 @@  cppcheck-version:
 $(objtree)/include/generated/compiler-def.h:
 	$(Q)$(CC) -dM -E -o $@ - < /dev/null
 
+JUSTIFICATION_FILES := $(XEN_ROOT)/docs/misra/safe.json \
+                       $(XEN_ROOT)/docs/misra/false-positive-$$*.json
+
+# The following command is using grep to find all files that contains a comment
+# containing "SAF-<anything>" on a single line.
+# %.safparse will be the original files saved from the build system, these files
+# will be restored at the end of the analysis step
+PARSE_FILE_LIST := $(addsuffix .safparse,$(filter-out %.safparse,\
+$(shell grep -ERl '^[[:blank:]]*\/\*[[:space:]]+SAF-.*\*\/$$' $(srctree))))
+
+.PRECIOUS: $(PARSE_FILE_LIST) $(objtree)/%.sed
+
+.SECONDEXPANSION:
+$(objtree)/%.sed: $(JUSTIFICATION_FILES) $(srctree)/tools/xenfusa-gen-tags.py
+	$(PYTHON) $(srctree)/tools/xenfusa-gen-tags.py \
+		$(foreach file, $(filter %.json, $^), --input $(file)) --output $@ \
+		--tool $*
+
+%.safparse: %
+# Create a copy of the original file (-p preserves also timestamp)
+	$(Q)if [ -f "$@" ]; then \
+		echo "Found $@, please check the integrity of $*"; \
+		exit 1; \
+	fi
+	$(Q)cp -p "$*" "$@"
+
+analysis-parse-tags-%: $(PARSE_FILE_LIST) $(objtree)/%.sed
+	$(Q)for file in $(patsubst %.safparse,%,$(PARSE_FILE_LIST)); do \
+		sed -i -f "$(objtree)/$*.sed" "$${file}"; \
+	done
+
+analysis-build-%: analysis-parse-tags-%
+	$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile build
+
+analysis-clean:
+# Reverts the original file (-p preserves also timestamp)
+	$(Q)find $(srctree) -type f -name "*.safparse" -print | \
+	while IFS= read file; do \
+		cp -p "$${file}" "$${file%.safparse}"; \
+		rm -f "$${file}"; \
+	done
+
+_analysis-%: analysis-build-%
+	$(Q)$(MAKE) O=$(abs_objtree) -f $(srctree)/Makefile analysis-clean
+
 endif #config-build
 endif # need-sub-make
 
diff --git a/xen/tools/xenfusa-gen-tags.py b/xen/tools/xenfusa-gen-tags.py
new file mode 100755
index 000000000000..4ab8c0f07a52
--- /dev/null
+++ b/xen/tools/xenfusa-gen-tags.py
@@ -0,0 +1,81 @@ 
+#!/usr/bin/env python
+
+import sys, getopt, json
+
+def help():
+    print('Usage: {} [OPTION] ...'.format(sys.argv[0]))
+    print('')
+    print('This script converts the justification file to a set of sed rules')
+    print('that will replace generic tags from Xen codebase in-code comments')
+    print('to in-code comments having the proprietary syntax for the selected')
+    print('tool.')
+    print('')
+    print('Options:')
+    print('  -i/--input   Json file containing the justifications, can be')
+    print('               passed multiple times for multiple files')
+    print('  -o/--output  Sed file containing the substitution rules')
+    print('  -t/--tool    Tool that will use the in-code comments')
+    print('')
+
+# This is the dictionary for the rules that translates to proprietary comments:
+#  - cppcheck: /* cppcheck-suppress[id] */
+#  - coverity: /* coverity[id] */
+#  - eclair:   /* -E> hide id 1 "" */
+# Add entries to support more analyzers
+tool_syntax = {
+    "cppcheck":"s,^.*/*[[:space:]]*TAG.*$,/* cppcheck-suppress[VID] */,g",
+    "coverity":"s,^.*/*[[:space:]]*TAG.*$,/* coverity[VID] */,g",
+    "eclair":"s,^.*/*[[:space:]]*TAG.*$,/* -E> hide VID 1 \"\" */,g"
+}
+
+def main(argv):
+    infiles = []
+    justifications = []
+    outfile = ''
+    tool = ''
+
+    try:
+        opts, args = getopt.getopt(argv,"hi:o:t:",["input=","output=","tool="])
+    except getopt.GetoptError:
+        help()
+        sys.exit(2)
+    for opt, arg in opts:
+        if opt == '-h':
+            help()
+            sys.exit(0)
+        elif opt in ("-i", "--input"):
+            infiles.append(arg)
+        elif opt in ("-o", "--output"):
+            outfile = arg
+        elif opt in ("-t", "--tool"):
+            tool = arg
+
+    # Open all input files
+    for file in infiles:
+        try:
+            handle = open(file, 'rt')
+            content = json.load(handle)
+            justifications = justifications + content['content']
+            handle.close()
+        except json.JSONDecodeError:
+            print('JSON decoding error in file: ' + file)
+        except:
+            print('Error opening ' + file)
+            sys.exit(1)
+
+    try:
+        outstr = open(outfile, "w")
+    except:
+        print('Error creating ' + outfile)
+        sys.exit(1)
+
+    for j in justifications:
+        if tool in j['analyser']:
+            comment=tool_syntax[tool].replace("TAG",j['id'])
+            comment=comment.replace("VID",j['analyser'][tool])
+            outstr.write('{}\n'.format(comment))
+
+    outstr.close()
+
+if __name__ == "__main__":
+   main(sys.argv[1:])
\ No newline at end of file