mbox series

[v3,0/3] diff-report.py tool

Message ID 20230525083401.3838462-1-luca.fancellu@arm.com (mailing list archive)
Headers show
Series diff-report.py tool | expand

Message

Luca Fancellu May 25, 2023, 8:33 a.m. UTC
--------------------------------------------------------------------------------
This serie is dependent on this patch, in case cppcheck report are generated
using xen-analysis.py that calls the makefile with in-tree build, because
this tool (in particular the patching feature) needs the path from the
repository instead of the path from /xen/xen:
https://patchwork.kernel.org/project/xen-devel/patch/20230519093019.2131896-4-luca.fancellu@arm.com/
--------------------------------------------------------------------------------

Now that we have a tool (xen-analysis.py) that wraps cppcheck to generates
reports, we have a generic overview of how many static analysis issues and non
compliances to MISRA C we have for a certain revision of the codebase.

This is great and eventually the work to be done should be just having less and
less findings in the report until we reach zero.

This is just an ideal trend, because in practice we might have issues that
comes from existing code (macro for example) that are not going to be fixed
soon for any reason, but we would like to check and see how many issues are
introduced by the commits added (ideally zero, but if any is added and the fault
resides outside the changed code, maintainers might decide to include it
anyway).

So the idea is to check the difference between the reports of the codebase: one
called "baseline" which is basically the current codebase, the other one called
"new report" that is the codebase after the changes.
To check if any new finding is added, we need to have a look on every finding in
the "new report" that is not listed in the "baseline".

It seems very simple, but what can happen to existing findings in the code after
a commit is applied?
Basically existing findings can be shifted in position due to changes to
unrelated lines, or they can be also deleted or fixed for changes involving the
findings line (Michal was the first one to point that out).

So comparing the two report now seems quite difficult, because if we compare
them we will have all the new findings plus all the findings that changed
position due to the changes applied.

To overcome this, the diff-tool.py is created and it allows to "patch" the
"baseline" report, looking into the changes applied to the baseline codebase,
described by git diff.

This serie is organised in two patch, I've tried to split the amount of code and
to leave a meaning, so in the first patch you have everything you need to
import cppcheck reports and do a "raw" diff between reports, this gives you
an hint about new findings + old findings that has changed place.

The second patch is adding the "patching" system, having a class that parses
the git diff output and later "patching" the baseline before doing the
comparison. This last option is activated only when passing the git diff changes
to the tool, but everything is described (I hope) in the help.

Some consideration needs to be made, this tool can translate in coordinates
(file, line) the findings from the "baseline" to the "new report" using the git
diff output as, let me say, a translation matrix.
This doesn't mean it can understand the meaning of the findings and recognise
them in the new codebase, so for example, a finding related to a line that is
moved to another part of the file won't be recognised as "old finding" and will
be just removed from the "baseline patched report", however we will find it
in the new report unless it contains a fix for the reported issue.

This means that the tool is not really suited to be a gatekeeper for the merge
action, it is more suitable to help the maintainer to understand when a change
is introducing new issues without having to check and compare manually two
reports of (nowadays) hundreds of finding.
Eventually we could run it in the CI and make the CI reply to the patchwork
thread with its output!

The tool has also a debug argument that when applied, generates extra files
that can be checked against the original files, for example the reports are
imported in the tool, and afterwards the debug code will generate them back from
the imported data and they should be the same (if everything works).
Another debug check is to export the representation of the parsed git diff
output, so that the developer can check how and if the parser interpreted
correctly the data.

Future works for this tool might be to parse also Coverity reports and
eventually (don't know if it is possible) also eclair text report.

Luca Fancellu (3):
  xen/misra: add diff-report.py tool
  xen/misra: diff-report.py: add report patching feature
  maintainers: Add Xen MISRA Analysis Tools section

 MAINTAINERS                                   |  10 +
 xen/scripts/diff-report.py                    | 130 ++++++++++
 .../xen_analysis/diff_tool/__init__.py        |   0
 .../xen_analysis/diff_tool/cppcheck_report.py |  44 ++++
 xen/scripts/xen_analysis/diff_tool/debug.py   |  61 +++++
 xen/scripts/xen_analysis/diff_tool/report.py  | 187 ++++++++++++++
 .../diff_tool/unified_format_parser.py        | 232 ++++++++++++++++++
 7 files changed, 664 insertions(+)
 create mode 100755 xen/scripts/diff-report.py
 create mode 100644 xen/scripts/xen_analysis/diff_tool/__init__.py
 create mode 100644 xen/scripts/xen_analysis/diff_tool/cppcheck_report.py
 create mode 100644 xen/scripts/xen_analysis/diff_tool/debug.py
 create mode 100644 xen/scripts/xen_analysis/diff_tool/report.py
 create mode 100644 xen/scripts/xen_analysis/diff_tool/unified_format_parser.py

Comments

Luca Fancellu June 5, 2023, 9:21 a.m. UTC | #1
> On 25 May 2023, at 09:33, Luca Fancellu <Luca.Fancellu@arm.com> wrote:
> 
> --------------------------------------------------------------------------------
> This serie is dependent on this patch, in case cppcheck report are generated
> using xen-analysis.py that calls the makefile with in-tree build, because
> this tool (in particular the patching feature) needs the path from the
> repository instead of the path from /xen/xen:
> https://patchwork.kernel.org/project/xen-devel/patch/20230519093019.2131896-4-luca.fancellu@arm.com/
> --------------------------------------------------------------------------------
> 
> Now that we have a tool (xen-analysis.py) that wraps cppcheck to generates
> reports, we have a generic overview of how many static analysis issues and non
> compliances to MISRA C we have for a certain revision of the codebase.
> 
> This is great and eventually the work to be done should be just having less and
> less findings in the report until we reach zero.
> 
> This is just an ideal trend, because in practice we might have issues that
> comes from existing code (macro for example) that are not going to be fixed
> soon for any reason, but we would like to check and see how many issues are
> introduced by the commits added (ideally zero, but if any is added and the fault
> resides outside the changed code, maintainers might decide to include it
> anyway).
> 
> So the idea is to check the difference between the reports of the codebase: one
> called "baseline" which is basically the current codebase, the other one called
> "new report" that is the codebase after the changes.
> To check if any new finding is added, we need to have a look on every finding in
> the "new report" that is not listed in the "baseline".
> 
> It seems very simple, but what can happen to existing findings in the code after
> a commit is applied?
> Basically existing findings can be shifted in position due to changes to
> unrelated lines, or they can be also deleted or fixed for changes involving the
> findings line (Michal was the first one to point that out).
> 
> So comparing the two report now seems quite difficult, because if we compare
> them we will have all the new findings plus all the findings that changed
> position due to the changes applied.
> 
> To overcome this, the diff-tool.py is created and it allows to "patch" the
> "baseline" report, looking into the changes applied to the baseline codebase,
> described by git diff.
> 
> This serie is organised in two patch, I've tried to split the amount of code and
> to leave a meaning, so in the first patch you have everything you need to
> import cppcheck reports and do a "raw" diff between reports, this gives you
> an hint about new findings + old findings that has changed place.
> 
> The second patch is adding the "patching" system, having a class that parses
> the git diff output and later "patching" the baseline before doing the
> comparison. This last option is activated only when passing the git diff changes
> to the tool, but everything is described (I hope) in the help.
> 
> Some consideration needs to be made, this tool can translate in coordinates
> (file, line) the findings from the "baseline" to the "new report" using the git
> diff output as, let me say, a translation matrix.
> This doesn't mean it can understand the meaning of the findings and recognise
> them in the new codebase, so for example, a finding related to a line that is
> moved to another part of the file won't be recognised as "old finding" and will
> be just removed from the "baseline patched report", however we will find it
> in the new report unless it contains a fix for the reported issue.
> 
> This means that the tool is not really suited to be a gatekeeper for the merge
> action, it is more suitable to help the maintainer to understand when a change
> is introducing new issues without having to check and compare manually two
> reports of (nowadays) hundreds of finding.
> Eventually we could run it in the CI and make the CI reply to the patchwork
> thread with its output!
> 
> The tool has also a debug argument that when applied, generates extra files
> that can be checked against the original files, for example the reports are
> imported in the tool, and afterwards the debug code will generate them back from
> the imported data and they should be the same (if everything works).
> Another debug check is to export the representation of the parsed git diff
> output, so that the developer can check how and if the parser interpreted
> correctly the data.
> 
> Future works for this tool might be to parse also Coverity reports and
> eventually (don't know if it is possible) also eclair text report.
> 
> Luca Fancellu (3):
>  xen/misra: add diff-report.py tool
>  xen/misra: diff-report.py: add report patching feature
>  maintainers: Add Xen MISRA Analysis Tools section
> 
> MAINTAINERS                                   |  10 +
> xen/scripts/diff-report.py                    | 130 ++++++++++
> .../xen_analysis/diff_tool/__init__.py        |   0
> .../xen_analysis/diff_tool/cppcheck_report.py |  44 ++++
> xen/scripts/xen_analysis/diff_tool/debug.py   |  61 +++++
> xen/scripts/xen_analysis/diff_tool/report.py  | 187 ++++++++++++++
> .../diff_tool/unified_format_parser.py        | 232 ++++++++++++++++++
> 7 files changed, 664 insertions(+)
> create mode 100755 xen/scripts/diff-report.py
> create mode 100644 xen/scripts/xen_analysis/diff_tool/__init__.py
> create mode 100644 xen/scripts/xen_analysis/diff_tool/cppcheck_report.py
> create mode 100644 xen/scripts/xen_analysis/diff_tool/debug.py
> create mode 100644 xen/scripts/xen_analysis/diff_tool/report.py
> create mode 100644 xen/scripts/xen_analysis/diff_tool/unified_format_parser.py
> 
> -- 
> 2.34.1
> 
> 

Hi All,

Is it possible to commit this serie? I think it has T-by and A-by Stefano for all the patches
and I didn’t receive so far comments against it

Thank you, please read this mail as a gentle ping :)