mbox series

[0/9] get_abi.pl: Check for missing symbols at the ABI specs

Message ID cover.1631112725.git.mchehab+huawei@kernel.org (mailing list archive)
Headers show
Series get_abi.pl: Check for missing symbols at the ABI specs | expand

Message

Mauro Carvalho Chehab Sept. 8, 2021, 2:58 p.m. UTC
Hi Greg,

Sometime ago, I discussed with Jonathan Cameron about providing 
a way check that the ABI documentation is incomplete.

While it would be doable to validate the ABI by searching __ATTR and 
similar macros around the driver, this would probably be very complex
and would take a while to parse.

So, I ended by implementing a new feature at scripts/get_abi.pl
which does a check on the sysfs contents of a running system:
it reads everything under /sys and reads the entire ABI from
Documentation/ABI. It then warns for symbols that weren't found,
optionally showing possible candidates that might be misdefined.

I opted to place it on 3 patches:

The first patch adds the basic logic. It runs really quicky (up to 2
seconds), but it doesn't use sysfs softlinks.

Patch 2 adds support for also parsing softlinks. It slows the logic,
with now takes ~40 seconds to run on my desktop (and ~23
seconds on a HiKey970 ARM board). There are space there for
performance improvements, by using a more sophisticated
algorithm, at the expense of making the code harder to
understand. I ended opting to use a simple implementation
for now, as ~40 seconds sounds acceptable on my eyes.

Patch 3 adds an optional parameter to allow filtering the results
using a regex given by the user.

One of the problems with the current ABI definitions is that several
symbols define wildcards, on non-standard ways. The more commonly
wildcards used there are:

	<foo>
	{foo}
	[foo]
	X
	Y
	Z
	/.../

The script converts the above wildcards into (somewhat relaxed)
regexes.

There's one place using  "(some description)". This one is harder to
parse, as parenthesis are used by the parsing regexes. As this happens
only on one file, patch 4 addresses such case.

Patch 5 to 9 fix some other ABI troubles I identified.

In long term, perhaps the better would be to just use regex on What:
fields, as this would avoid extra heuristics at get_abi.pl, but this is
OOT from this patch, and would mean a large number of changes.

-

As reference, I sent an early implementation of this change as a RFC:
	https://lore.kernel.org/lkml/cover.1624014140.git.mchehab+huawei@kernel.org/

Mauro Carvalho Chehab (9):
  scripts: get_abi.pl: Check for missing symbols at the ABI specs
  scripts: get_abi.pl: detect softlinks
  scripts: get_abi.pl: add an option to filter undefined results
  ABI: sysfs-bus-usb: better document variable argument
  ABI: sysfs-module: better document module name parameter
  ABI: sysfs-tty: better document module name parameter
  ABI: sysfs-kernel-slab: use a wildcard for the cache name
  ABI: security: fix location for evm and ima_policy
  ABI: sysfs-module: document initstate

 Documentation/ABI/stable/sysfs-module       |  10 +-
 Documentation/ABI/testing/evm               |   4 +-
 Documentation/ABI/testing/ima_policy        |   2 +-
 Documentation/ABI/testing/sysfs-bus-usb     |  16 +-
 Documentation/ABI/testing/sysfs-kernel-slab |  94 ++++-----
 Documentation/ABI/testing/sysfs-module      |   7 +
 Documentation/ABI/testing/sysfs-tty         |  32 +--
 scripts/get_abi.pl                          | 218 +++++++++++++++++++-
 8 files changed, 303 insertions(+), 80 deletions(-)

Comments

Greg KH Sept. 9, 2021, 1:51 p.m. UTC | #1
On Wed, Sep 08, 2021 at 04:58:47PM +0200, Mauro Carvalho Chehab wrote:
> Hi Greg,
> 
> Sometime ago, I discussed with Jonathan Cameron about providing 
> a way check that the ABI documentation is incomplete.
> 
> While it would be doable to validate the ABI by searching __ATTR and 
> similar macros around the driver, this would probably be very complex
> and would take a while to parse.
> 
> So, I ended by implementing a new feature at scripts/get_abi.pl
> which does a check on the sysfs contents of a running system:
> it reads everything under /sys and reads the entire ABI from
> Documentation/ABI. It then warns for symbols that weren't found,
> optionally showing possible candidates that might be misdefined.
> 
> I opted to place it on 3 patches:
> 
> The first patch adds the basic logic. It runs really quicky (up to 2
> seconds), but it doesn't use sysfs softlinks.
> 
> Patch 2 adds support for also parsing softlinks. It slows the logic,
> with now takes ~40 seconds to run on my desktop (and ~23
> seconds on a HiKey970 ARM board). There are space there for
> performance improvements, by using a more sophisticated
> algorithm, at the expense of making the code harder to
> understand. I ended opting to use a simple implementation
> for now, as ~40 seconds sounds acceptable on my eyes.
> 
> Patch 3 adds an optional parameter to allow filtering the results
> using a regex given by the user.
> 
> One of the problems with the current ABI definitions is that several
> symbols define wildcards, on non-standard ways. The more commonly
> wildcards used there are:
> 
> 	<foo>
> 	{foo}
> 	[foo]
> 	X
> 	Y
> 	Z
> 	/.../
> 
> The script converts the above wildcards into (somewhat relaxed)
> regexes.
> 
> There's one place using  "(some description)". This one is harder to
> parse, as parenthesis are used by the parsing regexes. As this happens
> only on one file, patch 4 addresses such case.
> 
> Patch 5 to 9 fix some other ABI troubles I identified.
> 
> In long term, perhaps the better would be to just use regex on What:
> fields, as this would avoid extra heuristics at get_abi.pl, but this is
> OOT from this patch, and would mean a large number of changes.

This is cool stuff, thanks for doing this!

I'll look at it more once 5.15-rc1 is out, thanks.

greg k-h
Mauro Carvalho Chehab Sept. 14, 2021, 2:24 p.m. UTC | #2
Em Thu, 9 Sep 2021 15:51:00 +0200
Greg KH <gregkh@linuxfoundation.org> escreveu:

> On Wed, Sep 08, 2021 at 04:58:47PM +0200, Mauro Carvalho Chehab wrote:
> > Hi Greg,
> > 
> > Sometime ago, I discussed with Jonathan Cameron about providing 
> > a way check that the ABI documentation is incomplete.
> > 
> > While it would be doable to validate the ABI by searching __ATTR and 
> > similar macros around the driver, this would probably be very complex
> > and would take a while to parse.
> > 
> > So, I ended by implementing a new feature at scripts/get_abi.pl
> > which does a check on the sysfs contents of a running system:
> > it reads everything under /sys and reads the entire ABI from
> > Documentation/ABI. It then warns for symbols that weren't found,
> > optionally showing possible candidates that might be misdefined.
> > 
> > I opted to place it on 3 patches:
> > 
> > The first patch adds the basic logic. It runs really quicky (up to 2
> > seconds), but it doesn't use sysfs softlinks.
> > 
> > Patch 2 adds support for also parsing softlinks. It slows the logic,
> > with now takes ~40 seconds to run on my desktop (and ~23
> > seconds on a HiKey970 ARM board). There are space there for
> > performance improvements, by using a more sophisticated
> > algorithm, at the expense of making the code harder to
> > understand. I ended opting to use a simple implementation
> > for now, as ~40 seconds sounds acceptable on my eyes.
> > 
> > Patch 3 adds an optional parameter to allow filtering the results
> > using a regex given by the user.
> > 
> > One of the problems with the current ABI definitions is that several
> > symbols define wildcards, on non-standard ways. The more commonly
> > wildcards used there are:
> > 
> > 	<foo>
> > 	{foo}
> > 	[foo]
> > 	X
> > 	Y
> > 	Z
> > 	/.../
> > 
> > The script converts the above wildcards into (somewhat relaxed)
> > regexes.
> > 
> > There's one place using  "(some description)". This one is harder to
> > parse, as parenthesis are used by the parsing regexes. As this happens
> > only on one file, patch 4 addresses such case.
> > 
> > Patch 5 to 9 fix some other ABI troubles I identified.
> > 
> > In long term, perhaps the better would be to just use regex on What:
> > fields, as this would avoid extra heuristics at get_abi.pl, but this is
> > OOT from this patch, and would mean a large number of changes.  
> 
> This is cool stuff, thanks for doing this!
> 
> I'll look at it more once 5.15-rc1 is out, thanks.

FYI, there's a new version at:

	https://git.kernel.org/pub/scm/linux/kernel/git/mchehab/devel.git/log/?h=get_undefined

In order for get_abi.pl to convert What: into regex, changes are needed on
existing ABI files. One alternative would be to convert everything into
regex, but that would probably mean that most ABI files would require work.

In order to avoid a huge number of patches/changes, I opted to touch only
the ones that aren't following the de-facto wildcard standards already 
found on most of the ABI files. So, I added support at get_abi.pl to
consider those patterns as wildcards:

	/.../
	*
	<foo>
	X
	Y
	Z
	[0-9] (and variants)

The files that use something else meaning a wildcard need changes, in order
to avoid ambiguity when the script decides if a character is either a 
wildcard or not. 

One of the issues there is with "N". several files use it as a wildcard, 
but USB sysfs parameters have several ABI nodes with an uppercase "N"
letter (like bNumInterfaces and such). So, this one had to be converted
too (and represents the vast majority of patches).

Anyway, as the number of such patches is high, I'll submit the work 
on three separate series:

	- What: changes needed for regex conversion;
	- get_abi.pl updates;
	- Some additions for missing symbols found on my
	  desktop.

Thanks,
Mauro