diff mbox series

[net-next,v3,2/6] tools/net/ynl: Report netlink errors without stacktrace

Message ID 20240306231046.97158-3-donald.hunter@gmail.com (mailing list archive)
State Accepted
Commit 771b7012e5f3a49739dab4be60b87517a249a1df
Delegated to: Netdev Maintainers
Headers show
Series tools/net/ynl: Add support for nlctrl netlink family | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/build_tools success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 39 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-03-07--15-00 (tests: 892)

Commit Message

Donald Hunter March 6, 2024, 11:10 p.m. UTC
ynl does not handle NlError exceptions so they get reported like program
failures. Handle the NlError exceptions and report the netlink errors
more cleanly.

Example now:

Netlink error: No such file or directory
nl_len = 44 (28) nl_flags = 0x300 nl_type = 2
	error: -2	extack: {'bad-attr': '.op'}

Example before:

Traceback (most recent call last):
  File "/home/donaldh/net-next/./tools/net/ynl/cli.py", line 81, in <module>
    main()
  File "/home/donaldh/net-next/./tools/net/ynl/cli.py", line 69, in main
    reply = ynl.dump(args.dump, attrs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/donaldh/net-next/tools/net/ynl/lib/ynl.py", line 906, in dump
    return self._op(method, vals, [], dump=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/donaldh/net-next/tools/net/ynl/lib/ynl.py", line 872, in _op
    raise NlError(nl_msg)
lib.ynl.NlError: Netlink error: No such file or directory
nl_len = 44 (28) nl_flags = 0x300 nl_type = 2
	error: -2	extack: {'bad-attr': '.op'}

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
---
 tools/net/ynl/cli.py          | 18 +++++++++++-------
 tools/net/ynl/lib/__init__.py |  4 ++--
 2 files changed, 13 insertions(+), 9 deletions(-)

Comments

Breno Leitao March 7, 2024, 9:31 a.m. UTC | #1
Hello Donald,

On Wed, Mar 06, 2024 at 11:10:42PM +0000, Donald Hunter wrote:
> ynl does not handle NlError exceptions so they get reported like program
> failures. Handle the NlError exceptions and report the netlink errors
> more cleanly.
> 
> Example now:
> 
> Netlink error: No such file or directory
> nl_len = 44 (28) nl_flags = 0x300 nl_type = 2
> 	error: -2	extack: {'bad-attr': '.op'}
> 
> Example before:
> 
> Traceback (most recent call last):
>   File "/home/donaldh/net-next/./tools/net/ynl/cli.py", line 81, in <module>
>     main()
>   File "/home/donaldh/net-next/./tools/net/ynl/cli.py", line 69, in main
>     reply = ynl.dump(args.dump, attrs)
>             ^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/home/donaldh/net-next/tools/net/ynl/lib/ynl.py", line 906, in dump
>     return self._op(method, vals, [], dump=True)
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/home/donaldh/net-next/tools/net/ynl/lib/ynl.py", line 872, in _op
>     raise NlError(nl_msg)
> lib.ynl.NlError: Netlink error: No such file or directory
> nl_len = 44 (28) nl_flags = 0x300 nl_type = 2
> 	error: -2	extack: {'bad-attr': '.op'}

Basically this is just hidding the stack, which may make it harder for
someone not used to the code to find the problem.

Usually fatal exception is handled to make the error more meaningful,
i.e, better than just the exception message + stack. Hidding the stack
and exitting may make the error less meaningful.

On a different topic, I am wondering if we want to add type hitting for
these python program. They make the review process easier, and the
development a bit more structured. (Maybe that is what we expect from
upcoming new python code in netdev?!)

If that is where we want to go, this is *not*, by any mean, a blocker to
this code. Maybe something we can add to our public ToDo list (CC:
Jakub).
Donald Hunter March 7, 2024, 11:56 a.m. UTC | #2
On Thu, 7 Mar 2024 at 09:31, Breno Leitao <leitao@debian.org> wrote:
>
> Hello Donald,
>
> On Wed, Mar 06, 2024 at 11:10:42PM +0000, Donald Hunter wrote:
> > ynl does not handle NlError exceptions so they get reported like program
> > failures. Handle the NlError exceptions and report the netlink errors
> > more cleanly.
> >
> > Example now:
> >
> > Netlink error: No such file or directory
> > nl_len = 44 (28) nl_flags = 0x300 nl_type = 2
> >       error: -2       extack: {'bad-attr': '.op'}
> >
> > Example before:
> >
> > Traceback (most recent call last):
> >   File "/home/donaldh/net-next/./tools/net/ynl/cli.py", line 81, in <module>
> >     main()
> >   File "/home/donaldh/net-next/./tools/net/ynl/cli.py", line 69, in main
> >     reply = ynl.dump(args.dump, attrs)
> >             ^^^^^^^^^^^^^^^^^^^^^^^^^^
> >   File "/home/donaldh/net-next/tools/net/ynl/lib/ynl.py", line 906, in dump
> >     return self._op(method, vals, [], dump=True)
> >            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >   File "/home/donaldh/net-next/tools/net/ynl/lib/ynl.py", line 872, in _op
> >     raise NlError(nl_msg)
> > lib.ynl.NlError: Netlink error: No such file or directory
> > nl_len = 44 (28) nl_flags = 0x300 nl_type = 2
> >       error: -2       extack: {'bad-attr': '.op'}
>
> Basically this is just hidding the stack, which may make it harder for
> someone not used to the code to find the problem.
>
> Usually fatal exception is handled to make the error more meaningful,
> i.e, better than just the exception message + stack. Hidding the stack
> and exitting may make the error less meaningful.

NlError is used to report a usage error reported by netlink as opposed
to a fatal exception. My thinking here is that it is better UX to
report netlink error responses without the stack trace precisely
because they are not exceptional. An NlError is not ynl program
breakage or subsystem breakage, it's e.g. nlctrl telling you that you
requested an op that does not exist.

> On a different topic, I am wondering if we want to add type hitting for
> these python program. They make the review process easier, and the
> development a bit more structured. (Maybe that is what we expect from
> upcoming new python code in netdev?!)

It's a good suggestion. I have never used python type hints so I'll
need to learn about them. I defer to the netdev maintainers about
whether this is something they want.

> If that is where we want to go, this is *not*, by any mean, a blocker to
> this code. Maybe something we can add to our public ToDo list (CC:
> Jakub).
Jakub Kicinski March 7, 2024, 3:58 p.m. UTC | #3
On Thu, 7 Mar 2024 11:56:59 +0000 Donald Hunter wrote:
> > Basically this is just hidding the stack, which may make it harder for
> > someone not used to the code to find the problem.
> >
> > Usually fatal exception is handled to make the error more meaningful,
> > i.e, better than just the exception message + stack. Hidding the stack
> > and exitting may make the error less meaningful.  
> 
> NlError is used to report a usage error reported by netlink as opposed
> to a fatal exception. My thinking here is that it is better UX to
> report netlink error responses without the stack trace precisely
> because they are not exceptional. An NlError is not ynl program
> breakage or subsystem breakage, it's e.g. nlctrl telling you that you
> requested an op that does not exist.

Right, I think the YNL library should still throw, but since this is
a case of "kernel gave us this specific error in response" the stack
trace adds relatively little for the CLI.

> > On a different topic, I am wondering if we want to add type hitting for
> > these python program. They make the review process easier, and the
> > development a bit more structured. (Maybe that is what we expect from
> > upcoming new python code in netdev?!)  
> 
> It's a good suggestion. I have never used python type hints so I'll
> need to learn about them. I defer to the netdev maintainers about
> whether this is something they want.

I'm far from a Python expert, so up to you :)
I used type hints a couple of times in the past, they are somewhat
useful, but didn't feel useful enough to bother. Happy for someone
else to do the work, tho :)

FWIW I reckon that trying to get the CLI ready for distro packaging 
may be higher prio. Apart from basic requirements to packaging python
code (I have no idea what they are), we should probably extend the
script to search some system paths? My thinking is that if someone
installs the CLI as an RPM, they should be able to use it like this:

 $ ynl-cli --family nlctrl \
	--do getfamily --json '{"family-name": "nlctrl"}'

the --family would be used instead of --spec and look for the exact
spec file in /usr/share/.../specs/ and probably also imply --no-schema,
since hopefully the schema is already validated during development,
and no point wasting time validating it on every user invocation.

WDYT?
Breno Leitao March 8, 2024, 6:27 p.m. UTC | #4
On Thu, Mar 07, 2024 at 07:58:15AM -0800, Jakub Kicinski wrote:
> On Thu, 7 Mar 2024 11:56:59 +0000 Donald Hunter wrote:
> > > Basically this is just hidding the stack, which may make it harder for
> > > someone not used to the code to find the problem.
> > >
> > > Usually fatal exception is handled to make the error more meaningful,
> > > i.e, better than just the exception message + stack. Hidding the stack
> > > and exitting may make the error less meaningful.  
> > 
> > NlError is used to report a usage error reported by netlink as opposed
> > to a fatal exception. My thinking here is that it is better UX to
> > report netlink error responses without the stack trace precisely
> > because they are not exceptional. An NlError is not ynl program
> > breakage or subsystem breakage, it's e.g. nlctrl telling you that you
> > requested an op that does not exist.
> 
> Right, I think the YNL library should still throw, but since this is
> a case of "kernel gave us this specific error in response" the stack
> trace adds relatively little for the CLI.
> 
> > > On a different topic, I am wondering if we want to add type hitting for
> > > these python program. They make the review process easier, and the
> > > development a bit more structured. (Maybe that is what we expect from
> > > upcoming new python code in netdev?!)  
> > 
> > It's a good suggestion. I have never used python type hints so I'll
> > need to learn about them. I defer to the netdev maintainers about
> > whether this is something they want.
> 
> I'm far from a Python expert, so up to you :)
> I used type hints a couple of times in the past, they are somewhat
> useful, but didn't feel useful enough to bother. Happy for someone
> else to do the work, tho :)

I am a big fan of type hitting, since it help in reviewing code, as also
with tooling that help you to find problems, since the function returns
and arguments now have a type.

What are the top 3 python scripts we have in network today? I can try to
find some time to help.

> FWIW I reckon that trying to get the CLI ready for distro packaging 
> may be higher prio. Apart from basic requirements to packaging python
> code (I have no idea what they are), we should probably extend the
> script to search some system paths? My thinking is that if someone
> installs the CLI as an RPM, they should be able to use it like this:
> 
>  $ ynl-cli --family nlctrl \
> 	--do getfamily --json '{"family-name": "nlctrl"}'
> 
> the --family would be used instead of --spec and look for the exact
> spec file in /usr/share/.../specs/ and probably also imply --no-schema,
> since hopefully the schema is already validated during development,
> and no point wasting time validating it on every user invocation.
> 
> WDYT?

This is a good idea. I've had a chat with Michel, and he can help with
the distro part. Adding him in the CC.
Jakub Kicinski March 8, 2024, 7:16 p.m. UTC | #5
On Fri, 8 Mar 2024 10:27:01 -0800 Breno Leitao wrote:
> > I'm far from a Python expert, so up to you :)
> > I used type hints a couple of times in the past, they are somewhat
> > useful, but didn't feel useful enough to bother. Happy for someone
> > else to do the work, tho :)  
> 
> I am a big fan of type hitting, since it help in reviewing code, as also
> with tooling that help you to find problems, since the function returns
> and arguments now have a type.
> 
> What are the top 3 python scripts we have in network today? I can try to
> find some time to help.

I think ynl.py (and nlspec.py) is by far the most used / active piece
of Python we have today.
diff mbox series

Patch

diff --git a/tools/net/ynl/cli.py b/tools/net/ynl/cli.py
index e8a65fbc3698..f131e33ac3ee 100755
--- a/tools/net/ynl/cli.py
+++ b/tools/net/ynl/cli.py
@@ -6,7 +6,7 @@  import json
 import pprint
 import time
 
-from lib import YnlFamily, Netlink
+from lib import YnlFamily, Netlink, NlError
 
 
 class YnlEncoder(json.JSONEncoder):
@@ -66,12 +66,16 @@  def main():
     if args.sleep:
         time.sleep(args.sleep)
 
-    if args.do:
-        reply = ynl.do(args.do, attrs, args.flags)
-        output(reply)
-    if args.dump:
-        reply = ynl.dump(args.dump, attrs)
-        output(reply)
+    try:
+        if args.do:
+            reply = ynl.do(args.do, attrs, args.flags)
+            output(reply)
+        if args.dump:
+            reply = ynl.dump(args.dump, attrs)
+            output(reply)
+    except NlError as e:
+        print(e)
+        exit(1)
 
     if args.ntf:
         ynl.check_ntf()
diff --git a/tools/net/ynl/lib/__init__.py b/tools/net/ynl/lib/__init__.py
index f7eaa07783e7..9137b83e580a 100644
--- a/tools/net/ynl/lib/__init__.py
+++ b/tools/net/ynl/lib/__init__.py
@@ -2,7 +2,7 @@ 
 
 from .nlspec import SpecAttr, SpecAttrSet, SpecEnumEntry, SpecEnumSet, \
     SpecFamily, SpecOperation
-from .ynl import YnlFamily, Netlink
+from .ynl import YnlFamily, Netlink, NlError
 
 __all__ = ["SpecAttr", "SpecAttrSet", "SpecEnumEntry", "SpecEnumSet",
-           "SpecFamily", "SpecOperation", "YnlFamily", "Netlink"]
+           "SpecFamily", "SpecOperation", "YnlFamily", "Netlink", "NlError"]