diff mbox series

[v2,01/12] simpletrace: Improve parsing of sys.argv; fix files never closed.

Message ID 20230502092339.27341-2-mads@ynddal.dk (mailing list archive)
State New, archived
Headers show
Series simpletrace: refactor and general improvements | expand

Commit Message

Mads Ynddal May 2, 2023, 9:23 a.m. UTC
From: Mads Ynddal <m.ynddal@samsung.com>

The arguments extracted from `sys.argv` named and unpacked to make it
clear what the arguments are and what they're used for.

The two input files were opened, but never explicitly closed. File usage
changed to use `with` statement to take care of this. At the same time,
ownership of the file-object is moved up to `run` function. Secondary `open`
inside `process` removed so there's only one place to handle `open`.

Signed-off-by: Mads Ynddal <m.ynddal@samsung.com>
---
 scripts/simpletrace.py | 28 +++++++++++-----------------
 1 file changed, 11 insertions(+), 17 deletions(-)

Comments

Stefan Hajnoczi May 4, 2023, 6:03 p.m. UTC | #1
On Tue, May 02, 2023 at 11:23:28AM +0200, Mads Ynddal wrote:
> From: Mads Ynddal <m.ynddal@samsung.com>
> 
> The arguments extracted from `sys.argv` named and unpacked to make it
> clear what the arguments are and what they're used for.
> 
> The two input files were opened, but never explicitly closed. File usage
> changed to use `with` statement to take care of this. At the same time,
> ownership of the file-object is moved up to `run` function. Secondary `open`
> inside `process` removed so there's only one place to handle `open`.
> 
> Signed-off-by: Mads Ynddal <m.ynddal@samsung.com>
> ---
>  scripts/simpletrace.py | 28 +++++++++++-----------------
>  1 file changed, 11 insertions(+), 17 deletions(-)
> 
> diff --git a/scripts/simpletrace.py b/scripts/simpletrace.py
> index 1f6d1ae1f3..9211caaec1 100755
> --- a/scripts/simpletrace.py
> +++ b/scripts/simpletrace.py
> @@ -9,6 +9,7 @@
>  #
>  # For help see docs/devel/tracing.rst
>  
> +import sys
>  import struct
>  import inspect
>  from tracetool import read_events, Event
> @@ -44,7 +45,6 @@ def get_record(edict, idtoname, rechdr, fobj):
>          try:
>              event = edict[name]
>          except KeyError as e:
> -            import sys
>              sys.stderr.write('%s event is logged but is not declared ' \
>                               'in the trace events file, try using ' \
>                               'trace-events-all instead.\n' % str(e))
> @@ -166,11 +166,6 @@ def end(self):
>  
>  def process(events, log, analyzer, read_header=True):
>      """Invoke an analyzer on each event in a log."""
> -    if isinstance(events, str):
> -        events = read_events(open(events, 'r'), events)
> -    if isinstance(log, str):
> -        log = open(log, 'rb')
> -
>      if read_header:
>          read_trace_header(log)

simpletrace.py is both a command-line tool and a Python module. The
Python module has a public API that people's scripts may rely on. Let's
avoid breaking API changes unless necessary so that existing scripts
that import simpletrace continue to work.

It's not very clear what is a public API in simpletrace.py, the file
should really have __all__ = ['Analyzer', 'process', 'run'].
Nevertheless, Analyzer's doc comments mention process() and the
process() function itself also has doc comments, so it's a public API.

Please drop this change to avoid breaking the public API.

>  
> @@ -223,19 +218,18 @@ def run(analyzer):
>  
>      This function is useful as a driver for simple analysis scripts.  More
>      advanced scripts will want to call process() instead."""
> -    import sys
> -
> -    read_header = True
> -    if len(sys.argv) == 4 and sys.argv[1] == '--no-header':
> -        read_header = False
> -        del sys.argv[1]
> -    elif len(sys.argv) != 3:
> -        sys.stderr.write('usage: %s [--no-header] <trace-events> ' \
> -                         '<trace-file>\n' % sys.argv[0])
> +
> +    try:
> +        # NOTE: See built-in `argparse` module for a more robust cli interface
> +        *no_header, trace_event_path, trace_file_path = sys.argv[1:]
> +        assert no_header == [] or no_header == ['--no-header'], 'Invalid no-header argument'
> +    except (AssertionError, ValueError):
> +        sys.stderr.write(f'usage: {sys.argv[0]} [--no-header] <trace-events> <trace-file>\n')
>          sys.exit(1)
>  
> -    events = read_events(open(sys.argv[1], 'r'), sys.argv[1])
> -    process(events, sys.argv[2], analyzer, read_header=read_header)
> +    with open(trace_event_path, 'r') as events_fobj, open(trace_file_path, 'rb') as log_fobj:
> +        events = read_events(events_fobj, trace_event_path)
> +        process(events, log_fobj, analyzer, read_header=not no_header)
>  
>  if __name__ == '__main__':
>      class Formatter(Analyzer):
> -- 
> 2.38.1
>
Mads Ynddal May 8, 2023, 1:18 p.m. UTC | #2
> simpletrace.py is both a command-line tool and a Python module. The
> Python module has a public API that people's scripts may rely on. Let's
> avoid breaking API changes unless necessary so that existing scripts
> that import simpletrace continue to work.
> 
> It's not very clear what is a public API in simpletrace.py, the file
> should really have __all__ = ['Analyzer', 'process', 'run'].
> Nevertheless, Analyzer's doc comments mention process() and the
> process() function itself also has doc comments, so it's a public API.
> 
> Please drop this change to avoid breaking the public API.

I agree, I'll revert the changes. I can add an `__all__` too.

I'd like to avoid having the same `open`, `read_trace_header` and `read_events`
multiple places. Would it be acceptable to let `process` be more of a stub and
move the logic to an internal `_process` function maybe?
Stefan Hajnoczi May 8, 2023, 3:08 p.m. UTC | #3
On Mon, May 08, 2023 at 01:18:40PM +0000, Mads Ynddal wrote:
> > simpletrace.py is both a command-line tool and a Python module. The
> > Python module has a public API that people's scripts may rely on. Let's
> > avoid breaking API changes unless necessary so that existing scripts
> > that import simpletrace continue to work.
> > 
> > It's not very clear what is a public API in simpletrace.py, the file
> > should really have __all__ = ['Analyzer', 'process', 'run'].
> > Nevertheless, Analyzer's doc comments mention process() and the
> > process() function itself also has doc comments, so it's a public API.
> > 
> > Please drop this change to avoid breaking the public API.
> 
> I agree, I'll revert the changes. I can add an `__all__` too.
> 
> I'd like to avoid having the same `open`, `read_trace_header` and `read_events`
> multiple places. Would it be acceptable to let `process` be more of a stub and
> move the logic to an internal `_process` function maybe?

Yes, as long as the existing public API doesn't change that would be
fine.

Stefan
diff mbox series

Patch

diff --git a/scripts/simpletrace.py b/scripts/simpletrace.py
index 1f6d1ae1f3..9211caaec1 100755
--- a/scripts/simpletrace.py
+++ b/scripts/simpletrace.py
@@ -9,6 +9,7 @@ 
 #
 # For help see docs/devel/tracing.rst
 
+import sys
 import struct
 import inspect
 from tracetool import read_events, Event
@@ -44,7 +45,6 @@  def get_record(edict, idtoname, rechdr, fobj):
         try:
             event = edict[name]
         except KeyError as e:
-            import sys
             sys.stderr.write('%s event is logged but is not declared ' \
                              'in the trace events file, try using ' \
                              'trace-events-all instead.\n' % str(e))
@@ -166,11 +166,6 @@  def end(self):
 
 def process(events, log, analyzer, read_header=True):
     """Invoke an analyzer on each event in a log."""
-    if isinstance(events, str):
-        events = read_events(open(events, 'r'), events)
-    if isinstance(log, str):
-        log = open(log, 'rb')
-
     if read_header:
         read_trace_header(log)
 
@@ -223,19 +218,18 @@  def run(analyzer):
 
     This function is useful as a driver for simple analysis scripts.  More
     advanced scripts will want to call process() instead."""
-    import sys
-
-    read_header = True
-    if len(sys.argv) == 4 and sys.argv[1] == '--no-header':
-        read_header = False
-        del sys.argv[1]
-    elif len(sys.argv) != 3:
-        sys.stderr.write('usage: %s [--no-header] <trace-events> ' \
-                         '<trace-file>\n' % sys.argv[0])
+
+    try:
+        # NOTE: See built-in `argparse` module for a more robust cli interface
+        *no_header, trace_event_path, trace_file_path = sys.argv[1:]
+        assert no_header == [] or no_header == ['--no-header'], 'Invalid no-header argument'
+    except (AssertionError, ValueError):
+        sys.stderr.write(f'usage: {sys.argv[0]} [--no-header] <trace-events> <trace-file>\n')
         sys.exit(1)
 
-    events = read_events(open(sys.argv[1], 'r'), sys.argv[1])
-    process(events, sys.argv[2], analyzer, read_header=read_header)
+    with open(trace_event_path, 'r') as events_fobj, open(trace_file_path, 'rb') as log_fobj:
+        events = read_events(events_fobj, trace_event_path)
+        process(events, log_fobj, analyzer, read_header=not no_header)
 
 if __name__ == '__main__':
     class Formatter(Analyzer):