[0/2] xfsdump whitespace changes

Message ID	20181101110130.19489-1-jtulak@redhat.com (mailing list archive)
Headers	show Return-Path: <linux-xfs-owner@kernel.org> From: Jan Tulak <jtulak@redhat.com> To: linux-xfs@vger.kernel.org Cc: Jan Tulak <jtulak@redhat.com> Subject: [PATCH 0/2] xfsdump whitespace changes Date: Thu, 1 Nov 2018 12:01:28 +0100 Message-Id: <20181101110130.19489-1-jtulak@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk
Series	xfsdump whitespace changes \| expand [0/2] xfsdump whitespace changes [1/2] xfsdump: remove trailing whitespaces

Jan Tulak Nov. 1, 2018, 11:01 a.m. UTC

(I'm splitting the previous set into smaller ones, so changes doesn't
 have to wait.)

This set is dealing with whitespaces only, no functional change, code
shuffling, etc. should be present. The first patch clears out trailing
whitespaces, the second one sorts out the crazy mishmash of tabs and
spaces.

I know that these kind of changes are usually not welcomed as they
conflict with other people's work, but that should not be an issue for
xfsdump.

Jan Tulak (2):
  xfsdump: remove trailing whitespaces
  xfsdump: change indentation to tabs

 common/arch_xlate.c     |  330 ++--
 common/cldmgr.c         |   38 +-
 common/cldmgr.h         |    6 +-
 common/cleanup.c        |   10 +-
 common/cleanup.h        |   12 +-
 common/content.h        |   14 +-
 common/content_common.c |   28 +-
 common/content_inode.h  |   14 +-
 common/dlog.c           |   94 +-
 common/dlog.h           |   42 +-
 common/drive.c          |   38 +-
 common/drive.h          |   74 +-
 common/drive_minrmt.c   |  926 +++++-----
 common/drive_scsitape.c | 1377 ++++++++-------
 common/drive_simple.c   |  308 ++--
 common/fs.c             |   28 +-
 common/fs.h             |   22 +-
 common/getdents.c       |   14 +-
 common/global.c         |  120 +-
 common/global.h         |    4 +-
 common/hsmapi.c         |   50 +-
 common/inventory.c      |  142 +-
 common/inventory.h      |   46 +-
 common/main.c           |  604 +++----
 common/media.c          |   62 +-
 common/media_rmvtape.h  |    2 +-
 common/mlog.c           |   78 +-
 common/mlog.h           |    2 +-
 common/openutil.c       |   14 +-
 common/qlock.c          |   26 +-
 common/rec_hdr.h        |    2 +-
 common/ring.c           |   32 +-
 common/ring.h           |   46 +-
 common/stream.c         |   16 +-
 common/stream.h         |   20 +-
 common/ts_mtio.h        |  110 +-
 common/types.h          |   76 +-
 common/util.c           |  178 +-
 common/util.h           |   46 +-
 dump/content.c          | 2172 ++++++++++++------------
 dump/inomap.c           |  402 ++---
 dump/inomap.h           |   34 +-
 dump/var.c              |   38 +-
 inventory/inv_api.c     |  354 ++--
 inventory/inv_core.c    |   40 +-
 inventory/inv_files.c   |    2 +-
 inventory/inv_fstab.c   |   92 +-
 inventory/inv_idx.c     |  197 ++-
 inventory/inv_mgr.c     |  175 +-
 inventory/inv_oref.c    |  140 +-
 inventory/inv_oref.h    |   82 +-
 inventory/inv_priv.h    |  128 +-
 inventory/inv_stobj.c   |  516 +++---
 inventory/inventory.h   |   70 +-
 inventory/testmain.c    |  147 +-
 invutil/cmenu.c         |  654 +++----
 invutil/cmenu.h         |   82 +-
 invutil/fstab.c         |  414 ++---
 invutil/invidx.c        | 1574 +++++++++--------
 invutil/invutil.c       | 1161 +++++++------
 invutil/invutil.h       |   12 +-
 invutil/list.c          |  148 +-
 invutil/list.h          |   37 +-
 invutil/menu.c          |  292 ++--
 invutil/screen.c        |   64 +-
 invutil/stobj.c         |  654 +++----
 librmt/rmtabort.c       |    2 +-
 librmt/rmtaccess.c      |    2 -
 librmt/rmtclose.c       |    2 -
 librmt/rmtcommand.c     |    3 -
 librmt/rmtcreat.c       |    2 -
 librmt/rmtdev.c         |    2 -
 librmt/rmtfstat.c       |    2 +-
 librmt/rmtioctl.c       |  222 +--
 librmt/rmtlib.h         |    2 +-
 librmt/rmtlseek.c       |    2 -
 librmt/rmtmsg.c         |   26 +-
 librmt/rmtopen.c        |   72 +-
 librmt/rmtstatus.c      |    3 -
 restore/bag.c           |   22 +-
 restore/bag.h           |    4 +-
 restore/content.c       | 3568 +++++++++++++++++++--------------------
 restore/dirattr.c       |  308 ++--
 restore/dirattr.h       |    6 +-
 restore/inomap.c        |  118 +-
 restore/mmap.c          |   16 +-
 restore/namreg.c        |   82 +-
 restore/node.c          |   98 +-
 restore/node.h          |   12 +-
 restore/tree.c          | 1082 ++++++------
 restore/tree.h          |    8 +-
 restore/win.c           |   28 +-
 restore/win.h           |   10 +-
 93 files changed, 10202 insertions(+), 10234 deletions(-)

Dave Chinner Nov. 2, 2018, 1:36 a.m. UTC | #1

On Thu, Nov 01, 2018 at 12:01:28PM +0100, Jan Tulak wrote:
> (I'm splitting the previous set into smaller ones, so changes doesn't
>  have to wait.)
> 
> This set is dealing with whitespaces only, no functional change, code
> shuffling, etc. should be present. The first patch clears out trailing
> whitespaces, the second one sorts out the crazy mishmash of tabs and
> spaces.

patch 2 Didn't come through - it'll be too large for the list.

However, it's is the same change as what you originally posted to a
git tree, then it needs revision. basically, most of the change was
converting vertically aligned function call parameters to use tabs,
and that broke the vertical alignment.

I'd suggest that this is the least of the whitespace sins of
xfsdump. yes, it's easy to script, but I don't think it's the right
thing to do. The biggest problems I think we need to start with are:

- reformat all the function definitions according to common XFS style
- get rid of all the "( foo )" style white space aroudn parenthesis
- convert all the code with 4 space indents to tabs, leaving
  vertically aligned function call parameters alone.

This will be a much smaller set of cleanups than a blanket
space-to-tab script does...

Cheers,

Dave.

Jan Tulak Nov. 2, 2018, 11:43 a.m. UTC | #2

On Fri, Nov 2, 2018 at 2:36 AM Dave Chinner <david@fromorbit.com> wrote:
>
> On Thu, Nov 01, 2018 at 12:01:28PM +0100, Jan Tulak wrote:
> > (I'm splitting the previous set into smaller ones, so changes doesn't
> >  have to wait.)
> >
> > This set is dealing with whitespaces only, no functional change, code
> > shuffling, etc. should be present. The first patch clears out trailing
> > whitespaces, the second one sorts out the crazy mishmash of tabs and
> > spaces.
>
> patch 2 Didn't come through - it'll be too large for the list.

Ah, I see.

>
> However, it's is the same change as what you originally posted to a

Yes, it is the same thing, with changes where I found something
misaligned on top.

> git tree, then it needs revision. basically, most of the change was
> converting vertically aligned function call parameters to use tabs,
> and that broke the vertical alignment.

It is "s/    /\t/" limited to the beginning of the line. The function
parameters were caught in it too, but I don't think they would make
the most of the lines.

>
> I'd suggest that this is the least of the whitespace sins of
> xfsdump. yes, it's easy to script, but I don't think it's the right
> thing to do. The biggest problems I think we need to start with are:
>
> - reformat all the function definitions according to common XFS style
> - get rid of all the "( foo )" style white space aroudn parenthesis

I already started working on it. It is too complex for awk/sed, so I'm
trying to find some better way through IDEs with autoformatting
capabilities.

> - convert all the code with 4 space indents to tabs, leaving
>   vertically aligned function call parameters alone.

Part of the issue is that xfsdump is not even consistently using the
number of spaces. E.g. just a two random functions in
invutil/stobj.c:335-486 in master, you find that both 2 and 4 spaces
per indent level are used and even stuff like \t<space*4> inside of
"if", without it being a function parameter.

And the indentation of the parameters will change anyway, as we have
parameters indented with tabs only, and with spaces only, and with
some mix that makes no sense unless you configure your editor to show
\t as an odd number of spaces except on the first Thursday of a month,
but only on years that make a power of 2... Same as the rest of the
code. :-) So I expect the resulting set to be roughly the same size as
it is now. But I will break the patch by component, it should pass
through then.

Thanks,
Jan

>
> This will be a much smaller set of cleanups than a blanket
> space-to-tab script does...
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com

Jan Tulak Nov. 2, 2018, 4:54 p.m. UTC | #3

On Fri, Nov 2, 2018 at 12:43 PM Jan Tulak <jtulak@redhat.com> wrote:
>
> On Fri, Nov 2, 2018 at 2:36 AM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Thu, Nov 01, 2018 at 12:01:28PM +0100, Jan Tulak wrote:
> > > (I'm splitting the previous set into smaller ones, so changes doesn't
> > >  have to wait.)
> > >
> > > This set is dealing with whitespaces only, no functional change, code
> > > shuffling, etc. should be present. The first patch clears out trailing
> > > whitespaces, the second one sorts out the crazy mishmash of tabs and
> > > spaces.
> >
> > patch 2 Didn't come through - it'll be too large for the list.
>
> Ah, I see.
>
> >
> > However, it's is the same change as what you originally posted to a
>
> Yes, it is the same thing, with changes where I found something
> misaligned on top.
>
> > git tree, then it needs revision. basically, most of the change was
> > converting vertically aligned function call parameters to use tabs,
> > and that broke the vertical alignment.
>
> It is "s/    /\t/" limited to the beginning of the line. The function
> parameters were caught in it too, but I don't think they would make
> the most of the lines.
>
> >
> > I'd suggest that this is the least of the whitespace sins of
> > xfsdump. yes, it's easy to script, but I don't think it's the right
> > thing to do. The biggest problems I think we need to start with are:
> >
> > - reformat all the function definitions according to common XFS style
> > - get rid of all the "( foo )" style white space aroudn parenthesis
>
> I already started working on it. It is too complex for awk/sed, so I'm
> trying to find some better way through IDEs with autoformatting
> capabilities.
>
> > - convert all the code with 4 space indents to tabs, leaving
> >   vertically aligned function call parameters alone.
>
> Part of the issue is that xfsdump is not even consistently using the
> number of spaces. E.g. just a two random functions in
> invutil/stobj.c:335-486 in master, you find that both 2 and 4 spaces
> per indent level are used and even stuff like \t<space*4> inside of
> "if", without it being a function parameter.
>
> And the indentation of the parameters will change anyway, as we have
> parameters indented with tabs only, and with spaces only, and with
> some mix that makes no sense unless you configure your editor to show
> \t as an odd number of spaces except on the first Thursday of a month,
> but only on years that make a power of 2... Same as the rest of the
> code. :-) So I expect the resulting set to be roughly the same size as
> it is now. But I will break the patch by component, it should pass
> through then.
>
> Thanks,
> Jan
>

Update: I found a reasonably working code prettifier with kernel-style
config. The disadvantage is that I'm not really able to split the
patches by type of change (indentation, spaces inside of parentheses,
vertical align...) and it did it all at once, but hopefully that is
not a big issue. You can look at it here [1] - it is 2,4MB in size of
the patch files (only the formatting changes) and I want to check it a
bit better before sending it here as emails, whether there is no new
warning in gcc, etc. But it compiles and passes xfstests, so feel free
to peek at it - I will be back online in the Monday.

[1] https://github.com/jtulak/xfsdump/tree/for-review  (git clone
https://github.com/jtulak/xfsdump.git)

Thanks,
Jan

> >
> > This will be a much smaller set of cleanups than a blanket
> > space-to-tab script does...
> >
> > Cheers,
> >
> > Dave.
> > --
> > Dave Chinner
> > david@fromorbit.com

Dave Chinner Nov. 2, 2018, 10:34 p.m. UTC | #4

On Fri, Nov 02, 2018 at 12:43:18PM +0100, Jan Tulak wrote:
> On Fri, Nov 2, 2018 at 2:36 AM Dave Chinner <david@fromorbit.com> wrote:
> > On Thu, Nov 01, 2018 at 12:01:28PM +0100, Jan Tulak wrote:
> > However, it's is the same change as what you originally posted to a
> 
> Yes, it is the same thing, with changes where I found something
> misaligned on top.
> 
> > git tree, then it needs revision. basically, most of the change was
> > converting vertically aligned function call parameters to use tabs,
> > and that broke the vertical alignment.
> 
> It is "s/    /\t/" limited to the beginning of the line.

You mean 's/^    /\t/'?

Because the above regex is "change the first occurrence anywhere on
the line", not at the start of the line. IIRC, that's what the
original patch did. I can't check, becuase you've removed it from
your git tree... :/

> The function
> parameters were caught in it too, but I don't think they would make
> the most of the lines.

It was enough for me to notice both declarations and call sites made
up a large amount of the change in the first 3rd of the diff I
scrolled through... :/

> > I'd suggest that this is the least of the whitespace sins of
> > xfsdump. yes, it's easy to script, but I don't think it's the right
> > thing to do. The biggest problems I think we need to start with are:
> >
> > - reformat all the function definitions according to common XFS style
> > - get rid of all the "( foo )" style white space aroudn parenthesis
> 
> I already started working on it. It is too complex for awk/sed, so I'm
> trying to find some better way through IDEs with autoformatting
> capabilities.

It should be relatively straight forward with sed. e.g. off the top
of my head, stripping the space after "(" is this expression:

$ echo "foo( bar" | sed -e 's/\([[:alnum:]](\) \([[:alnum:]]\)/\1\2/' 
foo(bar
$

/me is slightly worried that he can now write non-trivial line noise
in sed that does the right thing first go....

Another option is coccinelle - it was specifically designed to do
such semantic transformations to C code:

http://coccinelle.lip6.fr/
http://coccinellery.org/

It's been widely used on the linux kernel for automated, widepsread
code changes for bug fixes and cleanups.

> > - convert all the code with 4 space indents to tabs, leaving
> >   vertically aligned function call parameters alone.
> 
> Part of the issue is that xfsdump is not even consistently using the
> number of spaces. E.g. just a two random functions in
> invutil/stobj.c:335-486 in master, you find that both 2 and 4 spaces
> per indent level are used and even stuff like \t<space*4> inside of
> "if", without it being a function parameter.

Yes.

$ echo "foo(  bar        baz" | sed -e 's/\t \{4\}/_tab_and_4_spaces_/' 
foo(  bar_tab_and_4_spaces_baz
$

You'll end up building lots of specific regexes to catch each of
these cases. Which is painful, and you'll still have to manually
clean stuff up afterwards.

> And the indentation of the parameters will change anyway, as we have
> parameters indented with tabs only, and with spaces only, and with
> some mix that makes no sense unless you configure your editor to show
> \t as an odd number of spaces except on the first Thursday of a month,
> but only on years that make a power of 2... Same as the rest of the
> code. :-) So I expect the resulting set to be roughly the same size as
> it is now. But I will break the patch by component, it should pass
> through then.

Yup, which is precisely why you should be looking at using a semantic
patching tool like coccinelle, not a literal one like sed that uses
regexes.

Cheers,

Dave.

Dave Chinner Nov. 2, 2018, 10:57 p.m. UTC | #5

On Fri, Nov 02, 2018 at 05:54:59PM +0100, Jan Tulak wrote:
> On Fri, Nov 2, 2018 at 12:43 PM Jan Tulak <jtulak@redhat.com> wrote:
> >
> > On Fri, Nov 2, 2018 at 2:36 AM Dave Chinner <david@fromorbit.com> wrote:
> > >
> > > On Thu, Nov 01, 2018 at 12:01:28PM +0100, Jan Tulak wrote:
> > > > (I'm splitting the previous set into smaller ones, so changes doesn't
> > > >  have to wait.)
> > > >
> > > > This set is dealing with whitespaces only, no functional change, code
> > > > shuffling, etc. should be present. The first patch clears out trailing
> > > > whitespaces, the second one sorts out the crazy mishmash of tabs and
> > > > spaces.
> > >
> > > patch 2 Didn't come through - it'll be too large for the list.
> >
> > Ah, I see.
> >
> > >
> > > However, it's is the same change as what you originally posted to a
> >
> > Yes, it is the same thing, with changes where I found something
> > misaligned on top.
> >
> > > git tree, then it needs revision. basically, most of the change was
> > > converting vertically aligned function call parameters to use tabs,
> > > and that broke the vertical alignment.
> >
> > It is "s/    /\t/" limited to the beginning of the line. The function
> > parameters were caught in it too, but I don't think they would make
> > the most of the lines.
> >
> > >
> > > I'd suggest that this is the least of the whitespace sins of
> > > xfsdump. yes, it's easy to script, but I don't think it's the right
> > > thing to do. The biggest problems I think we need to start with are:
> > >
> > > - reformat all the function definitions according to common XFS style
> > > - get rid of all the "( foo )" style white space aroudn parenthesis
> >
> > I already started working on it. It is too complex for awk/sed, so I'm
> > trying to find some better way through IDEs with autoformatting
> > capabilities.
> >
> > > - convert all the code with 4 space indents to tabs, leaving
> > >   vertically aligned function call parameters alone.
> >
> > Part of the issue is that xfsdump is not even consistently using the
> > number of spaces. E.g. just a two random functions in
> > invutil/stobj.c:335-486 in master, you find that both 2 and 4 spaces
> > per indent level are used and even stuff like \t<space*4> inside of
> > "if", without it being a function parameter.
> >
> > And the indentation of the parameters will change anyway, as we have
> > parameters indented with tabs only, and with spaces only, and with
> > some mix that makes no sense unless you configure your editor to show
> > \t as an odd number of spaces except on the first Thursday of a month,
> > but only on years that make a power of 2... Same as the rest of the
> > code. :-) So I expect the resulting set to be roughly the same size as
> > it is now. But I will break the patch by component, it should pass
> > through then.
> >
> > Thanks,
> > Jan
> >
> 
> Update: I found a reasonably working code prettifier with kernel-style
> config. The disadvantage is that I'm not really able to split the
> patches by type of change (indentation, spaces inside of parentheses,
> vertical align...) and it did it all at once, but hopefully that is
> not a big issue. You can look at it here [1] - it is 2,4MB in size of
> the patch files (only the formatting changes) and I want to check it a
> bit better before sending it here as emails, whether there is no new
> warning in gcc, etc. But it compiles and passes xfstests, so feel free
> to peek at it - I will be back online in the Monday.
> 
> [1] https://github.com/jtulak/xfsdump/tree/for-review  (git clone
> https://github.com/jtulak/xfsdump.git)

IMO, it's largely unreviewable because not only does whitespace
change, so does the way the code is laid out.

e.g it removes {} around single line if/for/while, so that could
introduce bugs if there are multi-expression macros that aren't
correctly encapsulated (and there are!).

And that's really hard to see in amongst all the indenting changes,
the whitespace removal, etc.  I'd much prefer "one type of
change at a time" patches because they are much easier to review.

Cheers,

Dave.

Jan Tulak Nov. 5, 2018, 10:15 a.m. UTC | #6

On Fri, Nov 2, 2018 at 11:34 PM Dave Chinner <david@fromorbit.com> wrote:
>
> On Fri, Nov 02, 2018 at 12:43:18PM +0100, Jan Tulak wrote:
> > On Fri, Nov 2, 2018 at 2:36 AM Dave Chinner <david@fromorbit.com> wrote:
> > > On Thu, Nov 01, 2018 at 12:01:28PM +0100, Jan Tulak wrote:
> > > However, it's is the same change as what you originally posted to a
> >
> > Yes, it is the same thing, with changes where I found something
> > misaligned on top.
> >
> > > git tree, then it needs revision. basically, most of the change was
> > > converting vertically aligned function call parameters to use tabs,
> > > and that broke the vertical alignment.
> >
> > It is "s/    /\t/" limited to the beginning of the line.
>
> You mean 's/^    /\t/'?

Yes, but in multiple iterations to get \t, \t\t, \t\t\t, ...

[snip]

> > > I'd suggest that this is the least of the whitespace sins of
> > > xfsdump. yes, it's easy to script, but I don't think it's the right
> > > thing to do. The biggest problems I think we need to start with are:
> > >
> > > - reformat all the function definitions according to common XFS style
> > > - get rid of all the "( foo )" style white space aroudn parenthesis
> >
> > I already started working on it. It is too complex for awk/sed, so I'm
> > trying to find some better way through IDEs with autoformatting
> > capabilities.
>
> It should be relatively straight forward with sed. e.g. off the top
> of my head, stripping the space after "(" is this expression:
>
> $ echo "foo( bar" | sed -e 's/\([[:alnum:]](\) \([[:alnum:]]\)/\1\2/'
> foo(bar
> $

I tried something similar and the result was not ideal - it worked ok
for common cases, but not always.

>
> /me is slightly worried that he can now write non-trivial line noise
> in sed that does the right thing first go....
>
> Another option is coccinelle - it was specifically designed to do
> such semantic transformations to C code:

Thanks for the tip, I will look at it.

Cheers,
Jan

Jan Tulak Nov. 5, 2018, 10:17 a.m. UTC | #7

On Fri, Nov 2, 2018 at 11:57 PM Dave Chinner <david@fromorbit.com> wrote:
>
> On Fri, Nov 02, 2018 at 05:54:59PM +0100, Jan Tulak wrote:
> >
> > Update: I found a reasonably working code prettifier with kernel-style
> > config. The disadvantage is that I'm not really able to split the
> > patches by type of change (indentation, spaces inside of parentheses,
> > vertical align...) and it did it all at once, but hopefully that is
> > not a big issue. You can look at it here [1] - it is 2,4MB in size of
> > the patch files (only the formatting changes) and I want to check it a
> > bit better before sending it here as emails, whether there is no new
> > warning in gcc, etc. But it compiles and passes xfstests, so feel free
> > to peek at it - I will be back online in the Monday.
> >
> > [1] https://github.com/jtulak/xfsdump/tree/for-review  (git clone
> > https://github.com/jtulak/xfsdump.git)
>
> IMO, it's largely unreviewable because not only does whitespace
> change, so does the way the code is laid out.
>
> e.g it removes {} around single line if/for/while, so that could
> introduce bugs if there are multi-expression macros that aren't
> correctly encapsulated (and there are!).

I didn't realize that possibility. Fixing...

>
> And that's really hard to see in amongst all the indenting changes,
> the whitespace removal, etc.  I'd much prefer "one type of
> change at a time" patches because they are much easier to review.

Yeah, I understand. Let's see what I can do... Anyway, thanks for the feedback.

Cheers,
Jan

Dave Chinner Nov. 5, 2018, 11:48 a.m. UTC | #8

On Mon, Nov 05, 2018 at 11:15:34AM +0100, Jan Tulak wrote:
> On Fri, Nov 2, 2018 at 11:34 PM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Fri, Nov 02, 2018 at 12:43:18PM +0100, Jan Tulak wrote:
> > > On Fri, Nov 2, 2018 at 2:36 AM Dave Chinner <david@fromorbit.com> wrote:
> > > > On Thu, Nov 01, 2018 at 12:01:28PM +0100, Jan Tulak wrote:
> > > > However, it's is the same change as what you originally posted to a
> > >
> > > Yes, it is the same thing, with changes where I found something
> > > misaligned on top.
> > >
> > > > git tree, then it needs revision. basically, most of the change was
> > > > converting vertically aligned function call parameters to use tabs,
> > > > and that broke the vertical alignment.
> > >
> > > It is "s/    /\t/" limited to the beginning of the line.
> >
> > You mean 's/^    /\t/'?
> 
> Yes, but in multiple iterations to get \t, \t\t, \t\t\t, ...

Which is handled by this regex: 's/^\(\t*\)*    /\1\t/'

In this case, I'm using "*", which means "match zero or more of the
preceding expression" - which in this case is \t. That regex is
enclosed in \(...\) to group the result, which is then back
referenced in the output expression by \1 (first group backref).

Regexes are extremely and flexible once you've learnt how the
multiple object matching rules work.

Cheers,

Dave.

Jan Tulak Nov. 5, 2018, 12:25 p.m. UTC | #9

On Mon, Nov 5, 2018 at 12:48 PM Dave Chinner <david@fromorbit.com> wrote:
>
> On Mon, Nov 05, 2018 at 11:15:34AM +0100, Jan Tulak wrote:
> > On Fri, Nov 2, 2018 at 11:34 PM Dave Chinner <david@fromorbit.com> wrote:
> > >
> > > On Fri, Nov 02, 2018 at 12:43:18PM +0100, Jan Tulak wrote:
> > > > On Fri, Nov 2, 2018 at 2:36 AM Dave Chinner <david@fromorbit.com> wrote:
> > > > > On Thu, Nov 01, 2018 at 12:01:28PM +0100, Jan Tulak wrote:
> > > > > However, it's is the same change as what you originally posted to a
> > > >
> > > > Yes, it is the same thing, with changes where I found something
> > > > misaligned on top.
> > > >
> > > > > git tree, then it needs revision. basically, most of the change was
> > > > > converting vertically aligned function call parameters to use tabs,
> > > > > and that broke the vertical alignment.
> > > >
> > > > It is "s/    /\t/" limited to the beginning of the line.
> > >
> > > You mean 's/^    /\t/'?
> >
> > Yes, but in multiple iterations to get \t, \t\t, \t\t\t, ...
>
> Which is handled by this regex: 's/^\(\t*\)*    /\1\t/'
>
> In this case, I'm using "*", which means "match zero or more of the
> preceding expression" - which in this case is \t. That regex is
> enclosed in \(...\) to group the result, which is then back
> referenced in the output expression by \1 (first group backref).
>
> Regexes are extremely and flexible once you've learnt how the
> multiple object matching rules work.

I know. But I don't see how your regex would take the number of
four-space groups and inserted the same number of \t, which is what I
was trying to do and AFAIK there is no way to do it with sed. I know
it could be done with awk, but writing it would take more time for me
than re-running s/^        /\t\t/ with a manually changed number of
occurrences, from one to say 5 levels (or until I stop getting any
changes).

Cheers,
Jan

Dave Chinner Nov. 5, 2018, 9:52 p.m. UTC | #10

On Mon, Nov 05, 2018 at 01:25:45PM +0100, Jan Tulak wrote:
> On Mon, Nov 5, 2018 at 12:48 PM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Mon, Nov 05, 2018 at 11:15:34AM +0100, Jan Tulak wrote:
> > > On Fri, Nov 2, 2018 at 11:34 PM Dave Chinner <david@fromorbit.com> wrote:
> > > >
> > > > On Fri, Nov 02, 2018 at 12:43:18PM +0100, Jan Tulak wrote:
> > > > > On Fri, Nov 2, 2018 at 2:36 AM Dave Chinner <david@fromorbit.com> wrote:
> > > > > > On Thu, Nov 01, 2018 at 12:01:28PM +0100, Jan Tulak wrote:
> > > > > > However, it's is the same change as what you originally posted to a
> > > > >
> > > > > Yes, it is the same thing, with changes where I found something
> > > > > misaligned on top.
> > > > >
> > > > > > git tree, then it needs revision. basically, most of the change was
> > > > > > converting vertically aligned function call parameters to use tabs,
> > > > > > and that broke the vertical alignment.
> > > > >
> > > > > It is "s/    /\t/" limited to the beginning of the line.
> > > >
> > > > You mean 's/^    /\t/'?
> > >
> > > Yes, but in multiple iterations to get \t, \t\t, \t\t\t, ...
> >
> > Which is handled by this regex: 's/^\(\t*\)*    /\1\t/'
> >
> > In this case, I'm using "*", which means "match zero or more of the
> > preceding expression" - which in this case is \t. That regex is
> > enclosed in \(...\) to group the result, which is then back
> > referenced in the output expression by \1 (first group backref).
> >
> > Regexes are extremely and flexible once you've learnt how the
> > multiple object matching rules work.
> 
> I know. But I don't see how your regex would take the number of
> four-space groups and inserted the same number of \t,

I thought you were asking about having multiple tabs preceding
the "4 space group". If you simply want to change all 4 space
groups, it's 's/\(    \)/\t/g':

$ echo "                " |sed -e 's/\(    \)/T/g'
TTTT
$

The positional match selector suffix is the key here. 'g' means
"global match" and replaces every occurrence on the line. If you use
a number, it replaces the N'th occurrence:

$ echo "                " |sed -e 's/\(    \)/T/1'
T            
$ echo "                " |sed -e 's/\(    \)/T/2'
    T        
$ echo "                " |sed -e 's/\(    \)/T/3'
        T    
$ echo "                " |sed -e 's/\(    \)/T/4'
            T
> which is what I
> was trying to do and AFAIK there is no way to do it with sed. I know
> it could be done with awk,

awk is still regex based, it just allows you to get away with simple
regexes by adding complex code :P

> but writing it would take more time for me
> than re-running s/^        /\t\t/ with a manually changed number of
> occurrences, from one to say 5 levels (or until I stop getting any
> changes).

Grouping and positional selection is the answer here.

Cheers,

Dave.

Jan Tulak Nov. 8, 2018, 5:39 p.m. UTC | #11

On Mon, Nov 5, 2018 at 10:53 PM Dave Chinner <david@fromorbit.com> wrote:
>
> On Mon, Nov 05, 2018 at 01:25:45PM +0100, Jan Tulak wrote:
> > On Mon, Nov 5, 2018 at 12:48 PM Dave Chinner <david@fromorbit.com> wrote:
> > >
> > > On Mon, Nov 05, 2018 at 11:15:34AM +0100, Jan Tulak wrote:
> > > > On Fri, Nov 2, 2018 at 11:34 PM Dave Chinner <david@fromorbit.com> wrote:
> > > > >
> > > > > On Fri, Nov 02, 2018 at 12:43:18PM +0100, Jan Tulak wrote:
> > > > > > On Fri, Nov 2, 2018 at 2:36 AM Dave Chinner <david@fromorbit.com> wrote:
> > > > > > > On Thu, Nov 01, 2018 at 12:01:28PM +0100, Jan Tulak wrote:
> > > > > > > However, it's is the same change as what you originally posted to a
> > > > > >
> > > > > > Yes, it is the same thing, with changes where I found something
> > > > > > misaligned on top.
> > > > > >
> > > > > > > git tree, then it needs revision. basically, most of the change was
> > > > > > > converting vertically aligned function call parameters to use tabs,
> > > > > > > and that broke the vertical alignment.
> > > > > >
> > > > > > It is "s/    /\t/" limited to the beginning of the line.
> > > > >
> > > > > You mean 's/^    /\t/'?
> > > >
> > > > Yes, but in multiple iterations to get \t, \t\t, \t\t\t, ...
> > >
> > > Which is handled by this regex: 's/^\(\t*\)*    /\1\t/'
> > >
> > > In this case, I'm using "*", which means "match zero or more of the
> > > preceding expression" - which in this case is \t. That regex is
> > > enclosed in \(...\) to group the result, which is then back
> > > referenced in the output expression by \1 (first group backref).
> > >
> > > Regexes are extremely and flexible once you've learnt how the
> > > multiple object matching rules work.
> >
> > I know. But I don't see how your regex would take the number of
> > four-space groups and inserted the same number of \t,
>
> I thought you were asking about having multiple tabs preceding
> the "4 space group". If you simply want to change all 4 space
> groups, it's 's/\(    \)/\t/g':
>

[snip]

Thanks, but I know regular expressions, even if I usually don't get a
nontrivial expression on a first try. :-D

Anyway, how about this? I pushed it into a new branch (style-nov-8th)
https://github.com/jtulak/xfsdump/tree/style-nov-8th
(git clone --single-branch -b style-nov-8th
https://github.com/jtulak/xfsdump.git)

It is now in 16 patches, split usually one change at a time. Three of
them are over 500k in size, so I will need to cut them somehow into
more parts before sending them into the mailing list. But I ended up
with a small script set that does it all, including the creation of
commits, so I don't have to deal with conflicts if I will need to
change anything.

Btw, I wonder if these formatting scripts could be useful if I push
them somewhere publicly... About 2/3 are sed replacements, the last
third are gradually stricter configurations for Uncrustify
(http://uncrustify.sourceforge.net/, available in distro repos).

Thanks,
Jan

Dave Chinner Nov. 9, 2018, 1:04 a.m. UTC | #12

On Thu, Nov 08, 2018 at 06:39:21PM +0100, Jan Tulak wrote:
> On Mon, Nov 5, 2018 at 10:53 PM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Mon, Nov 05, 2018 at 01:25:45PM +0100, Jan Tulak wrote:
> > > On Mon, Nov 5, 2018 at 12:48 PM Dave Chinner <david@fromorbit.com> wrote:
> > > >
> > > > On Mon, Nov 05, 2018 at 11:15:34AM +0100, Jan Tulak wrote:
> > > > > On Fri, Nov 2, 2018 at 11:34 PM Dave Chinner <david@fromorbit.com> wrote:
> > > > > >
> > > > > > On Fri, Nov 02, 2018 at 12:43:18PM +0100, Jan Tulak wrote:
> > > > > > > On Fri, Nov 2, 2018 at 2:36 AM Dave Chinner <david@fromorbit.com> wrote:
> > > > > > > > On Thu, Nov 01, 2018 at 12:01:28PM +0100, Jan Tulak wrote:
> > > > > > > > However, it's is the same change as what you originally posted to a
> > > > > > >
> > > > > > > Yes, it is the same thing, with changes where I found something
> > > > > > > misaligned on top.
> > > > > > >
> > > > > > > > git tree, then it needs revision. basically, most of the change was
> > > > > > > > converting vertically aligned function call parameters to use tabs,
> > > > > > > > and that broke the vertical alignment.
> > > > > > >
> > > > > > > It is "s/    /\t/" limited to the beginning of the line.
> > > > > >
> > > > > > You mean 's/^    /\t/'?
> > > > >
> > > > > Yes, but in multiple iterations to get \t, \t\t, \t\t\t, ...
> > > >
> > > > Which is handled by this regex: 's/^\(\t*\)*    /\1\t/'
> > > >
> > > > In this case, I'm using "*", which means "match zero or more of the
> > > > preceding expression" - which in this case is \t. That regex is
> > > > enclosed in \(...\) to group the result, which is then back
> > > > referenced in the output expression by \1 (first group backref).
> > > >
> > > > Regexes are extremely and flexible once you've learnt how the
> > > > multiple object matching rules work.
> > >
> > > I know. But I don't see how your regex would take the number of
> > > four-space groups and inserted the same number of \t,
> >
> > I thought you were asking about having multiple tabs preceding
> > the "4 space group". If you simply want to change all 4 space
> > groups, it's 's/\(    \)/\t/g':
> >
> 
> [snip]
> 
> Thanks, but I know regular expressions, even if I usually don't get a
> nontrivial expression on a first try. :-D
> 
> Anyway, how about this? I pushed it into a new branch (style-nov-8th)
> https://github.com/jtulak/xfsdump/tree/style-nov-8th
> (git clone --single-branch -b style-nov-8th
> https://github.com/jtulak/xfsdump.git)
> 
> It is now in 16 patches, split usually one change at a time. Three of
> them are over 500k in size, so I will need to cut them somehow into
> more parts before sending them into the mailing list.

Split them by dump/restore/inventory/common directories.

> But I ended up
> with a small script set that does it all, including the creation of
> commits, so I don't have to deal with conflicts if I will need to
> change anything.
> 
> Btw, I wonder if these formatting scripts could be useful if I push
> them somewhere publicly... About 2/3 are sed replacements, the last
> third are gradually stricter configurations for Uncrustify
> (http://uncrustify.sourceforge.net/, available in distro repos).

The scripts used for the transfomation should be documented in the
the commit messages. e.g:

https://git.kernel.org/pub/scm/fs/xfs/xfsprogs-dev.git/commit/?id=37b3b4d6c38aaf501dd7714d9670bff4ac282923

Cheers,

Dave.

[0/2] xfsdump whitespace changes

Message

Comments