generic/581: remove extra escape character from awk line

Message ID	ZV40XOGNJdru13XE@suse.de (mailing list archive)
State	New, archived
Headers	show Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="bygTLAOw"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="0/7K2Wim" Date: Wed, 22 Nov 2023 17:03:24 +0000 From: Luis Henriques <lhenriques@suse.de> To: fstests@vger.kernel.org Subject: [PATCH] generic/581: remove extra escape character from awk line Message-ID: <ZV40XOGNJdru13XE@suse.de> Precedence: bulk MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline
Series	generic/581: remove extra escape character from awk line \| expand generic/581: remove extra escape character from awk line

Luis Henriques Nov. 22, 2023, 5:03 p.m. UTC

Checking the keys in /proc/key-users is buggy, as there's an extra '\'
character: in '{print \$4}' the '$4' shouldn't be escaped otherwise the
'awk' command will fail.  This has passed unnoticed because the output
is sent to '_user_do' function and the result assigned to a variable.

While there, replace 'awk' by $AWK_PROG.

Signed-off-by: Luis Henriques <lhenriques@suse.de>
---
 tests/generic/581 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Hi!

Please note that I'm not an 'awk' expert and I may be wrong!  But if I do
see an error if I run something like:

$ awk '/^[[:space:]]*1000:/{print \$4}' /proc/key-users 
awk: cmd. line:1: /^[[:space:]]*1000:/{print \$4}
awk: cmd. line:1:                            ^ backslash not last character on line

But maybe this depends on the awk implementation, although I've tried a few.

Eric Biggers Nov. 22, 2023, 11:16 p.m. UTC | #1

On Wed, Nov 22, 2023 at 05:03:24PM +0000, Luis Henriques wrote:
> Checking the keys in /proc/key-users is buggy, as there's an extra '\'
> character: in '{print \$4}' the '$4' shouldn't be escaped otherwise the
> 'awk' command will fail.  This has passed unnoticed because the output
> is sent to '_user_do' function and the result assigned to a variable.
> 
> While there, replace 'awk' by $AWK_PROG.
> 
> Signed-off-by: Luis Henriques <lhenriques@suse.de>
> ---
>  tests/generic/581 | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Hi!
> 
> Please note that I'm not an 'awk' expert and I may be wrong!  But if I do
> see an error if I run something like:
> 
> $ awk '/^[[:space:]]*1000:/{print \$4}' /proc/key-users 
> awk: cmd. line:1: /^[[:space:]]*1000:/{print \$4}
> awk: cmd. line:1:                            ^ backslash not last character on line
> 
> But maybe this depends on the awk implementation, although I've tried a few.
> 
> diff --git a/tests/generic/581 b/tests/generic/581
> index cabc7e1c69ab..1a4b571d40ce 100755
> --- a/tests/generic/581
> +++ b/tests/generic/581
> @@ -92,7 +92,7 @@ while grep -E -q '^[0-9a-f]+ [^ ]*i[^ ]*' /proc/keys; do
>  done
>  
>  # Set the user key quota to the fsgqa user's current number of keys plus 5.
> -orig_keys=$(_user_do "awk '/^[[:space:]]*$(id -u fsgqa):/{print \$4}' /proc/key-users | cut -d/ -f1")
> +orig_keys=$(_user_do "$AWK_PROG '/^[[:space:]]*$(id -u fsgqa):/{print $4}' /proc/key-users | cut -d/ -f1")

The backslash is needed to prevent $4 from being expanded by bash, because the
whole pipeline with 'awk' and 'cut' is in a double-quoted string:

    "awk '/^[[:space:]]*$(id -u fsgqa):/{print \$4}' /proc/key-users | cut -d/ -f1"

Without escaping the $, bash would replace $4 with the empty string while it's
doing expansions on the whole double-quoted string.

- Eric

Luís Henriques Nov. 23, 2023, 11:46 a.m. UTC | #2

(Sorry for replying from private email address)

On Wed, Nov 22, 2023 at 03:16:48PM -0800, Eric Biggers wrote:
> On Wed, Nov 22, 2023 at 05:03:24PM +0000, Luis Henriques wrote:
> > Checking the keys in /proc/key-users is buggy, as there's an extra '\'
> > character: in '{print \$4}' the '$4' shouldn't be escaped otherwise the
> > 'awk' command will fail.  This has passed unnoticed because the output
> > is sent to '_user_do' function and the result assigned to a variable.
> > 
> > While there, replace 'awk' by $AWK_PROG.
> > 
> > Signed-off-by: Luis Henriques <lhenriques@suse.de>
> > ---
> >  tests/generic/581 | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > Hi!
> > 
> > Please note that I'm not an 'awk' expert and I may be wrong!  But if I do
> > see an error if I run something like:
> > 
> > $ awk '/^[[:space:]]*1000:/{print \$4}' /proc/key-users 
> > awk: cmd. line:1: /^[[:space:]]*1000:/{print \$4}
> > awk: cmd. line:1:                            ^ backslash not last character on line
> > 
> > But maybe this depends on the awk implementation, although I've tried a few.
> > 
> > diff --git a/tests/generic/581 b/tests/generic/581
> > index cabc7e1c69ab..1a4b571d40ce 100755
> > --- a/tests/generic/581
> > +++ b/tests/generic/581
> > @@ -92,7 +92,7 @@ while grep -E -q '^[0-9a-f]+ [^ ]*i[^ ]*' /proc/keys; do
> >  done
> >  
> >  # Set the user key quota to the fsgqa user's current number of keys plus 5.
> > -orig_keys=$(_user_do "awk '/^[[:space:]]*$(id -u fsgqa):/{print \$4}' /proc/key-users | cut -d/ -f1")
> > +orig_keys=$(_user_do "$AWK_PROG '/^[[:space:]]*$(id -u fsgqa):/{print $4}' /proc/key-users | cut -d/ -f1")
> 
> The backslash is needed to prevent $4 from being expanded by bash, because the
> whole pipeline with 'awk' and 'cut' is in a double-quoted string:
> 
>     "awk '/^[[:space:]]*$(id -u fsgqa):/{print \$4}' /proc/key-users | cut -d/ -f1"
> 
> Without escaping the $, bash would replace $4 with the empty string while it's
> doing expansions on the whole double-quoted string.

/me blushes

Yeah, looking closer it makes sense.  Sorry for the noise.  I'm currently
investigating a test failure (which I can't reproduce locally) where
'orig_key' unexpectedly is set to '1' and makes the test fail because it
was supposed to be '0'.  That's when this caught my attention.  Anyway,
I'll go look somewhere else.

Cheers,
--
Luís

Luis Henriques Nov. 28, 2023, 2:16 p.m. UTC | #3

Luís Henriques <henrix@camandro.org> writes:

> (Sorry for replying from private email address)
>
> On Wed, Nov 22, 2023 at 03:16:48PM -0800, Eric Biggers wrote:
>> On Wed, Nov 22, 2023 at 05:03:24PM +0000, Luis Henriques wrote:
>> > Checking the keys in /proc/key-users is buggy, as there's an extra '\'
>> > character: in '{print \$4}' the '$4' shouldn't be escaped otherwise the
>> > 'awk' command will fail.  This has passed unnoticed because the output
>> > is sent to '_user_do' function and the result assigned to a variable.
>> > 
>> > While there, replace 'awk' by $AWK_PROG.
>> > 
>> > Signed-off-by: Luis Henriques <lhenriques@suse.de>
>> > ---
>> >  tests/generic/581 | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> > 
>> > Hi!
>> > 
>> > Please note that I'm not an 'awk' expert and I may be wrong!  But if I do
>> > see an error if I run something like:
>> > 
>> > $ awk '/^[[:space:]]*1000:/{print \$4}' /proc/key-users 
>> > awk: cmd. line:1: /^[[:space:]]*1000:/{print \$4}
>> > awk: cmd. line:1:                            ^ backslash not last character on line
>> > 
>> > But maybe this depends on the awk implementation, although I've tried a few.
>> > 
>> > diff --git a/tests/generic/581 b/tests/generic/581
>> > index cabc7e1c69ab..1a4b571d40ce 100755
>> > --- a/tests/generic/581
>> > +++ b/tests/generic/581
>> > @@ -92,7 +92,7 @@ while grep -E -q '^[0-9a-f]+ [^ ]*i[^ ]*' /proc/keys; do
>> >  done
>> >  
>> >  # Set the user key quota to the fsgqa user's current number of keys plus 5.
>> > -orig_keys=$(_user_do "awk '/^[[:space:]]*$(id -u fsgqa):/{print \$4}' /proc/key-users | cut -d/ -f1")
>> > +orig_keys=$(_user_do "$AWK_PROG '/^[[:space:]]*$(id -u fsgqa):/{print $4}' /proc/key-users | cut -d/ -f1")
>> 
>> The backslash is needed to prevent $4 from being expanded by bash, because the
>> whole pipeline with 'awk' and 'cut' is in a double-quoted string:
>> 
>>     "awk '/^[[:space:]]*$(id -u fsgqa):/{print \$4}' /proc/key-users | cut -d/ -f1"
>> 
>> Without escaping the $, bash would replace $4 with the empty string while it's
>> doing expansions on the whole double-quoted string.
>
> /me blushes
>
> Yeah, looking closer it makes sense.  Sorry for the noise.  I'm currently
> investigating a test failure (which I can't reproduce locally) where
> 'orig_key' unexpectedly is set to '1' and makes the test fail because it
> was supposed to be '0'.  That's when this caught my attention.  Anyway,
> I'll go look somewhere else.

OK, I'm not 100% sure yet, but I've an idea about what's going on with
this test failure.

I _think_ that even after the following is done in the test:

    _user_do_rm_enckey $SCRATCH_MNT $keyid
    _scratch_cycle_mount

the key garbage collector may not have finish running.  And then, when we
read '/proc/key-users', we can race against key_user_put(), which needs
key_user_lock, which is also grabbed while the proc file seq_operations
are run.

Eric, does this make any sense?  There is a loop in the test to wait for
invalidated keys, but I believe it's not relevant anymore since commit
d7e7b9af104c ("fscrypt: stop using keyrings subsystem for
fscrypt_master_key").  But I might be misunderstanding the code.

Cheers,

Eric Biggers Nov. 28, 2023, 5:37 p.m. UTC | #4

On Tue, Nov 28, 2023 at 02:16:44PM +0000, Luis Henriques wrote:
> >
> > Yeah, looking closer it makes sense.  Sorry for the noise.  I'm currently
> > investigating a test failure (which I can't reproduce locally) where
> > 'orig_key' unexpectedly is set to '1' and makes the test fail because it
> > was supposed to be '0'.  That's when this caught my attention.  Anyway,
> > I'll go look somewhere else.
> 
> OK, I'm not 100% sure yet, but I've an idea about what's going on with
> this test failure.
> 
> I _think_ that even after the following is done in the test:
> 
>     _user_do_rm_enckey $SCRATCH_MNT $keyid
>     _scratch_cycle_mount
> 
> the key garbage collector may not have finish running.  And then, when we
> read '/proc/key-users', we can race against key_user_put(), which needs
> key_user_lock, which is also grabbed while the proc file seq_operations
> are run.
> 
> Eric, does this make any sense?  There is a loop in the test to wait for
> invalidated keys, but I believe it's not relevant anymore since commit
> d7e7b9af104c ("fscrypt: stop using keyrings subsystem for
> fscrypt_master_key").  But I might be misunderstanding the code.

Thanks for looking into this!  I had noticed this test is still flaky on arm64
but haven't had a chance to look into it.  Yes, it's probably related to the key
garbage collector again.  The test needs to wait for the fscrypt "user" keys
(key_type_fscrypt_user in the kernel) to be released from the quota.  I think
that loop in the test does not have the intended effect because it waits for
"invalidated" keys, but the fscrypt "user" keys (which are charged to the quota)
are never invalidated; they're just released normally.  There used to be another
key (in the "keyrings" subsystem sense of the word "key") associated with each
fscrypt key, and that key was indeed invalidated, but that's no longer the case.

Maybe there's a better way to wait for the key garbage collector to finish.

Or the kernel could be changed to make releasing the key quota be synchronous.
Unfortunately the keyrings subsystem doesn't seem to work that way, and fscrypt
is tying into the key quota from the keyrings subsystem, so it is subject to its
limitations.  But maybe there's a way to do it.

- Eric

Luis Henriques Nov. 28, 2023, 6:25 p.m. UTC | #5

Eric Biggers <ebiggers@kernel.org> writes:

> On Tue, Nov 28, 2023 at 02:16:44PM +0000, Luis Henriques wrote:
>> >
>> > Yeah, looking closer it makes sense.  Sorry for the noise.  I'm currently
>> > investigating a test failure (which I can't reproduce locally) where
>> > 'orig_key' unexpectedly is set to '1' and makes the test fail because it
>> > was supposed to be '0'.  That's when this caught my attention.  Anyway,
>> > I'll go look somewhere else.
>> 
>> OK, I'm not 100% sure yet, but I've an idea about what's going on with
>> this test failure.
>> 
>> I _think_ that even after the following is done in the test:
>> 
>>     _user_do_rm_enckey $SCRATCH_MNT $keyid
>>     _scratch_cycle_mount
>> 
>> the key garbage collector may not have finish running.  And then, when we
>> read '/proc/key-users', we can race against key_user_put(), which needs
>> key_user_lock, which is also grabbed while the proc file seq_operations
>> are run.
>> 
>> Eric, does this make any sense?  There is a loop in the test to wait for
>> invalidated keys, but I believe it's not relevant anymore since commit
>> d7e7b9af104c ("fscrypt: stop using keyrings subsystem for
>> fscrypt_master_key").  But I might be misunderstanding the code.
>
> Thanks for looking into this!  I had noticed this test is still flaky on arm64
> but haven't had a chance to look into it.  Yes, it's probably related to the key
> garbage collector again.  The test needs to wait for the fscrypt "user" keys
> (key_type_fscrypt_user in the kernel) to be released from the quota.  I think
> that loop in the test does not have the intended effect because it waits for
> "invalidated" keys, but the fscrypt "user" keys (which are charged to the quota)
> are never invalidated; they're just released normally.  There used to be another
> key (in the "keyrings" subsystem sense of the word "key") associated with each
> fscrypt key, and that key was indeed invalidated, but that's no longer the case.
>

Awesome, thanks for confirming this.  That loop probably made sense back
when keys were invalidated -- that behaviour was changed by the commit I
mentioned, I believe.  Anyway, it's probably better to keep it the loop
for testing old kernels, as it doesn't really hurt.

> 
> Maybe there's a better way to wait for the key garbage collector to
> finish.
>
> Or the kernel could be changed to make releasing the key quota be synchronous.
> Unfortunately the keyrings subsystem doesn't seem to work that way, and fscrypt
> is tying into the key quota from the keyrings subsystem, so it is subject to its
> limitations.  But maybe there's a way to do it.

Hmm... yeah, that requires a closer look at that subsystem to see if
something can be done.  I'll try to find something there that would make
that test more reliable.

Cheers,

Luis Henriques Dec. 5, 2023, 5:28 p.m. UTC | #6

Luis Henriques <lhenriques@suse.de> writes:

> Eric Biggers <ebiggers@kernel.org> writes:
>
>> On Tue, Nov 28, 2023 at 02:16:44PM +0000, Luis Henriques wrote:
>>> >
>>> > Yeah, looking closer it makes sense.  Sorry for the noise.  I'm currently
>>> > investigating a test failure (which I can't reproduce locally) where
>>> > 'orig_key' unexpectedly is set to '1' and makes the test fail because it
>>> > was supposed to be '0'.  That's when this caught my attention.  Anyway,
>>> > I'll go look somewhere else.
>>> 
>>> OK, I'm not 100% sure yet, but I've an idea about what's going on with
>>> this test failure.
>>> 
>>> I _think_ that even after the following is done in the test:
>>> 
>>>     _user_do_rm_enckey $SCRATCH_MNT $keyid
>>>     _scratch_cycle_mount
>>> 
>>> the key garbage collector may not have finish running.  And then, when we
>>> read '/proc/key-users', we can race against key_user_put(), which needs
>>> key_user_lock, which is also grabbed while the proc file seq_operations
>>> are run.
>>> 
>>> Eric, does this make any sense?  There is a loop in the test to wait for
>>> invalidated keys, but I believe it's not relevant anymore since commit
>>> d7e7b9af104c ("fscrypt: stop using keyrings subsystem for
>>> fscrypt_master_key").  But I might be misunderstanding the code.
>>
>> Thanks for looking into this!  I had noticed this test is still flaky on arm64
>> but haven't had a chance to look into it.  Yes, it's probably related to the key
>> garbage collector again.  The test needs to wait for the fscrypt "user" keys
>> (key_type_fscrypt_user in the kernel) to be released from the quota.  I think
>> that loop in the test does not have the intended effect because it waits for
>> "invalidated" keys, but the fscrypt "user" keys (which are charged to the quota)
>> are never invalidated; they're just released normally.  There used to be another
>> key (in the "keyrings" subsystem sense of the word "key") associated with each
>> fscrypt key, and that key was indeed invalidated, but that's no longer the case.
>>
>
> Awesome, thanks for confirming this.  That loop probably made sense back
> when keys were invalidated -- that behaviour was changed by the commit I
> mentioned, I believe.  Anyway, it's probably better to keep it the loop
> for testing old kernels, as it doesn't really hurt.
>
>> 
>> Maybe there's a better way to wait for the key garbage collector to
>> finish.
>>
>> Or the kernel could be changed to make releasing the key quota be synchronous.
>> Unfortunately the keyrings subsystem doesn't seem to work that way, and fscrypt
>> is tying into the key quota from the keyrings subsystem, so it is subject to its
>> limitations.  But maybe there's a way to do it.
>
> Hmm... yeah, that requires a closer look at that subsystem to see if
> something can be done.  I'll try to find something there that would make
> that test more reliable.

OK, I've finally spent some time looking at this and I see two options to
fix this.

- Option 1:

Add some logic to the test to try to close this race.  The loop that adds
keys and expects the last key to fail could be changed to update the keys
quota if the number of keys in '/proc/key-users' isn't the one it was
expected after adding a new one.  It's hacky and still racy, as the GC
could still run in between.  I've an attempt to implement this, which
_seems_ to be working, but I haven't spent too much time testing it,
because... it's ugly and option 2 looks better.

- Option 2:

Modify the '/proc/key-users' ->start() handler so that the first thing it
does is to call flush_work() and force the GC to run.  This doesn't
completely close the race window, but I think it makes things a bit more
reliable to userspace scripts/applications relying on the data.

Also, this flushing should probably be done for '/proc/keys' as well, but
the patch I've been hammering is doing it for '/proc/key-users' only.  So
far, it seems to do the job and I can't reproduce the test failure with it
applied.

Obviously, forcing GC to run will have a performance impact, but I'm also
not sure how bad that would be and how much userspace expects a read to
these procfs files to perform.

Anyway, just for reference, I'm attaching bellow the patch I've been
testing.  Let me know what you think about it.  I'm still running more
tests, but if the patch looks good to you I can send out a proper RFC.

Cheers,

generic/581: remove extra escape character from awk line

Commit Message

Comments

Patch