diff mbox series

[HACK] fs: dodge atomic in putname if ref == 1

Message ID 20240604132448.101183-1-mjguzik@gmail.com (mailing list archive)
State New
Headers show
Series [HACK] fs: dodge atomic in putname if ref == 1 | expand

Commit Message

Mateusz Guzik June 4, 2024, 1:24 p.m. UTC
The struct used to be refcounted with regular inc/dec ops, atomic usage
showed up in commit 03adc61edad4 ("audit,io_uring: io_uring openat
triggers audit reference count underflow").

If putname spots a count of 1 there is no legitimate way of anyone to
bump it and these modifications are low traffic (names are not heavily)
shared, thus one can do a load first and if the value of 1 is found the
atomic can be elided -- this is the last reference..

When performing a failed open this reduces putname on the profile from
~1.60% to ~0.2% and bumps the syscall rate by just shy of 1% (the
discrepancy is due to now bigger stalls elsewhere).

Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---

This is a lazy hack.

The race is only possible with io_uring which has a dedicated entry
point, thus a getname variant which takes it into account could store
the need to use atomics as a flag in struct filename. To that end
getname could take a boolean indicating this, fronted with some inlines
and the current entry point renamed to __getname_flags to hide it.

Option B is to add a routine which "upgrades" to atomics after getname
returns, but that's a littly fishy vs audit_reusename.

At the end of the day all spots which modify the ref could branch on the
atomics flag.

I opted to not do it since the hack below undoes the problem for me.

I'm not going to fight for this hack though, it is merely a placeholder
until someone(tm) fixes things.

If the hack is considered a no-go and the appraoch described above is
considered fine, I can submit a patch some time this month to sort it
out, provided someone tells me how to name a routine which grabs a ref
-- the op is currently opencoded and "getname" allocates instead of
merely refing. would "refname" do it?

 fs/namei.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

Comments

Christian Brauner June 5, 2024, 3:20 p.m. UTC | #1
On Tue, Jun 04, 2024 at 03:24:48PM +0200, Mateusz Guzik wrote:
> The struct used to be refcounted with regular inc/dec ops, atomic usage
> showed up in commit 03adc61edad4 ("audit,io_uring: io_uring openat
> triggers audit reference count underflow").
> 
> If putname spots a count of 1 there is no legitimate way of anyone to
> bump it and these modifications are low traffic (names are not heavily)
> shared, thus one can do a load first and if the value of 1 is found the
> atomic can be elided -- this is the last reference..
> 
> When performing a failed open this reduces putname on the profile from
> ~1.60% to ~0.2% and bumps the syscall rate by just shy of 1% (the
> discrepancy is due to now bigger stalls elsewhere).

I suspect you haven't turned audit on in general because that would give
you performance impact in a bunch of places. Can't we just do something
where we e.g., use plain refcounts if audit isn't turned on?
(audit_dummy_context() or whatever it's called).

> 
> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
> ---
> 
> This is a lazy hack.
> 
> The race is only possible with io_uring which has a dedicated entry
> point, thus a getname variant which takes it into account could store
> the need to use atomics as a flag in struct filename. To that end
> getname could take a boolean indicating this, fronted with some inlines
> and the current entry point renamed to __getname_flags to hide it.
> 
> Option B is to add a routine which "upgrades" to atomics after getname
> returns, but that's a littly fishy vs audit_reusename.
> 
> At the end of the day all spots which modify the ref could branch on the
> atomics flag.
> 
> I opted to not do it since the hack below undoes the problem for me.
> 
> I'm not going to fight for this hack though, it is merely a placeholder
> until someone(tm) fixes things.
> 
> If the hack is considered a no-go and the appraoch described above is
> considered fine, I can submit a patch some time this month to sort it
> out, provided someone tells me how to name a routine which grabs a ref
> -- the op is currently opencoded and "getname" allocates instead of
> merely refing. would "refname" do it?
> 
>  fs/namei.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/namei.c b/fs/namei.c
> index 37fb0a8aa09a..f9440bdb21d0 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -260,11 +260,13 @@ void putname(struct filename *name)
>  	if (IS_ERR(name))
>  		return;
>  
> -	if (WARN_ON_ONCE(!atomic_read(&name->refcnt)))
> -		return;
> +	if (unlikely(atomic_read(&name->refcnt) != 1)) {
> +		if (WARN_ON_ONCE(!atomic_read(&name->refcnt)))
> +			return;
>  
> -	if (!atomic_dec_and_test(&name->refcnt))
> -		return;
> +		if (!atomic_dec_and_test(&name->refcnt))
> +			return;
> +	}
>  
>  	if (name->name != name->iname) {
>  		__putname(name->name);
> -- 
> 2.39.2
>
Mateusz Guzik June 5, 2024, 3:23 p.m. UTC | #2
On Wed, Jun 5, 2024 at 5:20 PM Christian Brauner <brauner@kernel.org> wrote:
>
> On Tue, Jun 04, 2024 at 03:24:48PM +0200, Mateusz Guzik wrote:
> > The struct used to be refcounted with regular inc/dec ops, atomic usage
> > showed up in commit 03adc61edad4 ("audit,io_uring: io_uring openat
> > triggers audit reference count underflow").
> >
> > If putname spots a count of 1 there is no legitimate way of anyone to
> > bump it and these modifications are low traffic (names are not heavily)
> > shared, thus one can do a load first and if the value of 1 is found the
> > atomic can be elided -- this is the last reference..
> >
> > When performing a failed open this reduces putname on the profile from
> > ~1.60% to ~0.2% and bumps the syscall rate by just shy of 1% (the
> > discrepancy is due to now bigger stalls elsewhere).
>
> I suspect you haven't turned audit on in general because that would give
> you performance impact in a bunch of places. Can't we just do something
> where we e.g., use plain refcounts if audit isn't turned on?
> (audit_dummy_context() or whatever it's called).
>

That would still give atomics for audit users which don't play with io_uring.

The part below --- describes one idea what to do with this.

> >
> > Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
> > ---
> >
> > This is a lazy hack.
> >
> > The race is only possible with io_uring which has a dedicated entry
> > point, thus a getname variant which takes it into account could store
> > the need to use atomics as a flag in struct filename. To that end
> > getname could take a boolean indicating this, fronted with some inlines
> > and the current entry point renamed to __getname_flags to hide it.
> >
> > Option B is to add a routine which "upgrades" to atomics after getname
> > returns, but that's a littly fishy vs audit_reusename.
> >
> > At the end of the day all spots which modify the ref could branch on the
> > atomics flag.
> >
> > I opted to not do it since the hack below undoes the problem for me.
> >
> > I'm not going to fight for this hack though, it is merely a placeholder
> > until someone(tm) fixes things.
> >
> > If the hack is considered a no-go and the appraoch described above is
> > considered fine, I can submit a patch some time this month to sort it
> > out, provided someone tells me how to name a routine which grabs a ref
> > -- the op is currently opencoded and "getname" allocates instead of
> > merely refing. would "refname" do it?
> >
> >  fs/namei.c | 10 ++++++----
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/fs/namei.c b/fs/namei.c
> > index 37fb0a8aa09a..f9440bdb21d0 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > @@ -260,11 +260,13 @@ void putname(struct filename *name)
> >       if (IS_ERR(name))
> >               return;
> >
> > -     if (WARN_ON_ONCE(!atomic_read(&name->refcnt)))
> > -             return;
> > +     if (unlikely(atomic_read(&name->refcnt) != 1)) {
> > +             if (WARN_ON_ONCE(!atomic_read(&name->refcnt)))
> > +                     return;
> >
> > -     if (!atomic_dec_and_test(&name->refcnt))
> > -             return;
> > +             if (!atomic_dec_and_test(&name->refcnt))
> > +                     return;
> > +     }
> >
> >       if (name->name != name->iname) {
> >               __putname(name->name);
> > --
> > 2.39.2
> >
diff mbox series

Patch

diff --git a/fs/namei.c b/fs/namei.c
index 37fb0a8aa09a..f9440bdb21d0 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -260,11 +260,13 @@  void putname(struct filename *name)
 	if (IS_ERR(name))
 		return;
 
-	if (WARN_ON_ONCE(!atomic_read(&name->refcnt)))
-		return;
+	if (unlikely(atomic_read(&name->refcnt) != 1)) {
+		if (WARN_ON_ONCE(!atomic_read(&name->refcnt)))
+			return;
 
-	if (!atomic_dec_and_test(&name->refcnt))
-		return;
+		if (!atomic_dec_and_test(&name->refcnt))
+			return;
+	}
 
 	if (name->name != name->iname) {
 		__putname(name->name);