diff mbox series

[net-next] once: add DO_ONCE_SLOW() for sleepable contexts

Message ID 20221001205102.2319658-1-eric.dumazet@gmail.com (mailing list archive)
State Accepted
Commit 62c07983bef9d3e78e71189441e1a470f0d1e653
Delegated to: Netdev Maintainers
Headers show
Series [net-next] once: add DO_ONCE_SLOW() for sleepable contexts | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit fail Errors and warnings before: 17974 this patch: 17976
netdev/cc_maintainers warning 4 maintainers not CCed: wuchi.zero@gmail.com dsahern@kernel.org akpm@linux-foundation.org yoshfuji@linux-ipv6.org
netdev/build_clang success Errors and warnings before: 4490 this patch: 4490
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn fail Errors and warnings before: 18764 this patch: 18766
netdev/checkpatch fail ERROR: do not initialise statics to false
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Eric Dumazet Oct. 1, 2022, 8:51 p.m. UTC
From: Eric Dumazet <edumazet@google.com>

Christophe Leroy reported a ~80ms latency spike
happening at first TCP connect() time.

This is because __inet_hash_connect() uses get_random_once()
to populate a perturbation table which became quite big
after commit 4c2c8f03a5ab ("tcp: increase source port perturb table to 2^16")

get_random_once() uses DO_ONCE(), which block hard irqs for the duration
of the operation.

This patch adds DO_ONCE_SLOW() which uses a mutex instead of a spinlock
for operations where we prefer to stay in process context.

Then __inet_hash_connect() can use get_random_slow_once()
to populate its perturbation table.

Fixes: 4c2c8f03a5ab ("tcp: increase source port perturb table to 2^16")
Fixes: 190cc82489f4 ("tcp: change source port randomizarion at connect() time")
Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Link: https://lore.kernel.org/netdev/CANn89iLAEYBaoYajy0Y9UmGFff5GPxDUoG-ErVB2jDdRNQ5Tug@mail.gmail.com/T/#t
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willy Tarreau <w@1wt.eu>
---
 include/linux/once.h       | 28 ++++++++++++++++++++++++++++
 lib/once.c                 | 30 ++++++++++++++++++++++++++++++
 net/ipv4/inet_hashtables.c |  4 ++--
 3 files changed, 60 insertions(+), 2 deletions(-)

Comments

Willy Tarreau Oct. 1, 2022, 9:15 p.m. UTC | #1
Hi Eric,

On Sat, Oct 01, 2022 at 01:51:02PM -0700, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> Christophe Leroy reported a ~80ms latency spike
> happening at first TCP connect() time.

Seeing Christophe's message also made me wonder if we didn't break
something back then :-/

> This is because __inet_hash_connect() uses get_random_once()
> to populate a perturbation table which became quite big
> after commit 4c2c8f03a5ab ("tcp: increase source port perturb table to 2^16")
> 
> get_random_once() uses DO_ONCE(), which block hard irqs for the duration
> of the operation.
> 
> This patch adds DO_ONCE_SLOW() which uses a mutex instead of a spinlock
> for operations where we prefer to stay in process context.

That's a nice improvement I think. I was wondering if, for this special
case, we *really* need an exclusive DO_ONCE(). I mean, we're getting
random bytes, we really do not care if two CPUs change them in parallel
provided that none uses them before the table is entirely filled. Thus
that could probably end up as something like:

    if (!atomic_read(&done)) {
        get_random_bytes(array);
        atomic_set(&done, 1);
    }

In any case, your solution remains cleaner and more robust, though.

Thanks,
Willy
Jason A. Donenfeld Oct. 1, 2022, 10:44 p.m. UTC | #2
On Sat, Oct 01, 2022 at 01:51:02PM -0700, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> Christophe Leroy reported a ~80ms latency spike
> happening at first TCP connect() time.
> 
> This is because __inet_hash_connect() uses get_random_once()
> to populate a perturbation table which became quite big
> after commit 4c2c8f03a5ab ("tcp: increase source port perturb table to 2^16")
> 
> get_random_once() uses DO_ONCE(), which block hard irqs for the duration
> of the operation.
> 
> This patch adds DO_ONCE_SLOW() which uses a mutex instead of a spinlock
> for operations where we prefer to stay in process context.
> 
> Then __inet_hash_connect() can use get_random_slow_once()
> to populate its perturbation table.
> 
> Fixes: 4c2c8f03a5ab ("tcp: increase source port perturb table to 2^16")
> Fixes: 190cc82489f4 ("tcp: change source port randomizarion at connect() time")
> Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> Link: https://lore.kernel.org/netdev/CANn89iLAEYBaoYajy0Y9UmGFff5GPxDUoG-ErVB2jDdRNQ5Tug@mail.gmail.com/T/#t
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Willy Tarreau <w@1wt.eu>
> ---
>  include/linux/once.h       | 28 ++++++++++++++++++++++++++++
>  lib/once.c                 | 30 ++++++++++++++++++++++++++++++
>  net/ipv4/inet_hashtables.c |  4 ++--
>  3 files changed, 60 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/once.h b/include/linux/once.h
> index b14d8b309d52b198bb144689fe67d9ed235c2b3e..176ab75b42df740a738d04d8480821a0b3b65ba9 100644
> --- a/include/linux/once.h
> +++ b/include/linux/once.h
> @@ -5,10 +5,18 @@
>  #include <linux/types.h>
>  #include <linux/jump_label.h>
>  
> +/* Helpers used from arbitrary contexts.
> + * Hard irqs are blocked, be cautious.
> + */
>  bool __do_once_start(bool *done, unsigned long *flags);
>  void __do_once_done(bool *done, struct static_key_true *once_key,
>  		    unsigned long *flags, struct module *mod);
>  
> +/* Variant for process contexts only. */
> +bool __do_once_slow_start(bool *done);
> +void __do_once_slow_done(bool *done, struct static_key_true *once_key,
> +			 struct module *mod);
> +
>  /* Call a function exactly once. The idea of DO_ONCE() is to perform
>   * a function call such as initialization of random seeds, etc, only
>   * once, where DO_ONCE() can live in the fast-path. After @func has
> @@ -52,7 +60,27 @@ void __do_once_done(bool *done, struct static_key_true *once_key,
>  		___ret;							     \
>  	})
>  
> +/* Variant of DO_ONCE() for process/sleepable contexts. */
> +#define DO_ONCE_SLOW(func, ...)						     \
> +	({								     \
> +		bool ___ret = false;					     \
> +		static bool __section(".data.once") ___done = false;	     \
> +		static DEFINE_STATIC_KEY_TRUE(___once_key);		     \
> +		if (static_branch_unlikely(&___once_key)) {		     \
> +			___ret = __do_once_slow_start(&___done);	     \
> +			if (unlikely(___ret)) {				     \
> +				func(__VA_ARGS__);			     \
> +				__do_once_slow_done(&___done, &___once_key,  \
> +						    THIS_MODULE);	     \
> +			}						     \
> +		}							     \
> +		___ret;							     \
> +	})
> +

Hmm, I dunno about this macro-choice explosion here. The whole thing
with DO_ONCE() is that the static branch makes it zero cost most of the
time while being somewhat expensive the rest of the time, but who cares,
because "the rest" is just once.

So instead, why not just branch on whether or not we can sleep here, if
that can be worked out dynamically? If not, and if you really do need
two sets of macros and functions, at least you can call the new one
something other than "slow"? Maybe something about being _SLEEPABLE()
instead?

Also, the __do_once_slow_done() function misses a really nice
optimization, which is that the static branch can be changed
synchronously instead of having to allocate and fire off that workqueue,
since by definition we're in sleepable context here.

Jason
Eric Dumazet Oct. 1, 2022, 10:50 p.m. UTC | #3
On Sat, Oct 1, 2022 at 3:44 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> On Sat, Oct 01, 2022 at 01:51:02PM -0700, Eric Dumazet wrote:
> > From: Eric Dumazet <edumazet@google.com>
> >
> > Christophe Leroy reported a ~80ms latency spike
> > happening at first TCP connect() time.
> >
> > This is because __inet_hash_connect() uses get_random_once()
> > to populate a perturbation table which became quite big
> > after commit 4c2c8f03a5ab ("tcp: increase source port perturb table to 2^16")
> >
> > get_random_once() uses DO_ONCE(), which block hard irqs for the duration
> > of the operation.
> >
> > This patch adds DO_ONCE_SLOW() which uses a mutex instead of a spinlock
> > for operations where we prefer to stay in process context.
> >
> > Then __inet_hash_connect() can use get_random_slow_once()
> > to populate its perturbation table.
> >
> > Fixes: 4c2c8f03a5ab ("tcp: increase source port perturb table to 2^16")
> > Fixes: 190cc82489f4 ("tcp: change source port randomizarion at connect() time")
> > Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> > Link: https://lore.kernel.org/netdev/CANn89iLAEYBaoYajy0Y9UmGFff5GPxDUoG-ErVB2jDdRNQ5Tug@mail.gmail.com/T/#t
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Cc: Willy Tarreau <w@1wt.eu>
> > ---
> >  include/linux/once.h       | 28 ++++++++++++++++++++++++++++
> >  lib/once.c                 | 30 ++++++++++++++++++++++++++++++
> >  net/ipv4/inet_hashtables.c |  4 ++--
> >  3 files changed, 60 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/linux/once.h b/include/linux/once.h
> > index b14d8b309d52b198bb144689fe67d9ed235c2b3e..176ab75b42df740a738d04d8480821a0b3b65ba9 100644
> > --- a/include/linux/once.h
> > +++ b/include/linux/once.h
> > @@ -5,10 +5,18 @@
> >  #include <linux/types.h>
> >  #include <linux/jump_label.h>
> >
> > +/* Helpers used from arbitrary contexts.
> > + * Hard irqs are blocked, be cautious.
> > + */
> >  bool __do_once_start(bool *done, unsigned long *flags);
> >  void __do_once_done(bool *done, struct static_key_true *once_key,
> >                   unsigned long *flags, struct module *mod);
> >
> > +/* Variant for process contexts only. */
> > +bool __do_once_slow_start(bool *done);
> > +void __do_once_slow_done(bool *done, struct static_key_true *once_key,
> > +                      struct module *mod);
> > +
> >  /* Call a function exactly once. The idea of DO_ONCE() is to perform
> >   * a function call such as initialization of random seeds, etc, only
> >   * once, where DO_ONCE() can live in the fast-path. After @func has
> > @@ -52,7 +60,27 @@ void __do_once_done(bool *done, struct static_key_true *once_key,
> >               ___ret;                                                      \
> >       })
> >
> > +/* Variant of DO_ONCE() for process/sleepable contexts. */
> > +#define DO_ONCE_SLOW(func, ...)                                                   \
> > +     ({                                                                   \
> > +             bool ___ret = false;                                         \
> > +             static bool __section(".data.once") ___done = false;         \
> > +             static DEFINE_STATIC_KEY_TRUE(___once_key);                  \
> > +             if (static_branch_unlikely(&___once_key)) {                  \
> > +                     ___ret = __do_once_slow_start(&___done);             \
> > +                     if (unlikely(___ret)) {                              \
> > +                             func(__VA_ARGS__);                           \
> > +                             __do_once_slow_done(&___done, &___once_key,  \
> > +                                                 THIS_MODULE);            \
> > +                     }                                                    \
> > +             }                                                            \
> > +             ___ret;                                                      \
> > +     })
> > +
>
> Hmm, I dunno about this macro-choice explosion here. The whole thing
> with DO_ONCE() is that the static branch makes it zero cost most of the
> time while being somewhat expensive the rest of the time, but who cares,
> because "the rest" is just once.
>
> So instead, why not just branch on whether or not we can sleep here, if
> that can be worked out dynamically? If not, and if you really do need
> two sets of macros and functions, at least you can call the new one
> something other than "slow"? Maybe something about being _SLEEPABLE()
> instead?

No idea what you mean. I do not want to over engineer code that yet have to be
adopted by other callers. If you think you want to spend week end time on this,
feel free to take over at this point.

>
> Also, the __do_once_slow_done() function misses a really nice
> optimization, which is that the static branch can be changed
> synchronously instead of having to allocate and fire off that workqueue,
> since by definition we're in sleepable context here.
>

This was deliberate. We already spent a lot of time in the called function,
better just return to the caller as fast as possible.

This really does not matter, the work queue is fired once by definition.
Jason A. Donenfeld Oct. 1, 2022, 10:50 p.m. UTC | #4
On Sat, Oct 01, 2022 at 11:15:29PM +0200, Willy Tarreau wrote:
> Hi Eric,
> 
> On Sat, Oct 01, 2022 at 01:51:02PM -0700, Eric Dumazet wrote:
> > From: Eric Dumazet <edumazet@google.com>
> > 
> > Christophe Leroy reported a ~80ms latency spike
> > happening at first TCP connect() time.
> 
> Seeing Christophe's message also made me wonder if we didn't break
> something back then :-/
> 
> > This is because __inet_hash_connect() uses get_random_once()
> > to populate a perturbation table which became quite big
> > after commit 4c2c8f03a5ab ("tcp: increase source port perturb table to 2^16")
> > 
> > get_random_once() uses DO_ONCE(), which block hard irqs for the duration
> > of the operation.
> > 
> > This patch adds DO_ONCE_SLOW() which uses a mutex instead of a spinlock
> > for operations where we prefer to stay in process context.
> 
> That's a nice improvement I think. I was wondering if, for this special
> case, we *really* need an exclusive DO_ONCE(). I mean, we're getting
> random bytes, we really do not care if two CPUs change them in parallel
> provided that none uses them before the table is entirely filled. Thus
> that could probably end up as something like:
> 
>     if (!atomic_read(&done)) {
>         get_random_bytes(array);
>         atomic_set(&done, 1);
>     }

If you don't care about the tables being consistent between CPUs, then
yea, sure, that seems like a reasonable approach, and I like not
polluting once.{c,h} with some _SLOW() special cases. If you don't want
the atomic read in there you could also do the same pattern with a
static branch, like what DO_ONCE() does:

   if (static_branch_unlikely(&need_bytes)) {
      get_random_bytes(array);
      static_branch_disable(&need_bytes);
   }

Anyway, same thing as your suggestion more or less.

Jason
Willy Tarreau Oct. 2, 2022, 5:38 a.m. UTC | #5
On Sun, Oct 02, 2022 at 12:50:38AM +0200, Jason A. Donenfeld wrote:
> > > This patch adds DO_ONCE_SLOW() which uses a mutex instead of a spinlock
> > > for operations where we prefer to stay in process context.
> > 
> > That's a nice improvement I think. I was wondering if, for this special
> > case, we *really* need an exclusive DO_ONCE(). I mean, we're getting
> > random bytes, we really do not care if two CPUs change them in parallel
> > provided that none uses them before the table is entirely filled. Thus
> > that could probably end up as something like:
> > 
> >     if (!atomic_read(&done)) {
> >         get_random_bytes(array);
> >         atomic_set(&done, 1);
> >     }
> 
> If you don't care about the tables being consistent between CPUs, then
> yea, sure, that seems like a reasonable approach, and I like not
> polluting once.{c,h} with some _SLOW() special cases.

I don't see this as pollution, it possibly is a nice addition for certain
use cases or early fast paths where the risk of contention is high.

> If you don't want
> the atomic read in there you could also do the same pattern with a
> static branch, like what DO_ONCE() does:
> 
>    if (static_branch_unlikely(&need_bytes)) {
>       get_random_bytes(array);
>       static_branch_disable(&need_bytes);
>    }
> 
> Anyway, same thing as your suggestion more or less.

What I don't know in fact is if the code patching itself can be
responsible for a measurable part of the extra time Christophe noticed.
Anyway at least Christophe now has a few approaches to try, let's first
see if any of them fixes the regression.

Willy
Christophe Leroy Oct. 2, 2022, 8:58 a.m. UTC | #6
Le 01/10/2022 à 22:51, Eric Dumazet a écrit :
> From: Eric Dumazet <edumazet@google.com>
> 
> Christophe Leroy reported a ~80ms latency spike
> happening at first TCP connect() time.
> 
> This is because __inet_hash_connect() uses get_random_once()
> to populate a perturbation table which became quite big
> after commit 4c2c8f03a5ab ("tcp: increase source port perturb table to 2^16")
> 
> get_random_once() uses DO_ONCE(), which block hard irqs for the duration
> of the operation.
> 
> This patch adds DO_ONCE_SLOW() which uses a mutex instead of a spinlock
> for operations where we prefer to stay in process context.
> 
> Then __inet_hash_connect() can use get_random_slow_once()
> to populate its perturbation table.

Many thanks for your quick answer and your patch.

It works great, now the irqsoff tracer reports a 2ms latency in a spi 
transfert. So the issue with tcp connect is gone.

> 
> Fixes: 4c2c8f03a5ab ("tcp: increase source port perturb table to 2^16")
> Fixes: 190cc82489f4 ("tcp: change source port randomizarion at connect() time")
> Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> Link: https://lore.kernel.org/netdev/CANn89iLAEYBaoYajy0Y9UmGFff5GPxDUoG-ErVB2jDdRNQ5Tug@mail.gmail.com/T/#t
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Willy Tarreau <w@1wt.eu>

Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>


> ---
>   include/linux/once.h       | 28 ++++++++++++++++++++++++++++
>   lib/once.c                 | 30 ++++++++++++++++++++++++++++++
>   net/ipv4/inet_hashtables.c |  4 ++--
>   3 files changed, 60 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/once.h b/include/linux/once.h
> index b14d8b309d52b198bb144689fe67d9ed235c2b3e..176ab75b42df740a738d04d8480821a0b3b65ba9 100644
> --- a/include/linux/once.h
> +++ b/include/linux/once.h
> @@ -5,10 +5,18 @@
>   #include <linux/types.h>
>   #include <linux/jump_label.h>
> 
> +/* Helpers used from arbitrary contexts.
> + * Hard irqs are blocked, be cautious.
> + */
>   bool __do_once_start(bool *done, unsigned long *flags);
>   void __do_once_done(bool *done, struct static_key_true *once_key,
>                      unsigned long *flags, struct module *mod);
> 
> +/* Variant for process contexts only. */
> +bool __do_once_slow_start(bool *done);
> +void __do_once_slow_done(bool *done, struct static_key_true *once_key,
> +                        struct module *mod);
> +
>   /* Call a function exactly once. The idea of DO_ONCE() is to perform
>    * a function call such as initialization of random seeds, etc, only
>    * once, where DO_ONCE() can live in the fast-path. After @func has
> @@ -52,7 +60,27 @@ void __do_once_done(bool *done, struct static_key_true *once_key,
>                  ___ret;                                                      \
>          })
> 
> +/* Variant of DO_ONCE() for process/sleepable contexts. */
> +#define DO_ONCE_SLOW(func, ...)                                                     \
> +       ({                                                                   \
> +               bool ___ret = false;                                         \
> +               static bool __section(".data.once") ___done = false;         \
> +               static DEFINE_STATIC_KEY_TRUE(___once_key);                  \
> +               if (static_branch_unlikely(&___once_key)) {                  \
> +                       ___ret = __do_once_slow_start(&___done);             \
> +                       if (unlikely(___ret)) {                              \
> +                               func(__VA_ARGS__);                           \
> +                               __do_once_slow_done(&___done, &___once_key,  \
> +                                                   THIS_MODULE);            \
> +                       }                                                    \
> +               }                                                            \
> +               ___ret;                                                      \
> +       })
> +
>   #define get_random_once(buf, nbytes)                                        \
>          DO_ONCE(get_random_bytes, (buf), (nbytes))
> 
> +#define get_random_slow_once(buf, nbytes)                                   \
> +       DO_ONCE_SLOW(get_random_bytes, (buf), (nbytes))
> +
>   #endif /* _LINUX_ONCE_H */
> diff --git a/lib/once.c b/lib/once.c
> index 59149bf3bfb4a97e4fa7febee737155d700bae48..351f66aad310a47f17d0636da0ed5b2b4460522d 100644
> --- a/lib/once.c
> +++ b/lib/once.c
> @@ -66,3 +66,33 @@ void __do_once_done(bool *done, struct static_key_true *once_key,
>          once_disable_jump(once_key, mod);
>   }
>   EXPORT_SYMBOL(__do_once_done);
> +
> +static DEFINE_MUTEX(once_mutex);
> +
> +bool __do_once_slow_start(bool *done)
> +       __acquires(once_mutex)
> +{
> +       mutex_lock(&once_mutex);
> +       if (*done) {
> +               mutex_unlock(&once_mutex);
> +               /* Keep sparse happy by restoring an even lock count on
> +                * this mutex. In case we return here, we don't call into
> +                * __do_once_done but return early in the DO_ONCE_SLOW() macro.
> +                */
> +               __acquire(once_mutex);
> +               return false;
> +       }
> +
> +       return true;
> +}
> +EXPORT_SYMBOL(__do_once_slow_start);
> +
> +void __do_once_slow_done(bool *done, struct static_key_true *once_key,
> +                        struct module *mod)
> +       __releases(once_mutex)
> +{
> +       *done = true;
> +       mutex_unlock(&once_mutex);
> +       once_disable_jump(once_key, mod);
> +}
> +EXPORT_SYMBOL(__do_once_slow_done);
> diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
> index 49db8c597eea83a27e91edc429c2c4779b0a5cd7..dc1c5629cd0d61716d6d99131c57b49717785709 100644
> --- a/net/ipv4/inet_hashtables.c
> +++ b/net/ipv4/inet_hashtables.c
> @@ -958,8 +958,8 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row,
>          if (likely(remaining > 1))
>                  remaining &= ~1U;
> 
> -       net_get_random_once(table_perturb,
> -                           INET_TABLE_PERTURB_SIZE * sizeof(*table_perturb));
> +       get_random_slow_once(table_perturb,
> +                            INET_TABLE_PERTURB_SIZE * sizeof(*table_perturb));
>          index = port_offset & (INET_TABLE_PERTURB_SIZE - 1);
> 
>          offset = READ_ONCE(table_perturb[index]) + (port_offset >> 32);
> --
> 2.38.0.rc1.362.ged0d419d3c-goog
>
patchwork-bot+netdevbpf@kernel.org Oct. 3, 2022, 12:40 p.m. UTC | #7
Hello:

This patch was applied to netdev/net-next.git (master)
by David S. Miller <davem@davemloft.net>:

On Sat,  1 Oct 2022 13:51:02 -0700 you wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> Christophe Leroy reported a ~80ms latency spike
> happening at first TCP connect() time.
> 
> This is because __inet_hash_connect() uses get_random_once()
> to populate a perturbation table which became quite big
> after commit 4c2c8f03a5ab ("tcp: increase source port perturb table to 2^16")
> 
> [...]

Here is the summary with links:
  - [net-next] once: add DO_ONCE_SLOW() for sleepable contexts
    https://git.kernel.org/netdev/net-next/c/62c07983bef9

You are awesome, thank you!
Jakub Kicinski Oct. 3, 2022, 5:25 p.m. UTC | #8
On Sun, 2 Oct 2022 00:44:28 +0200 Jason A. Donenfeld wrote:
> So instead, why not just branch on whether or not we can sleep here, if
> that can be worked out dynamically? 

IDK if we can dynamically work out if _all_ _possible_ callers are 
in a specific context, can we?

> If not, and if you really do need two sets of macros and functions,
> at least you can call the new one something other than "slow"? Maybe
> something about being _SLEEPABLE() instead?

+1 for s/SLOW/SLEEPABLE/. I was about to suggest s/SLOW/TASK/.
But I guess it's already applied..
Jason A. Donenfeld Oct. 3, 2022, 5:43 p.m. UTC | #9
On Mon, Oct 3, 2022 at 7:25 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Sun, 2 Oct 2022 00:44:28 +0200 Jason A. Donenfeld wrote:
> > So instead, why not just branch on whether or not we can sleep here, if
> > that can be worked out dynamically?
>
> IDK if we can dynamically work out if _all_ _possible_ callers are
> in a specific context, can we?
>
> > If not, and if you really do need two sets of macros and functions,
> > at least you can call the new one something other than "slow"? Maybe
> > something about being _SLEEPABLE() instead?
>
> +1 for s/SLOW/SLEEPABLE/. I was about to suggest s/SLOW/TASK/.
> But I guess it's already applied..

I'll send a patch to change it in a minute.
diff mbox series

Patch

diff --git a/include/linux/once.h b/include/linux/once.h
index b14d8b309d52b198bb144689fe67d9ed235c2b3e..176ab75b42df740a738d04d8480821a0b3b65ba9 100644
--- a/include/linux/once.h
+++ b/include/linux/once.h
@@ -5,10 +5,18 @@ 
 #include <linux/types.h>
 #include <linux/jump_label.h>
 
+/* Helpers used from arbitrary contexts.
+ * Hard irqs are blocked, be cautious.
+ */
 bool __do_once_start(bool *done, unsigned long *flags);
 void __do_once_done(bool *done, struct static_key_true *once_key,
 		    unsigned long *flags, struct module *mod);
 
+/* Variant for process contexts only. */
+bool __do_once_slow_start(bool *done);
+void __do_once_slow_done(bool *done, struct static_key_true *once_key,
+			 struct module *mod);
+
 /* Call a function exactly once. The idea of DO_ONCE() is to perform
  * a function call such as initialization of random seeds, etc, only
  * once, where DO_ONCE() can live in the fast-path. After @func has
@@ -52,7 +60,27 @@  void __do_once_done(bool *done, struct static_key_true *once_key,
 		___ret;							     \
 	})
 
+/* Variant of DO_ONCE() for process/sleepable contexts. */
+#define DO_ONCE_SLOW(func, ...)						     \
+	({								     \
+		bool ___ret = false;					     \
+		static bool __section(".data.once") ___done = false;	     \
+		static DEFINE_STATIC_KEY_TRUE(___once_key);		     \
+		if (static_branch_unlikely(&___once_key)) {		     \
+			___ret = __do_once_slow_start(&___done);	     \
+			if (unlikely(___ret)) {				     \
+				func(__VA_ARGS__);			     \
+				__do_once_slow_done(&___done, &___once_key,  \
+						    THIS_MODULE);	     \
+			}						     \
+		}							     \
+		___ret;							     \
+	})
+
 #define get_random_once(buf, nbytes)					     \
 	DO_ONCE(get_random_bytes, (buf), (nbytes))
 
+#define get_random_slow_once(buf, nbytes)				     \
+	DO_ONCE_SLOW(get_random_bytes, (buf), (nbytes))
+
 #endif /* _LINUX_ONCE_H */
diff --git a/lib/once.c b/lib/once.c
index 59149bf3bfb4a97e4fa7febee737155d700bae48..351f66aad310a47f17d0636da0ed5b2b4460522d 100644
--- a/lib/once.c
+++ b/lib/once.c
@@ -66,3 +66,33 @@  void __do_once_done(bool *done, struct static_key_true *once_key,
 	once_disable_jump(once_key, mod);
 }
 EXPORT_SYMBOL(__do_once_done);
+
+static DEFINE_MUTEX(once_mutex);
+
+bool __do_once_slow_start(bool *done)
+	__acquires(once_mutex)
+{
+	mutex_lock(&once_mutex);
+	if (*done) {
+		mutex_unlock(&once_mutex);
+		/* Keep sparse happy by restoring an even lock count on
+		 * this mutex. In case we return here, we don't call into
+		 * __do_once_done but return early in the DO_ONCE_SLOW() macro.
+		 */
+		__acquire(once_mutex);
+		return false;
+	}
+
+	return true;
+}
+EXPORT_SYMBOL(__do_once_slow_start);
+
+void __do_once_slow_done(bool *done, struct static_key_true *once_key,
+			 struct module *mod)
+	__releases(once_mutex)
+{
+	*done = true;
+	mutex_unlock(&once_mutex);
+	once_disable_jump(once_key, mod);
+}
+EXPORT_SYMBOL(__do_once_slow_done);
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 49db8c597eea83a27e91edc429c2c4779b0a5cd7..dc1c5629cd0d61716d6d99131c57b49717785709 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -958,8 +958,8 @@  int __inet_hash_connect(struct inet_timewait_death_row *death_row,
 	if (likely(remaining > 1))
 		remaining &= ~1U;
 
-	net_get_random_once(table_perturb,
-			    INET_TABLE_PERTURB_SIZE * sizeof(*table_perturb));
+	get_random_slow_once(table_perturb,
+			     INET_TABLE_PERTURB_SIZE * sizeof(*table_perturb));
 	index = port_offset & (INET_TABLE_PERTURB_SIZE - 1);
 
 	offset = READ_ONCE(table_perturb[index]) + (port_offset >> 32);