diff mbox series

random: handle creditable entropy from atomic process context

Message ID 20240417120217.3814215-2-Jason@zx2c4.com (mailing list archive)
State New, archived
Headers show
Series random: handle creditable entropy from atomic process context | expand

Commit Message

Jason A. Donenfeld April 17, 2024, 12:01 p.m. UTC
The entropy accounting changes a static key when the RNG has
initialized, since it only ever initializes once. Static key changes,
however, cannot be made from atomic context, so depending on where the
last creditable entropy comes from, the static key change might need to
be deferred to a worker.

Previously the code used the execute_in_process_context() helper
function, which accounts for whether or not the caller is
in_interrupt(). However, that doesn't account for the case where the
caller is actually in process context but is holding a spinlock.

This turned out to be the case with input_handle_event() in
drivers/input/input.c contributing entropy:

  [<ffffffd613025ba0>] die+0xa8/0x2fc
  [<ffffffd613027428>] bug_handler+0x44/0xec
  [<ffffffd613016964>] brk_handler+0x90/0x144
  [<ffffffd613041e58>] do_debug_exception+0xa0/0x148
  [<ffffffd61400c208>] el1_dbg+0x60/0x7c
  [<ffffffd61400c000>] el1h_64_sync_handler+0x38/0x90
  [<ffffffd613011294>] el1h_64_sync+0x64/0x6c
  [<ffffffd613102d88>] __might_resched+0x1fc/0x2e8
  [<ffffffd613102b54>] __might_sleep+0x44/0x7c
  [<ffffffd6130b6eac>] cpus_read_lock+0x1c/0xec
  [<ffffffd6132c2820>] static_key_enable+0x14/0x38
  [<ffffffd61400ac08>] crng_set_ready+0x14/0x28
  [<ffffffd6130df4dc>] execute_in_process_context+0xb8/0xf8
  [<ffffffd61400ab30>] _credit_init_bits+0x118/0x1dc
  [<ffffffd6138580c8>] add_timer_randomness+0x264/0x270
  [<ffffffd613857e54>] add_input_randomness+0x38/0x48
  [<ffffffd613a80f94>] input_handle_event+0x2b8/0x490
  [<ffffffd613a81310>] input_event+0x6c/0x98

According to Guoyong, it's not really possible to refactor the various
drivers to never hold a spinlock there. And in_atomic() isn't reliable.

So, rather than trying to be too fancy, just punt the change in the
static key to a workqueue always. There's basically no drawback of doing
this, as the code already needed to account for the static key not
changing immediately, and given that it's just an optimization, there's
not exactly a hurry to change the static key right away, so deferal is
fine.

Reported-by: Guoyong Wang <guoyong.wang@mediatek.com>
Cc: stable@vger.kernel.org
Fixes: f5bda35fba61 ("random: use static branch for crng_ready()")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
Guoyong- can you test this and tell me whether it fixes the problem you
were seeing? If so, I'll try to get this sent up for 6.9. -Jason

 drivers/char/random.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Comments

Guoyong Wang April 19, 2024, 8:41 a.m. UTC | #1
On Wed, 17 Apr 2024 14:01:11 +0200, Jason A. Donenfeld wrote:
> The entropy accounting changes a static key when the RNG has
> initialized, since it only ever initializes once. Static key changes,
> however, cannot be made from atomic context, so depending on where the
> last creditable entropy comes from, the static key change might need to
> be deferred to a worker.
> 
> Previously the code used the execute_in_process_context() helper
> function, which accounts for whether or not the caller is
> in_interrupt(). However, that doesn't account for the case where the
> caller is actually in process context but is holding a spinlock.
> 
> This turned out to be the case with input_handle_event() in
> drivers/input/input.c contributing entropy:
> 
>   [<ffffffd613025ba0>] die+0xa8/0x2fc
>   [<ffffffd613027428>] bug_handler+0x44/0xec
>   [<ffffffd613016964>] brk_handler+0x90/0x144
>   [<ffffffd613041e58>] do_debug_exception+0xa0/0x148
>   [<ffffffd61400c208>] el1_dbg+0x60/0x7c
>   [<ffffffd61400c000>] el1h_64_sync_handler+0x38/0x90
>   [<ffffffd613011294>] el1h_64_sync+0x64/0x6c
>   [<ffffffd613102d88>] __might_resched+0x1fc/0x2e8
>   [<ffffffd613102b54>] __might_sleep+0x44/0x7c
>   [<ffffffd6130b6eac>] cpus_read_lock+0x1c/0xec
>   [<ffffffd6132c2820>] static_key_enable+0x14/0x38
>   [<ffffffd61400ac08>] crng_set_ready+0x14/0x28
>   [<ffffffd6130df4dc>] execute_in_process_context+0xb8/0xf8
>   [<ffffffd61400ab30>] _credit_init_bits+0x118/0x1dc
>   [<ffffffd6138580c8>] add_timer_randomness+0x264/0x270
>   [<ffffffd613857e54>] add_input_randomness+0x38/0x48
>   [<ffffffd613a80f94>] input_handle_event+0x2b8/0x490
>   [<ffffffd613a81310>] input_event+0x6c/0x98
> 
> According to Guoyong, it's not really possible to refactor the various
> drivers to never hold a spinlock there. And in_atomic() isn't reliable.
> 
> So, rather than trying to be too fancy, just punt the change in the
> static key to a workqueue always. There's basically no drawback of doing
> this, as the code already needed to account for the static key not
> changing immediately, and given that it's just an optimization, there's
> not exactly a hurry to change the static key right away, so deferal is
> fine.
> 
> Reported-by: Guoyong Wang <guoyong.wang@mediatek.com>
> Cc: stable@vger.kernel.org
> Fixes: f5bda35fba61 ("random: use static branch for crng_ready()")
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> ---
> Guoyong- can you test this and tell me whether it fixes the problem you
> were seeing? If so, I'll try to get this sent up for 6.9. -Jason
> 
>  drivers/char/random.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/char/random.c b/drivers/char/random.c
> index 456be28ba67c..2597cb43f438 100644
> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -702,7 +702,7 @@ static void extract_entropy(void *buf, size_t len)
>  
>  static void __cold _credit_init_bits(size_t bits)
>  {
> -       static struct execute_work set_ready;
> +       static DECLARE_WORK(set_ready, crng_set_ready);
>         unsigned int new, orig, add;
>         unsigned long flags;
>  
> @@ -718,8 +718,8 @@ static void __cold _credit_init_bits(size_t bits)
>  
>         if (orig < POOL_READY_BITS && new >= POOL_READY_BITS) {
>                 crng_reseed(NULL); /* Sets crng_init to CRNG_READY under base_crng.lock. */
> -               if (static_key_initialized)
> -                       execute_in_process_context(crng_set_ready, &set_ready);
> +               if (static_key_initialized && system_unbound_wq)
> +                       queue_work(system_unbound_wq, &set_ready);
>                 atomic_notifier_call_chain(&random_ready_notifier, 0, NULL);
>                 wake_up_interruptible(&crng_init_wait);
>                 kill_fasync(&fasync, SIGIO, POLL_IN);
> @@ -890,8 +890,8 @@ void __init random_init(void)
>  
>         /*
>          * If we were initialized by the cpu or bootloader before jump labels
> -        * are initialized, then we should enable the static branch here, where
> -        * it's guaranteed that jump labels have been initialized.
> +        * or workqueues are initialized, then we should enable the static
> +        * branch here, where it's guaranteed that these have been initialized.
>          */
>         if (!static_branch_likely(&crng_is_ready) && crng_init >= CRNG_READY)
>                 crng_set_ready(NULL);
> -- 
> 2.44.0

Hi Jason,

Thanks for your feedback. We concur with the proposed change and have verified that it works well 
in our tests. Next, I will provide a patch v2 for the changes discussed.
Jason A. Donenfeld April 19, 2024, 8:55 a.m. UTC | #2
On Fri, Apr 19, 2024 at 04:41:12PM +0800, Guoyong Wang wrote:
> On Wed, 17 Apr 2024 14:01:11 +0200, Jason A. Donenfeld wrote:
> > The entropy accounting changes a static key when the RNG has
> > initialized, since it only ever initializes once. Static key changes,
> > however, cannot be made from atomic context, so depending on where the
> > last creditable entropy comes from, the static key change might need to
> > be deferred to a worker.
> > 
> > Previously the code used the execute_in_process_context() helper
> > function, which accounts for whether or not the caller is
> > in_interrupt(). However, that doesn't account for the case where the
> > caller is actually in process context but is holding a spinlock.
> > 
> > This turned out to be the case with input_handle_event() in
> > drivers/input/input.c contributing entropy:
> > 
> >   [<ffffffd613025ba0>] die+0xa8/0x2fc
> >   [<ffffffd613027428>] bug_handler+0x44/0xec
> >   [<ffffffd613016964>] brk_handler+0x90/0x144
> >   [<ffffffd613041e58>] do_debug_exception+0xa0/0x148
> >   [<ffffffd61400c208>] el1_dbg+0x60/0x7c
> >   [<ffffffd61400c000>] el1h_64_sync_handler+0x38/0x90
> >   [<ffffffd613011294>] el1h_64_sync+0x64/0x6c
> >   [<ffffffd613102d88>] __might_resched+0x1fc/0x2e8
> >   [<ffffffd613102b54>] __might_sleep+0x44/0x7c
> >   [<ffffffd6130b6eac>] cpus_read_lock+0x1c/0xec
> >   [<ffffffd6132c2820>] static_key_enable+0x14/0x38
> >   [<ffffffd61400ac08>] crng_set_ready+0x14/0x28
> >   [<ffffffd6130df4dc>] execute_in_process_context+0xb8/0xf8
> >   [<ffffffd61400ab30>] _credit_init_bits+0x118/0x1dc
> >   [<ffffffd6138580c8>] add_timer_randomness+0x264/0x270
> >   [<ffffffd613857e54>] add_input_randomness+0x38/0x48
> >   [<ffffffd613a80f94>] input_handle_event+0x2b8/0x490
> >   [<ffffffd613a81310>] input_event+0x6c/0x98
> > 
> > According to Guoyong, it's not really possible to refactor the various
> > drivers to never hold a spinlock there. And in_atomic() isn't reliable.
> > 
> > So, rather than trying to be too fancy, just punt the change in the
> > static key to a workqueue always. There's basically no drawback of doing
> > this, as the code already needed to account for the static key not
> > changing immediately, and given that it's just an optimization, there's
> > not exactly a hurry to change the static key right away, so deferal is
> > fine.
> > 
> > Reported-by: Guoyong Wang <guoyong.wang@mediatek.com>
> > Cc: stable@vger.kernel.org
> > Fixes: f5bda35fba61 ("random: use static branch for crng_ready()")
> > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> > ---
> > Guoyong- can you test this and tell me whether it fixes the problem you
> > were seeing? If so, I'll try to get this sent up for 6.9. -Jason
> > 
> >  drivers/char/random.c | 10 +++++-----
> >  1 file changed, 5 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/char/random.c b/drivers/char/random.c
> > index 456be28ba67c..2597cb43f438 100644
> > --- a/drivers/char/random.c
> > +++ b/drivers/char/random.c
> > @@ -702,7 +702,7 @@ static void extract_entropy(void *buf, size_t len)
> >  
> >  static void __cold _credit_init_bits(size_t bits)
> >  {
> > -       static struct execute_work set_ready;
> > +       static DECLARE_WORK(set_ready, crng_set_ready);
> >         unsigned int new, orig, add;
> >         unsigned long flags;
> >  
> > @@ -718,8 +718,8 @@ static void __cold _credit_init_bits(size_t bits)
> >  
> >         if (orig < POOL_READY_BITS && new >= POOL_READY_BITS) {
> >                 crng_reseed(NULL); /* Sets crng_init to CRNG_READY under base_crng.lock. */
> > -               if (static_key_initialized)
> > -                       execute_in_process_context(crng_set_ready, &set_ready);
> > +               if (static_key_initialized && system_unbound_wq)
> > +                       queue_work(system_unbound_wq, &set_ready);
> >                 atomic_notifier_call_chain(&random_ready_notifier, 0, NULL);
> >                 wake_up_interruptible(&crng_init_wait);
> >                 kill_fasync(&fasync, SIGIO, POLL_IN);
> > @@ -890,8 +890,8 @@ void __init random_init(void)
> >  
> >         /*
> >          * If we were initialized by the cpu or bootloader before jump labels
> > -        * are initialized, then we should enable the static branch here, where
> > -        * it's guaranteed that jump labels have been initialized.
> > +        * or workqueues are initialized, then we should enable the static
> > +        * branch here, where it's guaranteed that these have been initialized.
> >          */
> >         if (!static_branch_likely(&crng_is_ready) && crng_init >= CRNG_READY)
> >                 crng_set_ready(NULL);
> > -- 
> > 2.44.0
> 
> Hi Jason,
> 
> Thanks for your feedback. We concur with the proposed change and have verified that it works well 
> in our tests. Next, I will provide a patch v2 for the changes discussed.

No need. This is already upstream and has your Reported-by. It'll get
backported to stable too.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e871abcda3b67d0820b4182ebe93435624e9c6a4
diff mbox series

Patch

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 456be28ba67c..2597cb43f438 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -702,7 +702,7 @@  static void extract_entropy(void *buf, size_t len)
 
 static void __cold _credit_init_bits(size_t bits)
 {
-	static struct execute_work set_ready;
+	static DECLARE_WORK(set_ready, crng_set_ready);
 	unsigned int new, orig, add;
 	unsigned long flags;
 
@@ -718,8 +718,8 @@  static void __cold _credit_init_bits(size_t bits)
 
 	if (orig < POOL_READY_BITS && new >= POOL_READY_BITS) {
 		crng_reseed(NULL); /* Sets crng_init to CRNG_READY under base_crng.lock. */
-		if (static_key_initialized)
-			execute_in_process_context(crng_set_ready, &set_ready);
+		if (static_key_initialized && system_unbound_wq)
+			queue_work(system_unbound_wq, &set_ready);
 		atomic_notifier_call_chain(&random_ready_notifier, 0, NULL);
 		wake_up_interruptible(&crng_init_wait);
 		kill_fasync(&fasync, SIGIO, POLL_IN);
@@ -890,8 +890,8 @@  void __init random_init(void)
 
 	/*
 	 * If we were initialized by the cpu or bootloader before jump labels
-	 * are initialized, then we should enable the static branch here, where
-	 * it's guaranteed that jump labels have been initialized.
+	 * or workqueues are initialized, then we should enable the static
+	 * branch here, where it's guaranteed that these have been initialized.
 	 */
 	if (!static_branch_likely(&crng_is_ready) && crng_init >= CRNG_READY)
 		crng_set_ready(NULL);