diff mbox series

[3/3] kfence: test: try to avoid test_gfpzero trigger rcu_stall

Message ID 20220309014705.1265861-4-liupeng256@huawei.com (mailing list archive)
State Accepted
Commit 3cb1c9620eeeb67c614c0732a35861b0b1efdc53
Headers show
Series kunit: fix a UAF bug and do some optimization | expand

Commit Message

Peng Liu March 9, 2022, 1:47 a.m. UTC
When CONFIG_KFENCE_DYNAMIC_OBJECTS is set to a big number, kfence
kunit-test-case test_gfpzero will eat up nearly all the CPU's
resources and rcu_stall is reported as the following log which is
cut from a physical server.

  rcu: INFO: rcu_sched self-detected stall on CPU
  rcu: 	68-....: (14422 ticks this GP) idle=6ce/1/0x4000000000000002
  softirq=592/592 fqs=7500 (t=15004 jiffies g=10677 q=20019)
  Task dump for CPU 68:
  task:kunit_try_catch state:R  running task
  stack:    0 pid: 9728 ppid:     2 flags:0x0000020a
  Call trace:
   dump_backtrace+0x0/0x1e4
   show_stack+0x20/0x2c
   sched_show_task+0x148/0x170
   ...
   rcu_sched_clock_irq+0x70/0x180
   update_process_times+0x68/0xb0
   tick_sched_handle+0x38/0x74
   ...
   gic_handle_irq+0x78/0x2c0
   el1_irq+0xb8/0x140
   kfree+0xd8/0x53c
   test_alloc+0x264/0x310 [kfence_test]
   test_gfpzero+0xf4/0x840 [kfence_test]
   kunit_try_run_case+0x48/0x20c
   kunit_generic_run_threadfn_adapter+0x28/0x34
   kthread+0x108/0x13c
   ret_from_fork+0x10/0x18

To avoid rcu_stall and unacceptable latency, a schedule point is
added to test_gfpzero.

Signed-off-by: Peng Liu <liupeng256@huawei.com>
---
 mm/kfence/kfence_test.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Shuah Khan March 9, 2022, 9:39 p.m. UTC | #1
On 3/8/22 6:47 PM, Peng Liu wrote:
> When CONFIG_KFENCE_DYNAMIC_OBJECTS is set to a big number, kfence
> kunit-test-case test_gfpzero will eat up nearly all the CPU's
> resources and rcu_stall is reported as the following log which is
> cut from a physical server.
> 
>    rcu: INFO: rcu_sched self-detected stall on CPU
>    rcu: 	68-....: (14422 ticks this GP) idle=6ce/1/0x4000000000000002
>    softirq=592/592 fqs=7500 (t=15004 jiffies g=10677 q=20019)
>    Task dump for CPU 68:
>    task:kunit_try_catch state:R  running task
>    stack:    0 pid: 9728 ppid:     2 flags:0x0000020a
>    Call trace:
>     dump_backtrace+0x0/0x1e4
>     show_stack+0x20/0x2c
>     sched_show_task+0x148/0x170
>     ...
>     rcu_sched_clock_irq+0x70/0x180
>     update_process_times+0x68/0xb0
>     tick_sched_handle+0x38/0x74
>     ...
>     gic_handle_irq+0x78/0x2c0
>     el1_irq+0xb8/0x140
>     kfree+0xd8/0x53c
>     test_alloc+0x264/0x310 [kfence_test]
>     test_gfpzero+0xf4/0x840 [kfence_test]
>     kunit_try_run_case+0x48/0x20c
>     kunit_generic_run_threadfn_adapter+0x28/0x34
>     kthread+0x108/0x13c
>     ret_from_fork+0x10/0x18
> 
> To avoid rcu_stall and unacceptable latency, a schedule point is
> added to test_gfpzero.
> 
> Signed-off-by: Peng Liu <liupeng256@huawei.com>
> ---
>   mm/kfence/kfence_test.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
> index caed6b4eba94..1b50f70a4c0f 100644
> --- a/mm/kfence/kfence_test.c
> +++ b/mm/kfence/kfence_test.c
> @@ -627,6 +627,7 @@ static void test_gfpzero(struct kunit *test)
>   			kunit_warn(test, "giving up ... cannot get same object back\n");
>   			return;
>   		}
> +		cond_resched();

This sounds like a band-aid - is there a better way to fix this?

>   	}
>   
>   	for (i = 0; i < size; i++)
> 

thanks,
-- Shuah
diff mbox series

Patch

diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
index caed6b4eba94..1b50f70a4c0f 100644
--- a/mm/kfence/kfence_test.c
+++ b/mm/kfence/kfence_test.c
@@ -627,6 +627,7 @@  static void test_gfpzero(struct kunit *test)
 			kunit_warn(test, "giving up ... cannot get same object back\n");
 			return;
 		}
+		cond_resched();
 	}
 
 	for (i = 0; i < size; i++)