Message ID | 20210402202237.20334-2-urezki@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [PATCH-next,1/5] lib/test_vmalloc.c: remove two kvfree_rcu() tests | expand |
On Fri, 2 Apr 2021 22:22:34 +0200 "Uladzislau Rezki (Sony)" <urezki@gmail.com> wrote: > By using this parameter we can specify how many workers are > created to perform vmalloc tests. By default it is one CPU. > The maximum value is set to 1024. > > As a result of this change a 'single_cpu_test' one becomes > obsolete, therefore it is no longer needed. > Why limit to 1024? Maybe testers want more - what's the downside to permitting that? We may need to replaced that kcalloc() with kmvalloc() though...
> On Fri, 2 Apr 2021 22:22:34 +0200 "Uladzislau Rezki (Sony)" <urezki@gmail.com> wrote: > > > By using this parameter we can specify how many workers are > > created to perform vmalloc tests. By default it is one CPU. > > The maximum value is set to 1024. > > > > As a result of this change a 'single_cpu_test' one becomes > > obsolete, therefore it is no longer needed. > > > > Why limit to 1024? Maybe testers want more - what's the downside to > permitting that? > I was thinking mainly about if a tester issues enormous number of kthreads, so a system is not able to handle it. Therefore i clamped that value to 1024. From the other hand we can give more wide permissions, in that case a user should think more carefully about what is passed. For example we can limit max value by USHRT_MAX what is 65536. > > We may need to replaced that kcalloc() with kmvalloc() though... > Yep. If we limit to USHRT_MAX, the maximum amount of memory for internal data would be ~12MB. Something like below: diff --git a/lib/test_vmalloc.c b/lib/test_vmalloc.c index d337985e4c5e..a5103e3461bf 100644 --- a/lib/test_vmalloc.c +++ b/lib/test_vmalloc.c @@ -24,7 +24,7 @@ MODULE_PARM_DESC(name, msg) \ __param(int, nr_threads, 0, - "Number of workers to perform tests(min: 1 max: 1024)"); + "Number of workers to perform tests(min: 1 max: 65536)"); __param(bool, sequential_test_order, false, "Use sequential stress tests order"); @@ -469,13 +469,13 @@ init_test_configurtion(void) { /* * A maximum number of workers is defined as hard-coded - * value and set to 1024. We add such gap just in case + * value and set to 65536. We add such gap just in case * and for potential heavy stressing. */ - nr_threads = clamp(nr_threads, 1, 1024); + nr_threads = clamp(nr_threads, 1, 65536); /* Allocate the space for test instances. */ - tdriver = kcalloc(nr_threads, sizeof(*tdriver), GFP_KERNEL); + tdriver = kvcalloc(nr_threads, sizeof(*tdriver), GFP_KERNEL); if (tdriver == NULL) return -1; @@ -555,7 +555,7 @@ static void do_concurrent_test(void) i, t->stop - t->start); } - kfree(tdriver); + kvfree(tdriver); } static int vmalloc_test_init(void) Does it sound reasonable for you? -- Vlad Rezki
On Sat, 3 Apr 2021 14:31:43 +0200 Uladzislau Rezki <urezki@gmail.com> wrote: > > > > We may need to replaced that kcalloc() with kmvalloc() though... > > > Yep. If we limit to USHRT_MAX, the maximum amount of memory for > internal data would be ~12MB. Something like below: > > diff --git a/lib/test_vmalloc.c b/lib/test_vmalloc.c > index d337985e4c5e..a5103e3461bf 100644 > --- a/lib/test_vmalloc.c > +++ b/lib/test_vmalloc.c > @@ -24,7 +24,7 @@ > MODULE_PARM_DESC(name, msg) \ > > __param(int, nr_threads, 0, > - "Number of workers to perform tests(min: 1 max: 1024)"); > + "Number of workers to perform tests(min: 1 max: 65536)"); > > __param(bool, sequential_test_order, false, > "Use sequential stress tests order"); > @@ -469,13 +469,13 @@ init_test_configurtion(void) > { > /* > * A maximum number of workers is defined as hard-coded > - * value and set to 1024. We add such gap just in case > + * value and set to 65536. We add such gap just in case > * and for potential heavy stressing. > */ > - nr_threads = clamp(nr_threads, 1, 1024); > + nr_threads = clamp(nr_threads, 1, 65536); > > /* Allocate the space for test instances. */ > - tdriver = kcalloc(nr_threads, sizeof(*tdriver), GFP_KERNEL); > + tdriver = kvcalloc(nr_threads, sizeof(*tdriver), GFP_KERNEL); > if (tdriver == NULL) > return -1; > > @@ -555,7 +555,7 @@ static void do_concurrent_test(void) > i, t->stop - t->start); > } > > - kfree(tdriver); > + kvfree(tdriver); > } > > static int vmalloc_test_init(void) > > Does it sound reasonable for you? I think so. It's a test thing so let's give testers more flexibility, remembering that they don't need as much protection from their own mistakes.
On Mon, Apr 05, 2021 at 07:39:20PM -0700, Andrew Morton wrote: > On Sat, 3 Apr 2021 14:31:43 +0200 Uladzislau Rezki <urezki@gmail.com> wrote: > > > > > > > We may need to replaced that kcalloc() with kmvalloc() though... > > > > > Yep. If we limit to USHRT_MAX, the maximum amount of memory for > > internal data would be ~12MB. Something like below: > > > > diff --git a/lib/test_vmalloc.c b/lib/test_vmalloc.c > > index d337985e4c5e..a5103e3461bf 100644 > > --- a/lib/test_vmalloc.c > > +++ b/lib/test_vmalloc.c > > @@ -24,7 +24,7 @@ > > MODULE_PARM_DESC(name, msg) \ > > > > __param(int, nr_threads, 0, > > - "Number of workers to perform tests(min: 1 max: 1024)"); > > + "Number of workers to perform tests(min: 1 max: 65536)"); > > > > __param(bool, sequential_test_order, false, > > "Use sequential stress tests order"); > > @@ -469,13 +469,13 @@ init_test_configurtion(void) > > { > > /* > > * A maximum number of workers is defined as hard-coded > > - * value and set to 1024. We add such gap just in case > > + * value and set to 65536. We add such gap just in case > > * and for potential heavy stressing. > > */ > > - nr_threads = clamp(nr_threads, 1, 1024); > > + nr_threads = clamp(nr_threads, 1, 65536); > > > > /* Allocate the space for test instances. */ > > - tdriver = kcalloc(nr_threads, sizeof(*tdriver), GFP_KERNEL); > > + tdriver = kvcalloc(nr_threads, sizeof(*tdriver), GFP_KERNEL); > > if (tdriver == NULL) > > return -1; > > > > @@ -555,7 +555,7 @@ static void do_concurrent_test(void) > > i, t->stop - t->start); > > } > > > > - kfree(tdriver); > > + kvfree(tdriver); > > } > > > > static int vmalloc_test_init(void) > > > > Does it sound reasonable for you? > > I think so. It's a test thing so let's give testers more flexibility, > remembering that they don't need as much protection from their own > mistakes. > OK. I will send one more extra patch then. -- Vlad Rezki
diff --git a/lib/test_vmalloc.c b/lib/test_vmalloc.c index 4eb6abdaa74e..d337985e4c5e 100644 --- a/lib/test_vmalloc.c +++ b/lib/test_vmalloc.c @@ -23,8 +23,8 @@ module_param(name, type, 0444); \ MODULE_PARM_DESC(name, msg) \ -__param(bool, single_cpu_test, false, - "Use single first online CPU to run tests"); +__param(int, nr_threads, 0, + "Number of workers to perform tests(min: 1 max: 1024)"); __param(bool, sequential_test_order, false, "Use sequential stress tests order"); @@ -50,13 +50,6 @@ __param(int, run_test_mask, INT_MAX, /* Add a new test case description here. */ ); -/* - * Depends on single_cpu_test parameter. If it is true, then - * use first online CPU to trigger a test on, otherwise go with - * all online CPUs. - */ -static cpumask_t cpus_run_test_mask = CPU_MASK_NONE; - /* * Read write semaphore for synchronization of setup * phase that is done in main thread and workers. @@ -386,16 +379,13 @@ struct test_case_data { u64 time; }; -/* Split it to get rid of: WARNING: line over 80 characters */ -static struct test_case_data - per_cpu_test_data[NR_CPUS][ARRAY_SIZE(test_case_array)]; - static struct test_driver { struct task_struct *task; + struct test_case_data data[ARRAY_SIZE(test_case_array)]; + unsigned long start; unsigned long stop; - int cpu; -} per_cpu_test_driver[NR_CPUS]; +} *tdriver; static void shuffle_array(int *arr, int n) { @@ -423,9 +413,6 @@ static int test_func(void *private) ktime_t kt; u64 delta; - if (set_cpus_allowed_ptr(current, cpumask_of(t->cpu)) < 0) - pr_err("Failed to set affinity to %d CPU\n", t->cpu); - for (i = 0; i < ARRAY_SIZE(test_case_array); i++) random_array[i] = i; @@ -450,9 +437,9 @@ static int test_func(void *private) kt = ktime_get(); for (j = 0; j < test_repeat_count; j++) { if (!test_case_array[index].test_func()) - per_cpu_test_data[t->cpu][index].test_passed++; + t->data[index].test_passed++; else - per_cpu_test_data[t->cpu][index].test_failed++; + t->data[index].test_failed++; } /* @@ -461,7 +448,7 @@ static int test_func(void *private) delta = (u64) ktime_us_delta(ktime_get(), kt); do_div(delta, (u32) test_repeat_count); - per_cpu_test_data[t->cpu][index].time = delta; + t->data[index].time = delta; } t->stop = get_cycles(); @@ -477,53 +464,56 @@ static int test_func(void *private) return 0; } -static void +static int init_test_configurtion(void) { /* - * Reset all data of all CPUs. + * A maximum number of workers is defined as hard-coded + * value and set to 1024. We add such gap just in case + * and for potential heavy stressing. */ - memset(per_cpu_test_data, 0, sizeof(per_cpu_test_data)); + nr_threads = clamp(nr_threads, 1, 1024); - if (single_cpu_test) - cpumask_set_cpu(cpumask_first(cpu_online_mask), - &cpus_run_test_mask); - else - cpumask_and(&cpus_run_test_mask, cpu_online_mask, - cpu_online_mask); + /* Allocate the space for test instances. */ + tdriver = kcalloc(nr_threads, sizeof(*tdriver), GFP_KERNEL); + if (tdriver == NULL) + return -1; if (test_repeat_count <= 0) test_repeat_count = 1; if (test_loop_count <= 0) test_loop_count = 1; + + return 0; } static void do_concurrent_test(void) { - int cpu, ret; + int i, ret; /* * Set some basic configurations plus sanity check. */ - init_test_configurtion(); + ret = init_test_configurtion(); + if (ret < 0) + return; /* * Put on hold all workers. */ down_write(&prepare_for_test_rwsem); - for_each_cpu(cpu, &cpus_run_test_mask) { - struct test_driver *t = &per_cpu_test_driver[cpu]; + for (i = 0; i < nr_threads; i++) { + struct test_driver *t = &tdriver[i]; - t->cpu = cpu; - t->task = kthread_run(test_func, t, "vmalloc_test/%d", cpu); + t->task = kthread_run(test_func, t, "vmalloc_test/%d", i); if (!IS_ERR(t->task)) /* Success. */ atomic_inc(&test_n_undone); else - pr_err("Failed to start kthread for %d CPU\n", cpu); + pr_err("Failed to start %d kthread\n", i); } /* @@ -541,29 +531,31 @@ static void do_concurrent_test(void) ret = wait_for_completion_timeout(&test_all_done_comp, HZ); } while (!ret); - for_each_cpu(cpu, &cpus_run_test_mask) { - struct test_driver *t = &per_cpu_test_driver[cpu]; - int i; + for (i = 0; i < nr_threads; i++) { + struct test_driver *t = &tdriver[i]; + int j; if (!IS_ERR(t->task)) kthread_stop(t->task); - for (i = 0; i < ARRAY_SIZE(test_case_array); i++) { - if (!((run_test_mask & (1 << i)) >> i)) + for (j = 0; j < ARRAY_SIZE(test_case_array); j++) { + if (!((run_test_mask & (1 << j)) >> j)) continue; pr_info( "Summary: %s passed: %d failed: %d repeat: %d loops: %d avg: %llu usec\n", - test_case_array[i].test_name, - per_cpu_test_data[cpu][i].test_passed, - per_cpu_test_data[cpu][i].test_failed, + test_case_array[j].test_name, + t->data[j].test_passed, + t->data[j].test_failed, test_repeat_count, test_loop_count, - per_cpu_test_data[cpu][i].time); + t->data[j].time); } - pr_info("All test took CPU%d=%lu cycles\n", - cpu, t->stop - t->start); + pr_info("All test took worker%d=%lu cycles\n", + i, t->stop - t->start); } + + kfree(tdriver); } static int vmalloc_test_init(void)
By using this parameter we can specify how many workers are created to perform vmalloc tests. By default it is one CPU. The maximum value is set to 1024. As a result of this change a 'single_cpu_test' one becomes obsolete, therefore it is no longer needed. Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> --- lib/test_vmalloc.c | 88 +++++++++++++++++++++------------------------- 1 file changed, 40 insertions(+), 48 deletions(-)