Message ID | 20230203232409.163847-3-frederic@kernel.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | sched/isolation: Prep work for pcp cache draining isolation | expand |
On 2/3/23 18:24, Frederic Weisbecker wrote: > Provide this new API to check if a CPU has been isolated either through > isolcpus= or nohz_full= kernel parameter. > > It aims at avoiding kernel load deemed to be safely spared on CPUs > running sensitive workload that can't bear any disturbance, such as > pcp cache draining. > > Suggested-by: Michal Hocko <mhocko@suse.com> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > --- > include/linux/sched/isolation.h | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h > index b645cc81fe01..088672f08469 100644 > --- a/include/linux/sched/isolation.h > +++ b/include/linux/sched/isolation.h > @@ -53,4 +53,10 @@ static inline bool housekeeping_cpu(int cpu, enum hk_type type) > return true; > } > > +static inline bool cpu_is_isolated(int cpu) > +{ > + return !housekeeping_test_cpu(cpu, HK_TYPE_DOMAIN) || > + !housekeeping_test_cpu(cpu, HK_TYPE_KERNEL_NOISE); > +} > + > #endif /* _LINUX_SCHED_ISOLATION_H */ CPUs in an isolated cpuset partition is similar to HK_TYPE_DOMAIN CPUs as load balancing is disabled. I can add an API to access the cpumask and add to this API. However, that list is dynamic as it can be changed at run time. Will that be a problem? Or should that be used separately? Cheers, Longman
Hi Frederic, I love your patch! Yet something to improve: [auto build test ERROR on tip/sched/core] [also build test ERROR on horms-ipvs-next/master horms-ipvs/master linus/master v6.2-rc6 next-20230203] [cannot apply to paulmck-rcu/dev] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Frederic-Weisbecker/sched-isolation-Merge-individual-nohz_full-features-into-a-common-housekeeping-flag/20230204-072510 patch link: https://lore.kernel.org/r/20230203232409.163847-3-frederic%40kernel.org patch subject: [PATCH 2/2] sched/isolation: Add cpu_is_isolated() API config: alpha-defconfig (https://download.01.org/0day-ci/archive/20230204/202302041801.0Xt5eNbS-lkp@intel.com/config) compiler: alpha-linux-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/89596a035dc10e00cb66d4e75e49d69b75413807 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Frederic-Weisbecker/sched-isolation-Merge-individual-nohz_full-features-into-a-common-housekeeping-flag/20230204-072510 git checkout 89596a035dc10e00cb66d4e75e49d69b75413807 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=alpha olddefconfig COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=alpha SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): In file included from drivers/char/random.c:56: include/linux/sched/isolation.h: In function 'cpu_is_isolated': >> include/linux/sched/isolation.h:58:17: error: implicit declaration of function 'housekeeping_test_cpu'; did you mean 'housekeeping_any_cpu'? [-Werror=implicit-function-declaration] 58 | return !housekeeping_test_cpu(cpu, HK_TYPE_DOMAIN) || | ^~~~~~~~~~~~~~~~~~~~~ | housekeeping_any_cpu cc1: some warnings being treated as errors -- In file included from init/main.c:56: include/linux/sched/isolation.h: In function 'cpu_is_isolated': >> include/linux/sched/isolation.h:58:17: error: implicit declaration of function 'housekeeping_test_cpu'; did you mean 'housekeeping_any_cpu'? [-Werror=implicit-function-declaration] 58 | return !housekeeping_test_cpu(cpu, HK_TYPE_DOMAIN) || | ^~~~~~~~~~~~~~~~~~~~~ | housekeeping_any_cpu init/main.c: At top level: init/main.c:775:20: warning: no previous prototype for 'arch_post_acpi_subsys_init' [-Wmissing-prototypes] 775 | void __init __weak arch_post_acpi_subsys_init(void) { } | ^~~~~~~~~~~~~~~~~~~~~~~~~~ init/main.c:787:20: warning: no previous prototype for 'mem_encrypt_init' [-Wmissing-prototypes] 787 | void __init __weak mem_encrypt_init(void) { } | ^~~~~~~~~~~~~~~~ init/main.c:789:20: warning: no previous prototype for 'poking_init' [-Wmissing-prototypes] 789 | void __init __weak poking_init(void) { } | ^~~~~~~~~~~ cc1: some warnings being treated as errors -- In file included from kernel/sched/fair.c:38: include/linux/sched/isolation.h: In function 'cpu_is_isolated': >> include/linux/sched/isolation.h:58:17: error: implicit declaration of function 'housekeeping_test_cpu'; did you mean 'housekeeping_any_cpu'? [-Werror=implicit-function-declaration] 58 | return !housekeeping_test_cpu(cpu, HK_TYPE_DOMAIN) || | ^~~~~~~~~~~~~~~~~~~~~ | housekeeping_any_cpu kernel/sched/fair.c: At top level: kernel/sched/fair.c:688:5: warning: no previous prototype for 'sched_update_scaling' [-Wmissing-prototypes] 688 | int sched_update_scaling(void) | ^~~~~~~~~~~~~~~~~~~~ kernel/sched/fair.c:6067:6: warning: no previous prototype for 'init_cfs_bandwidth' [-Wmissing-prototypes] 6067 | void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b) {} | ^~~~~~~~~~~~~~~~~~ kernel/sched/fair.c:12493:6: warning: no previous prototype for 'free_fair_sched_group' [-Wmissing-prototypes] 12493 | void free_fair_sched_group(struct task_group *tg) { } | ^~~~~~~~~~~~~~~~~~~~~ kernel/sched/fair.c:12495:5: warning: no previous prototype for 'alloc_fair_sched_group' [-Wmissing-prototypes] 12495 | int alloc_fair_sched_group(struct task_group *tg, struct task_group *parent) | ^~~~~~~~~~~~~~~~~~~~~~ kernel/sched/fair.c:12500:6: warning: no previous prototype for 'online_fair_sched_group' [-Wmissing-prototypes] 12500 | void online_fair_sched_group(struct task_group *tg) { } | ^~~~~~~~~~~~~~~~~~~~~~~ kernel/sched/fair.c:12502:6: warning: no previous prototype for 'unregister_fair_sched_group' [-Wmissing-prototypes] 12502 | void unregister_fair_sched_group(struct task_group *tg) { } | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: some warnings being treated as errors vim +58 include/linux/sched/isolation.h 55 56 static inline bool cpu_is_isolated(int cpu) 57 { > 58 return !housekeeping_test_cpu(cpu, HK_TYPE_DOMAIN) || 59 !housekeeping_test_cpu(cpu, HK_TYPE_KERNEL_NOISE); 60 } 61
Hi Frederic, I love your patch! Yet something to improve: [auto build test ERROR on tip/sched/core] [also build test ERROR on horms-ipvs-next/master horms-ipvs/master linus/master v6.2-rc6 next-20230203] [cannot apply to paulmck-rcu/dev] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Frederic-Weisbecker/sched-isolation-Merge-individual-nohz_full-features-into-a-common-housekeeping-flag/20230204-072510 patch link: https://lore.kernel.org/r/20230203232409.163847-3-frederic%40kernel.org patch subject: [PATCH 2/2] sched/isolation: Add cpu_is_isolated() API config: x86_64-randconfig-a001 (https://download.01.org/0day-ci/archive/20230205/202302051316.1kdX7ylk-lkp@intel.com/config) compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/89596a035dc10e00cb66d4e75e49d69b75413807 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Frederic-Weisbecker/sched-isolation-Merge-individual-nohz_full-features-into-a-common-housekeeping-flag/20230204-072510 git checkout 89596a035dc10e00cb66d4e75e49d69b75413807 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 olddefconfig COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): In file included from drivers/pci/pci-driver.c:15: >> include/linux/sched/isolation.h:58:10: error: implicit declaration of function 'housekeeping_test_cpu' is invalid in C99 [-Werror,-Wimplicit-function-declaration] return !housekeeping_test_cpu(cpu, HK_TYPE_DOMAIN) || ^ include/linux/sched/isolation.h:58:10: note: did you mean 'housekeeping_any_cpu'? include/linux/sched/isolation.h:27:19: note: 'housekeeping_any_cpu' declared here static inline int housekeeping_any_cpu(enum hk_type type) ^ 1 error generated. -- In file included from kernel/sched/fair.c:38: >> include/linux/sched/isolation.h:58:10: error: implicit declaration of function 'housekeeping_test_cpu' is invalid in C99 [-Werror,-Wimplicit-function-declaration] return !housekeeping_test_cpu(cpu, HK_TYPE_DOMAIN) || ^ include/linux/sched/isolation.h:58:10: note: did you mean 'housekeeping_any_cpu'? include/linux/sched/isolation.h:27:19: note: 'housekeeping_any_cpu' declared here static inline int housekeeping_any_cpu(enum hk_type type) ^ kernel/sched/fair.c:688:5: warning: no previous prototype for function 'sched_update_scaling' [-Wmissing-prototypes] int sched_update_scaling(void) ^ kernel/sched/fair.c:688:1: note: declare 'static' if the function is not intended to be used outside of this translation unit int sched_update_scaling(void) ^ static 1 warning and 1 error generated. vim +/housekeeping_test_cpu +58 include/linux/sched/isolation.h 55 56 static inline bool cpu_is_isolated(int cpu) 57 { > 58 return !housekeeping_test_cpu(cpu, HK_TYPE_DOMAIN) || 59 !housekeeping_test_cpu(cpu, HK_TYPE_KERNEL_NOISE); 60 } 61
Hello. On Fri, Feb 03, 2023 at 10:53:46PM -0500, Waiman Long <longman@redhat.com> wrote: > CPUs in an isolated cpuset partition is similar to HK_TYPE_DOMAIN CPUs as > load balancing is disabled. I can add an API to access the cpumask and add > to this API. However, that list is dynamic as it can be changed at run time. > Will that be a problem? I can see a problem already -- as a CPU can be dynamically switched to "isolated" mode so should all dependent operations support that (switch) too, i.e. the CPUs local PCP caches would have to be drained when the CPU enters isolation. > Or should that be used separately? It'd be nice to have both (cpuset and cmdline flags) eventually unified. Alas, it only leads me conservatively to: #ifndef CONFIG_CPUSETS // the proposed implementaion else static inline bool cpu_is_isolated(int cpu) { return true; } #endif My 0.02€, Michal
On 2/6/23 10:47, Michal Koutný wrote: > Hello. > > On Fri, Feb 03, 2023 at 10:53:46PM -0500, Waiman Long <longman@redhat.com> wrote: >> CPUs in an isolated cpuset partition is similar to HK_TYPE_DOMAIN CPUs as >> load balancing is disabled. I can add an API to access the cpumask and add >> to this API. However, that list is dynamic as it can be changed at run time. >> Will that be a problem? > I can see a problem already -- as a CPU can be dynamically switched to > "isolated" mode so should all dependent operations support that (switch) > too, i.e. the CPUs local PCP caches would have to be drained when the > CPU enters isolation. I see the long term goal is to have more isolation capability to be done dynamically. However, we are not there yet. There is still a lot of work to do to achieve that. > >> Or should that be used separately? > It'd be nice to have both (cpuset and cmdline flags) eventually unified. > > Alas, it only leads me conservatively to: > > #ifndef CONFIG_CPUSETS > // the proposed implementaion > else > static inline bool cpu_is_isolated(int cpu) { > return true; > } > #endif That is too conservative from my point of view. We can have further discussion when a patch is ready. Cheers, Longman
On Fri, Feb 03, 2023 at 10:53:46PM -0500, Waiman Long wrote: > On 2/3/23 18:24, Frederic Weisbecker wrote: > > Provide this new API to check if a CPU has been isolated either through > > isolcpus= or nohz_full= kernel parameter. > > > > It aims at avoiding kernel load deemed to be safely spared on CPUs > > running sensitive workload that can't bear any disturbance, such as > > pcp cache draining. > > > > Suggested-by: Michal Hocko <mhocko@suse.com> > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > > --- > > include/linux/sched/isolation.h | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h > > index b645cc81fe01..088672f08469 100644 > > --- a/include/linux/sched/isolation.h > > +++ b/include/linux/sched/isolation.h > > @@ -53,4 +53,10 @@ static inline bool housekeeping_cpu(int cpu, enum hk_type type) > > return true; > > } > > +static inline bool cpu_is_isolated(int cpu) > > +{ > > + return !housekeeping_test_cpu(cpu, HK_TYPE_DOMAIN) || > > + !housekeeping_test_cpu(cpu, HK_TYPE_KERNEL_NOISE); > > +} > > + > > #endif /* _LINUX_SCHED_ISOLATION_H */ > > CPUs in an isolated cpuset partition is similar to HK_TYPE_DOMAIN CPUs as > load balancing is disabled. I can add an API to access the cpumask and add > to this API. However, that list is dynamic as it can be changed at run time. > Will that be a problem? Or should that be used separately? So that's what I intended first but the dynamic part of cpuset made me postpone that to better days. But yes ideally it should look like: static inline bool cpu_is_isolated(int cpu) { return !housekeeping_test_cpu(cpu, HK_TYPE_KERNEL_NOISE) || on_null_domain(cpu_rq(cpu)); } And there should be a hook in something like detach_destroy_domains() to flush the pcp cache when a CPU is attached to a NULL domain. All that with proper RCU synchronization: UPDATE READER ------ ------ rcu_assign_pointer(cpu_rq(cpu)->sd, NULL); rcu_read_lock(); synchronize_rcu(); if (!cpu_is_isolated(cpu)) stock = &per_cpu(memcg_stock, cpu); schedule_work_on(cpu, &stock->work); flush_work(&stock->work); rcu_read_unlock() Thanks.
On Sat 04-02-23 00:24:09, Frederic Weisbecker wrote: > Provide this new API to check if a CPU has been isolated either through > isolcpus= or nohz_full= kernel parameter. > > It aims at avoiding kernel load deemed to be safely spared on CPUs > running sensitive workload that can't bear any disturbance, such as > pcp cache draining. > > Suggested-by: Michal Hocko <mhocko@suse.com> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Is there any locking required? I do not think so as these should be boot time configured AFAIR. From the discussion around this I have understood that this might change in the future once cpusets gain a better isolation support. Maybe this should be documented at this stage? Thanks! > --- > include/linux/sched/isolation.h | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h > index b645cc81fe01..088672f08469 100644 > --- a/include/linux/sched/isolation.h > +++ b/include/linux/sched/isolation.h > @@ -53,4 +53,10 @@ static inline bool housekeeping_cpu(int cpu, enum hk_type type) > return true; > } > > +static inline bool cpu_is_isolated(int cpu) > +{ > + return !housekeeping_test_cpu(cpu, HK_TYPE_DOMAIN) || > + !housekeeping_test_cpu(cpu, HK_TYPE_KERNEL_NOISE); > +} > + > #endif /* _LINUX_SCHED_ISOLATION_H */ > -- > 2.34.1
diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h index b645cc81fe01..088672f08469 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -53,4 +53,10 @@ static inline bool housekeeping_cpu(int cpu, enum hk_type type) return true; } +static inline bool cpu_is_isolated(int cpu) +{ + return !housekeeping_test_cpu(cpu, HK_TYPE_DOMAIN) || + !housekeeping_test_cpu(cpu, HK_TYPE_KERNEL_NOISE); +} + #endif /* _LINUX_SCHED_ISOLATION_H */
Provide this new API to check if a CPU has been isolated either through isolcpus= or nohz_full= kernel parameter. It aims at avoiding kernel load deemed to be safely spared on CPUs running sensitive workload that can't bear any disturbance, such as pcp cache draining. Suggested-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> --- include/linux/sched/isolation.h | 6 ++++++ 1 file changed, 6 insertions(+)