Message ID | 20200214073320.28735-1-richardw.yang@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] mm/vmscan.c: only adjust related kswapd cpu affinity when online cpu | expand |
On Fri 14-02-20 15:33:20, Wei Yang wrote: > When onlining a cpu, kswapd_cpu_online() is called to adjust kswapd cpu > affinity. > > Current routine does like this: > > a) Iterate all the numa node > b) Adjust cpu affinity when node has an online cpu > > For a) this is not necessary, since the particular online cpu belongs to > a particular numa node. So it is not necessary to iterate on every nodes > on the system. This new onlined cpu just affect kswapd cpu affinity of > this particular node. > > For b) several cpumask operation is used to check whether the node has > an online CPU. Since at this point we are sure one of our CPU onlined, > we can set the cpu affinity directly to current cpumask_of_node(). > > This patch simplifies the logic by set cpu affinity of the affected > kswapd. How have you tested this patch? Also this is an old code and quite convoluted but does it still work as inteded? I mean, I do not see any cpu offline callback to reduce the cpu mask as all the CPUs for the given node go offline? Wouldn't the scheduler simply go and fallback to no affinity if that happens? In other words what is the value of kswapd_cpu_online in the first place?
On Fri, Feb 14, 2020 at 09:51:13AM +0100, Michal Hocko wrote: >On Fri 14-02-20 15:33:20, Wei Yang wrote: >> When onlining a cpu, kswapd_cpu_online() is called to adjust kswapd cpu >> affinity. >> >> Current routine does like this: >> >> a) Iterate all the numa node >> b) Adjust cpu affinity when node has an online cpu >> >> For a) this is not necessary, since the particular online cpu belongs to >> a particular numa node. So it is not necessary to iterate on every nodes >> on the system. This new onlined cpu just affect kswapd cpu affinity of >> this particular node. >> >> For b) several cpumask operation is used to check whether the node has >> an online CPU. Since at this point we are sure one of our CPU onlined, >> we can set the cpu affinity directly to current cpumask_of_node(). >> >> This patch simplifies the logic by set cpu affinity of the affected >> kswapd. > >How have you tested this patch? > I online one cpu and confirm the "cpu" is the one we just onlined. If my understanding is correct, this is the expected behavior. >Also this is an old code and quite convoluted but does it still work as >inteded? I mean, I do not see any cpu offline callback to reduce the >cpu mask as all the CPUs for the given node go offline? Wouldn't the You are right, I didn't see the counterpart for cpu offline. This is the question I want to ask. Seems we didn't handle it at the very beginning. >scheduler simply go and fallback to no affinity if that happens? >In other words what is the value of kswapd_cpu_online in the first >place? Some cases may this function be useful. If we have a memory node which doesn't have any online cpu, the default cpumask is not set. After one of the cpu online, we want to change cpu affinity. Or we want to add more cpu to the system, we could allow kswapd use more cpu resources. Otherwise, kswapd would be limited to those original cpus. >-- >Michal Hocko >SUSE Labs
On Sat 15-02-20 08:37:53, Wei Yang wrote: > On Fri, Feb 14, 2020 at 09:51:13AM +0100, Michal Hocko wrote: > >On Fri 14-02-20 15:33:20, Wei Yang wrote: > >> When onlining a cpu, kswapd_cpu_online() is called to adjust kswapd cpu > >> affinity. > >> > >> Current routine does like this: > >> > >> a) Iterate all the numa node > >> b) Adjust cpu affinity when node has an online cpu > >> > >> For a) this is not necessary, since the particular online cpu belongs to > >> a particular numa node. So it is not necessary to iterate on every nodes > >> on the system. This new onlined cpu just affect kswapd cpu affinity of > >> this particular node. > >> > >> For b) several cpumask operation is used to check whether the node has > >> an online CPU. Since at this point we are sure one of our CPU onlined, > >> we can set the cpu affinity directly to current cpumask_of_node(). > >> > >> This patch simplifies the logic by set cpu affinity of the affected > >> kswapd. > > > >How have you tested this patch? > > > > I online one cpu and confirm the "cpu" is the one we just onlined. > > If my understanding is correct, this is the expected behavior. > > >Also this is an old code and quite convoluted but does it still work as > >inteded? I mean, I do not see any cpu offline callback to reduce the > >cpu mask as all the CPUs for the given node go offline? Wouldn't the > > You are right, I didn't see the counterpart for cpu offline. This is the > question I want to ask. Seems we didn't handle it at the very beginning. > > >scheduler simply go and fallback to no affinity if that happens? > >In other words what is the value of kswapd_cpu_online in the first > >place? > > Some cases may this function be useful. > > If we have a memory node which doesn't have any online cpu, the default > cpumask is not set. After one of the cpu online, we want to change cpu > affinity. > > Or we want to add more cpu to the system, we could allow kswapd use more cpu > resources. Otherwise, kswapd would be limited to those original cpus. OK, so the usecase is when a NUMA node gains a new CPU which wasn't there at the time when the node got onlined. Is this a scenario we really do care about? While not completely impossible I haven't seen a system which would allow such a runtime configurability. Maybe it would be simply easier to drop the callback for now until we have a real world usecase to support it and have it documented.
On Mon, Feb 17, 2020 at 10:31:24AM +0100, Michal Hocko wrote: >On Sat 15-02-20 08:37:53, Wei Yang wrote: >> On Fri, Feb 14, 2020 at 09:51:13AM +0100, Michal Hocko wrote: >> >On Fri 14-02-20 15:33:20, Wei Yang wrote: >> >> When onlining a cpu, kswapd_cpu_online() is called to adjust kswapd cpu >> >> affinity. >> >> >> >> Current routine does like this: >> >> >> >> a) Iterate all the numa node >> >> b) Adjust cpu affinity when node has an online cpu >> >> >> >> For a) this is not necessary, since the particular online cpu belongs to >> >> a particular numa node. So it is not necessary to iterate on every nodes >> >> on the system. This new onlined cpu just affect kswapd cpu affinity of >> >> this particular node. >> >> >> >> For b) several cpumask operation is used to check whether the node has >> >> an online CPU. Since at this point we are sure one of our CPU onlined, >> >> we can set the cpu affinity directly to current cpumask_of_node(). >> >> >> >> This patch simplifies the logic by set cpu affinity of the affected >> >> kswapd. >> > >> >How have you tested this patch? >> > >> >> I online one cpu and confirm the "cpu" is the one we just onlined. >> >> If my understanding is correct, this is the expected behavior. >> >> >Also this is an old code and quite convoluted but does it still work as >> >inteded? I mean, I do not see any cpu offline callback to reduce the >> >cpu mask as all the CPUs for the given node go offline? Wouldn't the >> >> You are right, I didn't see the counterpart for cpu offline. This is the >> question I want to ask. Seems we didn't handle it at the very beginning. >> >> >scheduler simply go and fallback to no affinity if that happens? >> >In other words what is the value of kswapd_cpu_online in the first >> >place? >> >> Some cases may this function be useful. >> >> If we have a memory node which doesn't have any online cpu, the default >> cpumask is not set. After one of the cpu online, we want to change cpu >> affinity. >> >> Or we want to add more cpu to the system, we could allow kswapd use more cpu >> resources. Otherwise, kswapd would be limited to those original cpus. > >OK, so the usecase is when a NUMA node gains a new CPU which wasn't >there at the time when the node got onlined. Is this a scenario we >really do care about? While not completely impossible I haven't seen >a system which would allow such a runtime configurability. Maybe it >would be simply easier to drop the callback for now until we have a real >world usecase to support it and have it documented. I am fine with this suggestion. Let me prepare v3. >-- >Michal Hocko >SUSE Labs
diff --git a/mm/vmscan.c b/mm/vmscan.c index 665f33258cd7..acc5af82b6ed 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4029,18 +4029,19 @@ unsigned long shrink_all_memory(unsigned long nr_to_reclaim) restore their cpu bindings. */ static int kswapd_cpu_online(unsigned int cpu) { - int nid; + int nid = cpu_to_node(cpu); + pg_data_t *pgdat; + const struct cpumask *mask; - for_each_node_state(nid, N_MEMORY) { - pg_data_t *pgdat = NODE_DATA(nid); - const struct cpumask *mask; + if (!node_state(nid, N_MEMORY)) + return 0; - mask = cpumask_of_node(pgdat->node_id); + pgdat = NODE_DATA(nid); + mask = cpumask_of_node(nid); + + /* One of our CPUs online: restore mask */ + set_cpus_allowed_ptr(pgdat->kswapd, mask); - if (cpumask_any_and(cpu_online_mask, mask) < nr_cpu_ids) - /* One of our CPUs online: restore mask */ - set_cpus_allowed_ptr(pgdat->kswapd, mask); - } return 0; }
When onlining a cpu, kswapd_cpu_online() is called to adjust kswapd cpu affinity. Current routine does like this: a) Iterate all the numa node b) Adjust cpu affinity when node has an online cpu For a) this is not necessary, since the particular online cpu belongs to a particular numa node. So it is not necessary to iterate on every nodes on the system. This new onlined cpu just affect kswapd cpu affinity of this particular node. For b) several cpumask operation is used to check whether the node has an online CPU. Since at this point we are sure one of our CPU onlined, we can set the cpu affinity directly to current cpumask_of_node(). This patch simplifies the logic by set cpu affinity of the affected kswapd. --- v2: * rephrase the changelog Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> --- mm/vmscan.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-)