Message ID | 157013e978468241de4a4c05d5337a44638ecb0e.1697711415.git.zhengqi.arch@bytedance.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | handle memoryless nodes more appropriately | expand |
* Qi Zheng <zhengqi.arch@bytedance.com> wrote: > In find_next_best_node(), We skipped the memoryless nodes s/We /we s/the memoryless nodes /memoryless nodes > when building the zonelists of other normal nodes (N_NORMAL), > but did not skip the memoryless node itself when building > the zonelist. This will cause it to be traversed at runtime. > > For example, say we have node0 and node1, node0 is memoryless > node, then the fallback order of node0 and node1 as follows: > > [ 0.153005] Fallback order for Node 0: 0 1 > [ 0.153564] Fallback order for Node 1: 1 > > After this patch, we skip memoryless node0 entirely, then > the fallback order of node0 and node1 as follows: s/fallback /fall back > > [ 0.155236] Fallback order for Node 0: 1 > [ 0.155806] Fallback order for Node 1: 1 > > So it becomes completely invisible, which will reduce runtime > overhead. > > And in this way, we will not try to allocate pages from memoryless > node0, then the panic mentioned in [1] will also be fixed. Even though > this problem has been solved by dropping the NODE_MIN_SIZE constrain > in x86 [2], it would be better to fix it in core MM as well. s/in core MM /in the core MM > [1]. https://lore.kernel.org/all/20230212110305.93670-1-zhengqi.arch@bytedance.com/ > [2]. https://lore.kernel.org/all/20231017062215.171670-1-rppt@kernel.org/ > > Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> > Acked-by: David Hildenbrand <david@redhat.com> > + /* > + * Use the local node if we haven't already. But for memoryless local > + * node, we should skip it and fallback to other nodes. s/fallback /fall back s/already. But /already, but Acked-by: Ingo Molnar <mingo@kernel.org> Thanks, Ingo
On 2023/10/20 16:31, Ingo Molnar wrote: > > * Qi Zheng <zhengqi.arch@bytedance.com> wrote: > >> In find_next_best_node(), We skipped the memoryless nodes > > s/We > /we > > s/the memoryless nodes > /memoryless nodes > >> when building the zonelists of other normal nodes (N_NORMAL), >> but did not skip the memoryless node itself when building >> the zonelist. This will cause it to be traversed at runtime. >> >> For example, say we have node0 and node1, node0 is memoryless >> node, then the fallback order of node0 and node1 as follows: >> >> [ 0.153005] Fallback order for Node 0: 0 1 >> [ 0.153564] Fallback order for Node 1: 1 >> >> After this patch, we skip memoryless node0 entirely, then >> the fallback order of node0 and node1 as follows: > > s/fallback > /fall back > >> >> [ 0.155236] Fallback order for Node 0: 1 >> [ 0.155806] Fallback order for Node 1: 1 >> >> So it becomes completely invisible, which will reduce runtime >> overhead. >> >> And in this way, we will not try to allocate pages from memoryless >> node0, then the panic mentioned in [1] will also be fixed. Even though >> this problem has been solved by dropping the NODE_MIN_SIZE constrain >> in x86 [2], it would be better to fix it in core MM as well. > > s/in core MM > /in the core MM > >> [1]. https://lore.kernel.org/all/20230212110305.93670-1-zhengqi.arch@bytedance.com/ >> [2]. https://lore.kernel.org/all/20231017062215.171670-1-rppt@kernel.org/ >> >> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> >> Acked-by: David Hildenbrand <david@redhat.com> > >> + /* >> + * Use the local node if we haven't already. But for memoryless local >> + * node, we should skip it and fallback to other nodes. > > s/fallback > /fall back > > s/already. But > /already, but Will fix the typos above. > > Acked-by: Ingo Molnar <mingo@kernel.org> Thanks. > > Thanks, > > Ingo
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index ee392a324802..e978272699d3 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5052,8 +5052,11 @@ int find_next_best_node(int node, nodemask_t *used_node_mask) int min_val = INT_MAX; int best_node = NUMA_NO_NODE; - /* Use the local node if we haven't already */ - if (!node_isset(node, *used_node_mask)) { + /* + * Use the local node if we haven't already. But for memoryless local + * node, we should skip it and fallback to other nodes. + */ + if (!node_isset(node, *used_node_mask) && node_state(node, N_MEMORY)) { node_set(node, *used_node_mask); return node; }