Message ID | 20190213094315.3504-1-mhocko@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | None | expand |
On Wed, Feb 13, 2019 at 10:43:15AM +0100, Michal Hocko wrote: > @@ -5259,6 +5261,11 @@ static void build_zonelists(pg_data_t *pgdat) > > build_zonelists_in_node_order(pgdat, node_order, nr_nodes); > build_thisnode_zonelists(pgdat); > + > + pr_info("node[%d] zonelist: ", pgdat->node_id); > + for_each_zone_zonelist(zone, z, &pgdat->node_zonelists[ZONELIST_FALLBACK], MAX_NR_ZONES-1) > + pr_cont("%d:%s ", zone_to_nid(zone), zone->name); > + pr_cont("\n"); > } Have you ran this by the SGI and other stupid large machine vendors? Traditionally they tend to want to remove such things instead of adding them.
On Wed 13-02-19 11:32:31, Peter Zijlstra wrote: > On Wed, Feb 13, 2019 at 10:43:15AM +0100, Michal Hocko wrote: > > @@ -5259,6 +5261,11 @@ static void build_zonelists(pg_data_t *pgdat) > > > > build_zonelists_in_node_order(pgdat, node_order, nr_nodes); > > build_thisnode_zonelists(pgdat); > > + > > + pr_info("node[%d] zonelist: ", pgdat->node_id); > > + for_each_zone_zonelist(zone, z, &pgdat->node_zonelists[ZONELIST_FALLBACK], MAX_NR_ZONES-1) > > + pr_cont("%d:%s ", zone_to_nid(zone), zone->name); > > + pr_cont("\n"); > > } > > Have you ran this by the SGI and other stupid large machine vendors? I do not have such a large machine handy. The biggest I have has handfull (say dozen) of NUMA nodes. > Traditionally they tend to want to remove such things instead of adding > them. I do not insist on this patch but I find it handy. If there is an opposition I will not miss it much.
On Wed, Feb 13, 2019 at 12:50:14PM +0100, Michal Hocko wrote: > On Wed 13-02-19 11:32:31, Peter Zijlstra wrote: > > On Wed, Feb 13, 2019 at 10:43:15AM +0100, Michal Hocko wrote: > > > @@ -5259,6 +5261,11 @@ static void build_zonelists(pg_data_t *pgdat) > > > > > > build_zonelists_in_node_order(pgdat, node_order, nr_nodes); > > > build_thisnode_zonelists(pgdat); > > > + > > > + pr_info("node[%d] zonelist: ", pgdat->node_id); > > > + for_each_zone_zonelist(zone, z, &pgdat->node_zonelists[ZONELIST_FALLBACK], MAX_NR_ZONES-1) > > > + pr_cont("%d:%s ", zone_to_nid(zone), zone->name); > > > + pr_cont("\n"); > > > } > > > > Have you ran this by the SGI and other stupid large machine vendors? > > I do not have such a large machine handy. The biggest I have has > handfull (say dozen) of NUMA nodes. > > > Traditionally they tend to want to remove such things instead of adding > > them. > > I do not insist on this patch but I find it handy. If there is an > opposition I will not miss it much. Well, I don't have machines like that either and don't mind the patch. Just raising the issue; I've had the big iron boys complain about similar things (typically printing something for every CPU, which gets out of hand much faster than zones, but still).
On Wed 13-02-19 14:11:31, Peter Zijlstra wrote: > On Wed, Feb 13, 2019 at 12:50:14PM +0100, Michal Hocko wrote: > > On Wed 13-02-19 11:32:31, Peter Zijlstra wrote: > > > On Wed, Feb 13, 2019 at 10:43:15AM +0100, Michal Hocko wrote: > > > > @@ -5259,6 +5261,11 @@ static void build_zonelists(pg_data_t *pgdat) > > > > > > > > build_zonelists_in_node_order(pgdat, node_order, nr_nodes); > > > > build_thisnode_zonelists(pgdat); > > > > + > > > > + pr_info("node[%d] zonelist: ", pgdat->node_id); > > > > + for_each_zone_zonelist(zone, z, &pgdat->node_zonelists[ZONELIST_FALLBACK], MAX_NR_ZONES-1) > > > > + pr_cont("%d:%s ", zone_to_nid(zone), zone->name); > > > > + pr_cont("\n"); > > > > } > > > > > > Have you ran this by the SGI and other stupid large machine vendors? > > > > I do not have such a large machine handy. The biggest I have has > > handfull (say dozen) of NUMA nodes. > > > > > Traditionally they tend to want to remove such things instead of adding > > > them. > > > > I do not insist on this patch but I find it handy. If there is an > > opposition I will not miss it much. > > Well, I don't have machines like that either and don't mind the patch. > Just raising the issue; I've had the big iron boys complain about > similar things (typically printing something for every CPU, which gets > out of hand much faster than zones, but still). Maybe we can try to push this through and revert if somebody complains about an excessive output.
On 2/13/19 1:43 AM, Michal Hocko wrote: > > We have seen several bugs where zonelists have not been initialized > properly and it is not really straightforward to track those bugs down. > One way to help a bit at least is to dump zonelists of each node when > they are (re)initialized. Were you thinking of boot-time bugs and crashes, or just stuff going wonky after boot? We don't have the zonelists dumped in /proc anywhere, do we? Would that help?
On Wed 13-02-19 08:14:50, Dave Hansen wrote: > On 2/13/19 1:43 AM, Michal Hocko wrote: > > > > We have seen several bugs where zonelists have not been initialized > > properly and it is not really straightforward to track those bugs down. > > One way to help a bit at least is to dump zonelists of each node when > > they are (re)initialized. > > Were you thinking of boot-time bugs and crashes, or just stuff going > wonky after boot? Mostly boot time. I haven't seen hotplug related bugs in this direction. All the issues I have seen so far is that we forget a node altogether and it ends up with no zonelists at all. But who knows maybe we have some hidden bugs where zonelists is initialized only partially for some reason and there is no real way to find out. > We don't have the zonelists dumped in /proc anywhere, do we? Would that > help? I would prefer to not export such an implementation detail into proc
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 2e097f336126..52e54d16662a 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5234,6 +5234,8 @@ static void build_zonelists(pg_data_t *pgdat) int node, load, nr_nodes = 0; nodemask_t used_mask; int local_node, prev_node; + struct zone *zone; + struct zoneref *z; /* NUMA-aware ordering of nodes */ local_node = pgdat->node_id; @@ -5259,6 +5261,11 @@ static void build_zonelists(pg_data_t *pgdat) build_zonelists_in_node_order(pgdat, node_order, nr_nodes); build_thisnode_zonelists(pgdat); + + pr_info("node[%d] zonelist: ", pgdat->node_id); + for_each_zone_zonelist(zone, z, &pgdat->node_zonelists[ZONELIST_FALLBACK], MAX_NR_ZONES-1) + pr_cont("%d:%s ", zone_to_nid(zone), zone->name); + pr_cont("\n"); } #ifdef CONFIG_HAVE_MEMORYLESS_NODES