Message ID | 20181128210815.2134-1-richard.weiyang@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm, show_mem: drop pgdat_resize_lock in show_mem() | expand |
On Thu, 29 Nov 2018 05:08:15 +0800 Wei Yang <richard.weiyang@gmail.com> wrote: > Function show_mem() is used to print system memory status when user > requires or fail to allocate memory. Generally, this is a best effort > information and not willing to affect core mm subsystem. > > The data protected by pgdat_resize_lock is mostly correct except there is: > > * page struct defer init > * memory hotplug What is the advantage in doing this? What problem does the taking of that lock cause?
On Wed, Nov 28, 2018 at 02:07:51PM -0800, Andrew Morton wrote: >On Thu, 29 Nov 2018 05:08:15 +0800 Wei Yang <richard.weiyang@gmail.com> wrote: > >> Function show_mem() is used to print system memory status when user >> requires or fail to allocate memory. Generally, this is a best effort >> information and not willing to affect core mm subsystem. >> >> The data protected by pgdat_resize_lock is mostly correct except there is: >> >> * page struct defer init >> * memory hotplug > >What is the advantage in doing this? What problem does the taking of >that lock cause? Michal and I had a discussion in https://patchwork.kernel.org/patch/10689759/ The purpose of this is to see whehter it is nessary to make pgdat_resize_lock IRQ context safe. After went through the code, most of the users are not from IRQ context. If my understanding is correct, Michal's suggestion is to drop the lock here. (The second last reply from Michal.)
On Thu 29-11-18 05:08:15, Wei Yang wrote: > Function show_mem() is used to print system memory status when user > requires or fail to allocate memory. Generally, this is a best effort > information and not willing to affect core mm subsystem. I would drop the part after and > The data protected by pgdat_resize_lock is mostly correct except there is: > > * page struct defer init > * memory hotplug This is more confusing than helpful. I would just drop it. The changelog doesn't explain what is done and why. The second one is much more important. I would say this " Function show_mem() is used to print system memory status when user requires or fail to allocate memory. Generally, this is a best effort information so any races with memory hotplug (or very theoretically an early initialization) should be toleratable and the worst that could happen is to print an imprecise node state. Drop the resize lock because this is the only place which might hold the lock from the interrupt context and so all other callers might use a simple spinlock. Even though this doesn't solve any real issue it makes the code easier to follow and tiny more effective. " > > Signed-off-by: Wei Yang <richard.weiyang@gmail.com> > --- > lib/show_mem.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/lib/show_mem.c b/lib/show_mem.c > index 0beaa1d899aa..1d996e5771ab 100644 > --- a/lib/show_mem.c > +++ b/lib/show_mem.c > @@ -21,7 +21,6 @@ void show_mem(unsigned int filter, nodemask_t *nodemask) > unsigned long flags; btw. you want to drop flags. > int zoneid; > > - pgdat_resize_lock(pgdat, &flags); > for (zoneid = 0; zoneid < MAX_NR_ZONES; zoneid++) { > struct zone *zone = &pgdat->node_zones[zoneid]; > if (!populated_zone(zone)) > @@ -33,7 +32,6 @@ void show_mem(unsigned int filter, nodemask_t *nodemask) > if (is_highmem_idx(zoneid)) > highmem += zone->present_pages; > } > - pgdat_resize_unlock(pgdat, &flags); > } > > printk("%lu pages RAM\n", total); > -- > 2.15.1 >
On Thu, Nov 29, 2018 at 09:17:03AM +0100, Michal Hocko wrote: >On Thu 29-11-18 05:08:15, Wei Yang wrote: >> Function show_mem() is used to print system memory status when user >> requires or fail to allocate memory. Generally, this is a best effort >> information and not willing to affect core mm subsystem. > >I would drop the part after and > >> The data protected by pgdat_resize_lock is mostly correct except there is: >> >> * page struct defer init >> * memory hotplug > >This is more confusing than helpful. I would just drop it. > >The changelog doesn't explain what is done and why. The second one is >much more important. I would say this > >" >Function show_mem() is used to print system memory status when user >requires or fail to allocate memory. Generally, this is a best effort >information so any races with memory hotplug (or very theoretically an >early initialization) should be toleratable and the worst that could >happen is to print an imprecise node state. > >Drop the resize lock because this is the only place which might hold the >lock from the interrupt context and so all other callers might use a >simple spinlock. Even though this doesn't solve any real issue it makes >the code easier to follow and tiny more effective. >" Ah, I have to admit this is much clearer and easier for audience to understand the reason. Thanks a lot. > >> >> Signed-off-by: Wei Yang <richard.weiyang@gmail.com> >> --- >> lib/show_mem.c | 2 -- >> 1 file changed, 2 deletions(-) >> >> diff --git a/lib/show_mem.c b/lib/show_mem.c >> index 0beaa1d899aa..1d996e5771ab 100644 >> --- a/lib/show_mem.c >> +++ b/lib/show_mem.c >> @@ -21,7 +21,6 @@ void show_mem(unsigned int filter, nodemask_t *nodemask) >> unsigned long flags; > >btw. you want to drop flags. Oops, what a shame . :-( >> int zoneid; >> >> - pgdat_resize_lock(pgdat, &flags); >> for (zoneid = 0; zoneid < MAX_NR_ZONES; zoneid++) { >> struct zone *zone = &pgdat->node_zones[zoneid]; >> if (!populated_zone(zone)) >> @@ -33,7 +32,6 @@ void show_mem(unsigned int filter, nodemask_t *nodemask) >> if (is_highmem_idx(zoneid)) >> highmem += zone->present_pages; >> } >> - pgdat_resize_unlock(pgdat, &flags); >> } >> >> printk("%lu pages RAM\n", total); >> -- >> 2.15.1 >> > >-- >Michal Hocko >SUSE Labs
On Thu, Nov 29, 2018 at 09:17:03AM +0100, Michal Hocko wrote: >On Thu 29-11-18 05:08:15, Wei Yang wrote: >> Function show_mem() is used to print system memory status when user >> requires or fail to allocate memory. Generally, this is a best effort >> information and not willing to affect core mm subsystem. > >I would drop the part after and > >> The data protected by pgdat_resize_lock is mostly correct except there is: >> >> * page struct defer init >> * memory hotplug > >This is more confusing than helpful. I would just drop it. > >The changelog doesn't explain what is done and why. The second one is >much more important. I would say this > >" >Function show_mem() is used to print system memory status when user >requires or fail to allocate memory. Generally, this is a best effort >information so any races with memory hotplug (or very theoretically an >early initialization) should be toleratable and the worst that could >happen is to print an imprecise node state. > >Drop the resize lock because this is the only place which might hold the As I mentioned in https://patchwork.kernel.org/patch/10689759/, there is one place used in __remove_zone(). I don't get your suggestion of this place. And is __remove_zone() could be called in IRQ context? >lock from the interrupt context and so all other callers might use a >simple spinlock. Even though this doesn't solve any real issue it makes >the code easier to follow and tiny more effective. >" >
On Thu 29-11-18 15:04:49, Wei Yang wrote: > On Thu, Nov 29, 2018 at 09:17:03AM +0100, Michal Hocko wrote: > >On Thu 29-11-18 05:08:15, Wei Yang wrote: > >> Function show_mem() is used to print system memory status when user > >> requires or fail to allocate memory. Generally, this is a best effort > >> information and not willing to affect core mm subsystem. > > > >I would drop the part after and > > > >> The data protected by pgdat_resize_lock is mostly correct except there is: > >> > >> * page struct defer init > >> * memory hotplug > > > >This is more confusing than helpful. I would just drop it. > > > >The changelog doesn't explain what is done and why. The second one is > >much more important. I would say this > > > >" > >Function show_mem() is used to print system memory status when user > >requires or fail to allocate memory. Generally, this is a best effort > >information so any races with memory hotplug (or very theoretically an > >early initialization) should be toleratable and the worst that could > >happen is to print an imprecise node state. > > > >Drop the resize lock because this is the only place which might hold the > > As I mentioned in https://patchwork.kernel.org/patch/10689759/, there is > one place used in __remove_zone(). I don't get your suggestion of this > place. And is __remove_zone() could be called in IRQ context? It is only called from __remove_pages and that one calls cond_resched so obviosly not.
On Thu, Nov 29, 2018 at 04:49:22PM +0100, Michal Hocko wrote: >On Thu 29-11-18 15:04:49, Wei Yang wrote: >> On Thu, Nov 29, 2018 at 09:17:03AM +0100, Michal Hocko wrote: >> >On Thu 29-11-18 05:08:15, Wei Yang wrote: >> >> Function show_mem() is used to print system memory status when user >> >> requires or fail to allocate memory. Generally, this is a best effort >> >> information and not willing to affect core mm subsystem. >> > >> >I would drop the part after and >> > >> >> The data protected by pgdat_resize_lock is mostly correct except there is: >> >> >> >> * page struct defer init >> >> * memory hotplug >> > >> >This is more confusing than helpful. I would just drop it. >> > >> >The changelog doesn't explain what is done and why. The second one is >> >much more important. I would say this >> > >> >" >> >Function show_mem() is used to print system memory status when user >> >requires or fail to allocate memory. Generally, this is a best effort >> >information so any races with memory hotplug (or very theoretically an >> >early initialization) should be toleratable and the worst that could >> >happen is to print an imprecise node state. >> > >> >Drop the resize lock because this is the only place which might hold the >> >> As I mentioned in https://patchwork.kernel.org/patch/10689759/, there is >> one place used in __remove_zone(). I don't get your suggestion of this >> place. And is __remove_zone() could be called in IRQ context? > >It is only called from __remove_pages and that one calls cond_resched so >obviosly not. > Forgive my poor background knowledge, I went throught the code, but not found where call cond_resched. __remove_pages() release_mem_region_adjustable() clear_zone_contiguous() __remove_section() unregister_memory_section() __remove_zone() sparse_remove_one_section() set_zone_contiguous() Would you mind giving me a hint? >-- >Michal Hocko >SUSE Labs
On Thu 29-11-18 16:05:24, Wei Yang wrote: > On Thu, Nov 29, 2018 at 04:49:22PM +0100, Michal Hocko wrote: [...] > >It is only called from __remove_pages and that one calls cond_resched so > >obviosly not. > > > > Forgive my poor background knowledge, I went throught the code, but not > found where call cond_resched. > > __remove_pages() > release_mem_region_adjustable() > clear_zone_contiguous() > __remove_section() > unregister_memory_section() > __remove_zone() > sparse_remove_one_section() > set_zone_contiguous() > > Would you mind giving me a hint? This is the code as of 4.20-rc2 for (i = 0; i < sections_to_remove; i++) { unsigned long pfn = phys_start_pfn + i*PAGES_PER_SECTION; cond_resched(); ret = __remove_section(zone, __pfn_to_section(pfn), map_offset, altmap); map_offset = 0; if (ret) break; } Maybe things have changed in the meantime but in general the code is sleepable (e.g. release_mem_region_adjustable does GFP_KERNEL allocation) and that rules out IRQ context.
On Thu, Nov 29, 2018 at 05:18:47PM +0100, Michal Hocko wrote: >On Thu 29-11-18 16:05:24, Wei Yang wrote: >> On Thu, Nov 29, 2018 at 04:49:22PM +0100, Michal Hocko wrote: >[...] >> >It is only called from __remove_pages and that one calls cond_resched so >> >obviosly not. >> > >> >> Forgive my poor background knowledge, I went throught the code, but not >> found where call cond_resched. >> >> __remove_pages() >> release_mem_region_adjustable() >> clear_zone_contiguous() >> __remove_section() >> unregister_memory_section() >> __remove_zone() >> sparse_remove_one_section() >> set_zone_contiguous() >> >> Would you mind giving me a hint? > >This is the code as of 4.20-rc2 > > for (i = 0; i < sections_to_remove; i++) { > unsigned long pfn = phys_start_pfn + i*PAGES_PER_SECTION; > > cond_resched(); > ret = __remove_section(zone, __pfn_to_section(pfn), map_offset, > altmap); > map_offset = 0; > if (ret) > break; > } > >Maybe things have changed in the meantime but in general the code is >sleepable (e.g. release_mem_region_adjustable does GFP_KERNEL >allocation) and that rules out IRQ context. Thanks, my code is not up to date. >-- >Michal Hocko >SUSE Labs
diff --git a/lib/show_mem.c b/lib/show_mem.c index 0beaa1d899aa..1d996e5771ab 100644 --- a/lib/show_mem.c +++ b/lib/show_mem.c @@ -21,7 +21,6 @@ void show_mem(unsigned int filter, nodemask_t *nodemask) unsigned long flags; int zoneid; - pgdat_resize_lock(pgdat, &flags); for (zoneid = 0; zoneid < MAX_NR_ZONES; zoneid++) { struct zone *zone = &pgdat->node_zones[zoneid]; if (!populated_zone(zone)) @@ -33,7 +32,6 @@ void show_mem(unsigned int filter, nodemask_t *nodemask) if (is_highmem_idx(zoneid)) highmem += zone->present_pages; } - pgdat_resize_unlock(pgdat, &flags); } printk("%lu pages RAM\n", total);
Function show_mem() is used to print system memory status when user requires or fail to allocate memory. Generally, this is a best effort information and not willing to affect core mm subsystem. The data protected by pgdat_resize_lock is mostly correct except there is: * page struct defer init * memory hotplug Signed-off-by: Wei Yang <richard.weiyang@gmail.com> --- lib/show_mem.c | 2 -- 1 file changed, 2 deletions(-)