Message ID | 20210416112411.9826-4-osalvador@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v9,1/8] drivers/base/memory: Introduce memory_block_{online,offline} | expand |
On Fri 16-04-21 13:24:06, Oscar Salvador wrote: > From: David Hildenbrand <david@redhat.com> > > Let's have a single place (inspired by adjust_managed_page_count()) where > we adjust present pages. > In contrast to adjust_managed_page_count(), only memory onlining/offlining > is allowed to modify the number of present pages. > > Signed-off-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Oscar Salvador <osalvador@suse.de> > Reviewed-by: Oscar Salvador <osalvador@suse.de> Not sure self review counts ;) Acked-by: Michal Hocko <mhocko@suse.com> Btw. I strongly suspect the resize lock is quite pointless here. Something for a follow up patch. > --- > mm/memory_hotplug.c | 22 ++++++++++++---------- > 1 file changed, 12 insertions(+), 10 deletions(-) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 25e59d5dc13c..d05056b3c173 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -829,6 +829,16 @@ struct zone * zone_for_pfn_range(int online_type, int nid, unsigned start_pfn, > return default_zone_for_pfn(nid, start_pfn, nr_pages); > } > > +static void adjust_present_page_count(struct zone *zone, long nr_pages) > +{ > + unsigned long flags; > + > + zone->present_pages += nr_pages; > + pgdat_resize_lock(zone->zone_pgdat, &flags); > + zone->zone_pgdat->node_present_pages += nr_pages; > + pgdat_resize_unlock(zone->zone_pgdat, &flags); > +} > + > int __ref online_pages(unsigned long pfn, unsigned long nr_pages, > int online_type, int nid) > { > @@ -882,11 +892,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, > } > > online_pages_range(pfn, nr_pages); > - zone->present_pages += nr_pages; > - > - pgdat_resize_lock(zone->zone_pgdat, &flags); > - zone->zone_pgdat->node_present_pages += nr_pages; > - pgdat_resize_unlock(zone->zone_pgdat, &flags); > + adjust_present_page_count(zone, nr_pages); > > node_states_set_node(nid, &arg); > if (need_zonelists_rebuild) > @@ -1701,11 +1707,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages) > > /* removal success */ > adjust_managed_page_count(pfn_to_page(start_pfn), -nr_pages); > - zone->present_pages -= nr_pages; > - > - pgdat_resize_lock(zone->zone_pgdat, &flags); > - zone->zone_pgdat->node_present_pages -= nr_pages; > - pgdat_resize_unlock(zone->zone_pgdat, &flags); > + adjust_present_page_count(zone, -nr_pages); > > init_per_zone_wmark_min(); > > -- > 2.16.3
On Tue, Apr 20, 2021 at 11:45:55AM +0200, Michal Hocko wrote: > On Fri 16-04-21 13:24:06, Oscar Salvador wrote: > > From: David Hildenbrand <david@redhat.com> > > > > Let's have a single place (inspired by adjust_managed_page_count()) where > > we adjust present pages. > > In contrast to adjust_managed_page_count(), only memory onlining/offlining > > is allowed to modify the number of present pages. > > > > Signed-off-by: David Hildenbrand <david@redhat.com> > > Signed-off-by: Oscar Salvador <osalvador@suse.de> > > Reviewed-by: Oscar Salvador <osalvador@suse.de> > > Not sure self review counts ;) Uhm, the original author is David, I just added my signed-off-by as a deliverer. I thought that in that case was ok to stick my Reviewed-by. Or maybe my signed-off-by carries that implicitly. > Acked-by: Michal Hocko <mhocko@suse.com> > > Btw. I strongly suspect the resize lock is quite pointless here. > Something for a follow up patch. What makes you think that? I have been thinking about this, let us ignore this patch for a moment. If I poked the code correctly, node_size_lock is taken in: remove_pfn_range_from_zone() move_pfn_range_to_zone() both of them handling {zone,node}->spanned_pages Then we take it in {offline,online}_pages() for {zone,node}->present_pages. The other places where we take it are __init functions, so not of interest. Given that {offline,online}_pages() is serialized by the memory_hotplug lock, I would say that {node,zone}->{spanned,present}_pages is, at any time, stable? So, no need for the lock even without considering this patch? Now, getting back to this patch. adjust_present_page_count() will be called from memory_block_online(), which is not holding the memory_hotplug lock yet. But, we only fiddle with present pages out of {online,offline}_pages() if we have vmemmap pages, and since that operates on the same memory block, its lock should serialize that. I think I went down a rabbit hole, I am slightly confused now.
On 21.04.21 10:00, Oscar Salvador wrote: > On Tue, Apr 20, 2021 at 11:45:55AM +0200, Michal Hocko wrote: >> On Fri 16-04-21 13:24:06, Oscar Salvador wrote: >>> From: David Hildenbrand <david@redhat.com> >>> >>> Let's have a single place (inspired by adjust_managed_page_count()) where >>> we adjust present pages. >>> In contrast to adjust_managed_page_count(), only memory onlining/offlining >>> is allowed to modify the number of present pages. >>> >>> Signed-off-by: David Hildenbrand <david@redhat.com> >>> Signed-off-by: Oscar Salvador <osalvador@suse.de> >>> Reviewed-by: Oscar Salvador <osalvador@suse.de> >> >> Not sure self review counts ;) > > Uhm, the original author is David, I just added my signed-off-by as a deliverer. > I thought that in that case was ok to stick my Reviewed-by. > Or maybe my signed-off-by carries that implicitly. > >> Acked-by: Michal Hocko <mhocko@suse.com> >> >> Btw. I strongly suspect the resize lock is quite pointless here. >> Something for a follow up patch. > > What makes you think that? > I have been thinking about this, let us ignore this patch for a moment. > > If I poked the code correctly, node_size_lock is taken in: > > remove_pfn_range_from_zone() > move_pfn_range_to_zone() > > both of them handling {zone,node}->spanned_pages > > Then we take it in {offline,online}_pages() for {zone,node}->present_pages. > > The other places where we take it are __init functions, so not of interest. > > Given that {offline,online}_pages() is serialized by the memory_hotplug lock, > I would say that {node,zone}->{spanned,present}_pages is, at any time, stable? > So, no need for the lock even without considering this patch? > > Now, getting back to this patch. > adjust_present_page_count() will be called from memory_block_online(), which > is not holding the memory_hotplug lock yet. > But, we only fiddle with present pages out of {online,offline}_pages() if > we have vmemmap pages, and since that operates on the same memory block, > its lock should serialize that. > > I think I went down a rabbit hole, I am slightly confused now. We always hold the device hotplug lock when onlining/offlining memory. I agree that the lock might be unnecessary (had the same thoughts a while ago), we can look into that in the future.
On Wed 21-04-21 10:00:36, Oscar Salvador wrote: > On Tue, Apr 20, 2021 at 11:45:55AM +0200, Michal Hocko wrote: > > On Fri 16-04-21 13:24:06, Oscar Salvador wrote: > > > From: David Hildenbrand <david@redhat.com> > > > > > > Let's have a single place (inspired by adjust_managed_page_count()) where > > > we adjust present pages. > > > In contrast to adjust_managed_page_count(), only memory onlining/offlining > > > is allowed to modify the number of present pages. > > > > > > Signed-off-by: David Hildenbrand <david@redhat.com> > > > Signed-off-by: Oscar Salvador <osalvador@suse.de> > > > Reviewed-by: Oscar Salvador <osalvador@suse.de> > > > > Not sure self review counts ;) > > Uhm, the original author is David, I just added my signed-off-by as a deliverer. > I thought that in that case was ok to stick my Reviewed-by. > Or maybe my signed-off-by carries that implicitly. Yeah I do expect that one should review own changes but this is not really anything to lose sleep over. > > Acked-by: Michal Hocko <mhocko@suse.com> > > > > Btw. I strongly suspect the resize lock is quite pointless here. > > Something for a follow up patch. > > What makes you think that? * Write access to present_pages at runtime should be protected by * mem_hotplug_begin/end(). Any reader who can't tolerant drift of * present_pages should get_online_mems() to get a stable value. > I have been thinking about this, let us ignore this patch for a moment. > > If I poked the code correctly, node_size_lock is taken in: > > remove_pfn_range_from_zone() > move_pfn_range_to_zone() > > both of them handling {zone,node}->spanned_pages > > Then we take it in {offline,online}_pages() for {zone,node}->present_pages. > > The other places where we take it are __init functions, so not of interest. > > Given that {offline,online}_pages() is serialized by the memory_hotplug lock, > I would say that {node,zone}->{spanned,present}_pages is, at any time, stable? > So, no need for the lock even without considering this patch? Yes. The resize lock is really only relevant to the parallel struct page initialization during early boot. The hotplug usage seems just a left over from the past or maybe it has never been really relevant in that context. > Now, getting back to this patch. > adjust_present_page_count() will be called from memory_block_online(), which > is not holding the memory_hotplug lock yet. > But, we only fiddle with present pages out of {online,offline}_pages() if > we have vmemmap pages, and since that operates on the same memory block, > its lock should serialize that. Memory hotplug is always synchronized on the device level.
On Wed, Apr 21, 2021 at 10:31:30AM +0200, Michal Hocko wrote: > > Given that {offline,online}_pages() is serialized by the memory_hotplug lock, > > I would say that {node,zone}->{spanned,present}_pages is, at any time, stable? > > So, no need for the lock even without considering this patch? > > Yes. The resize lock is really only relevant to the parallel struct page > initialization during early boot. The hotplug usage seems just a left > over from the past or maybe it has never been really relevant in that > context. Ok, I will prepare a follow-up patch to remove the lock in such situations when this work goes in.
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 25e59d5dc13c..d05056b3c173 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -829,6 +829,16 @@ struct zone * zone_for_pfn_range(int online_type, int nid, unsigned start_pfn, return default_zone_for_pfn(nid, start_pfn, nr_pages); } +static void adjust_present_page_count(struct zone *zone, long nr_pages) +{ + unsigned long flags; + + zone->present_pages += nr_pages; + pgdat_resize_lock(zone->zone_pgdat, &flags); + zone->zone_pgdat->node_present_pages += nr_pages; + pgdat_resize_unlock(zone->zone_pgdat, &flags); +} + int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_type, int nid) { @@ -882,11 +892,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, } online_pages_range(pfn, nr_pages); - zone->present_pages += nr_pages; - - pgdat_resize_lock(zone->zone_pgdat, &flags); - zone->zone_pgdat->node_present_pages += nr_pages; - pgdat_resize_unlock(zone->zone_pgdat, &flags); + adjust_present_page_count(zone, nr_pages); node_states_set_node(nid, &arg); if (need_zonelists_rebuild) @@ -1701,11 +1707,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages) /* removal success */ adjust_managed_page_count(pfn_to_page(start_pfn), -nr_pages); - zone->present_pages -= nr_pages; - - pgdat_resize_lock(zone->zone_pgdat, &flags); - zone->zone_pgdat->node_present_pages -= nr_pages; - pgdat_resize_unlock(zone->zone_pgdat, &flags); + adjust_present_page_count(zone, -nr_pages); init_per_zone_wmark_min();