Message ID | 20200302134941.315212-8-david@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | virtio-mem: paravirtualized memory | expand |
On Mon 02-03-20 14:49:37, David Hildenbrand wrote: [...] > +static void virtio_mem_notify_going_offline(struct virtio_mem *vm, > + unsigned long mb_id) > +{ > + const unsigned long nr_pages = PFN_DOWN(vm->subblock_size); > + unsigned long pfn; > + int sb_id, i; > + > + for (sb_id = 0; sb_id < vm->nb_sb_per_mb; sb_id++) { > + if (virtio_mem_mb_test_sb_plugged(vm, mb_id, sb_id, 1)) > + continue; > + /* > + * Drop our reference to the pages so the memory can get > + * offlined and add the unplugged pages to the managed > + * page counters (so offlining code can correctly subtract > + * them again). > + */ > + pfn = PFN_DOWN(virtio_mem_mb_id_to_phys(mb_id) + > + sb_id * vm->subblock_size); > + adjust_managed_page_count(pfn_to_page(pfn), nr_pages); > + for (i = 0; i < nr_pages; i++) > + page_ref_dec(pfn_to_page(pfn + i)); Is there ever situation this might be a different than 1->0 transition?
On 10.03.20 12:43, Michal Hocko wrote: > On Mon 02-03-20 14:49:37, David Hildenbrand wrote: > [...] >> +static void virtio_mem_notify_going_offline(struct virtio_mem *vm, >> + unsigned long mb_id) >> +{ >> + const unsigned long nr_pages = PFN_DOWN(vm->subblock_size); >> + unsigned long pfn; >> + int sb_id, i; >> + >> + for (sb_id = 0; sb_id < vm->nb_sb_per_mb; sb_id++) { >> + if (virtio_mem_mb_test_sb_plugged(vm, mb_id, sb_id, 1)) >> + continue; >> + /* >> + * Drop our reference to the pages so the memory can get >> + * offlined and add the unplugged pages to the managed >> + * page counters (so offlining code can correctly subtract >> + * them again). >> + */ >> + pfn = PFN_DOWN(virtio_mem_mb_id_to_phys(mb_id) + >> + sb_id * vm->subblock_size); >> + adjust_managed_page_count(pfn_to_page(pfn), nr_pages); >> + for (i = 0; i < nr_pages; i++) >> + page_ref_dec(pfn_to_page(pfn + i)); > > Is there ever situation this might be a different than 1->0 transition? Only if some other code would be taking a reference. At least not from virtio-mem perspective.
On Tue 10-03-20 12:46:05, David Hildenbrand wrote: > On 10.03.20 12:43, Michal Hocko wrote: > > On Mon 02-03-20 14:49:37, David Hildenbrand wrote: > > [...] > >> +static void virtio_mem_notify_going_offline(struct virtio_mem *vm, > >> + unsigned long mb_id) > >> +{ > >> + const unsigned long nr_pages = PFN_DOWN(vm->subblock_size); > >> + unsigned long pfn; > >> + int sb_id, i; > >> + > >> + for (sb_id = 0; sb_id < vm->nb_sb_per_mb; sb_id++) { > >> + if (virtio_mem_mb_test_sb_plugged(vm, mb_id, sb_id, 1)) > >> + continue; > >> + /* > >> + * Drop our reference to the pages so the memory can get > >> + * offlined and add the unplugged pages to the managed > >> + * page counters (so offlining code can correctly subtract > >> + * them again). > >> + */ > >> + pfn = PFN_DOWN(virtio_mem_mb_id_to_phys(mb_id) + > >> + sb_id * vm->subblock_size); > >> + adjust_managed_page_count(pfn_to_page(pfn), nr_pages); > >> + for (i = 0; i < nr_pages; i++) > >> + page_ref_dec(pfn_to_page(pfn + i)); > > > > Is there ever situation this might be a different than 1->0 transition? > > Only if some other code would be taking a reference. At least not from > virtio-mem perspective. OK, so that is essentially an error condition. I think it shouldn't go silent and you want something like if (WARN_ON(!page_ref_sub_and_test(page))) dump_page(pfn_to_page(pfn + i), "YOUR REASON");
On 10.03.20 12:59, Michal Hocko wrote: > On Tue 10-03-20 12:46:05, David Hildenbrand wrote: >> On 10.03.20 12:43, Michal Hocko wrote: >>> On Mon 02-03-20 14:49:37, David Hildenbrand wrote: >>> [...] >>>> +static void virtio_mem_notify_going_offline(struct virtio_mem *vm, >>>> + unsigned long mb_id) >>>> +{ >>>> + const unsigned long nr_pages = PFN_DOWN(vm->subblock_size); >>>> + unsigned long pfn; >>>> + int sb_id, i; >>>> + >>>> + for (sb_id = 0; sb_id < vm->nb_sb_per_mb; sb_id++) { >>>> + if (virtio_mem_mb_test_sb_plugged(vm, mb_id, sb_id, 1)) >>>> + continue; >>>> + /* >>>> + * Drop our reference to the pages so the memory can get >>>> + * offlined and add the unplugged pages to the managed >>>> + * page counters (so offlining code can correctly subtract >>>> + * them again). >>>> + */ >>>> + pfn = PFN_DOWN(virtio_mem_mb_id_to_phys(mb_id) + >>>> + sb_id * vm->subblock_size); >>>> + adjust_managed_page_count(pfn_to_page(pfn), nr_pages); >>>> + for (i = 0; i < nr_pages; i++) >>>> + page_ref_dec(pfn_to_page(pfn + i)); >>> >>> Is there ever situation this might be a different than 1->0 transition? >> >> Only if some other code would be taking a reference. At least not from >> virtio-mem perspective. > > OK, so that is essentially an error condition. I think it shouldn't go > silent and you want something like > if (WARN_ON(!page_ref_sub_and_test(page))) > dump_page(pfn_to_page(pfn + i), "YOUR REASON"); > Makes sense - I'll most probably convert this to a WARN_ON_ONCE. Thanks!
diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index 5b26d57be551..2916f8b970fa 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -570,6 +570,53 @@ static void virtio_mem_notify_online(struct virtio_mem *vm, unsigned long mb_id, virtio_mem_retry(vm); } +static void virtio_mem_notify_going_offline(struct virtio_mem *vm, + unsigned long mb_id) +{ + const unsigned long nr_pages = PFN_DOWN(vm->subblock_size); + unsigned long pfn; + int sb_id, i; + + for (sb_id = 0; sb_id < vm->nb_sb_per_mb; sb_id++) { + if (virtio_mem_mb_test_sb_plugged(vm, mb_id, sb_id, 1)) + continue; + /* + * Drop our reference to the pages so the memory can get + * offlined and add the unplugged pages to the managed + * page counters (so offlining code can correctly subtract + * them again). + */ + pfn = PFN_DOWN(virtio_mem_mb_id_to_phys(mb_id) + + sb_id * vm->subblock_size); + adjust_managed_page_count(pfn_to_page(pfn), nr_pages); + for (i = 0; i < nr_pages; i++) + page_ref_dec(pfn_to_page(pfn + i)); + } +} + +static void virtio_mem_notify_cancel_offline(struct virtio_mem *vm, + unsigned long mb_id) +{ + const unsigned long nr_pages = PFN_DOWN(vm->subblock_size); + unsigned long pfn; + int sb_id, i; + + for (sb_id = 0; sb_id < vm->nb_sb_per_mb; sb_id++) { + if (virtio_mem_mb_test_sb_plugged(vm, mb_id, sb_id, 1)) + continue; + /* + * Get the reference we dropped when going offline and + * subtract the unplugged pages from the managed page + * counters. + */ + pfn = PFN_DOWN(virtio_mem_mb_id_to_phys(mb_id) + + sb_id * vm->subblock_size); + adjust_managed_page_count(pfn_to_page(pfn), -nr_pages); + for (i = 0; i < nr_pages; i++) + page_ref_inc(pfn_to_page(pfn + i)); + } +} + /* * This callback will either be called synchronously from add_memory() or * asynchronously (e.g., triggered via user space). We have to be careful @@ -616,6 +663,7 @@ static int virtio_mem_memory_notifier_cb(struct notifier_block *nb, break; } vm->hotplug_active = true; + virtio_mem_notify_going_offline(vm, mb_id); break; case MEM_GOING_ONLINE: mutex_lock(&vm->hotplug_mutex); @@ -640,6 +688,12 @@ static int virtio_mem_memory_notifier_cb(struct notifier_block *nb, mutex_unlock(&vm->hotplug_mutex); break; case MEM_CANCEL_OFFLINE: + if (!vm->hotplug_active) + break; + virtio_mem_notify_cancel_offline(vm, mb_id); + vm->hotplug_active = false; + mutex_unlock(&vm->hotplug_mutex); + break; case MEM_CANCEL_ONLINE: if (!vm->hotplug_active) break; @@ -666,8 +720,11 @@ static void virtio_mem_set_fake_offline(unsigned long pfn, struct page *page = pfn_to_page(pfn); __SetPageOffline(page); - if (!onlined) + if (!onlined) { SetPageDirty(page); + /* FIXME: remove after cleanups */ + ClearPageReserved(page); + } } } @@ -1717,6 +1774,11 @@ static void virtio_mem_remove(struct virtio_device *vdev) rc = virtio_mem_mb_remove(vm, mb_id); BUG_ON(rc); } + /* + * After we unregistered our callbacks, user space can no longer + * offline partially plugged online memory blocks. No need to worry + * about them. + */ /* unregister callbacks */ unregister_virtio_mem_device(vm);
Dropping the reference count of PageOffline() pages allows offlining code to skip them. However, we also have to convert PG_reserved to another flag - let's use PG_dirty - so has_unmovable_pages() will properly handle them. PG_reserved pages get detected as unmovable right away. We need the flag to see if we are onlining pages the first time, or if we allocated them via alloc_contig_range(). Properly take care of offlining code also modifying the stats and special handling in case the driver gets unloaded. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: David Hildenbrand <david@redhat.com> --- drivers/virtio/virtio_mem.c | 64 ++++++++++++++++++++++++++++++++++++- 1 file changed, 63 insertions(+), 1 deletion(-)