Message ID | 20250201095726.3768796-7-william.roche@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Poisoned memory recovery on reboot | expand |
> /* > @@ -595,6 +628,7 @@ static const TypeInfo host_memory_backend_info = { > .instance_size = sizeof(HostMemoryBackend), > .instance_init = host_memory_backend_init, > .instance_post_init = host_memory_backend_post_init, > + .instance_finalize = host_memory_backend_finalize, > .interfaces = (InterfaceInfo[]) { > { TYPE_USER_CREATABLE }, > { } > diff --git a/include/system/hostmem.h b/include/system/hostmem.h > index 5c21ca55c0..170849e8a4 100644 > --- a/include/system/hostmem.h > +++ b/include/system/hostmem.h > @@ -83,6 +83,7 @@ struct HostMemoryBackend { > HostMemPolicy policy; > > MemoryRegion mr; > + RAMBlockNotifier ram_notifier; > }; Thinking about Peters comment, it would be a nice improvement to have a single global memory-backend notifier that looks up the fitting memory backend, instead of having one per memory backend. A per-ramblock notifier might also be possible, but that's a bit harder/ackward to configure: e.g., the resize callback is passed to memory_region_init_resizeable_ram() right now.
On Tue, Feb 04, 2025 at 06:50:17PM +0100, David Hildenbrand wrote: > > /* > > @@ -595,6 +628,7 @@ static const TypeInfo host_memory_backend_info = { > > .instance_size = sizeof(HostMemoryBackend), > > .instance_init = host_memory_backend_init, > > .instance_post_init = host_memory_backend_post_init, > > + .instance_finalize = host_memory_backend_finalize, > > .interfaces = (InterfaceInfo[]) { > > { TYPE_USER_CREATABLE }, > > { } > > diff --git a/include/system/hostmem.h b/include/system/hostmem.h > > index 5c21ca55c0..170849e8a4 100644 > > --- a/include/system/hostmem.h > > +++ b/include/system/hostmem.h > > @@ -83,6 +83,7 @@ struct HostMemoryBackend { > > HostMemPolicy policy; > > MemoryRegion mr; > > + RAMBlockNotifier ram_notifier; > > }; > > Thinking about Peters comment, it would be a nice improvement to have a > single global memory-backend notifier that looks up the fitting memory > backend, instead of having one per memory backend. Yes, this could also avoid O(N**2). > > A per-ramblock notifier might also be possible, but that's a bit > harder/ackward to configure: e.g., the resize callback is passed to > memory_region_init_resizeable_ram() right now. Yes, that can be some fuss on code to be touched up. We could avoid passing that in when create the ramblock, instead we could allow ramblocks to opt-in on hooks after ramblock created. Maybe we could move resize() out too like that. Either way looks good.
On 04.02.25 18:58, Peter Xu wrote: > On Tue, Feb 04, 2025 at 06:50:17PM +0100, David Hildenbrand wrote: >>> /* >>> @@ -595,6 +628,7 @@ static const TypeInfo host_memory_backend_info = { >>> .instance_size = sizeof(HostMemoryBackend), >>> .instance_init = host_memory_backend_init, >>> .instance_post_init = host_memory_backend_post_init, >>> + .instance_finalize = host_memory_backend_finalize, >>> .interfaces = (InterfaceInfo[]) { >>> { TYPE_USER_CREATABLE }, >>> { } >>> diff --git a/include/system/hostmem.h b/include/system/hostmem.h >>> index 5c21ca55c0..170849e8a4 100644 >>> --- a/include/system/hostmem.h >>> +++ b/include/system/hostmem.h >>> @@ -83,6 +83,7 @@ struct HostMemoryBackend { >>> HostMemPolicy policy; >>> MemoryRegion mr; >>> + RAMBlockNotifier ram_notifier; >>> }; >> >> Thinking about Peters comment, it would be a nice improvement to have a >> single global memory-backend notifier that looks up the fitting memory >> backend, instead of having one per memory backend. > > Yes, this could also avoid O(N**2). Ah, and now I remember where these 3 patches originate from: virtio-mem handling. For virtio-mem I want to register also a remap handler, for example, to perform the custom preallocation handling. So there will be at least two instances getting notified (memory backend, virtio-mem), and the per-ramblock one would have only allowed to trigger one (at least with a simple callback as we have today for ->resize).
On Tue, Feb 04, 2025 at 07:55:52PM +0100, David Hildenbrand wrote: > Ah, and now I remember where these 3 patches originate from: virtio-mem > handling. > > For virtio-mem I want to register also a remap handler, for example, to > perform the custom preallocation handling. > > So there will be at least two instances getting notified (memory backend, > virtio-mem), and the per-ramblock one would have only allowed to trigger one > (at least with a simple callback as we have today for ->resize). I see, we can put something into commit log with such on decisions, then we'll remember. Said that, this still sounds like a per-ramblock thing, so instead of one hook function we can also have per-ramblock notifier lists. But I agree the perf issue isn't some immediate concern, so I'll leave that to you and William. If so I think we should discuss that in the commit log too, so we decide to not care about perf until necessary (or we just make it per-ramblock..). Thanks,
On 2/4/25 21:16, Peter Xu wrote: > On Tue, Feb 04, 2025 at 07:55:52PM +0100, David Hildenbrand wrote: >> Ah, and now I remember where these 3 patches originate from: virtio-mem >> handling. >> >> For virtio-mem I want to register also a remap handler, for example, to >> perform the custom preallocation handling. >> >> So there will be at least two instances getting notified (memory backend, >> virtio-mem), and the per-ramblock one would have only allowed to trigger one >> (at least with a simple callback as we have today for ->resize). > > I see, we can put something into commit log with such on decisions, then > we'll remember. > > Said that, this still sounds like a per-ramblock thing, so instead of one > hook function we can also have per-ramblock notifier lists. > > But I agree the perf issue isn't some immediate concern, so I'll leave that > to you and William. If so I think we should discuss that in the commit log > too, so we decide to not care about perf until necessary (or we just make > it per-ramblock..). > > Thanks, > I agree that we could split this fix in 2 parts: The one fixing the hugetlbfs (ignoring the preallocation setting for the moment), and the notification mechanism as a second set of patches. The first part would be the 3 first patches (including a corrected version of patch 2) and the second part could be an adaptation of the next 3 patches, with their notification implementation dealing with merging, dump *and* preallocation setup. But I'd be happy to help with the implementation of this 2nd aspect too: In order to apply settings like preallocation to a RAMBLock we need to find its associated HostMemoryBackend (where we have the 'prealloc' flag). To do so, we record a RAMBlockNotifier in the HostMemoryBackend struct, so that the notification triggered by the remap action: ram_block_notify_remap(block->host, offset, page_size); will go through the list of notifiers ram_list.ramblock_notifiers to run the not NULL ram_block_remapped entries on all of them. For each of them, we know the associated HostMemoryBackend (as it contains the RAMBlockNotifier), and we verify which one corresponds to the host address given, so that we can apply the appropriate settings. IIUC, my proposal (with David's code) currently has a per-HostMemoryBackend notification. Now if I want to implement a per-RAMBlock notification, would you suggest to consider that the 'mr' attibute of a RAMBlock always points to a HostMemoryBackend.mr, so that we could get the HostMemoryBackend associated to the block from a container_of(block->mr, HostMemoryBackend, mr) ? If this is valid, than we could apply the appropriate settings from there, but can't we have RAMBlocks not pointing to a HostMemoryBackend.mr ? I'm probably confused about what you are referring to. So how would you suggest that I make the notification per-ramblock ? Thanks in advance for your feedback. I'll send a corrected version of the first 3 patches, unless you want to go with the current version of the patches 4/6, 5/6 and 6/6, so that we can deal with preallocation. Please let me know. Thanks, William.
On Wed, Feb 05, 2025 at 05:27:50PM +0100, William Roche wrote: > On 2/4/25 21:16, Peter Xu wrote: > > On Tue, Feb 04, 2025 at 07:55:52PM +0100, David Hildenbrand wrote: > > > Ah, and now I remember where these 3 patches originate from: virtio-mem > > > handling. > > > > > > For virtio-mem I want to register also a remap handler, for example, to > > > perform the custom preallocation handling. > > > > > > So there will be at least two instances getting notified (memory backend, > > > virtio-mem), and the per-ramblock one would have only allowed to trigger one > > > (at least with a simple callback as we have today for ->resize). > > > > I see, we can put something into commit log with such on decisions, then > > we'll remember. > > > > Said that, this still sounds like a per-ramblock thing, so instead of one > > hook function we can also have per-ramblock notifier lists. > > > > But I agree the perf issue isn't some immediate concern, so I'll leave that > > to you and William. If so I think we should discuss that in the commit log > > too, so we decide to not care about perf until necessary (or we just make > > it per-ramblock..). > > > > Thanks, > > > > > I agree that we could split this fix in 2 parts: The one fixing the > hugetlbfs (ignoring the preallocation setting for the moment), and the > notification mechanism as a second set of patches. > > The first part would be the 3 first patches (including a corrected version > of patch 2) and the second part could be an adaptation of the next 3 > patches, with their notification implementation dealing with merging, dump > *and* preallocation setup. > > > But I'd be happy to help with the implementation of this 2nd aspect too: > > In order to apply settings like preallocation to a RAMBLock we need to find > its associated HostMemoryBackend (where we have the 'prealloc' flag). > To do so, we record a RAMBlockNotifier in the HostMemoryBackend struct, so > that the notification triggered by the remap action: > ram_block_notify_remap(block->host, offset, page_size); > will go through the list of notifiers ram_list.ramblock_notifiers to run the > not NULL ram_block_remapped entries on all of them. > > For each of them, we know the associated HostMemoryBackend (as it contains > the RAMBlockNotifier), and we verify which one corresponds to the host > address given, so that we can apply the appropriate settings. > > IIUC, my proposal (with David's code) currently has a per-HostMemoryBackend > notification. > > Now if I want to implement a per-RAMBlock notification, would you suggest to > consider that the 'mr' attibute of a RAMBlock always points to a > HostMemoryBackend.mr, so that we could get the HostMemoryBackend associated > to the block from a > container_of(block->mr, HostMemoryBackend, mr) ? > > If this is valid, than we could apply the appropriate settings from there, > but can't we have RAMBlocks not pointing to a HostMemoryBackend.mr ? Yes, QEMU definitely has ramblocks that are not backed by memory backends. However each memory backend must have its ramblock. IIUC what we need to do is let host_memory_backend_memory_complete() register a per-ramblock notifier on top of its ramblock, which can be referenced by backend->mr.ramblock. > > > I'm probably confused about what you are referring to. > So how would you suggest that I make the notification per-ramblock ? > Thanks in advance for your feedback. > > > I'll send a corrected version of the first 3 patches, unless you want to go > with the current version of the patches 4/6, 5/6 and 6/6, so that we can > deal with preallocation. I don't feel strongly, but I can explain how the per-ramblock can be done. One thing interesting I found is we actually have such notifier list already in ramblocks.. see: struct RAMBlock { ... QLIST_HEAD(, RAMBlockNotifier) ramblock_notifiers; ... } I guess that's some leftover from the global ramblock notifier.. e.g. I tried remove that line and qemu compiles all fine. Then instead of removing it, we could make that the per-ramblock list. One way to do this is: - Patch 1: refactor current code, let RAMBlock.resized() to be a notifier instead of a fn() pointer passed over from memory_region_init_resizeable_ram(). It means we can remove RAMBlock.resized() but make fw_cfg_resized() becomes a notifier, taking RAM_BLOCK_RESIZED event instead. - Patch 2: introduce another RAM_BLOCK_REMAPPED event, then host backends (probably, host_memory_backend_memory_complete() after alloc() done so that the ramblock will always be available..) can register a notifier only looking for REMAPPED. Then in the future virtio-mem can register similarly to specific ramblock on REMAPPED only. Thanks,
diff --git a/backends/hostmem.c b/backends/hostmem.c index 46d80f98b4..4589467c77 100644 --- a/backends/hostmem.c +++ b/backends/hostmem.c @@ -361,11 +361,37 @@ static void host_memory_backend_set_prealloc_threads(Object *obj, Visitor *v, backend->prealloc_threads = value; } +static void host_memory_backend_ram_remapped(RAMBlockNotifier *n, void *host, + size_t offset, size_t size) +{ + HostMemoryBackend *backend = container_of(n, HostMemoryBackend, + ram_notifier); + Error *err = NULL; + + if (!host_memory_backend_mr_inited(backend) || + memory_region_get_ram_ptr(&backend->mr) != host) { + return; + } + + host_memory_backend_apply_settings(backend, host + offset, size, &err); + if (err) { + /* + * If memory settings can't be successfully applied on remap, + * don't take the risk to continue without them. + */ + error_report_err(err); + exit(1); + } +} + static void host_memory_backend_init(Object *obj) { HostMemoryBackend *backend = MEMORY_BACKEND(obj); MachineState *machine = MACHINE(qdev_get_machine()); + backend->ram_notifier.ram_block_remapped = host_memory_backend_ram_remapped; + ram_block_notifier_add(&backend->ram_notifier); + /* TODO: convert access to globals to compat properties */ backend->merge = machine_mem_merge(machine); backend->dump = machine_dump_guest_core(machine); @@ -379,6 +405,13 @@ static void host_memory_backend_post_init(Object *obj) object_apply_compat_props(obj); } +static void host_memory_backend_finalize(Object *obj) +{ + HostMemoryBackend *backend = MEMORY_BACKEND(obj); + + ram_block_notifier_remove(&backend->ram_notifier); +} + bool host_memory_backend_mr_inited(HostMemoryBackend *backend) { /* @@ -595,6 +628,7 @@ static const TypeInfo host_memory_backend_info = { .instance_size = sizeof(HostMemoryBackend), .instance_init = host_memory_backend_init, .instance_post_init = host_memory_backend_post_init, + .instance_finalize = host_memory_backend_finalize, .interfaces = (InterfaceInfo[]) { { TYPE_USER_CREATABLE }, { } diff --git a/include/system/hostmem.h b/include/system/hostmem.h index 5c21ca55c0..170849e8a4 100644 --- a/include/system/hostmem.h +++ b/include/system/hostmem.h @@ -83,6 +83,7 @@ struct HostMemoryBackend { HostMemPolicy policy; MemoryRegion mr; + RAMBlockNotifier ram_notifier; }; bool host_memory_backend_mr_inited(HostMemoryBackend *backend); diff --git a/system/physmem.c b/system/physmem.c index 561b2c38c0..a047fd680e 100644 --- a/system/physmem.c +++ b/system/physmem.c @@ -2223,7 +2223,6 @@ void qemu_ram_remap(ram_addr_t addr) { RAMBlock *block; uint64_t offset; - void *vaddr; size_t page_size; RAMBLOCK_FOREACH(block) { @@ -2233,7 +2232,6 @@ void qemu_ram_remap(ram_addr_t addr) page_size = qemu_ram_pagesize(block); offset = QEMU_ALIGN_DOWN(offset, page_size); - vaddr = ramblock_ptr(block, offset); if (block->flags & RAM_PREALLOC) { ; } else if (xen_enabled()) { @@ -2257,8 +2255,6 @@ void qemu_ram_remap(ram_addr_t addr) exit(1); } } - memory_try_enable_merging(vaddr, page_size); - qemu_ram_setup_dump(vaddr, page_size); ram_block_notify_remap(block->host, offset, page_size); }