Message ID | 3ebadf898bf9e165d657a31c0aa98bbd300ffcb2.1648561546.git.tamas.lengyel@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2,1/3] x86/mem_sharing: option to enforce fork starting with empty p2m | expand |
On 29.03.2022 16:03, Tamas K Lengyel wrote: > --- a/xen/arch/x86/include/asm/mem_sharing.h > +++ b/xen/arch/x86/include/asm/mem_sharing.h > @@ -85,6 +85,9 @@ static inline bool mem_sharing_is_fork(const struct domain *d) > int mem_sharing_fork_page(struct domain *d, gfn_t gfn, > bool unsharing); > > +int mem_sharing_fork_reset(struct domain *d, bool reset_state, > + bool reset_memory); Please avoid passing multiple booleans, even more so when you already derive them from a single "flags" value. This would likely be easiest if you re-used the VM_EVENT_FLAG_RESET_FORK_* values also for XENMEM_FORK_RESET_*, with suitable BUILD_BUG_ON() put in place to prove they're the same. > --- a/xen/arch/x86/mm/mem_sharing.c > +++ b/xen/arch/x86/mm/mem_sharing.c > @@ -1890,15 +1890,24 @@ static int fork(struct domain *cd, struct domain *d, uint16_t flags) > * footprints the hypercall continuation should be implemented (or if this > * feature needs to be become "stable"). > */ > -static int mem_sharing_fork_reset(struct domain *d) > +int mem_sharing_fork_reset(struct domain *d, bool reset_state, > + bool reset_memory) > { > - int rc; > + int rc = 0; > struct domain *pd = d->parent; > struct p2m_domain *p2m = p2m_get_hostp2m(d); > struct page_info *page, *tmp; > > + ASSERT(reset_state || reset_memory); > + > + if ( !d->arch.hvm.mem_sharing.fork_complete ) > + return -ENOSYS; Either EOPNOTSUPP (in case you think this operation could make sense to implement for incomplete forks) or e.g. EINVAL, but please not ENOSYS. > @@ -394,6 +399,16 @@ static int vm_event_resume(struct domain *d, struct vm_event_domain *ved) > if ( rsp.reason == VM_EVENT_REASON_MEM_PAGING ) > p2m_mem_paging_resume(d, &rsp); > #endif > +#ifdef CONFIG_MEM_SHARING > + if ( mem_sharing_is_fork(d) ) > + { > + bool reset_state = rsp.flags & VM_EVENT_FLAG_RESET_FORK_STATE; > + bool reset_mem = rsp.flags & VM_EVENT_FLAG_RESET_FORK_MEMORY; > + > + if ( reset_state || reset_mem ) > + ASSERT(!mem_sharing_fork_reset(d, reset_state, reset_mem)); > + } > +#endif Should the two flags be rejected in the "else" case, rather than being silently ignored? > --- a/xen/include/public/memory.h > +++ b/xen/include/public/memory.h > @@ -541,12 +541,14 @@ struct xen_mem_sharing_op { > uint32_t gref; /* IN: gref to debug */ > } u; > } debug; > - struct mem_sharing_op_fork { /* OP_FORK */ > + struct mem_sharing_op_fork { /* OP_FORK/_RESET */ I consider the notation in the comment misleading - I would read it to mean OP_FORK and OP_RESET, supported by the earlier OP_SHARE/ADD_PHYSMAP. Commonly we write OP_FORK{,_RESET} in such cases. Jan
On Wed, Mar 30, 2022, 6:31 AM Jan Beulich <jbeulich@suse.com> wrote: > On 29.03.2022 16:03, Tamas K Lengyel wrote: > > --- a/xen/arch/x86/include/asm/mem_sharing.h > > +++ b/xen/arch/x86/include/asm/mem_sharing.h > > @@ -85,6 +85,9 @@ static inline bool mem_sharing_is_fork(const struct > domain *d) > > int mem_sharing_fork_page(struct domain *d, gfn_t gfn, > > bool unsharing); > > > > +int mem_sharing_fork_reset(struct domain *d, bool reset_state, > > + bool reset_memory); > > Please avoid passing multiple booleans, even more so when you already > derive them from a single "flags" value. This would likely be easiest > if you re-used the VM_EVENT_FLAG_RESET_FORK_* values also for > XENMEM_FORK_RESET_*, with suitable BUILD_BUG_ON() put in place to > prove they're the same. > I don't see why that would be an improvement in any way. I also don't want to make VM_EVENT flags tied to the XENMEM ones as they are separate interfaces. I rather just drop the changes to the XENMEM interface then do that. > > --- a/xen/arch/x86/mm/mem_sharing.c > > +++ b/xen/arch/x86/mm/mem_sharing.c > > @@ -1890,15 +1890,24 @@ static int fork(struct domain *cd, struct domain > *d, uint16_t flags) > > * footprints the hypercall continuation should be implemented (or if > this > > * feature needs to be become "stable"). > > */ > > -static int mem_sharing_fork_reset(struct domain *d) > > +int mem_sharing_fork_reset(struct domain *d, bool reset_state, > > + bool reset_memory) > > { > > - int rc; > > + int rc = 0; > > struct domain *pd = d->parent; > > struct p2m_domain *p2m = p2m_get_hostp2m(d); > > struct page_info *page, *tmp; > > > > + ASSERT(reset_state || reset_memory); > > + > > + if ( !d->arch.hvm.mem_sharing.fork_complete ) > > + return -ENOSYS; > > Either EOPNOTSUPP (in case you think this operation could make sense > to implement for incomplete forks) or e.g. EINVAL, but please not > ENOSYS. > > > @@ -394,6 +399,16 @@ static int vm_event_resume(struct domain *d, struct > vm_event_domain *ved) > > if ( rsp.reason == VM_EVENT_REASON_MEM_PAGING ) > > p2m_mem_paging_resume(d, &rsp); > > #endif > > +#ifdef CONFIG_MEM_SHARING > > + if ( mem_sharing_is_fork(d) ) > > + { > > + bool reset_state = rsp.flags & > VM_EVENT_FLAG_RESET_FORK_STATE; > > + bool reset_mem = rsp.flags & > VM_EVENT_FLAG_RESET_FORK_MEMORY; > > + > > + if ( reset_state || reset_mem ) > > + ASSERT(!mem_sharing_fork_reset(d, reset_state, > reset_mem)); > > + } > > +#endif > > Should the two flags be rejected in the "else" case, rather than > being silently ignored? > What do you mean by rejected? There is no feasible "rejection" that could be done in this path other then skipping it. > > --- a/xen/include/public/memory.h > > +++ b/xen/include/public/memory.h > > @@ -541,12 +541,14 @@ struct xen_mem_sharing_op { > > uint32_t gref; /* IN: gref to debug */ > > } u; > > } debug; > > - struct mem_sharing_op_fork { /* OP_FORK */ > > + struct mem_sharing_op_fork { /* OP_FORK/_RESET */ > > I consider the notation in the comment misleading - I would read it to > mean OP_FORK and OP_RESET, supported by the earlier > OP_SHARE/ADD_PHYSMAP. Commonly we write OP_FORK{,_RESET} in such cases. Ack. Tamas >
On 30.03.2022 15:19, Tamas K Lengyel wrote: > On Wed, Mar 30, 2022, 6:31 AM Jan Beulich <jbeulich@suse.com> wrote: >> On 29.03.2022 16:03, Tamas K Lengyel wrote: >>> --- a/xen/arch/x86/include/asm/mem_sharing.h >>> +++ b/xen/arch/x86/include/asm/mem_sharing.h >>> @@ -85,6 +85,9 @@ static inline bool mem_sharing_is_fork(const struct >> domain *d) >>> int mem_sharing_fork_page(struct domain *d, gfn_t gfn, >>> bool unsharing); >>> >>> +int mem_sharing_fork_reset(struct domain *d, bool reset_state, >>> + bool reset_memory); >> >> Please avoid passing multiple booleans, even more so when you already >> derive them from a single "flags" value. This would likely be easiest >> if you re-used the VM_EVENT_FLAG_RESET_FORK_* values also for >> XENMEM_FORK_RESET_*, with suitable BUILD_BUG_ON() put in place to >> prove they're the same. > > I don't see why that would be an improvement in any way. I also don't want > to make VM_EVENT flags tied to the XENMEM ones as they are separate > interfaces. I rather just drop the changes to the XENMEM interface then do > that. If the function gained two or three more flags, how would that look to you? And how would you easily identify the correct order of all the booleans? >>> @@ -394,6 +399,16 @@ static int vm_event_resume(struct domain *d, struct >> vm_event_domain *ved) >>> if ( rsp.reason == VM_EVENT_REASON_MEM_PAGING ) >>> p2m_mem_paging_resume(d, &rsp); >>> #endif >>> +#ifdef CONFIG_MEM_SHARING >>> + if ( mem_sharing_is_fork(d) ) >>> + { >>> + bool reset_state = rsp.flags & >> VM_EVENT_FLAG_RESET_FORK_STATE; >>> + bool reset_mem = rsp.flags & >> VM_EVENT_FLAG_RESET_FORK_MEMORY; >>> + >>> + if ( reset_state || reset_mem ) >>> + ASSERT(!mem_sharing_fork_reset(d, reset_state, >> reset_mem)); >>> + } >>> +#endif >> >> Should the two flags be rejected in the "else" case, rather than >> being silently ignored? > > What do you mean by rejected? There is no feasible "rejection" that could > be done in this path other then skipping it. IOW no return value being handed back to the requester? The function does have an error return path, so I don't see why you couldn't return -EINVAL. Jan
On Wed, Mar 30, 2022, 9:34 AM Jan Beulich <jbeulich@suse.com> wrote: > On 30.03.2022 15:19, Tamas K Lengyel wrote: > > On Wed, Mar 30, 2022, 6:31 AM Jan Beulich <jbeulich@suse.com> wrote: > >> On 29.03.2022 16:03, Tamas K Lengyel wrote: > >>> --- a/xen/arch/x86/include/asm/mem_sharing.h > >>> +++ b/xen/arch/x86/include/asm/mem_sharing.h > >>> @@ -85,6 +85,9 @@ static inline bool mem_sharing_is_fork(const struct > >> domain *d) > >>> int mem_sharing_fork_page(struct domain *d, gfn_t gfn, > >>> bool unsharing); > >>> > >>> +int mem_sharing_fork_reset(struct domain *d, bool reset_state, > >>> + bool reset_memory); > >> > >> Please avoid passing multiple booleans, even more so when you already > >> derive them from a single "flags" value. This would likely be easiest > >> if you re-used the VM_EVENT_FLAG_RESET_FORK_* values also for > >> XENMEM_FORK_RESET_*, with suitable BUILD_BUG_ON() put in place to > >> prove they're the same. > > > > I don't see why that would be an improvement in any way. I also don't > want > > to make VM_EVENT flags tied to the XENMEM ones as they are separate > > interfaces. I rather just drop the changes to the XENMEM interface then > do > > that. > > If the function gained two or three more flags, how would that look to > you? And how would you easily identify the correct order of all the > booleans? > IMHO we can cross that bridge when and if that becomes necessary. Also not sure what you mean by the order of the booleans - that's irrelevant since the booleans are named? > >>> @@ -394,6 +399,16 @@ static int vm_event_resume(struct domain *d, > struct > >> vm_event_domain *ved) > >>> if ( rsp.reason == VM_EVENT_REASON_MEM_PAGING ) > >>> p2m_mem_paging_resume(d, &rsp); > >>> #endif > >>> +#ifdef CONFIG_MEM_SHARING > >>> + if ( mem_sharing_is_fork(d) ) > >>> + { > >>> + bool reset_state = rsp.flags & > >> VM_EVENT_FLAG_RESET_FORK_STATE; > >>> + bool reset_mem = rsp.flags & > >> VM_EVENT_FLAG_RESET_FORK_MEMORY; > >>> + > >>> + if ( reset_state || reset_mem ) > >>> + ASSERT(!mem_sharing_fork_reset(d, reset_state, > >> reset_mem)); > >>> + } > >>> +#endif > >> > >> Should the two flags be rejected in the "else" case, rather than > >> being silently ignored? > > > > What do you mean by rejected? There is no feasible "rejection" that could > > be done in this path other then skipping it. > > IOW no return value being handed back to the requester? The function > does have an error return path, so I don't see why you couldn't return > -EINVAL. > The way vm_event response flags are right now is "best effort". Error code from this path never reaches the caller, which are usually callback functions that return these flags. The best way for an error code to reach the caller would be by making a separate vm_event on the ring and sending that, but that's non-existent today and also hasn't been needed. It certainly isn't needed here since there should be no feasable way for a fork to fail a reset request (hence the assert). Tamas >
diff --git a/tools/include/xenctrl.h b/tools/include/xenctrl.h index 26766ec19f..8a5b125aae 100644 --- a/tools/include/xenctrl.h +++ b/tools/include/xenctrl.h @@ -2291,7 +2291,8 @@ int xc_memshr_fork(xc_interface *xch, * * With VMs that have a lot of memory this call may block for a long time. */ -int xc_memshr_fork_reset(xc_interface *xch, uint32_t forked_domain); +int xc_memshr_fork_reset(xc_interface *xch, uint32_t forked_domain, + bool reset_state, bool reset_memory); /* Debug calls: return the number of pages referencing the shared frame backing * the input argument. Should be one or greater. diff --git a/tools/libs/ctrl/xc_memshr.c b/tools/libs/ctrl/xc_memshr.c index 0143f9ddea..5044d5b83e 100644 --- a/tools/libs/ctrl/xc_memshr.c +++ b/tools/libs/ctrl/xc_memshr.c @@ -260,12 +260,17 @@ int xc_memshr_fork(xc_interface *xch, uint32_t pdomid, uint32_t domid, return xc_memshr_memop(xch, domid, &mso); } -int xc_memshr_fork_reset(xc_interface *xch, uint32_t domid) +int xc_memshr_fork_reset(xc_interface *xch, uint32_t domid, bool reset_state, + bool reset_memory) { xen_mem_sharing_op_t mso; memset(&mso, 0, sizeof(mso)); mso.op = XENMEM_sharing_op_fork_reset; + if ( reset_state ) + mso.u.fork.flags |= XENMEM_FORK_RESET_STATE; + if ( reset_memory ) + mso.u.fork.flags |= XENMEM_FORK_RESET_MEMORY; return xc_memshr_memop(xch, domid, &mso); } diff --git a/xen/arch/x86/include/asm/mem_sharing.h b/xen/arch/x86/include/asm/mem_sharing.h index cf7a12f4d2..2c00069bc9 100644 --- a/xen/arch/x86/include/asm/mem_sharing.h +++ b/xen/arch/x86/include/asm/mem_sharing.h @@ -85,6 +85,9 @@ static inline bool mem_sharing_is_fork(const struct domain *d) int mem_sharing_fork_page(struct domain *d, gfn_t gfn, bool unsharing); +int mem_sharing_fork_reset(struct domain *d, bool reset_state, + bool reset_memory); + /* * If called by a foreign domain, possible errors are * -EBUSY -> ring full @@ -148,6 +151,12 @@ static inline int mem_sharing_fork_page(struct domain *d, gfn_t gfn, bool lock) return -EOPNOTSUPP; } +static inline int mem_sharing_fork_reset(struct domain *d, bool reset_state, + bool reset_memory) +{ + return -EOPNOTSUPP; +} + #endif #endif /* __MEM_SHARING_H__ */ diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c index bfde342a38..11c74e3905 100644 --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -1890,15 +1890,24 @@ static int fork(struct domain *cd, struct domain *d, uint16_t flags) * footprints the hypercall continuation should be implemented (or if this * feature needs to be become "stable"). */ -static int mem_sharing_fork_reset(struct domain *d) +int mem_sharing_fork_reset(struct domain *d, bool reset_state, + bool reset_memory) { - int rc; + int rc = 0; struct domain *pd = d->parent; struct p2m_domain *p2m = p2m_get_hostp2m(d); struct page_info *page, *tmp; + ASSERT(reset_state || reset_memory); + + if ( !d->arch.hvm.mem_sharing.fork_complete ) + return -ENOSYS; + domain_pause(d); + if ( !reset_memory ) + goto state; + /* need recursive lock because we will free pages */ spin_lock_recursive(&d->page_alloc_lock); page_list_for_each_safe(page, tmp, &d->page_list) @@ -1931,7 +1940,9 @@ static int mem_sharing_fork_reset(struct domain *d) } spin_unlock_recursive(&d->page_alloc_lock); - rc = copy_settings(d, pd, d->arch.hvm.mem_sharing.empty_p2m); + state: + if ( reset_state ) + rc = copy_settings(d, pd, d->arch.hvm.mem_sharing.empty_p2m); domain_unpause(d); @@ -2237,15 +2248,21 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg) case XENMEM_sharing_op_fork_reset: { + bool reset_state = mso.u.fork.flags & XENMEM_FORK_RESET_STATE; + bool reset_memory = mso.u.fork.flags & XENMEM_FORK_RESET_MEMORY; + rc = -EINVAL; - if ( mso.u.fork.pad || mso.u.fork.flags ) + if ( mso.u.fork.pad || (!reset_state && !reset_memory) ) + goto out; + if ( mso.u.fork.flags & + ~(XENMEM_FORK_RESET_STATE | XENMEM_FORK_RESET_MEMORY) ) goto out; rc = -ENOSYS; if ( !d->parent ) goto out; - rc = mem_sharing_fork_reset(d); + rc = mem_sharing_fork_reset(d, reset_state, reset_memory); break; } diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c index 84cf52636b..d26a6699fc 100644 --- a/xen/common/vm_event.c +++ b/xen/common/vm_event.c @@ -28,6 +28,11 @@ #include <asm/p2m.h> #include <asm/monitor.h> #include <asm/vm_event.h> + +#ifdef CONFIG_MEM_SHARING +#include <asm/mem_sharing.h> +#endif + #include <xsm/xsm.h> #include <public/hvm/params.h> @@ -394,6 +399,16 @@ static int vm_event_resume(struct domain *d, struct vm_event_domain *ved) if ( rsp.reason == VM_EVENT_REASON_MEM_PAGING ) p2m_mem_paging_resume(d, &rsp); #endif +#ifdef CONFIG_MEM_SHARING + if ( mem_sharing_is_fork(d) ) + { + bool reset_state = rsp.flags & VM_EVENT_FLAG_RESET_FORK_STATE; + bool reset_mem = rsp.flags & VM_EVENT_FLAG_RESET_FORK_MEMORY; + + if ( reset_state || reset_mem ) + ASSERT(!mem_sharing_fork_reset(d, reset_state, reset_mem)); + } +#endif /* * Check emulation flags in the arch-specific handler only, as it diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h index d44c256b3c..2a4edfc84b 100644 --- a/xen/include/public/memory.h +++ b/xen/include/public/memory.h @@ -541,12 +541,14 @@ struct xen_mem_sharing_op { uint32_t gref; /* IN: gref to debug */ } u; } debug; - struct mem_sharing_op_fork { /* OP_FORK */ + struct mem_sharing_op_fork { /* OP_FORK/_RESET */ domid_t parent_domain; /* IN: parent's domain id */ /* These flags only makes sense for short-lived forks */ #define XENMEM_FORK_WITH_IOMMU_ALLOWED (1u << 0) #define XENMEM_FORK_BLOCK_INTERRUPTS (1u << 1) #define XENMEM_FORK_EMPTY_P2M (1u << 2) +#define XENMEM_FORK_RESET_STATE (1u << 3) +#define XENMEM_FORK_RESET_MEMORY (1u << 4) uint16_t flags; /* IN: optional settings */ uint32_t pad; /* Must be set to 0 */ } fork; diff --git a/xen/include/public/vm_event.h b/xen/include/public/vm_event.h index bb003d21d0..81c2ee28cc 100644 --- a/xen/include/public/vm_event.h +++ b/xen/include/public/vm_event.h @@ -127,6 +127,14 @@ * Reset the vmtrace buffer (if vmtrace is enabled) */ #define VM_EVENT_FLAG_RESET_VMTRACE (1 << 13) +/* + * Reset the VM state (if VM is fork) + */ +#define VM_EVENT_FLAG_RESET_FORK_STATE (1 << 14) +/* + * Remove unshared entried from physmap (if VM is fork) + */ +#define VM_EVENT_FLAG_RESET_FORK_MEMORY (1 << 15) /* * Reasons for the vm event request
Allow specify distinct parts of the fork VM to be reset. This is useful when a fuzzing operation involves mapping in only a handful of pages that are known ahead of time. Throwing these pages away just to be re-copied immediately is expensive, thus allowing to specify partial resets can speed things up. Also allow resetting to be initiated from vm_event responses as an optiomization. Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com> --- v2: address review comments and add more sanity checking --- tools/include/xenctrl.h | 3 ++- tools/libs/ctrl/xc_memshr.c | 7 ++++++- xen/arch/x86/include/asm/mem_sharing.h | 9 +++++++++ xen/arch/x86/mm/mem_sharing.c | 27 +++++++++++++++++++++----- xen/common/vm_event.c | 15 ++++++++++++++ xen/include/public/memory.h | 4 +++- xen/include/public/vm_event.h | 8 ++++++++ 7 files changed, 65 insertions(+), 8 deletions(-)