Message ID | c21708c84c850ff8c976a3934099c54da044e7d9.1631802816.git.tamas.lengyel@intel.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86/mem_sharing: don't lock parent during fork reset | expand |
On 16.09.2021 17:04, Tamas K Lengyel wrote: > During fork reset operation the parent domain doesn't need to be gathered using > rcu_lock_live_remote_domain_by_id as the fork reset doesn't modify anything on > the parent. The parent is also guaranteed to be paused while forks are active. > This patch reduces lock contention when performing resets in parallel. I'm a little in trouble following you here: RCU locks aren't really locks in that sense, so "lock contention" seems misleading to me. I can see that rcu_lock_domain_by_id()'s loop is extra overhead. Furthermore - does the parent being paused really mean the parent can't go away behind the back of the fork reset? In fork() I see if ( rc && rc != -ERESTART ) { domain_unpause(d); put_domain(d); cd->parent = NULL; } i.e. the ref gets dropped before the parent pointer gets cleared. If the parent having a reference kept was indeed properly guaranteed, I agree the code change itself is fine. (The sequence looks correct at the other put_domain() site [dealing with the success case of fork(), when the reference gets retained] in domain_relinquish_resources().) Jan
On Fri, Sep 17, 2021 at 3:26 AM Jan Beulich <jbeulich@suse.com> wrote: > > On 16.09.2021 17:04, Tamas K Lengyel wrote: > > During fork reset operation the parent domain doesn't need to be gathered using > > rcu_lock_live_remote_domain_by_id as the fork reset doesn't modify anything on > > the parent. The parent is also guaranteed to be paused while forks are active. > > This patch reduces lock contention when performing resets in parallel. > > I'm a little in trouble following you here: RCU locks aren't really > locks in that sense, so "lock contention" seems misleading to me. I > can see that rcu_lock_domain_by_id()'s loop is extra overhead. > > Furthermore - does the parent being paused really mean the parent > can't go away behind the back of the fork reset? In fork() I see > > if ( rc && rc != -ERESTART ) > { > domain_unpause(d); > put_domain(d); > cd->parent = NULL; > } > > i.e. the ref gets dropped before the parent pointer gets cleared. If > the parent having a reference kept was indeed properly guaranteed, I > agree the code change itself is fine. > > (The sequence looks correct at the other put_domain() site [dealing > with the success case of fork(), when the reference gets retained] > in domain_relinquish_resources().) This code above you copied is when the fork() fails. Calling fork_reset() before fork() successfully returns is not a sane sequence and it is not "supported" by any means. If someone would try to do that it would be racy as-is already with or without this patch. Clearing the cd->parent pointer first here on the error path wouldn't guarantee that sequence to be safe or sane either. Adding an extra field to struct domain that signifies that "fork is complete" would be a way to make that safe. But since the toolstack using this interface is already sane (ie. never calls fork_reset before a successful fork) I really don't think that's necessary. It would just grow struct domain for very little benefit. Tamas
On 17.09.2021 16:21, Tamas K Lengyel wrote: > On Fri, Sep 17, 2021 at 3:26 AM Jan Beulich <jbeulich@suse.com> wrote: >> >> On 16.09.2021 17:04, Tamas K Lengyel wrote: >>> During fork reset operation the parent domain doesn't need to be gathered using >>> rcu_lock_live_remote_domain_by_id as the fork reset doesn't modify anything on >>> the parent. The parent is also guaranteed to be paused while forks are active. >>> This patch reduces lock contention when performing resets in parallel. >> >> I'm a little in trouble following you here: RCU locks aren't really >> locks in that sense, so "lock contention" seems misleading to me. I >> can see that rcu_lock_domain_by_id()'s loop is extra overhead. >> >> Furthermore - does the parent being paused really mean the parent >> can't go away behind the back of the fork reset? In fork() I see >> >> if ( rc && rc != -ERESTART ) >> { >> domain_unpause(d); >> put_domain(d); >> cd->parent = NULL; >> } >> >> i.e. the ref gets dropped before the parent pointer gets cleared. If >> the parent having a reference kept was indeed properly guaranteed, I >> agree the code change itself is fine. >> >> (The sequence looks correct at the other put_domain() site [dealing >> with the success case of fork(), when the reference gets retained] >> in domain_relinquish_resources().) > > This code above you copied is when the fork() fails. Calling > fork_reset() before fork() successfully returns is not a sane sequence > and it is not "supported" by any means. If someone would try to do > that it would be racy as-is already with or without this patch. > Clearing the cd->parent pointer first here on the error path wouldn't > guarantee that sequence to be safe or sane either. Adding an extra > field to struct domain that signifies that "fork is complete" would be > a way to make that safe. But since the toolstack using this interface > is already sane (ie. never calls fork_reset before a successful fork) > I really don't think that's necessary. It would just grow struct > domain for very little benefit. The point of this latter part of my comments wasn't to suggest that fork-reset ought to work before fork completed. That's fine to not be '"supported" by any means'. What your change here does, though, is to add a dependency (maybe not the first one) on there being a ref held as long as ->parent is non-NULL. That requirement is violated by the error path I've quoted. IOW my request isn't really fork or even mem-sharing specific, but it instead is asking that the code in question please follow a common, safe model (as soon as at least one such dependency exists). If there are pre-existing cases where the wrong order of operations is an issue, then adjusting that sequence in a separate prereq patch might be better than folding the fix in here. Whereas if there isn't any other such case or it's simply unknown (without extended audit) whether there is, then I see no issue folding that adjustment in here. Of course - you're the maintainer of this code, so if you think the adjustment isn't needed, so be it. It's just that then I can't give you an R-b, so you'd need someone else's for your change to actually go in. (Of course you could also convince me of your pov, but for now I can't see this happening.) Jan
diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c index 8d5d44bdbc..b80b978ef3 100644 --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -1879,9 +1879,10 @@ static int fork(struct domain *cd, struct domain *d) * footprints the hypercall continuation should be implemented (or if this * feature needs to be become "stable"). */ -static int mem_sharing_fork_reset(struct domain *d, struct domain *pd) +static int mem_sharing_fork_reset(struct domain *d) { int rc; + struct domain *pd = d->parent; struct p2m_domain *p2m = p2m_get_hostp2m(d); struct page_info *page, *tmp; @@ -2226,8 +2227,6 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg) case XENMEM_sharing_op_fork_reset: { - struct domain *pd; - rc = -EINVAL; if ( mso.u.fork.pad || mso.u.fork.flags ) goto out; @@ -2236,13 +2235,7 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg) if ( !d->parent ) goto out; - rc = rcu_lock_live_remote_domain_by_id(d->parent->domain_id, &pd); - if ( rc ) - goto out; - - rc = mem_sharing_fork_reset(d, pd); - - rcu_unlock_domain(pd); + rc = mem_sharing_fork_reset(d); break; }
During fork reset operation the parent domain doesn't need to be gathered using rcu_lock_live_remote_domain_by_id as the fork reset doesn't modify anything on the parent. The parent is also guaranteed to be paused while forks are active. This patch reduces lock contention when performing resets in parallel. Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com> --- xen/arch/x86/mm/mem_sharing.c | 13 +++---------- 1 file changed, 3 insertions(+), 10 deletions(-)