Message ID | 20210526152152.26251-1-julien@xen.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | xen/grant-table: Simplify the update to the per-vCPU maptrack freelist | expand |
On 26.05.2021 17:21, Julien Grall wrote: > From: Julien Grall <jgrall@amazon.com> > > Since XSA-288 (commit 02cbeeb62075 "gnttab: split maptrack lock to make XSA-228 I suppose? > it fulfill its purpose again"), v->maptrack_head and v->maptrack_tail > are with the lock v->maptrack_freelist_lock held. Nit: missing "accessed" or alike? > Therefore it is not necessary to update the fields using cmpxchg() > and also read them atomically. Ah yes, very good observation. Should have noticed this back at the time, for an immediate follow-up change. > Note that there are two cases where v->maptrack_tail is accessed without > the lock. They both happen _get_maptrack_handle() when the current vCPU > list is empty. Therefore there is no possible race. I think you mean the other function here, without a leading underscore in its name. And if you want to explain the absence of a race, wouldn't you then better also mention that the list can get initially filled only on the local vCPU? > I am not sure whether we should try to protect the remaining unprotected > access with the lock or maybe add a comment? As per above I don't view adding locking as sensible. If you feel like adding a helpful comment, perhaps. I will admit that it took me more than just a moment to recall that "local vCPU only" argument. > --- a/xen/common/grant_table.c > +++ b/xen/common/grant_table.c > @@ -543,34 +543,26 @@ double_gt_unlock(struct grant_table *lgt, struct grant_table *rgt) > static inline grant_handle_t > _get_maptrack_handle(struct grant_table *t, struct vcpu *v) > { > - unsigned int head, next, prev_head; > + unsigned int head, next; > > spin_lock(&v->maptrack_freelist_lock); > > - do { > - /* No maptrack pages allocated for this VCPU yet? */ > - head = read_atomic(&v->maptrack_head); > - if ( unlikely(head == MAPTRACK_TAIL) ) > - { > - spin_unlock(&v->maptrack_freelist_lock); > - return INVALID_MAPTRACK_HANDLE; Where did this and ... > - } > - > - /* > - * Always keep one entry in the free list to make it easier to > - * add free entries to the tail. > - */ > - next = read_atomic(&maptrack_entry(t, head).ref); > - if ( unlikely(next == MAPTRACK_TAIL) ) > - { > - spin_unlock(&v->maptrack_freelist_lock); > - return INVALID_MAPTRACK_HANDLE; ... this use of INVALID_MAPTRACK_HANDLE go? It is at present merely coincidence that INVALID_MAPTRACK_HANDLE == MAPTRACK_TAIL. If you want to fold them, you will need to do so properly (by eliminating one of the two constants). But I think they're separate on purpose. > - } > + /* No maptrack pages allocated for this VCPU yet? */ > + head = v->maptrack_head; > + if ( unlikely(head == MAPTRACK_TAIL) ) > + goto out; > > - prev_head = head; > - head = cmpxchg(&v->maptrack_head, prev_head, next); > - } while ( head != prev_head ); > + /* > + * Always keep one entry in the free list to make it easier to > + * add free entries to the tail. > + */ > + next = read_atomic(&maptrack_entry(t, head).ref); Since the lock protects the entire free list, why do you need to keep read_atomic() here? > + if ( unlikely(next == MAPTRACK_TAIL) ) > + head = MAPTRACK_TAIL; > + else > + v->maptrack_head = next; > > +out: Please indent labels by at least one blank, to avoid issues with diff's -p option. In fact if you didn't introduce a goto here in the first place, there'd be less code churn overall, as you'd need to alter the indentation of fewer lines. > @@ -623,7 +615,7 @@ put_maptrack_handle( > { > struct domain *currd = current->domain; > struct vcpu *v; > - unsigned int prev_tail, cur_tail; > + unsigned int prev_tail; > > /* 1. Set entry to be a tail. */ > maptrack_entry(t, handle).ref = MAPTRACK_TAIL; > @@ -633,11 +625,8 @@ put_maptrack_handle( > > spin_lock(&v->maptrack_freelist_lock); > > - cur_tail = read_atomic(&v->maptrack_tail); > - do { > - prev_tail = cur_tail; > - cur_tail = cmpxchg(&v->maptrack_tail, prev_tail, handle); > - } while ( cur_tail != prev_tail ); > + prev_tail = v->maptrack_tail; > + v->maptrack_tail = handle; > > /* 3. Update the old tail entry to point to the new entry. */ > write_atomic(&maptrack_entry(t, prev_tail).ref, handle); Since the write_atomic() here can then also be converted, may I ask that you then rename the local variable to just "tail" as well? > --- a/xen/include/xen/sched.h > +++ b/xen/include/xen/sched.h > @@ -255,7 +255,10 @@ struct vcpu > /* VCPU paused by system controller. */ > int controller_pause_count; > > - /* Grant table map tracking. */ > + /* > + * Grant table map tracking. The lock maptrack_freelist_lock protect Nit: protects > + * the access to maptrack_head and maptrack_tail. > + */ I'm inclined to suggest this doesn't need spelling out, considering ... > spinlock_t maptrack_freelist_lock; > unsigned int maptrack_head; > unsigned int maptrack_tail; ... both the name of the lock and its placement next to the two fields it protects. Also as per the docs change of the XSA-228 change, the lock protects more than just these two fields, so the comment may be misleading the way you have it now. Jan
Hi Jan, On 28/05/2021 14:29, Jan Beulich wrote: > On 26.05.2021 17:21, Julien Grall wrote: >> From: Julien Grall <jgrall@amazon.com> >> >> Since XSA-288 (commit 02cbeeb62075 "gnttab: split maptrack lock to make > > XSA-228 I suppose? Yes, I will update in the next version. >> it fulfill its purpose again"), v->maptrack_head and v->maptrack_tail >> are with the lock v->maptrack_freelist_lock held. > > Nit: missing "accessed" or alike? I have added "accessed". > >> Therefore it is not necessary to update the fields using cmpxchg() >> and also read them atomically. > > Ah yes, very good observation. Should have noticed this back at the > time, for an immediate follow-up change. > >> Note that there are two cases where v->maptrack_tail is accessed without >> the lock. They both happen _get_maptrack_handle() when the current vCPU >> list is empty. Therefore there is no possible race. > > I think you mean the other function here, without a leading underscore > in its name. Hmmm... Yes. I will update it. > And if you want to explain the absence of a race, wouldn't > you then better also mention that the list can get initially filled > only on the local vCPU? Sure. I will reword it. > >> I am not sure whether we should try to protect the remaining unprotected >> access with the lock or maybe add a comment? > > As per above I don't view adding locking as sensible. If you feel like > adding a helpful comment, perhaps. I will admit that it took me more > than just a moment to recall that "local vCPU only" argument. I will try to come up with an helpful comment. >> --- a/xen/common/grant_table.c >> +++ b/xen/common/grant_table.c >> @@ -543,34 +543,26 @@ double_gt_unlock(struct grant_table *lgt, struct grant_table *rgt) >> static inline grant_handle_t >> _get_maptrack_handle(struct grant_table *t, struct vcpu *v) >> { >> - unsigned int head, next, prev_head; >> + unsigned int head, next; >> >> spin_lock(&v->maptrack_freelist_lock); >> >> - do { >> - /* No maptrack pages allocated for this VCPU yet? */ >> - head = read_atomic(&v->maptrack_head); >> - if ( unlikely(head == MAPTRACK_TAIL) ) >> - { >> - spin_unlock(&v->maptrack_freelist_lock); >> - return INVALID_MAPTRACK_HANDLE; > > Where did this and ... > >> - } >> - >> - /* >> - * Always keep one entry in the free list to make it easier to >> - * add free entries to the tail. >> - */ >> - next = read_atomic(&maptrack_entry(t, head).ref); >> - if ( unlikely(next == MAPTRACK_TAIL) ) >> - { >> - spin_unlock(&v->maptrack_freelist_lock); >> - return INVALID_MAPTRACK_HANDLE; > > ... this use of INVALID_MAPTRACK_HANDLE go? It is at present merely > coincidence that INVALID_MAPTRACK_HANDLE == MAPTRACK_TAIL. If you > want to fold them, you will need to do so properly (by eliminating > one of the two constants). But I think they're separate on purpose. Hmmm... Somehow I thought one was an alias to the other. But they are clearly not. I will update it on the next version. > >> - } >> + /* No maptrack pages allocated for this VCPU yet? */ >> + head = v->maptrack_head; >> + if ( unlikely(head == MAPTRACK_TAIL) ) >> + goto out; >> >> - prev_head = head; >> - head = cmpxchg(&v->maptrack_head, prev_head, next); >> - } while ( head != prev_head ); >> + /* >> + * Always keep one entry in the free list to make it easier to >> + * add free entries to the tail. >> + */ >> + next = read_atomic(&maptrack_entry(t, head).ref); > > Since the lock protects the entire free list, why do you need to > keep read_atomic() here? Because I wasn't sure whether dropping {write, read}_atomic() when accessing the freelist would be fine. Anyway, I can drop it in the next version. > >> + if ( unlikely(next == MAPTRACK_TAIL) ) >> + head = MAPTRACK_TAIL; >> + else >> + v->maptrack_head = next; >> >> +out: > > Please indent labels by at least one blank, to avoid issues with > diff's -p option. In fact if you didn't introduce a goto here in > the first place, there'd be less code churn overall, as you'd > need to alter the indentation of fewer lines. I will have a look. > >> @@ -623,7 +615,7 @@ put_maptrack_handle( >> { >> struct domain *currd = current->domain; >> struct vcpu *v; >> - unsigned int prev_tail, cur_tail; >> + unsigned int prev_tail; >> >> /* 1. Set entry to be a tail. */ >> maptrack_entry(t, handle).ref = MAPTRACK_TAIL; >> @@ -633,11 +625,8 @@ put_maptrack_handle( >> >> spin_lock(&v->maptrack_freelist_lock); >> >> - cur_tail = read_atomic(&v->maptrack_tail); >> - do { >> - prev_tail = cur_tail; >> - cur_tail = cmpxchg(&v->maptrack_tail, prev_tail, handle); >> - } while ( cur_tail != prev_tail ); >> + prev_tail = v->maptrack_tail; >> + v->maptrack_tail = handle; >> >> /* 3. Update the old tail entry to point to the new entry. */ >> write_atomic(&maptrack_entry(t, prev_tail).ref, handle); > > Since the write_atomic() here can then also be converted, may I > ask that you then rename the local variable to just "tail" as > well? Sure. > >> --- a/xen/include/xen/sched.h >> +++ b/xen/include/xen/sched.h >> @@ -255,7 +255,10 @@ struct vcpu >> /* VCPU paused by system controller. */ >> int controller_pause_count; >> >> - /* Grant table map tracking. */ >> + /* >> + * Grant table map tracking. The lock maptrack_freelist_lock protect > > Nit: protects I will fix it. > >> + * the access to maptrack_head and maptrack_tail. >> + */ > > I'm inclined to suggest this doesn't need spelling out, considering ... > >> spinlock_t maptrack_freelist_lock; >> unsigned int maptrack_head; >> unsigned int maptrack_tail; > > ... both the name of the lock and its placement next to the two > fields it protects. Also as per the docs change of the XSA-228 change, > the lock protects more than just these two fields, so the comment may > be misleading the way you have it now. So I think it would be good to document above the lock what it actually protects. I agree this is fairly clear that it protect maptrack_{head, tail} but this wasn't very clear to me that it would also protect the content of the freelist (so read_atomic()/write_atomic() could be dropped). I will try to come up with a better comment. Cheers,
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c index ab30e2e8cfb6..cac9d1d73446 100644 --- a/xen/common/grant_table.c +++ b/xen/common/grant_table.c @@ -543,34 +543,26 @@ double_gt_unlock(struct grant_table *lgt, struct grant_table *rgt) static inline grant_handle_t _get_maptrack_handle(struct grant_table *t, struct vcpu *v) { - unsigned int head, next, prev_head; + unsigned int head, next; spin_lock(&v->maptrack_freelist_lock); - do { - /* No maptrack pages allocated for this VCPU yet? */ - head = read_atomic(&v->maptrack_head); - if ( unlikely(head == MAPTRACK_TAIL) ) - { - spin_unlock(&v->maptrack_freelist_lock); - return INVALID_MAPTRACK_HANDLE; - } - - /* - * Always keep one entry in the free list to make it easier to - * add free entries to the tail. - */ - next = read_atomic(&maptrack_entry(t, head).ref); - if ( unlikely(next == MAPTRACK_TAIL) ) - { - spin_unlock(&v->maptrack_freelist_lock); - return INVALID_MAPTRACK_HANDLE; - } + /* No maptrack pages allocated for this VCPU yet? */ + head = v->maptrack_head; + if ( unlikely(head == MAPTRACK_TAIL) ) + goto out; - prev_head = head; - head = cmpxchg(&v->maptrack_head, prev_head, next); - } while ( head != prev_head ); + /* + * Always keep one entry in the free list to make it easier to + * add free entries to the tail. + */ + next = read_atomic(&maptrack_entry(t, head).ref); + if ( unlikely(next == MAPTRACK_TAIL) ) + head = MAPTRACK_TAIL; + else + v->maptrack_head = next; +out: spin_unlock(&v->maptrack_freelist_lock); return head; @@ -623,7 +615,7 @@ put_maptrack_handle( { struct domain *currd = current->domain; struct vcpu *v; - unsigned int prev_tail, cur_tail; + unsigned int prev_tail; /* 1. Set entry to be a tail. */ maptrack_entry(t, handle).ref = MAPTRACK_TAIL; @@ -633,11 +625,8 @@ put_maptrack_handle( spin_lock(&v->maptrack_freelist_lock); - cur_tail = read_atomic(&v->maptrack_tail); - do { - prev_tail = cur_tail; - cur_tail = cmpxchg(&v->maptrack_tail, prev_tail, handle); - } while ( cur_tail != prev_tail ); + prev_tail = v->maptrack_tail; + v->maptrack_tail = handle; /* 3. Update the old tail entry to point to the new entry. */ write_atomic(&maptrack_entry(t, prev_tail).ref, handle); @@ -650,7 +639,7 @@ get_maptrack_handle( struct grant_table *lgt) { struct vcpu *curr = current; - unsigned int i, head; + unsigned int i; grant_handle_t handle; struct grant_mapping *new_mt = NULL; @@ -686,7 +675,7 @@ get_maptrack_handle( maptrack_entry(lgt, handle).ref = MAPTRACK_TAIL; curr->maptrack_tail = handle; if ( curr->maptrack_head == MAPTRACK_TAIL ) - write_atomic(&curr->maptrack_head, handle); + curr->maptrack_head = handle; spin_unlock(&curr->maptrack_freelist_lock); } return steal_maptrack_handle(lgt, curr); @@ -716,13 +705,10 @@ get_maptrack_handle( lgt->maptrack_limit += MAPTRACK_PER_PAGE; spin_unlock(&lgt->maptrack_lock); - spin_lock(&curr->maptrack_freelist_lock); - - do { - new_mt[i - 1].ref = read_atomic(&curr->maptrack_head); - head = cmpxchg(&curr->maptrack_head, new_mt[i - 1].ref, handle + 1); - } while ( head != new_mt[i - 1].ref ); + spin_lock(&curr->maptrack_freelist_lock); + new_mt[i - 1].ref = curr->maptrack_head; + curr->maptrack_head = handle + 1; spin_unlock(&curr->maptrack_freelist_lock); return handle; diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 3982167144c6..bd1cb08266d8 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -255,7 +255,10 @@ struct vcpu /* VCPU paused by system controller. */ int controller_pause_count; - /* Grant table map tracking. */ + /* + * Grant table map tracking. The lock maptrack_freelist_lock protect + * the access to maptrack_head and maptrack_tail. + */ spinlock_t maptrack_freelist_lock; unsigned int maptrack_head; unsigned int maptrack_tail;