[v2,2/2] xen/rwlock: add check_lock() handling to rwlocks

Message ID	20201102131239.14134-3-jgross@suse.com (mailing list archive)
State	Superseded
Headers	show Return-Path: <SRS0=5c5y=EI=lists.xenproject.org=xen-devel-bounces@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2B37721D40 Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" <xen-devel-bounces@lists.xenproject.org> From: Juergen Gross <jgross@suse.com> To: xen-devel@lists.xenproject.org Cc: Juergen Gross <jgross@suse.com>, Andrew Cooper <andrew.cooper3@citrix.com>, George Dunlap <george.dunlap@citrix.com>, Ian Jackson <iwj@xenproject.org>, Jan Beulich <jbeulich@suse.com>, Julien Grall <julien@xen.org>, Stefano Stabellini <sstabellini@kernel.org>, Wei Liu <wl@xen.org> Subject: [PATCH v2 2/2] xen/rwlock: add check_lock() handling to rwlocks Date: Mon, 2 Nov 2020 14:12:39 +0100 Message-Id: <20201102131239.14134-3-jgross@suse.com> In-Reply-To: <20201102131239.14134-1-jgross@suse.com> References: <20201102131239.14134-1-jgross@suse.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	xen/locking: fix and enhance lock debugging \| expand [v2,0/2] xen/locking: fix and enhance lock debugging [v2,1/2] xen/spinlocks: spin_trylock with interrupts off is always fine [v2,2/2] xen/rwlock: add check_lock() handling to rwlocks

Jürgen Groß Nov. 2, 2020, 1:12 p.m. UTC

Checking whether a lock is consistently used regarding interrupts on
or off is beneficial for rwlocks, too.

So add check_lock() calls to rwlock functions. For this purpose make
check_lock() globally accessible.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- call check_lock() unconditionally in try_lock variants (Jan Beulich)
---
 xen/common/spinlock.c      |  3 +--
 xen/include/xen/rwlock.h   | 11 +++++++++++
 xen/include/xen/spinlock.h |  2 ++
 3 files changed, 14 insertions(+), 2 deletions(-)

Jan Beulich Nov. 3, 2020, 9:02 a.m. UTC | #1

On 02.11.2020 14:12, Juergen Gross wrote:
> --- a/xen/include/xen/rwlock.h
> +++ b/xen/include/xen/rwlock.h
> @@ -56,6 +56,7 @@ static inline int _read_trylock(rwlock_t *lock)
>      u32 cnts;
>  
>      preempt_disable();
> +    check_lock(&lock->lock.debug, true);
>      cnts = atomic_read(&lock->cnts);
>      if ( likely(_can_read_lock(cnts)) )
>      {

I'm sorry for being picky, but this still isn't matching
_spin_trylock(). Perhaps the difference is really benign, but
there the check sits ahead of preempt_disable(). (It has a
clear reason to be this way there, but without a clear reason
for things to be differently here, I think matching ordering
may help, at least to avoid questions.)

> @@ -66,6 +67,7 @@ static inline int _read_trylock(rwlock_t *lock)
>           */
>          if ( likely(_can_read_lock(cnts)) )
>              return 1;
> +
>          atomic_sub(_QR_BIAS, &lock->cnts);
>      }
>      preempt_enable();

Stray change?

> @@ -87,7 +89,10 @@ static inline void _read_lock(rwlock_t *lock)
>       * arch_lock_acquire_barrier().
>       */
>      if ( likely(_can_read_lock(cnts)) )
> +    {
> +        check_lock(&lock->lock.debug, false);
>          return;
> +    }
>  
>      /* The slowpath will decrement the reader count, if necessary. */
>      queue_read_lock_slowpath(lock);
> @@ -162,7 +167,10 @@ static inline void _write_lock(rwlock_t *lock)
>       * arch_lock_acquire_barrier().
>       */
>      if ( atomic_cmpxchg(&lock->cnts, 0, _write_lock_val()) == 0 )
> +    {
> +        check_lock(&lock->lock.debug, false);
>          return;
> +    }
>  
>      queue_write_lock_slowpath(lock);
>      /*

Maybe also for these two, but likely more importantly for ...

> @@ -328,6 +337,8 @@ static inline void _percpu_read_lock(percpu_rwlock_t **per_cpudata,
>          /* Drop the read lock because we don't need it anymore. */
>          read_unlock(&percpu_rwlock->rwlock);
>      }
> +    else
> +        check_lock(&percpu_rwlock->rwlock.lock.debug, false);
>  }

... this one a brief comment may be warranted to clarify why
the call sits here rather than at the top?

Jan

Jürgen Groß Nov. 3, 2020, 9:22 a.m. UTC | #2

On 03.11.20 10:02, Jan Beulich wrote:
> On 02.11.2020 14:12, Juergen Gross wrote:
>> --- a/xen/include/xen/rwlock.h
>> +++ b/xen/include/xen/rwlock.h
>> @@ -56,6 +56,7 @@ static inline int _read_trylock(rwlock_t *lock)
>>       u32 cnts;
>>   
>>       preempt_disable();
>> +    check_lock(&lock->lock.debug, true);
>>       cnts = atomic_read(&lock->cnts);
>>       if ( likely(_can_read_lock(cnts)) )
>>       {
> 
> I'm sorry for being picky, but this still isn't matching
> _spin_trylock(). Perhaps the difference is really benign, but
> there the check sits ahead of preempt_disable(). (It has a
> clear reason to be this way there, but without a clear reason
> for things to be differently here, I think matching ordering
> may help, at least to avoid questions.)

I think this is more an optimization opportunity: I'd rather move the
preempt_disable() into the first if clause, as there is no need to
disable preemption in case the first read of the lock already leads
to failure acquiring the lock.

If you want I can prepend a patch doing that optimization.

> 
>> @@ -66,6 +67,7 @@ static inline int _read_trylock(rwlock_t *lock)
>>            */
>>           if ( likely(_can_read_lock(cnts)) )
>>               return 1;
>> +
>>           atomic_sub(_QR_BIAS, &lock->cnts);
>>       }
>>       preempt_enable();
> 
> Stray change?

Oh yes, a leftover from the old positioning of check_lock().

> 
>> @@ -87,7 +89,10 @@ static inline void _read_lock(rwlock_t *lock)
>>        * arch_lock_acquire_barrier().
>>        */
>>       if ( likely(_can_read_lock(cnts)) )
>> +    {
>> +        check_lock(&lock->lock.debug, false);
>>           return;
>> +    }
>>   
>>       /* The slowpath will decrement the reader count, if necessary. */
>>       queue_read_lock_slowpath(lock);
>> @@ -162,7 +167,10 @@ static inline void _write_lock(rwlock_t *lock)
>>        * arch_lock_acquire_barrier().
>>        */
>>       if ( atomic_cmpxchg(&lock->cnts, 0, _write_lock_val()) == 0 )
>> +    {
>> +        check_lock(&lock->lock.debug, false);
>>           return;
>> +    }
>>   
>>       queue_write_lock_slowpath(lock);
>>       /*
> 
> Maybe also for these two, but likely more importantly for ...
> 
>> @@ -328,6 +337,8 @@ static inline void _percpu_read_lock(percpu_rwlock_t **per_cpudata,
>>           /* Drop the read lock because we don't need it anymore. */
>>           read_unlock(&percpu_rwlock->rwlock);
>>       }
>> +    else
>> +        check_lock(&percpu_rwlock->rwlock.lock.debug, false);
>>   }
> 
> ... this one a brief comment may be warranted to clarify why
> the call sits here rather than at the top?

Okay.


Juergen

Jan Beulich Nov. 3, 2020, 10:04 a.m. UTC | #3

On 03.11.2020 10:22, Jürgen Groß wrote:
> On 03.11.20 10:02, Jan Beulich wrote:
>> On 02.11.2020 14:12, Juergen Gross wrote:
>>> --- a/xen/include/xen/rwlock.h
>>> +++ b/xen/include/xen/rwlock.h
>>> @@ -56,6 +56,7 @@ static inline int _read_trylock(rwlock_t *lock)
>>>       u32 cnts;
>>>   
>>>       preempt_disable();
>>> +    check_lock(&lock->lock.debug, true);
>>>       cnts = atomic_read(&lock->cnts);
>>>       if ( likely(_can_read_lock(cnts)) )
>>>       {
>>
>> I'm sorry for being picky, but this still isn't matching
>> _spin_trylock(). Perhaps the difference is really benign, but
>> there the check sits ahead of preempt_disable(). (It has a
>> clear reason to be this way there, but without a clear reason
>> for things to be differently here, I think matching ordering
>> may help, at least to avoid questions.)
> 
> I think this is more an optimization opportunity: I'd rather move the
> preempt_disable() into the first if clause, as there is no need to
> disable preemption in case the first read of the lock already leads
> to failure acquiring the lock.
> 
> If you want I can prepend a patch doing that optimization.

I'd appreciate you doing so, yet I'd like to point out that
then the same question remains for _write_trylock(). Perhaps
a similar transformation is possible there, but it'll at
least be more code churn. Which of course isn't a reason not
to go this route there too.

This said - wouldn't what you suggest be wrong if we had
actual preemption in the hypervisor? Preemption hitting
between e.g. these two lines

    cnts = atomic_read(&lock->cnts);
    if ( likely(_can_read_lock(cnts)) )

would not yield the intended result, would it? (It wouldn't
affect correctness afaics, because the caller has to be
prepared anyway that the attempt fails, but the amount of
effectively false negatives would grow, as would the number
of cases where the slower path is taken for no reason.)

Question therefore is how much we care about keeping code
ready for "real" preemption, when we have ample other places
that would need changing first, before such could be enabled.

Jan

Jürgen Groß Nov. 3, 2020, 10:17 a.m. UTC | #4

On 03.11.20 11:04, Jan Beulich wrote:
> On 03.11.2020 10:22, Jürgen Groß wrote:
>> On 03.11.20 10:02, Jan Beulich wrote:
>>> On 02.11.2020 14:12, Juergen Gross wrote:
>>>> --- a/xen/include/xen/rwlock.h
>>>> +++ b/xen/include/xen/rwlock.h
>>>> @@ -56,6 +56,7 @@ static inline int _read_trylock(rwlock_t *lock)
>>>>        u32 cnts;
>>>>    
>>>>        preempt_disable();
>>>> +    check_lock(&lock->lock.debug, true);
>>>>        cnts = atomic_read(&lock->cnts);
>>>>        if ( likely(_can_read_lock(cnts)) )
>>>>        {
>>>
>>> I'm sorry for being picky, but this still isn't matching
>>> _spin_trylock(). Perhaps the difference is really benign, but
>>> there the check sits ahead of preempt_disable(). (It has a
>>> clear reason to be this way there, but without a clear reason
>>> for things to be differently here, I think matching ordering
>>> may help, at least to avoid questions.)
>>
>> I think this is more an optimization opportunity: I'd rather move the
>> preempt_disable() into the first if clause, as there is no need to
>> disable preemption in case the first read of the lock already leads
>> to failure acquiring the lock.
>>
>> If you want I can prepend a patch doing that optimization.
> 
> I'd appreciate you doing so, yet I'd like to point out that
> then the same question remains for _write_trylock(). Perhaps
> a similar transformation is possible there, but it'll at
> least be more code churn. Which of course isn't a reason not
> to go this route there too.

Shouldn't be very hard. It would just need to split the if clause
into two clauses.

> This said - wouldn't what you suggest be wrong if we had
> actual preemption in the hypervisor? Preemption hitting
> between e.g. these two lines
> 
>      cnts = atomic_read(&lock->cnts);
>      if ( likely(_can_read_lock(cnts)) )
> 
> would not yield the intended result, would it? (It wouldn't
> affect correctness afaics, because the caller has to be
> prepared anyway that the attempt fails, but the amount of
> effectively false negatives would grow, as would the number
> of cases where the slower path is taken for no reason.)

And this in turn would hit _spin_trylock() the same way.

IMO we should harmonize all the trylock variants in this regard:
either they have an as small as possible preemption disabled
section or they have the initial test for success being possible
at all in this section.

> Question therefore is how much we care about keeping code
> ready for "real" preemption, when we have ample other places
> that would need changing first, before such could be enabled.

Yes. And depending on the answer the route to go (wide or narrow
no-preemption section) wither the rwlock or the spinlock trylock
variants should be adapted.


Juergen

Julien Grall Nov. 3, 2020, 10:25 a.m. UTC | #5

Hi Jan,

On 03/11/2020 10:04, Jan Beulich wrote:
> On 03.11.2020 10:22, Jürgen Groß wrote:
>> On 03.11.20 10:02, Jan Beulich wrote:
>>> On 02.11.2020 14:12, Juergen Gross wrote:
> Question therefore is how much we care about keeping code
> ready for "real" preemption, when we have ample other places
> that would need changing first, before such could be enabled

The question we should ask ourself is whether we think one would want to 
use preemption in Xen.

Some of the emulation in Xen on Arm (e.g. ITS, SMMUv3, set/way) would 
have been easier to implement if the code were preemptible.

I also hear time to time stakeholders asking for preemptible spin lock 
(this is useful for RT).

Therefore, I think there are values to keep the code as much as possible 
preempt-ready.

Cheers,

Jan Beulich Nov. 3, 2020, 10:30 a.m. UTC | #6

On 03.11.2020 11:17, Jürgen Groß wrote:
> On 03.11.20 11:04, Jan Beulich wrote:
>> This said - wouldn't what you suggest be wrong if we had
>> actual preemption in the hypervisor? Preemption hitting
>> between e.g. these two lines
>>
>>      cnts = atomic_read(&lock->cnts);
>>      if ( likely(_can_read_lock(cnts)) )
>>
>> would not yield the intended result, would it? (It wouldn't
>> affect correctness afaics, because the caller has to be
>> prepared anyway that the attempt fails, but the amount of
>> effectively false negatives would grow, as would the number
>> of cases where the slower path is taken for no reason.)
> 
> And this in turn would hit _spin_trylock() the same way.

True.

> IMO we should harmonize all the trylock variants in this regard:
> either they have an as small as possible preemption disabled
> section or they have the initial test for success being possible
> at all in this section.
> 
>> Question therefore is how much we care about keeping code
>> ready for "real" preemption, when we have ample other places
>> that would need changing first, before such could be enabled.
> 
> Yes. And depending on the answer the route to go (wide or narrow
> no-preemption section) wither the rwlock or the spinlock trylock
> variants should be adapted.

Well, personally I'd slightly prefer the adjustment as you did
suggest, but Julien's subsequent reply points towards the other
direction.

Jan

[v2,2/2] xen/rwlock: add check_lock() handling to rwlocks

Commit Message

Comments

Patch