diff mbox

x86 spinlock: Fix memory corruption on completing completions

Message ID 54D4DBA1.1030905@oracle.com (mailing list archive)
State New, archived
Headers show

Commit Message

Sasha Levin Feb. 6, 2015, 3:20 p.m. UTC
On 02/06/2015 09:49 AM, Raghavendra K T wrote:
>  static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
>  {
>  	if (TICKET_SLOWPATH_FLAG &&
> -	    static_key_false(&paravirt_ticketlocks_enabled)) {
> -		arch_spinlock_t prev;
> +		static_key_false(&paravirt_ticketlocks_enabled)) {
> +		__ticket_t prev_head;
>  
> -		prev = *lock;
> +		prev_head = lock->tickets.head;
>  		add_smp(&lock->tickets.head, TICKET_LOCK_INC);
>  
>  		/* add_smp() is a full mb() */
>  
> -		if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG))
> -			__ticket_unlock_slowpath(lock, prev);
> +		if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG)) {
> +			BUILD_BUG_ON(((__ticket_t)NR_CPUS) != NR_CPUS);
> +			__ticket_unlock_kick(lock, prev_head);

Can we modify it slightly to avoid potentially accessing invalid memory:



Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Linus Torvalds Feb. 6, 2015, 4:15 p.m. UTC | #1
On Fri, Feb 6, 2015 at 7:20 AM, Sasha Levin <sasha.levin@oracle.com> wrote:
>
> Can we modify it slightly to avoid potentially accessing invalid memory:

So I think there's a race with that.

And I'll warn you: the kernel does do speculative reads of memory that
might be invalid, not just in places like this. See the comment in
get_user_huge_page() for example, where we knowingly do speculative
reads, but hide it if DEBUG_PAGEALLOC is set.

More commonly, CONFIG_DCACHE_WORD_ACCESS is very much about doing
speculative reads. Now, that access is hidden inside an asm, so KASan
won't see it, but there might well be others.

You probably don't see them very much just because they are so rarely
a problem, and most of the time it's not to other processes stack but
to allocated structures where freeing takes long enough to basically
hide any small race..

In other words: I suspect it would be good to instead just teach KASan
about "this is a speculative read" and just suppress the warning for
those instead.

                             Linus
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrey Ryabinin Feb. 6, 2015, 5:03 p.m. UTC | #2
On 02/06/2015 07:15 PM, Linus Torvalds wrote:
> On Fri, Feb 6, 2015 at 7:20 AM, Sasha Levin <sasha.levin@oracle.com> wrote:
>>
>> Can we modify it slightly to avoid potentially accessing invalid memory:
> 
> So I think there's a race with that.
> 
> And I'll warn you: the kernel does do speculative reads of memory that
> might be invalid, not just in places like this. See the comment in
> get_user_huge_page() for example, where we knowingly do speculative
> reads, but hide it if DEBUG_PAGEALLOC is set.
> 
> More commonly, CONFIG_DCACHE_WORD_ACCESS is very much about doing
> speculative reads. Now, that access is hidden inside an asm, so KASan
> won't see it, but there might well be others.
> 
> You probably don't see them very much just because they are so rarely
> a problem, and most of the time it's not to other processes stack but
> to allocated structures where freeing takes long enough to basically
> hide any small race..
> 
> In other words: I suspect it would be good to instead just teach KASan
> about "this is a speculative read" and just suppress the warning for
> those instead.
> 

We can suppress warnings by wrapping such speculative reads with
kasan_disable_current()/kasan_enable_current() calls.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Oleg Nesterov Feb. 8, 2015, 5:14 p.m. UTC | #3
On 02/06, Sasha Levin wrote:
>
> Can we modify it slightly to avoid potentially accessing invalid memory:
> 
> diff --git a/arch/x86/include/asm/spinlock.h b/arch/x86/include/asm/spinlock.h
> index 5315887..cd22d73 100644
> --- a/arch/x86/include/asm/spinlock.h
> +++ b/arch/x86/include/asm/spinlock.h
> @@ -144,13 +144,13 @@ static __always_inline void arch_spin_unlock(arch_spinlock_t *lock
>         if (TICKET_SLOWPATH_FLAG &&
>                 static_key_false(&paravirt_ticketlocks_enabled)) {
>                 __ticket_t prev_head;
> -
> +               bool needs_kick = lock->tickets.tail & TICKET_SLOWPATH_FLAG;
>                 prev_head = lock->tickets.head;
>                 add_smp(&lock->tickets.head, TICKET_LOCK_INC);
> 
>                 /* add_smp() is a full mb() */
> 
> -               if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG)) {
> +               if (unlikely(needs_kick)) {

This doesn't look right too...

We need to guarantee that either unlock() sees TICKET_SLOWPATH_FLAG, or
the calller of __ticket_enter_slowpath() sees the result of add_smp().

Suppose that kvm_lock_spinning() is called right before add_smp() and it
sets SLOWPATH. It will block then because .head != want, and it needs
__ticket_unlock_kick().

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/include/asm/spinlock.h b/arch/x86/include/asm/spinlock.h
index 5315887..cd22d73 100644
--- a/arch/x86/include/asm/spinlock.h
+++ b/arch/x86/include/asm/spinlock.h
@@ -144,13 +144,13 @@  static __always_inline void arch_spin_unlock(arch_spinlock_t *lock
        if (TICKET_SLOWPATH_FLAG &&
                static_key_false(&paravirt_ticketlocks_enabled)) {
                __ticket_t prev_head;
-
+               bool needs_kick = lock->tickets.tail & TICKET_SLOWPATH_FLAG;
                prev_head = lock->tickets.head;
                add_smp(&lock->tickets.head, TICKET_LOCK_INC);

                /* add_smp() is a full mb() */

-               if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG)) {
+               if (unlikely(needs_kick)) {
                        BUILD_BUG_ON(((__ticket_t)NR_CPUS) != NR_CPUS);
                        __ticket_unlock_kick(lock, prev_head);
                }