[V5] x86 spinlock: Fix memory corruption on completing completions

Paravirt spinlock clears slowpath flag after doing unlock.
As explained by Linus currently it does:
                prev = *lock;
                add_smp(&lock->tickets.head, TICKET_LOCK_INC);

                /* add_smp() is a full mb() */

                if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG))
                        __ticket_unlock_slowpath(lock, prev);

which is *exactly* the kind of things you cannot do with spinlocks,
because after you've done the "add_smp()" and released the spinlock
for the fast-path, you can't access the spinlock any more.  Exactly
because a fast-path lock might come in, and release the whole data
structure.

Linus suggested that we should not do any writes to lock after unlock(),
and we can move slowpath clearing to fastpath lock.

So this patch implements the fix with:
1. Moving slowpath flag to head (Oleg):
Unlocked locks don't care about the slowpath flag; therefore we can keep
it set after the last unlock, and clear it again on the first (try)lock.
-- this removes the write after unlock. note that keeping slowpath flag would
result in unnecessary kicks.
By moving the slowpath flag from the tail to the head ticket we also avoid
the need to access both the head and tail tickets on unlock.

2. use xadd to avoid read/write after unlock that checks the need for
unlock_kick (Linus):
We further avoid the need for a read-after-release by using xadd;
the prev head value will include the slowpath flag and indicate if we
need to do PV kicking of suspended spinners -- on modern chips xadd
isn't (much) more expensive than an add + load.

Result:
 setup: 16core (32 cpu +ht sandy bridge 8GB 16vcpu guest)
 benchmark overcommit %improve
 kernbench  1x           -0.13
 kernbench  2x            0.02
 dbench     1x           -1.77
 dbench     2x           -0.63

[Jeremy: hinted missing TICKET_LOCK_INC for kick]
[Oleg: Moving slowpath flag to head, ticket_equals idea]
[PeterZ: Detailed changelog]

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
---
 arch/x86/include/asm/spinlock.h | 95 ++++++++++++++++++++---------------------
 arch/x86/kernel/kvm.c           | 10 +++--
 arch/x86/xen/spinlock.c         | 10 +++--
 3 files changed, 59 insertions(+), 56 deletions(-)

potential TODO:
 * The whole patch be splitted into, 1. move slowpath flag
     2. fix memory corruption in completion problem ??

Changes since V4:
  - one more usage of tickets_equal() (Oleg)
  - head > tail situation can lead to false contended check (Oleg)

Changes since V3:
  - Detailed changelog (PeterZ)
  - Replace ACCESS_ONCE with READ_ONCE (oleg)
  - Add xen changes (Oleg)
  - Correct break logic in unlock_wait() (Oleg)

Changes since V2:
  - Move the slowpath flag to head, this enables xadd usage in unlock code
    and inturn we can get rid of read/write after  unlock (Oleg)
  - usage of ticket_equals (Oleg)

Changes since V1:
  - Add missing TICKET_LOCK_INC before unlock kick (fixes hang in overcommit: Jeremy).
  - Remove SLOWPATH_FLAG clearing in fast lock. (Jeremy)
  - clear SLOWPATH_FLAG in arch_spin_value_unlocked during comparison.

 Result:
 setup: 16core (32 cpu +ht sandy bridge 8GB 16vcpu guest)
base = 3_19_rc7

3_19_rc7_spinfix_v3
+-----------+-----------+-----------+------------+-----------+
     kernbench (Time taken in sec lower is better)
+-----------+-----------+-----------+------------+-----------+
     base       %stdev    patched      %stdev      %improve
+-----------+-----------+-----------+------------+-----------+
1x   54.2300     3.0652     54.3008     4.0366    -0.13056
2x   90.1883     5.5509     90.1650     6.4336     0.02583
+-----------+-----------+-----------+------------+-----------+
+-----------+-----------+-----------+------------+-----------+
    dbench (Throughput higher is better)
+-----------+-----------+-----------+------------+-----------+
     base       %stdev    patched      %stdev      %improve
+-----------+-----------+-----------+------------+-----------+
1x 7029.9188     2.5952   6905.0712     4.4737    -1.77595
2x 3254.2075    14.8291   3233.7137    26.8784    -0.62976
+-----------+-----------+-----------+------------+-----------+

 (here is the result I got from the patches, I believe there may
 be some small overhead from xadd etc, but overall looks fine but
 a thorough test may be needed)

[V5] x86 spinlock: Fix memory corruption on completing completions

Commit Message

Comments

Patch