[RFC,06/11] rust: apply cache line padding for `SpinLock`

Message ID	20230503090708.2524310-7-nmi@metaspace.dk (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-block-owner@vger.kernel.org> From: Andreas Hindborg <nmi@metaspace.dk> To: Jens Axboe <axboe@kernel.dk>, Christoph Hellwig <hch@lst.de>, Keith Busch <kbusch@kernel.org>, Damien Le Moal <Damien.LeMoal@wdc.com>, Hannes Reinecke <hare@suse.de>, lsf-pc@lists.linux-foundation.org, rust-for-linux@vger.kernel.org, linux-block@vger.kernel.org Cc: Andreas Hindborg <a.hindborg@samsung.com>, Matthew Wilcox <willy@infradead.org>, Miguel Ojeda <ojeda@kernel.org>, Alex Gaynor <alex.gaynor@gmail.com>, Wedson Almeida Filho <wedsonaf@gmail.com>, Boqun Feng <boqun.feng@gmail.com>, Gary Guo <gary@garyguo.net>, =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= <bjorn3_gh@protonmail.com>, Benno Lossin <benno.lossin@proton.me>, Andreas Hindborg <nmi@metaspace.dk>, linux-kernel@vger.kernel.org (open list), gost.dev@samsung.com Subject: [RFC PATCH 06/11] rust: apply cache line padding for `SpinLock` Date: Wed, 3 May 2023 11:07:03 +0200 Message-Id: <20230503090708.2524310-7-nmi@metaspace.dk> In-Reply-To: <20230503090708.2524310-1-nmi@metaspace.dk> References: <20230503090708.2524310-1-nmi@metaspace.dk> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	Rust null block driver \| expand [RFC,00/11] Rust null block driver [RFC,01/11] rust: add radix tree abstraction [RFC,02/11] rust: add `pages` module for handling page allocation [RFC,03/11] rust: block: introduce `kernel::block::mq` module [RFC,04/11] rust: block: introduce `kernel::block::bio` module [RFC,05/11] RUST: add `module_params` macro [RFC,06/11] rust: apply cache line padding for `SpinLock` [RFC,07/11] rust: lock: add support for `Lock::lock_irqsave` [RFC,08/11] rust: lock: implement `IrqSaveBackend` for `SpinLock` [RFC,09/11] RUST: implement `ForeignOwnable` for `Pin` [RFC,10/11] rust: add null block driver [RFC,11/11] rust: inline a number of short functions

Message ID

20230503090708.2524310-7-nmi@metaspace.dk (mailing list archive)

State

New, archived

Headers

From: Andreas Hindborg <nmi@metaspace.dk>
To: Jens Axboe <axboe@kernel.dk>, Christoph Hellwig <hch@lst.de>,
        Keith Busch <kbusch@kernel.org>,
        Damien Le Moal <Damien.LeMoal@wdc.com>,
        Hannes Reinecke <hare@suse.de>,
        lsf-pc@lists.linux-foundation.org, rust-for-linux@vger.kernel.org,
        linux-block@vger.kernel.org
Cc: Andreas Hindborg <a.hindborg@samsung.com>,
 Matthew Wilcox <willy@infradead.org>, Miguel Ojeda <ojeda@kernel.org>,
 Alex Gaynor <alex.gaynor@gmail.com>,
 Wedson Almeida Filho <wedsonaf@gmail.com>, Boqun Feng <boqun.feng@gmail.com>,
 Gary Guo <gary@garyguo.net>,
 =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= <bjorn3_gh@protonmail.com>,
 Benno Lossin <benno.lossin@proton.me>, Andreas Hindborg <nmi@metaspace.dk>,
 linux-kernel@vger.kernel.org (open list), gost.dev@samsung.com
Subject: [RFC PATCH 06/11] rust: apply cache line padding for `SpinLock`
Date: Wed,  3 May 2023 11:07:03 +0200
Message-Id: <20230503090708.2524310-7-nmi@metaspace.dk>
In-Reply-To: <20230503090708.2524310-1-nmi@metaspace.dk>
References: <20230503090708.2524310-1-nmi@metaspace.dk>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

Rust null block driver | expand

Commit Message

Andreas Hindborg May 3, 2023, 9:07 a.m. UTC

From: Andreas Hindborg <a.hindborg@samsung.com>

The kernel `struct spinlock` is 4 bytes on x86 when lockdep is not enabled. The
structure is not padded to fit a cache line. The effect of this for `SpinLock`
is that the lock variable and the value protected by the lock will share a cache
line, depending on the alignment requirements of the protected value. Aligning
the lock variable and the protected value to a cache line yields a 20%
performance increase for the Rust null block driver for sequential reads to
memory backed devices at 6 concurrent readers.

Signed-off-by: Andreas Hindborg <a.hindborg@samsung.com>
---
 rust/kernel/cache_padded.rs       | 33 +++++++++++++++++++++++++++++++
 rust/kernel/lib.rs                |  2 ++
 rust/kernel/sync/lock.rs          |  9 ++++++---
 rust/kernel/sync/lock/spinlock.rs | 13 ++++++++----
 4 files changed, 50 insertions(+), 7 deletions(-)
 create mode 100644 rust/kernel/cache_padded.rs

Comments

Alice Ryhl May 3, 2023, 12:03 p.m. UTC | #1

On Wed, 3 May 2023 11:07:03 +0200, Andreas Hindborg <a.hindborg@samsung.com> wrote:
> The kernel `struct spinlock` is 4 bytes on x86 when lockdep is not enabled. The
> structure is not padded to fit a cache line. The effect of this for `SpinLock`
> is that the lock variable and the value protected by the lock will share a cache
> line, depending on the alignment requirements of the protected value. Aligning
> the lock variable and the protected value to a cache line yields a 20%
> performance increase for the Rust null block driver for sequential reads to
> memory backed devices at 6 concurrent readers.
> 
> Signed-off-by: Andreas Hindborg <a.hindborg@samsung.com>

This applies the cacheline padding to all spinlocks unconditionally.
It's not clear to me that we want to do that. Instead, I suggest using
`SpinLock<CachePadded<T>>` in the null block driver to opt-in to the
cache padding there, and let other drivers choose whether or not they
want to cache pad their locks.

On Wed, 3 May 2023 11:07:03 +0200, Andreas Hindborg <a.hindborg@samsung.com> wrote:
> diff --git a/rust/kernel/cache_padded.rs b/rust/kernel/cache_padded.rs
> new file mode 100644
> index 000000000000..758678e71f50
> --- /dev/null
> +++ b/rust/kernel/cache_padded.rs
> 
> +impl<T> CachePadded<T> {
> +    /// Pads and aligns a value to 64 bytes.
> +    #[inline(always)]
> +    pub(crate) const fn new(t: T) -> CachePadded<T> {
> +        CachePadded::<T> { value: t }
> +    }
> +}

Please make this `pub` instead of just `pub(crate)`. Other drivers might
want to use this directly.

On Wed, 3 May 2023 11:07:03 +0200, Andreas Hindborg <a.hindborg@samsung.com> wrote:
> diff --git a/rust/kernel/sync/lock/spinlock.rs b/rust/kernel/sync/lock/spinlock.rs
> index 979b56464a4e..e39142a8148c 100644
> --- a/rust/kernel/sync/lock/spinlock.rs
> +++ b/rust/kernel/sync/lock/spinlock.rs
> @@ -100,18 +103,20 @@ unsafe impl super::Backend for SpinLockBackend {
>      ) {
>          // SAFETY: The safety requirements ensure that `ptr` is valid for writes, and `name` and
>          // `key` are valid for read indefinitely.
> -        unsafe { bindings::__spin_lock_init(ptr, name, key) }
> +        unsafe { bindings::__spin_lock_init((&mut *ptr).deref_mut(), name, key) }
>      }
>  
> +    #[inline(always)]
>      unsafe fn lock(ptr: *mut Self::State) -> Self::GuardState {
>          // SAFETY: The safety requirements of this function ensure that `ptr` points to valid
>          // memory, and that it has been initialised before.
> -        unsafe { bindings::spin_lock(ptr) }
> +        unsafe { bindings::spin_lock((&mut *ptr).deref_mut()) }
>      }
>  
> +    #[inline(always)]
>      unsafe fn unlock(ptr: *mut Self::State, _guard_state: &Self::GuardState) {
>          // SAFETY: The safety requirements of this function ensure that `ptr` is valid and that the
>          // caller is the owner of the mutex.
> -        unsafe { bindings::spin_unlock(ptr) }
> +        unsafe { bindings::spin_unlock((&mut *ptr).deref_mut()) }
>      }
>  }

I would prefer to remain in pointer-land for the above operations. I
think that this leads to core that is more obviously correct.

For example:

```
impl<T> CachePadded<T> {
    pub const fn raw_get(ptr: *mut Self) -> *mut T {
        core::ptr::addr_of_mut!((*ptr).value)
    }
}

#[inline(always)]
unsafe fn unlock(ptr: *mut Self::State, _guard_state: &Self::GuardState) {
    unsafe { bindings::spin_unlock(CachePadded::raw_get(ptr)) }
}
```

Andreas Hindborg Feb. 23, 2024, 11:29 a.m. UTC | #2

Hi Alice,

Alice Ryhl <aliceryhl@google.com> writes:

> On Wed, 3 May 2023 11:07:03 +0200, Andreas Hindborg <a.hindborg@samsung.com> wrote:
>> The kernel `struct spinlock` is 4 bytes on x86 when lockdep is not enabled. The
>> structure is not padded to fit a cache line. The effect of this for `SpinLock`
>> is that the lock variable and the value protected by the lock will share a cache
>> line, depending on the alignment requirements of the protected value. Aligning
>> the lock variable and the protected value to a cache line yields a 20%
>> performance increase for the Rust null block driver for sequential reads to
>> memory backed devices at 6 concurrent readers.
>> 
>> Signed-off-by: Andreas Hindborg <a.hindborg@samsung.com>
>
> This applies the cacheline padding to all spinlocks unconditionally.
> It's not clear to me that we want to do that. Instead, I suggest using
> `SpinLock<CachePadded<T>>` in the null block driver to opt-in to the
> cache padding there, and let other drivers choose whether or not they
> want to cache pad their locks.

I was going to write that this is not going to work because the compiler
is going to reorder the fields of `Lock` and put the `data` field first,
followed by the `state` field. But I checked the layout, and it seems
that I actually get the `state` field first (with an alignment of 4), 60
bytes of padding, and then the `data` field (with alignment 64).

I am wondering why the compiler is not reordering these fields? Am I
guaranteed that the fields will not be reordered? Looking at the
definition of `Lock` there does not seem to be anything that prevents
rustc from swapping `state` and `data`.

>
> On Wed, 3 May 2023 11:07:03 +0200, Andreas Hindborg <a.hindborg@samsung.com> wrote:
>> diff --git a/rust/kernel/cache_padded.rs b/rust/kernel/cache_padded.rs
>> new file mode 100644
>> index 000000000000..758678e71f50
>> --- /dev/null
>> +++ b/rust/kernel/cache_padded.rs
>> 
>> +impl<T> CachePadded<T> {
>> +    /// Pads and aligns a value to 64 bytes.
>> +    #[inline(always)]
>> +    pub(crate) const fn new(t: T) -> CachePadded<T> {
>> +        CachePadded::<T> { value: t }
>> +    }
>> +}
>
> Please make this `pub` instead of just `pub(crate)`. Other drivers might
> want to use this directly.

Alright.

>
> On Wed, 3 May 2023 11:07:03 +0200, Andreas Hindborg <a.hindborg@samsung.com> wrote:
>> diff --git a/rust/kernel/sync/lock/spinlock.rs b/rust/kernel/sync/lock/spinlock.rs
>> index 979b56464a4e..e39142a8148c 100644
>> --- a/rust/kernel/sync/lock/spinlock.rs
>> +++ b/rust/kernel/sync/lock/spinlock.rs
>> @@ -100,18 +103,20 @@ unsafe impl super::Backend for SpinLockBackend {
>>      ) {
>>          // SAFETY: The safety requirements ensure that `ptr` is valid for writes, and `name` and
>>          // `key` are valid for read indefinitely.
>> -        unsafe { bindings::__spin_lock_init(ptr, name, key) }
>> +        unsafe { bindings::__spin_lock_init((&mut *ptr).deref_mut(), name, key) }
>>      }
>>  
>> +    #[inline(always)]
>>      unsafe fn lock(ptr: *mut Self::State) -> Self::GuardState {
>>          // SAFETY: The safety requirements of this function ensure that `ptr` points to valid
>>          // memory, and that it has been initialised before.
>> -        unsafe { bindings::spin_lock(ptr) }
>> +        unsafe { bindings::spin_lock((&mut *ptr).deref_mut()) }
>>      }
>>  
>> +    #[inline(always)]
>>      unsafe fn unlock(ptr: *mut Self::State, _guard_state: &Self::GuardState) {
>>          // SAFETY: The safety requirements of this function ensure that `ptr` is valid and that the
>>          // caller is the owner of the mutex.
>> -        unsafe { bindings::spin_unlock(ptr) }
>> +        unsafe { bindings::spin_unlock((&mut *ptr).deref_mut()) }
>>      }
>>  }
>
> I would prefer to remain in pointer-land for the above operations. I
> think that this leads to core that is more obviously correct.
>
> For example:
>
> ```
> impl<T> CachePadded<T> {
>     pub const fn raw_get(ptr: *mut Self) -> *mut T {
>         core::ptr::addr_of_mut!((*ptr).value)
>     }
> }
>
> #[inline(always)]
> unsafe fn unlock(ptr: *mut Self::State, _guard_state: &Self::GuardState) {
>     unsafe { bindings::spin_unlock(CachePadded::raw_get(ptr)) }
> }
> ```

Got it

Alice Ryhl Feb. 26, 2024, 9:15 a.m. UTC | #3

On Mon, Feb 26, 2024 at 10:02 AM Andreas Hindborg (Samsung)
<nmi@metaspace.dk> wrote:
>
>
> Hi Alice,
>
> Alice Ryhl <aliceryhl@google.com> writes:
>
> > On Wed, 3 May 2023 11:07:03 +0200, Andreas Hindborg <a.hindborg@samsung.com> wrote:
> >> The kernel `struct spinlock` is 4 bytes on x86 when lockdep is not enabled. The
> >> structure is not padded to fit a cache line. The effect of this for `SpinLock`
> >> is that the lock variable and the value protected by the lock will share a cache
> >> line, depending on the alignment requirements of the protected value. Aligning
> >> the lock variable and the protected value to a cache line yields a 20%
> >> performance increase for the Rust null block driver for sequential reads to
> >> memory backed devices at 6 concurrent readers.
> >>
> >> Signed-off-by: Andreas Hindborg <a.hindborg@samsung.com>
> >
> > This applies the cacheline padding to all spinlocks unconditionally.
> > It's not clear to me that we want to do that. Instead, I suggest using
> > `SpinLock<CachePadded<T>>` in the null block driver to opt-in to the
> > cache padding there, and let other drivers choose whether or not they
> > want to cache pad their locks.
>
> I was going to write that this is not going to work because the compiler
> is going to reorder the fields of `Lock` and put the `data` field first,
> followed by the `state` field. But I checked the layout, and it seems
> that I actually get the `state` field first (with an alignment of 4), 60
> bytes of padding, and then the `data` field (with alignment 64).
>
> I am wondering why the compiler is not reordering these fields? Am I
> guaranteed that the fields will not be reordered? Looking at the
> definition of `Lock` there does not seem to be anything that prevents
> rustc from swapping `state` and `data`.

It's because `Lock` has `: ?Sized` on the `T` generic. Fields that
might not be Sized must always be last.

Alice

diff --git a/rust/kernel/cache_padded.rs b/rust/kernel/cache_padded.rs
new file mode 100644
index 000000000000..758678e71f50
--- /dev/null
+++ b/rust/kernel/cache_padded.rs
@@ -0,0 +1,33 @@ 
+// SPDX-License-Identifier: GPL-2.0
+
+#[repr(align(64))]
+pub struct CachePadded<T: ?Sized> {
+    value: T,
+}
+
+unsafe impl<T: Send> Send for CachePadded<T> {}
+unsafe impl<T: Sync> Sync for CachePadded<T> {}
+
+impl<T> CachePadded<T> {
+    /// Pads and aligns a value to 64 bytes.
+    #[inline(always)]
+    pub(crate) const fn new(t: T) -> CachePadded<T> {
+        CachePadded::<T> { value: t }
+    }
+}
+
+impl<T: ?Sized> core::ops::Deref for CachePadded<T> {
+    type Target = T;
+
+    #[inline(always)]
+    fn deref(&self) -> &T {
+        &self.value
+    }
+}
+
+impl<T: ?Sized> core::ops::DerefMut for CachePadded<T> {
+    #[inline(always)]
+    fn deref_mut(&mut self) -> &mut T {
+        &mut self.value
+    }
+}
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index a0bd0b0e2aef..426e2dea0da6 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -37,6 +37,7 @@  extern crate self as kernel;
 mod allocator;
 pub mod block;
 mod build_assert;
+mod cache_padded;
 pub mod error;
 pub mod init;
 pub mod ioctl;
@@ -56,6 +57,7 @@  pub mod types;
 
 #[doc(hidden)]
 pub use bindings;
+pub(crate) use cache_padded::CachePadded;
 pub use macros;
 pub use uapi;
 
diff --git a/rust/kernel/sync/lock.rs b/rust/kernel/sync/lock.rs
index a2216325632d..1c584b1df30d 100644
--- a/rust/kernel/sync/lock.rs
+++ b/rust/kernel/sync/lock.rs
@@ -6,7 +6,9 @@ 
 //! spinlocks, raw spinlocks) to be provided with minimal effort.
 
 use super::LockClassKey;
-use crate::{bindings, init::PinInit, pin_init, str::CStr, types::Opaque, types::ScopeGuard};
+use crate::{
+    bindings, init::PinInit, pin_init, str::CStr, types::Opaque, types::ScopeGuard, CachePadded,
+};
 use core::{cell::UnsafeCell, marker::PhantomData, marker::PhantomPinned};
 use macros::pin_data;
 
@@ -87,7 +89,7 @@  pub struct Lock<T: ?Sized, B: Backend> {
     _pin: PhantomPinned,
 
     /// The data protected by the lock.
-    pub(crate) data: UnsafeCell<T>,
+    pub(crate) data: CachePadded<UnsafeCell<T>>,
 }
 
 // SAFETY: `Lock` can be transferred across thread boundaries iff the data it protects can.
@@ -102,7 +104,7 @@  impl<T, B: Backend> Lock<T, B> {
     #[allow(clippy::new_ret_no_self)]
     pub fn new(t: T, name: &'static CStr, key: &'static LockClassKey) -> impl PinInit<Self> {
         pin_init!(Self {
-            data: UnsafeCell::new(t),
+            data: CachePadded::new(UnsafeCell::new(t)),
             _pin: PhantomPinned,
             // SAFETY: `slot` is valid while the closure is called and both `name` and `key` have
             // static lifetimes so they live indefinitely.
@@ -115,6 +117,7 @@  impl<T, B: Backend> Lock<T, B> {
 
 impl<T: ?Sized, B: Backend> Lock<T, B> {
     /// Acquires the lock and gives the caller access to the data protected by it.
+    #[inline(always)]
     pub fn lock(&self) -> Guard<'_, T, B> {
         // SAFETY: The constructor of the type calls `init`, so the existence of the object proves
         // that `init` was called.
diff --git a/rust/kernel/sync/lock/spinlock.rs b/rust/kernel/sync/lock/spinlock.rs
index 979b56464a4e..e39142a8148c 100644
--- a/rust/kernel/sync/lock/spinlock.rs
+++ b/rust/kernel/sync/lock/spinlock.rs
@@ -4,7 +4,10 @@ 
 //!
 //! This module allows Rust code to use the kernel's `spinlock_t`.
 
+use core::ops::DerefMut;
+
 use crate::bindings;
+use crate::CachePadded;
 
 /// Creates a [`SpinLock`] initialiser with the given name and a newly-created lock class.
 ///
@@ -90,7 +93,7 @@  pub struct SpinLockBackend;
 // SAFETY: The underlying kernel `spinlock_t` object ensures mutual exclusion. `relock` uses the
 // default implementation that always calls the same locking method.
 unsafe impl super::Backend for SpinLockBackend {
-    type State = bindings::spinlock_t;
+    type State = CachePadded<bindings::spinlock_t>;
     type GuardState = ();
 
     unsafe fn init(
@@ -100,18 +103,20 @@  unsafe impl super::Backend for SpinLockBackend {
     ) {
         // SAFETY: The safety requirements ensure that `ptr` is valid for writes, and `name` and
         // `key` are valid for read indefinitely.
-        unsafe { bindings::__spin_lock_init(ptr, name, key) }
+        unsafe { bindings::__spin_lock_init((&mut *ptr).deref_mut(), name, key) }
     }
 
+    #[inline(always)]
     unsafe fn lock(ptr: *mut Self::State) -> Self::GuardState {
         // SAFETY: The safety requirements of this function ensure that `ptr` points to valid
         // memory, and that it has been initialised before.
-        unsafe { bindings::spin_lock(ptr) }
+        unsafe { bindings::spin_lock((&mut *ptr).deref_mut()) }
     }
 
+    #[inline(always)]
     unsafe fn unlock(ptr: *mut Self::State, _guard_state: &Self::GuardState) {
         // SAFETY: The safety requirements of this function ensure that `ptr` is valid and that the
         // caller is the owner of the mutex.
-        unsafe { bindings::spin_unlock(ptr) }
+        unsafe { bindings::spin_unlock((&mut *ptr).deref_mut()) }
     }
 }

[RFC,06/11] rust: apply cache line padding for `SpinLock`

Commit Message

Comments

Patch