Message ID | 20241004155247.2210469-4-gary@garyguo.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | None | expand |
Am 04.10.24 um 17:52 schrieb Gary Guo: > Currently there's a custom reference counting in `block::mq`, which uses > `AtomicU64` Rust atomics, and this type doesn't exist on some 32-bit > architectures. We cannot just change it to use 32-bit atomics, because > doing so will make it vulnerable to refcount overflow. So switch it to > use the kernel refcount `kernel::sync::Refcount` instead. > > There is an operation needed by `block::mq`, atomically decreasing > refcount from 2 to 0, which is not available through refcount.h, so > I exposed `Refcount::as_atomic` which allows accessing the refcount > directly. > > Cc: Will Deacon <will@kernel.org> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Boqun Feng <boqun.feng@gmail.com> > Cc: Mark Rutland <mark.rutland@arm.com> > Signed-off-by: Gary Guo <gary@garyguo.net> > --- ... > diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs > index a0e22827f3f4..7b63c02bdce7 100644 > --- a/rust/kernel/block/mq/request.rs > +++ b/rust/kernel/block/mq/request.rs .... > @@ -37,6 +38,9 @@ > /// We need to track C and D to ensure that it is safe to end the request and hand > /// back ownership to the block layer. > /// > +/// Note that driver can still obtain new `ARef` even there is no `ARef`s in existence by using Do you like to check this sentence? Maybe: Note that "a" driver can still obtain "a" new `ARef` even "if" there "are" no `ARef`s in existence by using .... ? Best regards Dirk
Hi Gary, "Gary Guo" <gary@garyguo.net> writes: > Currently there's a custom reference counting in `block::mq`, which uses > `AtomicU64` Rust atomics, and this type doesn't exist on some 32-bit > architectures. We cannot just change it to use 32-bit atomics, because > doing so will make it vulnerable to refcount overflow. So switch it to > use the kernel refcount `kernel::sync::Refcount` instead. > > There is an operation needed by `block::mq`, atomically decreasing > refcount from 2 to 0, which is not available through refcount.h, so > I exposed `Refcount::as_atomic` which allows accessing the refcount > directly. I would rather wait with this patch until the helper LTO patches land upstream. Or at least let me run the benchmarks to see the effect of not inlining these refcount operations. Best regards, Andreas
Hi Gary, "Gary Guo" <gary@garyguo.net> writes: [...] > // SAFETY: All instances of `Request<T>` are reference counted. This > // implementation of `AlwaysRefCounted` ensure that increments to the ref count > // keeps the object alive in memory at least until a matching reference count > // decrement is executed. > unsafe impl<T: Operations> AlwaysRefCounted for Request<T> { > fn inc_ref(&self) { > - let refcount = &self.wrapper_ref().refcount(); > - > - #[cfg_attr(not(CONFIG_DEBUG_MISC), allow(unused_variables))] > - let updated = atomic_relaxed_op_unless(refcount, |x| x + 1, 0); > - > - #[cfg(CONFIG_DEBUG_MISC)] > - if !updated { > - panic!("Request refcount zero on clone") > - } > + self.wrapper_ref().refcount().inc(); What happened to the debug code? BR Andreas
On Fri, 04 Oct 2024 20:34:01 +0200 Andreas Hindborg <a.hindborg@kernel.org> wrote: > Hi Gary, > > "Gary Guo" <gary@garyguo.net> writes: > > [...] > > > // SAFETY: All instances of `Request<T>` are reference counted. This > > // implementation of `AlwaysRefCounted` ensure that increments to the ref count > > // keeps the object alive in memory at least until a matching reference count > > // decrement is executed. > > unsafe impl<T: Operations> AlwaysRefCounted for Request<T> { > > fn inc_ref(&self) { > > - let refcount = &self.wrapper_ref().refcount(); > > - > > - #[cfg_attr(not(CONFIG_DEBUG_MISC), allow(unused_variables))] > > - let updated = atomic_relaxed_op_unless(refcount, |x| x + 1, 0); > > - > > - #[cfg(CONFIG_DEBUG_MISC)] > > - if !updated { > > - panic!("Request refcount zero on clone") > > - } > > + self.wrapper_ref().refcount().inc(); > > What happened to the debug code? > > > BR Andreas > `refcount_inc` will WARN and saturate the refcount when trying to increment from zero. This is true regardless of config. I've already captured this in `Refcount::inc` documentation. Best, Gary
"Gary Guo" <gary@garyguo.net> writes: > On Fri, 04 Oct 2024 20:34:01 +0200 > Andreas Hindborg <a.hindborg@kernel.org> wrote: > >> Hi Gary, >> >> "Gary Guo" <gary@garyguo.net> writes: >> >> [...] >> >> > // SAFETY: All instances of `Request<T>` are reference counted. This >> > // implementation of `AlwaysRefCounted` ensure that increments to the ref count >> > // keeps the object alive in memory at least until a matching reference count >> > // decrement is executed. >> > unsafe impl<T: Operations> AlwaysRefCounted for Request<T> { >> > fn inc_ref(&self) { >> > - let refcount = &self.wrapper_ref().refcount(); >> > - >> > - #[cfg_attr(not(CONFIG_DEBUG_MISC), allow(unused_variables))] >> > - let updated = atomic_relaxed_op_unless(refcount, |x| x + 1, 0); >> > - >> > - #[cfg(CONFIG_DEBUG_MISC)] >> > - if !updated { >> > - panic!("Request refcount zero on clone") >> > - } >> > + self.wrapper_ref().refcount().inc(); >> >> What happened to the debug code? >> >> >> BR Andreas >> > > `refcount_inc` will WARN and saturate the refcount when trying to > increment from zero. This is true regardless of config. > > I've already captured this in `Refcount::inc` documentation. I did not know. Thanks! BR Andreas
On Fri, Oct 04, 2024 at 04:52:24PM +0100, Gary Guo wrote: > There is an operation needed by `block::mq`, atomically decreasing > refcount from 2 to 0, which is not available through refcount.h, so > I exposed `Refcount::as_atomic` which allows accessing the refcount > directly. That's scary, and of course feels wrong on many levels, but: > @@ -91,13 +95,17 @@ pub(crate) unsafe fn start_unchecked(this: &ARef<Self>) { > /// C `struct request`. If the operation fails, `this` is returned in the > /// `Err` variant. > fn try_set_end(this: ARef<Self>) -> Result<*mut bindings::request, ARef<Self>> { > - // We can race with `TagSet::tag_to_rq` > - if let Err(_old) = this.wrapper_ref().refcount().compare_exchange( > - 2, > - 0, > - Ordering::Relaxed, > - Ordering::Relaxed, > - ) { > + // To hand back the ownership, we need the current refcount to be 2. > + // Since we can race with `TagSet::tag_to_rq`, this needs to atomically reduce > + // refcount to 0. `Refcount` does not provide a way to do this, so use the underlying > + // atomics directly. > + if this > + .wrapper_ref() > + .refcount() > + .as_atomic() > + .compare_exchange(2, 0, Ordering::Relaxed, Ordering::Relaxed) > + .is_err() Why not just call rust_helper_refcount_set()? Or is the issue that you think you might not be 2 here? And if you HAVE to be 2, why that magic value (i.e. why not just always be 1 and rely on normal increment/decrement?) I know some refcounts are odd in the kernel, but I don't see where the block layer is caring about 2 as a refcount anywhere, what am I missing? thanks, greg k-h
Hi Greg, "Greg KH" <gregkh@linuxfoundation.org> writes: > On Fri, Oct 04, 2024 at 04:52:24PM +0100, Gary Guo wrote: >> There is an operation needed by `block::mq`, atomically decreasing >> refcount from 2 to 0, which is not available through refcount.h, so >> I exposed `Refcount::as_atomic` which allows accessing the refcount >> directly. > > That's scary, and of course feels wrong on many levels, but: > > >> @@ -91,13 +95,17 @@ pub(crate) unsafe fn start_unchecked(this: &ARef<Self>) { >> /// C `struct request`. If the operation fails, `this` is returned in the >> /// `Err` variant. >> fn try_set_end(this: ARef<Self>) -> Result<*mut bindings::request, ARef<Self>> { >> - // We can race with `TagSet::tag_to_rq` >> - if let Err(_old) = this.wrapper_ref().refcount().compare_exchange( >> - 2, >> - 0, >> - Ordering::Relaxed, >> - Ordering::Relaxed, >> - ) { >> + // To hand back the ownership, we need the current refcount to be 2. >> + // Since we can race with `TagSet::tag_to_rq`, this needs to atomically reduce >> + // refcount to 0. `Refcount` does not provide a way to do this, so use the underlying >> + // atomics directly. >> + if this >> + .wrapper_ref() >> + .refcount() >> + .as_atomic() >> + .compare_exchange(2, 0, Ordering::Relaxed, Ordering::Relaxed) >> + .is_err() > > Why not just call rust_helper_refcount_set()? Or is the issue that you > think you might not be 2 here? And if you HAVE to be 2, why that magic > value (i.e. why not just always be 1 and rely on normal > increment/decrement?) > > I know some refcounts are odd in the kernel, but I don't see where the > block layer is caring about 2 as a refcount anywhere, what am I missing? It is in the documentation, rendered version available here [1]. Let me know if it is still unclear, then I guess we need to update the docs. Also, my session from Recipes has a little bit of discussion regarding this refcount and it's use [2]. Best regards, Andreas [1] https://rust.docs.kernel.org/kernel/block/mq/struct.Request.html#implementation-details [2] https://youtu.be/1LEvgkhU-t4?si=B1XnJhzCCNnUtRsI&t=1685
On Sat, Oct 05, 2024 at 11:48:53AM +0200, Andreas Hindborg wrote: > Hi Greg, > > "Greg KH" <gregkh@linuxfoundation.org> writes: > > > On Fri, Oct 04, 2024 at 04:52:24PM +0100, Gary Guo wrote: > >> There is an operation needed by `block::mq`, atomically decreasing > >> refcount from 2 to 0, which is not available through refcount.h, so > >> I exposed `Refcount::as_atomic` which allows accessing the refcount > >> directly. > > > > That's scary, and of course feels wrong on many levels, but: > > > > > >> @@ -91,13 +95,17 @@ pub(crate) unsafe fn start_unchecked(this: &ARef<Self>) { > >> /// C `struct request`. If the operation fails, `this` is returned in the > >> /// `Err` variant. > >> fn try_set_end(this: ARef<Self>) -> Result<*mut bindings::request, ARef<Self>> { > >> - // We can race with `TagSet::tag_to_rq` > >> - if let Err(_old) = this.wrapper_ref().refcount().compare_exchange( > >> - 2, > >> - 0, > >> - Ordering::Relaxed, > >> - Ordering::Relaxed, > >> - ) { > >> + // To hand back the ownership, we need the current refcount to be 2. > >> + // Since we can race with `TagSet::tag_to_rq`, this needs to atomically reduce > >> + // refcount to 0. `Refcount` does not provide a way to do this, so use the underlying > >> + // atomics directly. > >> + if this > >> + .wrapper_ref() > >> + .refcount() > >> + .as_atomic() > >> + .compare_exchange(2, 0, Ordering::Relaxed, Ordering::Relaxed) > >> + .is_err() > > > > Why not just call rust_helper_refcount_set()? Or is the issue that you > > think you might not be 2 here? And if you HAVE to be 2, why that magic > > value (i.e. why not just always be 1 and rely on normal > > increment/decrement?) > > > > I know some refcounts are odd in the kernel, but I don't see where the > > block layer is caring about 2 as a refcount anywhere, what am I missing? > > It is in the documentation, rendered version available here [1]. Let me > know if it is still unclear, then I guess we need to update the docs. > > Also, my session from Recipes has a little bit of discussion regarding > this refcount and it's use [2]. Ah, ick, that's crazy, ok, good luck!
On Sat, Oct 05, 2024 at 11:48:53AM +0200, Andreas Hindborg wrote: > It is in the documentation, rendered version available here [1]. Let me > know if it is still unclear, then I guess we need to update the docs. > > [1] https://rust.docs.kernel.org/kernel/block/mq/struct.Request.html#implementation-details So I clicked on the link for shits and giggles, and OMG that's unreadable garbage :/ Is there a plain text form that a normal person can read? There's just too much 'layout' and fonts and colours and URGH.
"Peter Zijlstra" <peterz@infradead.org> writes: > On Sat, Oct 05, 2024 at 11:48:53AM +0200, Andreas Hindborg wrote: > >> It is in the documentation, rendered version available here [1]. Let me >> know if it is still unclear, then I guess we need to update the docs. > >> >> [1] https://rust.docs.kernel.org/kernel/block/mq/struct.Request.html#implementation-details > > So I clicked on the link for shits and giggles, and OMG that's > unreadable garbage :/ Is there a plain text form that a normal person > can read? To each his own, I guess. It is rendered from the source code. There is a source link at the top of the page that will take you to a rendered version of the source code, line numbers, syntax highlighting and all. You can open up the source file in your favorite pager and read it there as well. This particular text is from rust/kernel/block/mq/request.rs. BR Andreas
On Sat, Oct 5, 2024 at 12:10 PM Peter Zijlstra <peterz@infradead.org> wrote: > > So I clicked on the link for shits and giggles, and OMG that's > unreadable garbage :/ Is there a plain text form that a normal person > can read? > > There's just too much 'layout' and fonts and colours and URGH. If fonts and colors are the only issue, then it can easily be fixed with a bit of CSS client-side or we can perhaps add it to a new theme. Otherwise, people have implemented other renderers and viewers in the past, including text / terminal-based ones. Nowadays there is unstable JSON output support that can be used for that without dealing with HTML: https://rust-lang.github.io/rfcs/2963-rustdoc-json.html If you want to use rust.docs.kernel.org, you can also use the "source" view at the top-right. It is still syntax highlighted a bit -- not sure if you like that, but you may find it "less busy". Having said that, there is some logic in the layout (in the non-source view, I mean) being the way it is in the HTML view -- it may take time to get used to, but it is quite useful when you know where to look / click. Cheers, Miguel
On Sat, Oct 5, 2024 at 11:49 AM Andreas Hindborg <a.hindborg@kernel.org> wrote: > > Hi Greg, > > "Greg KH" <gregkh@linuxfoundation.org> writes: > > > On Fri, Oct 04, 2024 at 04:52:24PM +0100, Gary Guo wrote: > >> There is an operation needed by `block::mq`, atomically decreasing > >> refcount from 2 to 0, which is not available through refcount.h, so > >> I exposed `Refcount::as_atomic` which allows accessing the refcount > >> directly. > > > > That's scary, and of course feels wrong on many levels, but: > > > > > >> @@ -91,13 +95,17 @@ pub(crate) unsafe fn start_unchecked(this: &ARef<Self>) { > >> /// C `struct request`. If the operation fails, `this` is returned in the > >> /// `Err` variant. > >> fn try_set_end(this: ARef<Self>) -> Result<*mut bindings::request, ARef<Self>> { > >> - // We can race with `TagSet::tag_to_rq` > >> - if let Err(_old) = this.wrapper_ref().refcount().compare_exchange( > >> - 2, > >> - 0, > >> - Ordering::Relaxed, > >> - Ordering::Relaxed, > >> - ) { > >> + // To hand back the ownership, we need the current refcount to be 2. > >> + // Since we can race with `TagSet::tag_to_rq`, this needs to atomically reduce > >> + // refcount to 0. `Refcount` does not provide a way to do this, so use the underlying > >> + // atomics directly. > >> + if this > >> + .wrapper_ref() > >> + .refcount() > >> + .as_atomic() > >> + .compare_exchange(2, 0, Ordering::Relaxed, Ordering::Relaxed) > >> + .is_err() > > > > Why not just call rust_helper_refcount_set()? Or is the issue that you > > think you might not be 2 here? And if you HAVE to be 2, why that magic > > value (i.e. why not just always be 1 and rely on normal > > increment/decrement?) > > > > I know some refcounts are odd in the kernel, but I don't see where the > > block layer is caring about 2 as a refcount anywhere, what am I missing? > > It is in the documentation, rendered version available here [1]. Let me > know if it is still unclear, then I guess we need to update the docs. > > Also, my session from Recipes has a little bit of discussion regarding > this refcount and it's use [2]. > > Best regards, > Andreas > > > [1] https://rust.docs.kernel.org/kernel/block/mq/struct.Request.html#implementation-details > [2] https://youtu.be/1LEvgkhU-t4?si=B1XnJhzCCNnUtRsI&t=1685 So it sounds like there is one refcount from the C side, and some number of references from the Rust side. The function checks whether there's only one Rust reference left, and if so, takes ownership of the value, correct? In that case, the CAS should have an acquire ordering to synchronize with dropping the refcount 3->2 on another thread. Otherwise, you might have a data race with the operations that happened just before the 3->2 refcount drop. Alice On Sat, Oct 5, 2024 at 11:49 AM Andreas Hindborg <a.hindborg@kernel.org> wrote: > > Hi Greg, > > "Greg KH" <gregkh@linuxfoundation.org> writes: > > > On Fri, Oct 04, 2024 at 04:52:24PM +0100, Gary Guo wrote: > >> There is an operation needed by `block::mq`, atomically decreasing > >> refcount from 2 to 0, which is not available through refcount.h, so > >> I exposed `Refcount::as_atomic` which allows accessing the refcount > >> directly. > > > > That's scary, and of course feels wrong on many levels, but: > > > > > >> @@ -91,13 +95,17 @@ pub(crate) unsafe fn start_unchecked(this: &ARef<Self>) { > >> /// C `struct request`. If the operation fails, `this` is returned in the > >> /// `Err` variant. > >> fn try_set_end(this: ARef<Self>) -> Result<*mut bindings::request, ARef<Self>> { > >> - // We can race with `TagSet::tag_to_rq` > >> - if let Err(_old) = this.wrapper_ref().refcount().compare_exchange( > >> - 2, > >> - 0, > >> - Ordering::Relaxed, > >> - Ordering::Relaxed, > >> - ) { > >> + // To hand back the ownership, we need the current refcount to be 2. > >> + // Since we can race with `TagSet::tag_to_rq`, this needs to atomically reduce > >> + // refcount to 0. `Refcount` does not provide a way to do this, so use the underlying > >> + // atomics directly. > >> + if this > >> + .wrapper_ref() > >> + .refcount() > >> + .as_atomic() > >> + .compare_exchange(2, 0, Ordering::Relaxed, Ordering::Relaxed) > >> + .is_err() > > > > Why not just call rust_helper_refcount_set()? Or is the issue that you > > think you might not be 2 here? And if you HAVE to be 2, why that magic > > value (i.e. why not just always be 1 and rely on normal > > increment/decrement?) > > > > I know some refcounts are odd in the kernel, but I don't see where the > > block layer is caring about 2 as a refcount anywhere, what am I missing? > > It is in the documentation, rendered version available here [1]. Let me > know if it is still unclear, then I guess we need to update the docs. > > Also, my session from Recipes has a little bit of discussion regarding > this refcount and it's use [2]. > > Best regards, > Andreas > > > [1] https://rust.docs.kernel.org/kernel/block/mq/struct.Request.html#implementation-details > [2] https://youtu.be/1LEvgkhU-t4?si=B1XnJhzCCNnUtRsI&t=1685 >
On Sat, 5 Oct 2024 13:59:44 +0200 Alice Ryhl <aliceryhl@google.com> wrote: > On Sat, Oct 5, 2024 at 11:49 AM Andreas Hindborg <a.hindborg@kernel.org> wrote: > > > > Hi Greg, > > > > "Greg KH" <gregkh@linuxfoundation.org> writes: > > > > > On Fri, Oct 04, 2024 at 04:52:24PM +0100, Gary Guo wrote: > > >> There is an operation needed by `block::mq`, atomically decreasing > > >> refcount from 2 to 0, which is not available through refcount.h, so > > >> I exposed `Refcount::as_atomic` which allows accessing the refcount > > >> directly. > > > > > > That's scary, and of course feels wrong on many levels, but: > > > > > > > > >> @@ -91,13 +95,17 @@ pub(crate) unsafe fn start_unchecked(this: &ARef<Self>) { > > >> /// C `struct request`. If the operation fails, `this` is returned in the > > >> /// `Err` variant. > > >> fn try_set_end(this: ARef<Self>) -> Result<*mut bindings::request, ARef<Self>> { > > >> - // We can race with `TagSet::tag_to_rq` > > >> - if let Err(_old) = this.wrapper_ref().refcount().compare_exchange( > > >> - 2, > > >> - 0, > > >> - Ordering::Relaxed, > > >> - Ordering::Relaxed, > > >> - ) { > > >> + // To hand back the ownership, we need the current refcount to be 2. > > >> + // Since we can race with `TagSet::tag_to_rq`, this needs to atomically reduce > > >> + // refcount to 0. `Refcount` does not provide a way to do this, so use the underlying > > >> + // atomics directly. > > >> + if this > > >> + .wrapper_ref() > > >> + .refcount() > > >> + .as_atomic() > > >> + .compare_exchange(2, 0, Ordering::Relaxed, Ordering::Relaxed) > > >> + .is_err() > > > > > > Why not just call rust_helper_refcount_set()? Or is the issue that you > > > think you might not be 2 here? And if you HAVE to be 2, why that magic > > > value (i.e. why not just always be 1 and rely on normal > > > increment/decrement?) > > > > > > I know some refcounts are odd in the kernel, but I don't see where the > > > block layer is caring about 2 as a refcount anywhere, what am I missing? > > > > It is in the documentation, rendered version available here [1]. Let me > > know if it is still unclear, then I guess we need to update the docs. > > > > Also, my session from Recipes has a little bit of discussion regarding > > this refcount and it's use [2]. > > > > Best regards, > > Andreas > > > > > > [1] https://rust.docs.kernel.org/kernel/block/mq/struct.Request.html#implementation-details > > [2] https://youtu.be/1LEvgkhU-t4?si=B1XnJhzCCNnUtRsI&t=1685 > > So it sounds like there is one refcount from the C side, and some > number of references from the Rust side. The function checks whether > there's only one Rust reference left, and if so, takes ownership of > the value, correct? > > In that case, the CAS should have an acquire ordering to synchronize > with dropping the refcount 3->2 on another thread. Otherwise, you > might have a data race with the operations that happened just before > the 3->2 refcount drop. > > Alice The code as is is fine since there's no data protected in `RequestDataWrapper` yet (in fact it's not even generic yet). I know Andreas does want to introduce driver-specific data into that, so in the long term the acquire would be necessary. Andreas, please let me know if you want me to make the change now, or you'd rather change the ordering when you introduce data to `RequestDataWrapper`. Best, Gary
"Alice Ryhl" <aliceryhl@google.com> writes: > On Sat, Oct 5, 2024 at 11:49 AM Andreas Hindborg <a.hindborg@kernel.org> wrote: >> >> Hi Greg, >> >> "Greg KH" <gregkh@linuxfoundation.org> writes: >> >> > On Fri, Oct 04, 2024 at 04:52:24PM +0100, Gary Guo wrote: >> >> There is an operation needed by `block::mq`, atomically decreasing >> >> refcount from 2 to 0, which is not available through refcount.h, so >> >> I exposed `Refcount::as_atomic` which allows accessing the refcount >> >> directly. >> > >> > That's scary, and of course feels wrong on many levels, but: >> > >> > >> >> @@ -91,13 +95,17 @@ pub(crate) unsafe fn start_unchecked(this: &ARef<Self>) { >> >> /// C `struct request`. If the operation fails, `this` is returned in the >> >> /// `Err` variant. >> >> fn try_set_end(this: ARef<Self>) -> Result<*mut bindings::request, ARef<Self>> { >> >> - // We can race with `TagSet::tag_to_rq` >> >> - if let Err(_old) = this.wrapper_ref().refcount().compare_exchange( >> >> - 2, >> >> - 0, >> >> - Ordering::Relaxed, >> >> - Ordering::Relaxed, >> >> - ) { >> >> + // To hand back the ownership, we need the current refcount to be 2. >> >> + // Since we can race with `TagSet::tag_to_rq`, this needs to atomically reduce >> >> + // refcount to 0. `Refcount` does not provide a way to do this, so use the underlying >> >> + // atomics directly. >> >> + if this >> >> + .wrapper_ref() >> >> + .refcount() >> >> + .as_atomic() >> >> + .compare_exchange(2, 0, Ordering::Relaxed, Ordering::Relaxed) >> >> + .is_err() >> > >> > Why not just call rust_helper_refcount_set()? Or is the issue that you >> > think you might not be 2 here? And if you HAVE to be 2, why that magic >> > value (i.e. why not just always be 1 and rely on normal >> > increment/decrement?) >> > >> > I know some refcounts are odd in the kernel, but I don't see where the >> > block layer is caring about 2 as a refcount anywhere, what am I missing? >> >> It is in the documentation, rendered version available here [1]. Let me >> know if it is still unclear, then I guess we need to update the docs. >> >> Also, my session from Recipes has a little bit of discussion regarding >> this refcount and it's use [2]. >> >> Best regards, >> Andreas >> >> >> [1] https://rust.docs.kernel.org/kernel/block/mq/struct.Request.html#implementation-details >> [2] https://youtu.be/1LEvgkhU-t4?si=B1XnJhzCCNnUtRsI&t=1685 > > So it sounds like there is one refcount from the C side, and some > number of references from the Rust side. C side uses a different refcount field. That refcount never read by Rust, but it is guaranteed to be greater or equal to one while the driver owns the request. Rust uses a different refcount field to keep track of how many Rust references there is to a request. There is an implicit count while the driver owns the request. This count is not materialized as an `ARef` instance. > The function checks whether there's only one Rust reference left, and > if so, takes ownership of the value, correct? It checks if the `ARef` passed to the function is the last one in existence. If it is, it takes ownership of the `Request` object. > In that case, the CAS should have an acquire ordering to synchronize > with dropping the refcount 3->2 on another thread. Otherwise, you > might have a data race with the operations that happened just before > the 3->2 refcount drop. I am not sure. I don't think that the thread that does the CAS 2 -> 0 has any data dependencies to any thread that does the atomic decrement 3 -> 2. Any data dependencies between operations on the underlying C `struct request` would be synchronized by operations on the `ref` field of `struct request`, which is entirely managed block layer C code and the C functions called by the Rust abstractions. BR Andreas
"Gary Guo" <gary@garyguo.net> writes: > On Sat, 5 Oct 2024 13:59:44 +0200 > Alice Ryhl <aliceryhl@google.com> wrote: > >> On Sat, Oct 5, 2024 at 11:49 AM Andreas Hindborg <a.hindborg@kernel.org> wrote: >> > >> > Hi Greg, >> > >> > "Greg KH" <gregkh@linuxfoundation.org> writes: >> > >> > > On Fri, Oct 04, 2024 at 04:52:24PM +0100, Gary Guo wrote: >> > >> There is an operation needed by `block::mq`, atomically decreasing >> > >> refcount from 2 to 0, which is not available through refcount.h, so >> > >> I exposed `Refcount::as_atomic` which allows accessing the refcount >> > >> directly. >> > > >> > > That's scary, and of course feels wrong on many levels, but: >> > > >> > > >> > >> @@ -91,13 +95,17 @@ pub(crate) unsafe fn start_unchecked(this: &ARef<Self>) { >> > >> /// C `struct request`. If the operation fails, `this` is returned in the >> > >> /// `Err` variant. >> > >> fn try_set_end(this: ARef<Self>) -> Result<*mut bindings::request, ARef<Self>> { >> > >> - // We can race with `TagSet::tag_to_rq` >> > >> - if let Err(_old) = this.wrapper_ref().refcount().compare_exchange( >> > >> - 2, >> > >> - 0, >> > >> - Ordering::Relaxed, >> > >> - Ordering::Relaxed, >> > >> - ) { >> > >> + // To hand back the ownership, we need the current refcount to be 2. >> > >> + // Since we can race with `TagSet::tag_to_rq`, this needs to atomically reduce >> > >> + // refcount to 0. `Refcount` does not provide a way to do this, so use the underlying >> > >> + // atomics directly. >> > >> + if this >> > >> + .wrapper_ref() >> > >> + .refcount() >> > >> + .as_atomic() >> > >> + .compare_exchange(2, 0, Ordering::Relaxed, Ordering::Relaxed) >> > >> + .is_err() >> > > >> > > Why not just call rust_helper_refcount_set()? Or is the issue that you >> > > think you might not be 2 here? And if you HAVE to be 2, why that magic >> > > value (i.e. why not just always be 1 and rely on normal >> > > increment/decrement?) >> > > >> > > I know some refcounts are odd in the kernel, but I don't see where the >> > > block layer is caring about 2 as a refcount anywhere, what am I missing? >> > >> > It is in the documentation, rendered version available here [1]. Let me >> > know if it is still unclear, then I guess we need to update the docs. >> > >> > Also, my session from Recipes has a little bit of discussion regarding >> > this refcount and it's use [2]. >> > >> > Best regards, >> > Andreas >> > >> > >> > [1] https://rust.docs.kernel.org/kernel/block/mq/struct.Request.html#implementation-details >> > [2] https://youtu.be/1LEvgkhU-t4?si=B1XnJhzCCNnUtRsI&t=1685 >> >> So it sounds like there is one refcount from the C side, and some >> number of references from the Rust side. The function checks whether >> there's only one Rust reference left, and if so, takes ownership of >> the value, correct? >> >> In that case, the CAS should have an acquire ordering to synchronize >> with dropping the refcount 3->2 on another thread. Otherwise, you >> might have a data race with the operations that happened just before >> the 3->2 refcount drop. >> >> Alice > > The code as is is fine since there's no data protected in > `RequestDataWrapper` yet (in fact it's not even generic yet). I know > Andreas does want to introduce driver-specific data into that, so in > the long term the acquire would be necessary. > > Andreas, please let me know if you want me to make the change now, or > you'd rather change the ordering when you introduce data to > `RequestDataWrapper`. I guess we will have said data dependencies when we are going to run drop for fields in the private data area. Thanks for pointing that out. I will update the ordering when I submit that patch. As I mentioned before, I would rather we do not apply this patch before we get a way to inline helpers. BR Andreas
On Fri, 04 Oct 2024 20:05:41 +0200 Andreas Hindborg <a.hindborg@kernel.org> wrote: > Hi Gary, > > "Gary Guo" <gary@garyguo.net> writes: > > > Currently there's a custom reference counting in `block::mq`, which uses > > `AtomicU64` Rust atomics, and this type doesn't exist on some 32-bit > > architectures. We cannot just change it to use 32-bit atomics, because > > doing so will make it vulnerable to refcount overflow. So switch it to > > use the kernel refcount `kernel::sync::Refcount` instead. > > > > There is an operation needed by `block::mq`, atomically decreasing > > refcount from 2 to 0, which is not available through refcount.h, so > > I exposed `Refcount::as_atomic` which allows accessing the refcount > > directly. > > I would rather wait with this patch until the helper LTO patches land > upstream. Or at least let me run the benchmarks to see the effect of not > inlining these refcount operations. > > Best regards, > Andreas > The helper LTO patch series still need some time. I'd want to be able to test on 32-bit archs in the meantime. Best, Gary
"Gary Guo" <gary@garyguo.net> writes: > On Fri, 04 Oct 2024 20:05:41 +0200 > Andreas Hindborg <a.hindborg@kernel.org> wrote: > >> Hi Gary, >> >> "Gary Guo" <gary@garyguo.net> writes: >> >> > Currently there's a custom reference counting in `block::mq`, which uses >> > `AtomicU64` Rust atomics, and this type doesn't exist on some 32-bit >> > architectures. We cannot just change it to use 32-bit atomics, because >> > doing so will make it vulnerable to refcount overflow. So switch it to >> > use the kernel refcount `kernel::sync::Refcount` instead. >> > >> > There is an operation needed by `block::mq`, atomically decreasing >> > refcount from 2 to 0, which is not available through refcount.h, so >> > I exposed `Refcount::as_atomic` which allows accessing the refcount >> > directly. >> >> I would rather wait with this patch until the helper LTO patches land >> upstream. Or at least let me run the benchmarks to see the effect of not >> inlining these refcount operations. >> >> Best regards, >> Andreas >> > > The helper LTO patch series still need some time. I'd want to be able to > test on 32-bit archs in the meantime. In that case I would rather just conditionally gate the `block` module and the `rnull` driver on 64 bit arch. BR Andreas
Andreas Hindborg <a.hindborg@kernel.org> writes: > "Gary Guo" <gary@garyguo.net> writes: > >> On Sat, 5 Oct 2024 13:59:44 +0200 >> Alice Ryhl <aliceryhl@google.com> wrote: >> >>> On Sat, Oct 5, 2024 at 11:49 AM Andreas Hindborg <a.hindborg@kernel.org> wrote: >>> > >>> > Hi Greg, >>> > >>> > "Greg KH" <gregkh@linuxfoundation.org> writes: >>> > >>> > > On Fri, Oct 04, 2024 at 04:52:24PM +0100, Gary Guo wrote: >>> > >> There is an operation needed by `block::mq`, atomically decreasing >>> > >> refcount from 2 to 0, which is not available through refcount.h, so >>> > >> I exposed `Refcount::as_atomic` which allows accessing the refcount >>> > >> directly. >>> > > >>> > > That's scary, and of course feels wrong on many levels, but: >>> > > >>> > > >>> > >> @@ -91,13 +95,17 @@ pub(crate) unsafe fn start_unchecked(this: &ARef<Self>) { >>> > >> /// C `struct request`. If the operation fails, `this` is returned in the >>> > >> /// `Err` variant. >>> > >> fn try_set_end(this: ARef<Self>) -> Result<*mut bindings::request, ARef<Self>> { >>> > >> - // We can race with `TagSet::tag_to_rq` >>> > >> - if let Err(_old) = this.wrapper_ref().refcount().compare_exchange( >>> > >> - 2, >>> > >> - 0, >>> > >> - Ordering::Relaxed, >>> > >> - Ordering::Relaxed, >>> > >> - ) { >>> > >> + // To hand back the ownership, we need the current refcount to be 2. >>> > >> + // Since we can race with `TagSet::tag_to_rq`, this needs to atomically reduce >>> > >> + // refcount to 0. `Refcount` does not provide a way to do this, so use the underlying >>> > >> + // atomics directly. >>> > >> + if this >>> > >> + .wrapper_ref() >>> > >> + .refcount() >>> > >> + .as_atomic() >>> > >> + .compare_exchange(2, 0, Ordering::Relaxed, Ordering::Relaxed) >>> > >> + .is_err() >>> > > >>> > > Why not just call rust_helper_refcount_set()? Or is the issue that you >>> > > think you might not be 2 here? And if you HAVE to be 2, why that magic >>> > > value (i.e. why not just always be 1 and rely on normal >>> > > increment/decrement?) >>> > > >>> > > I know some refcounts are odd in the kernel, but I don't see where the >>> > > block layer is caring about 2 as a refcount anywhere, what am I missing? >>> > >>> > It is in the documentation, rendered version available here [1]. Let me >>> > know if it is still unclear, then I guess we need to update the docs. >>> > >>> > Also, my session from Recipes has a little bit of discussion regarding >>> > this refcount and it's use [2]. >>> > >>> > Best regards, >>> > Andreas >>> > >>> > >>> > [1] https://rust.docs.kernel.org/kernel/block/mq/struct.Request.html#implementation-details >>> > [2] https://youtu.be/1LEvgkhU-t4?si=B1XnJhzCCNnUtRsI&t=1685 >>> >>> So it sounds like there is one refcount from the C side, and some >>> number of references from the Rust side. The function checks whether >>> there's only one Rust reference left, and if so, takes ownership of >>> the value, correct? >>> >>> In that case, the CAS should have an acquire ordering to synchronize >>> with dropping the refcount 3->2 on another thread. Otherwise, you >>> might have a data race with the operations that happened just before >>> the 3->2 refcount drop. >>> >>> Alice >> >> The code as is is fine since there's no data protected in >> `RequestDataWrapper` yet (in fact it's not even generic yet). I know >> Andreas does want to introduce driver-specific data into that, so in >> the long term the acquire would be necessary. >> >> Andreas, please let me know if you want me to make the change now, or >> you'd rather change the ordering when you introduce data to >> `RequestDataWrapper`. > > I guess we will have said data dependencies when we are going to run > drop for fields in the private data area. Thanks for pointing that out. > I will update the ordering when I submit that patch. > > As I mentioned before, I would rather we do not apply this patch before > we get a way to inline helpers. As discussed offline, the code that suffers the performance regression is downstream, and since this change seems to be important, I can apply the helper LTO patch downstream as well. Since the plan for the downstream code _is_ to move upstream, I really hope to see the helper LTO patch upstream, so we don't get a performance regression because of these refcounts. If we cannot figure out a way to get the LTO patches (or an alternative solution) upstream, we can always revert back to a more performant solution in block. BR Andreas
"Gary Guo" <gary@garyguo.net> writes: > Currently there's a custom reference counting in `block::mq`, which uses > `AtomicU64` Rust atomics, and this type doesn't exist on some 32-bit > architectures. We cannot just change it to use 32-bit atomics, because > doing so will make it vulnerable to refcount overflow. So switch it to > use the kernel refcount `kernel::sync::Refcount` instead. > > There is an operation needed by `block::mq`, atomically decreasing > refcount from 2 to 0, which is not available through refcount.h, so > I exposed `Refcount::as_atomic` which allows accessing the refcount > directly. > > Cc: Will Deacon <will@kernel.org> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Boqun Feng <boqun.feng@gmail.com> > Cc: Mark Rutland <mark.rutland@arm.com> > Signed-off-by: Gary Guo <gary@garyguo.net> Acked-by: Andreas Hindborg <a.hindborg@kernel.org>
Andreas Hindborg <a.hindborg@kernel.org> writes: > Andreas Hindborg <a.hindborg@kernel.org> writes: > >> "Gary Guo" <gary@garyguo.net> writes: >> >>> On Sat, 5 Oct 2024 13:59:44 +0200 >>> Alice Ryhl <aliceryhl@google.com> wrote: >>> >>>> On Sat, Oct 5, 2024 at 11:49 AM Andreas Hindborg <a.hindborg@kernel.org> wrote: >>>> > >>>> > Hi Greg, >>>> > >>>> > "Greg KH" <gregkh@linuxfoundation.org> writes: >>>> > >>>> > > On Fri, Oct 04, 2024 at 04:52:24PM +0100, Gary Guo wrote: >>>> > >> There is an operation needed by `block::mq`, atomically decreasing >>>> > >> refcount from 2 to 0, which is not available through refcount.h, so >>>> > >> I exposed `Refcount::as_atomic` which allows accessing the refcount >>>> > >> directly. >>>> > > >>>> > > That's scary, and of course feels wrong on many levels, but: >>>> > > >>>> > > >>>> > >> @@ -91,13 +95,17 @@ pub(crate) unsafe fn start_unchecked(this: &ARef<Self>) { >>>> > >> /// C `struct request`. If the operation fails, `this` is returned in the >>>> > >> /// `Err` variant. >>>> > >> fn try_set_end(this: ARef<Self>) -> Result<*mut bindings::request, ARef<Self>> { >>>> > >> - // We can race with `TagSet::tag_to_rq` >>>> > >> - if let Err(_old) = this.wrapper_ref().refcount().compare_exchange( >>>> > >> - 2, >>>> > >> - 0, >>>> > >> - Ordering::Relaxed, >>>> > >> - Ordering::Relaxed, >>>> > >> - ) { >>>> > >> + // To hand back the ownership, we need the current refcount to be 2. >>>> > >> + // Since we can race with `TagSet::tag_to_rq`, this needs to atomically reduce >>>> > >> + // refcount to 0. `Refcount` does not provide a way to do this, so use the underlying >>>> > >> + // atomics directly. >>>> > >> + if this >>>> > >> + .wrapper_ref() >>>> > >> + .refcount() >>>> > >> + .as_atomic() >>>> > >> + .compare_exchange(2, 0, Ordering::Relaxed, Ordering::Relaxed) >>>> > >> + .is_err() >>>> > > >>>> > > Why not just call rust_helper_refcount_set()? Or is the issue that you >>>> > > think you might not be 2 here? And if you HAVE to be 2, why that magic >>>> > > value (i.e. why not just always be 1 and rely on normal >>>> > > increment/decrement?) >>>> > > >>>> > > I know some refcounts are odd in the kernel, but I don't see where the >>>> > > block layer is caring about 2 as a refcount anywhere, what am I missing? >>>> > >>>> > It is in the documentation, rendered version available here [1]. Let me >>>> > know if it is still unclear, then I guess we need to update the docs. >>>> > >>>> > Also, my session from Recipes has a little bit of discussion regarding >>>> > this refcount and it's use [2]. >>>> > >>>> > Best regards, >>>> > Andreas >>>> > >>>> > >>>> > [1] https://rust.docs.kernel.org/kernel/block/mq/struct.Request.html#implementation-details >>>> > [2] https://youtu.be/1LEvgkhU-t4?si=B1XnJhzCCNnUtRsI&t=1685 >>>> >>>> So it sounds like there is one refcount from the C side, and some >>>> number of references from the Rust side. The function checks whether >>>> there's only one Rust reference left, and if so, takes ownership of >>>> the value, correct? >>>> >>>> In that case, the CAS should have an acquire ordering to synchronize >>>> with dropping the refcount 3->2 on another thread. Otherwise, you >>>> might have a data race with the operations that happened just before >>>> the 3->2 refcount drop. >>>> >>>> Alice >>> >>> The code as is is fine since there's no data protected in >>> `RequestDataWrapper` yet (in fact it's not even generic yet). I know >>> Andreas does want to introduce driver-specific data into that, so in >>> the long term the acquire would be necessary. >>> >>> Andreas, please let me know if you want me to make the change now, or >>> you'd rather change the ordering when you introduce data to >>> `RequestDataWrapper`. >> >> I guess we will have said data dependencies when we are going to run >> drop for fields in the private data area. Thanks for pointing that out. >> I will update the ordering when I submit that patch. >> >> As I mentioned before, I would rather we do not apply this patch before >> we get a way to inline helpers. > > As discussed offline, the code that suffers the performance regression > is downstream, and since this change seems to be important, I can apply > the helper LTO patch downstream as well. > > Since the plan for the downstream code _is_ to move upstream, I really > hope to see the helper LTO patch upstream, so we don't get a performance > regression because of these refcounts. > > If we cannot figure out a way to get the LTO patches (or an alternative > solution) upstream, we can always revert back to a more performant > solution in block. I forgot to report the result of the benchmarks. Over the usual benchmark workload that I run for `rnull` I see an average 0.8 percent performance penalty with this patch. For some configurations I see 95% CI N=40 [-18%;-5%]. So it is not insignificant. BR Andreas
On 10.10.24 11:06, Andreas Hindborg wrote: > Andreas Hindborg <a.hindborg@kernel.org> writes: > >> Andreas Hindborg <a.hindborg@kernel.org> writes: >> >>> "Gary Guo" <gary@garyguo.net> writes: >>> >>>> On Sat, 5 Oct 2024 13:59:44 +0200 >>>> Alice Ryhl <aliceryhl@google.com> wrote: >>>> >>>>> On Sat, Oct 5, 2024 at 11:49 AM Andreas Hindborg <a.hindborg@kernel.org> wrote: >>>>>> >>>>>> Hi Greg, >>>>>> >>>>>> "Greg KH" <gregkh@linuxfoundation.org> writes: >>>>>> >>>>>>> On Fri, Oct 04, 2024 at 04:52:24PM +0100, Gary Guo wrote: >>>>>>>> There is an operation needed by `block::mq`, atomically decreasing >>>>>>>> refcount from 2 to 0, which is not available through refcount.h, so >>>>>>>> I exposed `Refcount::as_atomic` which allows accessing the refcount >>>>>>>> directly. >>>>>>> >>>>>>> That's scary, and of course feels wrong on many levels, but: >>>>>>> >>>>>>> >>>>>>>> @@ -91,13 +95,17 @@ pub(crate) unsafe fn start_unchecked(this: &ARef<Self>) { >>>>>>>> /// C `struct request`. If the operation fails, `this` is returned in the >>>>>>>> /// `Err` variant. >>>>>>>> fn try_set_end(this: ARef<Self>) -> Result<*mut bindings::request, ARef<Self>> { >>>>>>>> - // We can race with `TagSet::tag_to_rq` >>>>>>>> - if let Err(_old) = this.wrapper_ref().refcount().compare_exchange( >>>>>>>> - 2, >>>>>>>> - 0, >>>>>>>> - Ordering::Relaxed, >>>>>>>> - Ordering::Relaxed, >>>>>>>> - ) { >>>>>>>> + // To hand back the ownership, we need the current refcount to be 2. >>>>>>>> + // Since we can race with `TagSet::tag_to_rq`, this needs to atomically reduce >>>>>>>> + // refcount to 0. `Refcount` does not provide a way to do this, so use the underlying >>>>>>>> + // atomics directly. >>>>>>>> + if this >>>>>>>> + .wrapper_ref() >>>>>>>> + .refcount() >>>>>>>> + .as_atomic() >>>>>>>> + .compare_exchange(2, 0, Ordering::Relaxed, Ordering::Relaxed) >>>>>>>> + .is_err() >>>>>>> >>>>>>> Why not just call rust_helper_refcount_set()? Or is the issue that you >>>>>>> think you might not be 2 here? And if you HAVE to be 2, why that magic >>>>>>> value (i.e. why not just always be 1 and rely on normal >>>>>>> increment/decrement?) >>>>>>> >>>>>>> I know some refcounts are odd in the kernel, but I don't see where the >>>>>>> block layer is caring about 2 as a refcount anywhere, what am I missing? >>>>>> >>>>>> It is in the documentation, rendered version available here [1]. Let me >>>>>> know if it is still unclear, then I guess we need to update the docs. >>>>>> >>>>>> Also, my session from Recipes has a little bit of discussion regarding >>>>>> this refcount and it's use [2]. >>>>>> >>>>>> Best regards, >>>>>> Andreas >>>>>> >>>>>> >>>>>> [1] https://rust.docs.kernel.org/kernel/block/mq/struct.Request.html#implementation-details >>>>>> [2] https://youtu.be/1LEvgkhU-t4?si=B1XnJhzCCNnUtRsI&t=1685 >>>>> >>>>> So it sounds like there is one refcount from the C side, and some >>>>> number of references from the Rust side. The function checks whether >>>>> there's only one Rust reference left, and if so, takes ownership of >>>>> the value, correct? >>>>> >>>>> In that case, the CAS should have an acquire ordering to synchronize >>>>> with dropping the refcount 3->2 on another thread. Otherwise, you >>>>> might have a data race with the operations that happened just before >>>>> the 3->2 refcount drop. >>>>> >>>>> Alice >>>> >>>> The code as is is fine since there's no data protected in >>>> `RequestDataWrapper` yet (in fact it's not even generic yet). I know >>>> Andreas does want to introduce driver-specific data into that, so in >>>> the long term the acquire would be necessary. >>>> >>>> Andreas, please let me know if you want me to make the change now, or >>>> you'd rather change the ordering when you introduce data to >>>> `RequestDataWrapper`. >>> >>> I guess we will have said data dependencies when we are going to run >>> drop for fields in the private data area. Thanks for pointing that out. >>> I will update the ordering when I submit that patch. >>> >>> As I mentioned before, I would rather we do not apply this patch before >>> we get a way to inline helpers. >> >> As discussed offline, the code that suffers the performance regression >> is downstream, and since this change seems to be important, I can apply >> the helper LTO patch downstream as well. >> >> Since the plan for the downstream code _is_ to move upstream, I really >> hope to see the helper LTO patch upstream, so we don't get a performance >> regression because of these refcounts. >> >> If we cannot figure out a way to get the LTO patches (or an alternative >> solution) upstream, we can always revert back to a more performant >> solution in block. > > I forgot to report the result of the benchmarks. Over the usual > benchmark workload that I run for `rnull` I see an average 0.8 percent > performance penalty with this patch. For some configurations > I see 95% CI N=40 [-18%;-5%]. So it is not insignificant. Was the benchmark run together with the LTO helper patches? --- Cheers, Benno
"Benno Lossin" <benno.lossin@proton.me> writes: > On 10.10.24 11:06, Andreas Hindborg wrote: >> Andreas Hindborg <a.hindborg@kernel.org> writes: >> >>> Andreas Hindborg <a.hindborg@kernel.org> writes: >>> >>>> "Gary Guo" <gary@garyguo.net> writes: >>>> >>>>> On Sat, 5 Oct 2024 13:59:44 +0200 >>>>> Alice Ryhl <aliceryhl@google.com> wrote: >>>>> >>>>>> On Sat, Oct 5, 2024 at 11:49 AM Andreas Hindborg <a.hindborg@kernel.org> wrote: >>>>>>> >>>>>>> Hi Greg, >>>>>>> >>>>>>> "Greg KH" <gregkh@linuxfoundation.org> writes: >>>>>>> >>>>>>>> On Fri, Oct 04, 2024 at 04:52:24PM +0100, Gary Guo wrote: >>>>>>>>> There is an operation needed by `block::mq`, atomically decreasing >>>>>>>>> refcount from 2 to 0, which is not available through refcount.h, so >>>>>>>>> I exposed `Refcount::as_atomic` which allows accessing the refcount >>>>>>>>> directly. >>>>>>>> >>>>>>>> That's scary, and of course feels wrong on many levels, but: >>>>>>>> >>>>>>>> >>>>>>>>> @@ -91,13 +95,17 @@ pub(crate) unsafe fn start_unchecked(this: &ARef<Self>) { >>>>>>>>> /// C `struct request`. If the operation fails, `this` is returned in the >>>>>>>>> /// `Err` variant. >>>>>>>>> fn try_set_end(this: ARef<Self>) -> Result<*mut bindings::request, ARef<Self>> { >>>>>>>>> - // We can race with `TagSet::tag_to_rq` >>>>>>>>> - if let Err(_old) = this.wrapper_ref().refcount().compare_exchange( >>>>>>>>> - 2, >>>>>>>>> - 0, >>>>>>>>> - Ordering::Relaxed, >>>>>>>>> - Ordering::Relaxed, >>>>>>>>> - ) { >>>>>>>>> + // To hand back the ownership, we need the current refcount to be 2. >>>>>>>>> + // Since we can race with `TagSet::tag_to_rq`, this needs to atomically reduce >>>>>>>>> + // refcount to 0. `Refcount` does not provide a way to do this, so use the underlying >>>>>>>>> + // atomics directly. >>>>>>>>> + if this >>>>>>>>> + .wrapper_ref() >>>>>>>>> + .refcount() >>>>>>>>> + .as_atomic() >>>>>>>>> + .compare_exchange(2, 0, Ordering::Relaxed, Ordering::Relaxed) >>>>>>>>> + .is_err() >>>>>>>> >>>>>>>> Why not just call rust_helper_refcount_set()? Or is the issue that you >>>>>>>> think you might not be 2 here? And if you HAVE to be 2, why that magic >>>>>>>> value (i.e. why not just always be 1 and rely on normal >>>>>>>> increment/decrement?) >>>>>>>> >>>>>>>> I know some refcounts are odd in the kernel, but I don't see where the >>>>>>>> block layer is caring about 2 as a refcount anywhere, what am I missing? >>>>>>> >>>>>>> It is in the documentation, rendered version available here [1]. Let me >>>>>>> know if it is still unclear, then I guess we need to update the docs. >>>>>>> >>>>>>> Also, my session from Recipes has a little bit of discussion regarding >>>>>>> this refcount and it's use [2]. >>>>>>> >>>>>>> Best regards, >>>>>>> Andreas >>>>>>> >>>>>>> >>>>>>> [1] https://rust.docs.kernel.org/kernel/block/mq/struct.Request.html#implementation-details >>>>>>> [2] https://youtu.be/1LEvgkhU-t4?si=B1XnJhzCCNnUtRsI&t=1685 >>>>>> >>>>>> So it sounds like there is one refcount from the C side, and some >>>>>> number of references from the Rust side. The function checks whether >>>>>> there's only one Rust reference left, and if so, takes ownership of >>>>>> the value, correct? >>>>>> >>>>>> In that case, the CAS should have an acquire ordering to synchronize >>>>>> with dropping the refcount 3->2 on another thread. Otherwise, you >>>>>> might have a data race with the operations that happened just before >>>>>> the 3->2 refcount drop. >>>>>> >>>>>> Alice >>>>> >>>>> The code as is is fine since there's no data protected in >>>>> `RequestDataWrapper` yet (in fact it's not even generic yet). I know >>>>> Andreas does want to introduce driver-specific data into that, so in >>>>> the long term the acquire would be necessary. >>>>> >>>>> Andreas, please let me know if you want me to make the change now, or >>>>> you'd rather change the ordering when you introduce data to >>>>> `RequestDataWrapper`. >>>> >>>> I guess we will have said data dependencies when we are going to run >>>> drop for fields in the private data area. Thanks for pointing that out. >>>> I will update the ordering when I submit that patch. >>>> >>>> As I mentioned before, I would rather we do not apply this patch before >>>> we get a way to inline helpers. >>> >>> As discussed offline, the code that suffers the performance regression >>> is downstream, and since this change seems to be important, I can apply >>> the helper LTO patch downstream as well. >>> >>> Since the plan for the downstream code _is_ to move upstream, I really >>> hope to see the helper LTO patch upstream, so we don't get a performance >>> regression because of these refcounts. >>> >>> If we cannot figure out a way to get the LTO patches (or an alternative >>> solution) upstream, we can always revert back to a more performant >>> solution in block. >> >> I forgot to report the result of the benchmarks. Over the usual >> benchmark workload that I run for `rnull` I see an average 0.8 percent >> performance penalty with this patch. For some configurations >> I see 95% CI N=40 [-18%;-5%]. So it is not insignificant. > > Was the benchmark run together with the LTO helper patches? No, that the effect of applying only this patch set alone. I did apply the helper LTO patches downstream a few times, but I don't carry them in my default tree. But I guess I can start doing that now. Best regards, Andreas
diff --git a/rust/kernel/block/mq/operations.rs b/rust/kernel/block/mq/operations.rs index 9ba7fdfeb4b2..36ee5f96c66d 100644 --- a/rust/kernel/block/mq/operations.rs +++ b/rust/kernel/block/mq/operations.rs @@ -9,9 +9,10 @@ block::mq::request::RequestDataWrapper, block::mq::Request, error::{from_result, Result}, + sync::Refcount, types::ARef, }; -use core::{marker::PhantomData, sync::atomic::AtomicU64, sync::atomic::Ordering}; +use core::marker::PhantomData; /// Implement this trait to interface blk-mq as block devices. /// @@ -77,7 +78,7 @@ impl<T: Operations> OperationsVTable<T> { let request = unsafe { &*(*bd).rq.cast::<Request<T>>() }; // One refcount for the ARef, one for being in flight - request.wrapper_ref().refcount().store(2, Ordering::Relaxed); + request.wrapper_ref().refcount().set(2); // SAFETY: // - We own a refcount that we took above. We pass that to `ARef`. @@ -186,7 +187,7 @@ impl<T: Operations> OperationsVTable<T> { // SAFETY: The refcount field is allocated but not initialized, so // it is valid for writes. - unsafe { RequestDataWrapper::refcount_ptr(pdu.as_ptr()).write(AtomicU64::new(0)) }; + unsafe { RequestDataWrapper::refcount_ptr(pdu.as_ptr()).write(Refcount::new(0)) }; Ok(0) }) diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs index a0e22827f3f4..7b63c02bdce7 100644 --- a/rust/kernel/block/mq/request.rs +++ b/rust/kernel/block/mq/request.rs @@ -8,12 +8,13 @@ bindings, block::mq::Operations, error::Result, + sync::Refcount, types::{ARef, AlwaysRefCounted, Opaque}, }; use core::{ marker::PhantomData, ptr::{addr_of_mut, NonNull}, - sync::atomic::{AtomicU64, Ordering}, + sync::atomic::Ordering, }; /// A wrapper around a blk-mq `struct request`. This represents an IO request. @@ -37,6 +38,9 @@ /// We need to track C and D to ensure that it is safe to end the request and hand /// back ownership to the block layer. /// +/// Note that driver can still obtain new `ARef` even there is no `ARef`s in existence by using +/// `tag_to_rq`, hence the need to distinct B and C. +/// /// The states are tracked through the private `refcount` field of /// `RequestDataWrapper`. This structure lives in the private data area of the C /// `struct request`. @@ -91,13 +95,17 @@ pub(crate) unsafe fn start_unchecked(this: &ARef<Self>) { /// C `struct request`. If the operation fails, `this` is returned in the /// `Err` variant. fn try_set_end(this: ARef<Self>) -> Result<*mut bindings::request, ARef<Self>> { - // We can race with `TagSet::tag_to_rq` - if let Err(_old) = this.wrapper_ref().refcount().compare_exchange( - 2, - 0, - Ordering::Relaxed, - Ordering::Relaxed, - ) { + // To hand back the ownership, we need the current refcount to be 2. + // Since we can race with `TagSet::tag_to_rq`, this needs to atomically reduce + // refcount to 0. `Refcount` does not provide a way to do this, so use the underlying + // atomics directly. + if this + .wrapper_ref() + .refcount() + .as_atomic() + .compare_exchange(2, 0, Ordering::Relaxed, Ordering::Relaxed) + .is_err() + { return Err(this); } @@ -159,13 +167,13 @@ pub(crate) struct RequestDataWrapper { /// - 0: The request is owned by C block layer. /// - 1: The request is owned by Rust abstractions but there are no ARef references to it. /// - 2+: There are `ARef` references to the request. - refcount: AtomicU64, + refcount: Refcount, } impl RequestDataWrapper { /// Return a reference to the refcount of the request that is embedding /// `self`. - pub(crate) fn refcount(&self) -> &AtomicU64 { + pub(crate) fn refcount(&self) -> &Refcount { &self.refcount } @@ -175,7 +183,7 @@ pub(crate) fn refcount(&self) -> &AtomicU64 { /// # Safety /// /// - `this` must point to a live allocation of at least the size of `Self`. - pub(crate) unsafe fn refcount_ptr(this: *mut Self) -> *mut AtomicU64 { + pub(crate) unsafe fn refcount_ptr(this: *mut Self) -> *mut Refcount { // SAFETY: Because of the safety requirements of this function, the // field projection is safe. unsafe { addr_of_mut!((*this).refcount) } @@ -191,47 +199,13 @@ unsafe impl<T: Operations> Send for Request<T> {} // mutate `self` are internally synchronized` unsafe impl<T: Operations> Sync for Request<T> {} -/// Store the result of `op(target.load())` in target, returning new value of -/// target. -fn atomic_relaxed_op_return(target: &AtomicU64, op: impl Fn(u64) -> u64) -> u64 { - let old = target.fetch_update(Ordering::Relaxed, Ordering::Relaxed, |x| Some(op(x))); - - // SAFETY: Because the operation passed to `fetch_update` above always - // return `Some`, `old` will always be `Ok`. - let old = unsafe { old.unwrap_unchecked() }; - - op(old) -} - -/// Store the result of `op(target.load)` in `target` if `target.load() != -/// pred`, returning true if the target was updated. -fn atomic_relaxed_op_unless(target: &AtomicU64, op: impl Fn(u64) -> u64, pred: u64) -> bool { - target - .fetch_update(Ordering::Relaxed, Ordering::Relaxed, |x| { - if x == pred { - None - } else { - Some(op(x)) - } - }) - .is_ok() -} - // SAFETY: All instances of `Request<T>` are reference counted. This // implementation of `AlwaysRefCounted` ensure that increments to the ref count // keeps the object alive in memory at least until a matching reference count // decrement is executed. unsafe impl<T: Operations> AlwaysRefCounted for Request<T> { fn inc_ref(&self) { - let refcount = &self.wrapper_ref().refcount(); - - #[cfg_attr(not(CONFIG_DEBUG_MISC), allow(unused_variables))] - let updated = atomic_relaxed_op_unless(refcount, |x| x + 1, 0); - - #[cfg(CONFIG_DEBUG_MISC)] - if !updated { - panic!("Request refcount zero on clone") - } + self.wrapper_ref().refcount().inc(); } unsafe fn dec_ref(obj: core::ptr::NonNull<Self>) { @@ -243,10 +217,10 @@ unsafe fn dec_ref(obj: core::ptr::NonNull<Self>) { let refcount = unsafe { &*RequestDataWrapper::refcount_ptr(wrapper_ptr) }; #[cfg_attr(not(CONFIG_DEBUG_MISC), allow(unused_variables))] - let new_refcount = atomic_relaxed_op_return(refcount, |x| x - 1); + let is_zero = refcount.dec_and_test(); #[cfg(CONFIG_DEBUG_MISC)] - if new_refcount == 0 { + if is_zero { panic!("Request reached refcount zero in Rust abstractions"); } } diff --git a/rust/kernel/sync/refcount.rs b/rust/kernel/sync/refcount.rs index 4a5b815adc05..2cdcb20e8ee2 100644 --- a/rust/kernel/sync/refcount.rs +++ b/rust/kernel/sync/refcount.rs @@ -31,6 +31,18 @@ fn as_ptr(&self) -> *mut bindings::refcount_t { self.0.get() } + /// Get the underlying atomic counter that backs the refcount. + /// + /// NOTE: This will be changed to LKMM atomic in the future. + #[inline] + pub fn as_atomic(&self) -> &AtomicI32 { + let ptr = self.0.get() as *const AtomicI32; + // SAFETY: `refcount_t` is a transparent wrapper of `atomic_t`, which is an atomic 32-bit + // integer that is layout-wise compatible with `AtomicI32`. All values are valid for + // `refcount_t`, despite some of the values are considered saturated and "bad". + unsafe { &*ptr } + } + /// Get a refcount's value. #[inline] pub fn read(&self) -> i32 {
Currently there's a custom reference counting in `block::mq`, which uses `AtomicU64` Rust atomics, and this type doesn't exist on some 32-bit architectures. We cannot just change it to use 32-bit atomics, because doing so will make it vulnerable to refcount overflow. So switch it to use the kernel refcount `kernel::sync::Refcount` instead. There is an operation needed by `block::mq`, atomically decreasing refcount from 2 to 0, which is not available through refcount.h, so I exposed `Refcount::as_atomic` which allows accessing the refcount directly. Cc: Will Deacon <will@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Gary Guo <gary@garyguo.net> --- rust/kernel/block/mq/operations.rs | 7 +-- rust/kernel/block/mq/request.rs | 70 ++++++++++-------------------- rust/kernel/sync/refcount.rs | 12 +++++ 3 files changed, 38 insertions(+), 51 deletions(-)