diff mbox series

[net-next,v7,1/5] rust: core abstractions for network PHY drivers

Message ID 20231026001050.1720612-2-fujita.tomonori@gmail.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series Rust abstractions for network PHY drivers | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 9 this patch: 9
netdev/cc_maintainers warning 15 maintainers not CCed: pabeni@redhat.com linux@armlinux.org.uk edumazet@google.com aakashsensharma@gmail.com alex.gaynor@gmail.com kuba@kernel.org aliceryhl@google.com bjorn3_gh@protonmail.com gary@garyguo.net davem@davemloft.net yakoyoku@gmail.com vincenzopalazzodev@gmail.com hkallweit1@gmail.com a.hindborg@samsung.com boqun.feng@gmail.com
netdev/build_clang success Errors and warnings before: 9 this patch: 9
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 9 this patch: 9
netdev/checkpatch warning WARNING: added, moved or deleted file(s), does MAINTAINERS need updating? WARNING: line length of 110 exceeds 80 columns WARNING: line length of 81 exceeds 80 columns WARNING: line length of 82 exceeds 80 columns WARNING: line length of 85 exceeds 80 columns WARNING: line length of 86 exceeds 80 columns WARNING: line length of 87 exceeds 80 columns WARNING: line length of 88 exceeds 80 columns WARNING: line length of 89 exceeds 80 columns WARNING: line length of 90 exceeds 80 columns WARNING: line length of 91 exceeds 80 columns WARNING: line length of 92 exceeds 80 columns WARNING: line length of 94 exceeds 80 columns WARNING: line length of 95 exceeds 80 columns WARNING: line length of 96 exceeds 80 columns WARNING: line length of 97 exceeds 80 columns WARNING: line length of 98 exceeds 80 columns WARNING: line length of 99 exceeds 80 columns WARNING: please write a help paragraph that fully describes the config symbol
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

FUJITA Tomonori Oct. 26, 2023, 12:10 a.m. UTC
This patch adds abstractions to implement network PHY drivers; the
driver registration and bindings for some of callback functions in
struct phy_driver and many genphy_ functions.

This feature is enabled with CONFIG_RUST_PHYLIB_ABSTRACTIONS=y.

This patch enables unstable const_maybe_uninit_zeroed feature for
kernel crate to enable unsafe code to handle a constant value with
uninitialized data. With the feature, the abstractions can initialize
a phy_driver structure with zero easily; instead of initializing all
the members by hand. It's supposed to be stable in the not so distant
future.

Link: https://github.com/rust-lang/rust/pull/116218

Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
---
 drivers/net/phy/Kconfig         |   8 +
 rust/bindings/bindings_helper.h |   3 +
 rust/kernel/lib.rs              |   3 +
 rust/kernel/net.rs              |   6 +
 rust/kernel/net/phy.rs          | 725 ++++++++++++++++++++++++++++++++
 5 files changed, 745 insertions(+)
 create mode 100644 rust/kernel/net.rs
 create mode 100644 rust/kernel/net/phy.rs

Comments

Boqun Feng Oct. 27, 2023, 7:09 p.m. UTC | #1
On Thu, Oct 26, 2023 at 09:10:46AM +0900, FUJITA Tomonori wrote:
[...]
> +config RUST_PHYLIB_ABSTRACTIONS
> +        bool "PHYLIB abstractions support"

bool "Rust PHYLIB abstractions support"

maybe? Make it easier for menuconfig users to know this is the option
to enable Rust support.

Regards,
Boqun

> +        depends on RUST
> +        depends on PHYLIB=y
> +        help
> +          Adds support needed for PHY drivers written in Rust. It provides
> +          a wrapper around the C phylib core.
> +
[...]
Boqun Feng Oct. 27, 2023, 7:59 p.m. UTC | #2
On Thu, Oct 26, 2023 at 09:10:46AM +0900, FUJITA Tomonori wrote:
[...]
> +    /// Gets the current link state.
> +    ///
> +    /// It returns true if the link is up.
> +    pub fn is_link_up(&self) -> bool {
> +        const LINK_IS_UP: u32 = 1;
> +        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
> +        let phydev = unsafe { *self.0.get() };

Tomo, FWIW, the above line means *copying* the content pointed by
`self.0.get()` into `phydev`, i.e. `phydev` is the semantically a copy
of the `phy_device` instead of an alias. In C code, it means you did:

	struct phy_device phydev = *ptr;

Sure, both compilers can figure this out, therefore no extra copy is
done, but still it's better to avoid this copy semantics by doing:

	let phydev = unsafe { &*self.0.get() };

it's equal to C code:

	struct phy_device *phydev = ptr;

Ditto for is_autoneg_enabled() and is_autoneg_completed().

Regards,
Boqun

> +        phydev.link() == LINK_IS_UP
> +    }
> +
> +    /// Gets the current auto-negotiation configuration.
> +    ///
> +    /// It returns true if auto-negotiation is enabled.
> +    pub fn is_autoneg_enabled(&self) -> bool {
> +        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
> +        let phydev = unsafe { *self.0.get() };
> +        phydev.autoneg() == bindings::AUTONEG_ENABLE
> +    }
> +
> +    /// Gets the current auto-negotiation state.
> +    ///
> +    /// It returns true if auto-negotiation is completed.
> +    pub fn is_autoneg_completed(&self) -> bool {
> +        const AUTONEG_COMPLETED: u32 = 1;
> +        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
> +        let phydev = unsafe { *self.0.get() };
> +        phydev.autoneg_complete() == AUTONEG_COMPLETED
> +    }
[...]
Benno Lossin Oct. 27, 2023, 9:19 p.m. UTC | #3
On 10/27/23 21:59, Boqun Feng wrote:
> On Thu, Oct 26, 2023 at 09:10:46AM +0900, FUJITA Tomonori wrote:
> [...]
>> +    /// Gets the current link state.
>> +    ///
>> +    /// It returns true if the link is up.
>> +    pub fn is_link_up(&self) -> bool {
>> +        const LINK_IS_UP: u32 = 1;
>> +        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
>> +        let phydev = unsafe { *self.0.get() };
> 
> Tomo, FWIW, the above line means *copying* the content pointed by
> `self.0.get()` into `phydev`, i.e. `phydev` is the semantically a copy
> of the `phy_device` instead of an alias. In C code, it means you did:

Good catch. `phy_device` is rather large (did not look at the exact
size) and this will not be optimized on debug builds, so it could lead
to stackoverflows.

> 	struct phy_device phydev = *ptr;
> 
> Sure, both compilers can figure this out, therefore no extra copy is
> done, but still it's better to avoid this copy semantics by doing:
> 
> 	let phydev = unsafe { &*self.0.get() };

We need to be careful here, since doing this creates a reference
`&bindings::phy_device` which asserts that it is immutable. That is not
the case, since the C side might change it at any point (this is the
reason we wrap things in `Opaque`, since that allows mutatation even
through sharde references).

I did not notice this before, but this means we cannot use the `link`
function from bindgen, since that takes `&self`. We would need a
function that takes `*const Self` instead.
Boqun Feng Oct. 27, 2023, 10:21 p.m. UTC | #4
On Fri, Oct 27, 2023 at 09:19:38PM +0000, Benno Lossin wrote:
[...]
> 
> Good catch. `phy_device` is rather large (did not look at the exact
> size) and this will not be optimized on debug builds, so it could lead
> to stackoverflows.
> 

IIRC, kernel only supports O2 build, but yes, if we don't want the copy
here, we should avoid the copy semantics.

> > 	struct phy_device phydev = *ptr;
> > 
> > Sure, both compilers can figure this out, therefore no extra copy is
> > done, but still it's better to avoid this copy semantics by doing:
> > 
> > 	let phydev = unsafe { &*self.0.get() };
> 
> We need to be careful here, since doing this creates a reference
> `&bindings::phy_device` which asserts that it is immutable. That is not
> the case, since the C side might change it at any point (this is the
> reason we wrap things in `Opaque`, since that allows mutatation even
> through sharde references).
> 
> I did not notice this before, but this means we cannot use the `link`
> function from bindgen, since that takes `&self`. We would need a
> function that takes `*const Self` instead.
> 

Hmm... but does it mean even `set_speed()` has the similar issue?

	let phydev: *mut phy_device = self.0.get();
	unsafe { (*phydev).speed = ...; }

The `(*phydev)` creates a `&mut` IIUC. So we need the following maybe?

	let phydev: *mut phy_device = self.0.get();
	unsafe { *addr_mut_of!((*phydev).speed) = ...; }

because at least from phylib core guarantee, we know no one accessing
`speed` in the same time. However, yes, bit fields are tricky...

Regards,
Boqun

> -- 
> Cheers,
> Benno
>
Andrew Lunn Oct. 27, 2023, 10:36 p.m. UTC | #5
> Hmm... but does it mean even `set_speed()` has the similar issue?
> 
> 	let phydev: *mut phy_device = self.0.get();
> 	unsafe { (*phydev).speed = ...; }
> 
> The `(*phydev)` creates a `&mut` IIUC. So we need the following maybe?
> 
> 	let phydev: *mut phy_device = self.0.get();
> 	unsafe { *addr_mut_of!((*phydev).speed) = ...; }
> 
> because at least from phylib core guarantee, we know no one accessing
> `speed` in the same time. However, yes, bit fields are tricky...

Speed is not a bitfield. Its a plain boring int. link is however a bit
field.

	Andrew
Andrew Lunn Oct. 27, 2023, 10:40 p.m. UTC | #6
> I did not notice this before, but this means we cannot use the `link`
> function from bindgen, since that takes `&self`. We would need a
> function that takes `*const Self` instead.

After the discussion about mutability, i took a look at the C code,
and started adding const to functions which take phydev, but don't
modify it. Does bindgen look for such const attributes? Does it make a
difference to be binding?

	Andrew
Benno Lossin Oct. 27, 2023, 10:50 p.m. UTC | #7
On 10/28/23 00:21, Boqun Feng wrote:
> On Fri, Oct 27, 2023 at 09:19:38PM +0000, Benno Lossin wrote:
>> We need to be careful here, since doing this creates a reference
>> `&bindings::phy_device` which asserts that it is immutable. That is not
>> the case, since the C side might change it at any point (this is the
>> reason we wrap things in `Opaque`, since that allows mutatation even
>> through sharde references).
>>
>> I did not notice this before, but this means we cannot use the `link`
>> function from bindgen, since that takes `&self`. We would need a
>> function that takes `*const Self` instead.
>>
> 
> Hmm... but does it mean even `set_speed()` has the similar issue?
> 
> 	let phydev: *mut phy_device = self.0.get();
> 	unsafe { (*phydev).speed = ...; }

No that should be fine, take a look at the MIR output of the following 
code [1]:

    struct Foo {
        a: usize,
        b: usize,
    }
    
    fn foo(ptr: *mut Foo) {
        unsafe { (*ptr).b = 0; }
    }
    
    fn bar(ptr: *mut Foo) {
        unsafe { (&mut *ptr).b = 0; }
    }

Aside from some alignment checking, foo's MIR looks like this:

    bb1: {
        ((*_1).1: usize) = const 0_usize;
        return;
    }

And bar's MIR like this:

    bb1: {
        _2 = &mut (*_1);
        ((*_2).1: usize) = const 0_usize;
        return;
    }

[1]: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=f7c4d87bf29a64af0acc09ff75d3716d

So I think that is fine, but maybe Gary has something else to say about it.

> The `(*phydev)` creates a `&mut` IIUC. So we need the following maybe?
> 
> 	let phydev: *mut phy_device = self.0.get();
> 	unsafe { *addr_mut_of!((*phydev).speed) = ...; }
> 
> because at least from phylib core guarantee, we know no one accessing
> `speed` in the same time. However, yes, bit fields are tricky...
Boqun Feng Oct. 27, 2023, 11:26 p.m. UTC | #8
On Fri, Oct 27, 2023 at 10:50:45PM +0000, Benno Lossin wrote:
[...]
> > 
> > Hmm... but does it mean even `set_speed()` has the similar issue?
> > 
> > 	let phydev: *mut phy_device = self.0.get();
> > 	unsafe { (*phydev).speed = ...; }
> 
> No that should be fine, take a look at the MIR output of the following 
> code [1]:
> 
>     struct Foo {
>         a: usize,
>         b: usize,
>     }
>     
>     fn foo(ptr: *mut Foo) {
>         unsafe { (*ptr).b = 0; }
>     }
>     
>     fn bar(ptr: *mut Foo) {
>         unsafe { (&mut *ptr).b = 0; }
>     }
> 
> Aside from some alignment checking, foo's MIR looks like this:
> 
>     bb1: {
>         ((*_1).1: usize) = const 0_usize;
>         return;
>     }
> 
> And bar's MIR like this:
> 
>     bb1: {
>         _2 = &mut (*_1);
>         ((*_2).1: usize) = const 0_usize;
>         return;
>     }
> 
> [1]: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=f7c4d87bf29a64af0acc09ff75d3716d
> 
> So I think that is fine, but maybe Gary has something else to say about it.
> 

Well when `-C opt-level=2`, they are the same:

	https://godbolt.org/z/hxxo75YYh

Regards,
Boqun
Boqun Feng Oct. 27, 2023, 11:52 p.m. UTC | #9
[...]
> > [1]: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=f7c4d87bf29a64af0acc09ff75d3716d
> > 
> > So I think that is fine, but maybe Gary has something else to say about it.
> > 
> 
> Well when `-C opt-level=2`, they are the same:
> 
> 	https://godbolt.org/z/hxxo75YYh

But maybe you are right, no temporary reference in this case.

Regards,
Boqun
Benno Lossin Oct. 28, 2023, 8:35 a.m. UTC | #10
On 28.10.23 01:26, Boqun Feng wrote:
> On Fri, Oct 27, 2023 at 10:50:45PM +0000, Benno Lossin wrote:
> [...]
>>>
>>> Hmm... but does it mean even `set_speed()` has the similar issue?
>>>
>>> 	let phydev: *mut phy_device = self.0.get();
>>> 	unsafe { (*phydev).speed = ...; }
>>
>> No that should be fine, take a look at the MIR output of the following
>> code [1]:
>>
>>      struct Foo {
>>          a: usize,
>>          b: usize,
>>      }
>>
>>      fn foo(ptr: *mut Foo) {
>>          unsafe { (*ptr).b = 0; }
>>      }
>>
>>      fn bar(ptr: *mut Foo) {
>>          unsafe { (&mut *ptr).b = 0; }
>>      }
>>
>> Aside from some alignment checking, foo's MIR looks like this:
>>
>>      bb1: {
>>          ((*_1).1: usize) = const 0_usize;
>>          return;
>>      }
>>
>> And bar's MIR like this:
>>
>>      bb1: {
>>          _2 = &mut (*_1);
>>          ((*_2).1: usize) = const 0_usize;
>>          return;
>>      }
>>
>> [1]: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=f7c4d87bf29a64af0acc09ff75d3716d
>>
>> So I think that is fine, but maybe Gary has something else to say about it.
>>
> 
> Well when `-C opt-level=2`, they are the same:
> 
> 	https://godbolt.org/z/hxxo75YYh

It doesn't matter what the optimizer does, `bar` is unsound in our use-case
and `foo` is fine (I think).
FUJITA Tomonori Oct. 28, 2023, 9:27 a.m. UTC | #11
On Fri, 27 Oct 2023 21:19:38 +0000
Benno Lossin <benno.lossin@proton.me> wrote:

> On 10/27/23 21:59, Boqun Feng wrote:
>> On Thu, Oct 26, 2023 at 09:10:46AM +0900, FUJITA Tomonori wrote:
>> [...]
>>> +    /// Gets the current link state.
>>> +    ///
>>> +    /// It returns true if the link is up.
>>> +    pub fn is_link_up(&self) -> bool {
>>> +        const LINK_IS_UP: u32 = 1;
>>> +        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
>>> +        let phydev = unsafe { *self.0.get() };
>> 
>> Tomo, FWIW, the above line means *copying* the content pointed by
>> `self.0.get()` into `phydev`, i.e. `phydev` is the semantically a copy
>> of the `phy_device` instead of an alias. In C code, it means you did:
> 
> Good catch. `phy_device` is rather large (did not look at the exact
> size) and this will not be optimized on debug builds, so it could lead
> to stackoverflows.
> 
>> 	struct phy_device phydev = *ptr;
>> 
>> Sure, both compilers can figure this out, therefore no extra copy is
>> done, but still it's better to avoid this copy semantics by doing:
>> 
>> 	let phydev = unsafe { &*self.0.get() };
> 
> We need to be careful here, since doing this creates a reference
> `&bindings::phy_device` which asserts that it is immutable. That is not
> the case, since the C side might change it at any point (this is the
> reason we wrap things in `Opaque`, since that allows mutatation even
> through sharde references).

You meant that the C code might modify it independently anytime, not
the C code called the Rust abstractions might modify it, right?


> I did not notice this before, but this means we cannot use the `link`
> function from bindgen, since that takes `&self`. We would need a
> function that takes `*const Self` instead.

Implementing functions to access to a bitfield looks tricky so we need
to add such feature to bindgen or we add getters to the C side?
FUJITA Tomonori Oct. 28, 2023, 10 a.m. UTC | #12
On Fri, 27 Oct 2023 12:09:41 -0700
Boqun Feng <boqun.feng@gmail.com> wrote:

> On Thu, Oct 26, 2023 at 09:10:46AM +0900, FUJITA Tomonori wrote:
> [...]
>> +config RUST_PHYLIB_ABSTRACTIONS
>> +        bool "PHYLIB abstractions support"
> 
> bool "Rust PHYLIB abstractions support"
> 
> maybe? Make it easier for menuconfig users to know this is the option
> to enable Rust support.

Surely, updated.

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index 11b18370a05b..1fa84a188bcc 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -61,7 +61,7 @@ config FIXED_PHY
 	  Currently tested with mpc866ads and mpc8349e-mitx.
 
 config RUST_PHYLIB_ABSTRACTIONS
-        bool "PHYLIB abstractions support"
+        bool "Rust PHYLIB abstractions support"
         depends on RUST
         depends on PHYLIB=y
         help
Andrew Lunn Oct. 28, 2023, 2:53 p.m. UTC | #13
> > We need to be careful here, since doing this creates a reference
> > `&bindings::phy_device` which asserts that it is immutable. That is not
> > the case, since the C side might change it at any point (this is the
> > reason we wrap things in `Opaque`, since that allows mutatation even
> > through sharde references).
> 
> You meant that the C code might modify it independently anytime, not
> the C code called the Rust abstractions might modify it, right?

The whole locking model is base around that not happening. Things
should only change with the lock held. I you make a call into the C
side, then yes, it can and will change it. So you should not cache a
value over a C call.
 
 	Andrew
Miguel Ojeda Oct. 28, 2023, 3:16 p.m. UTC | #14
On Sat, Oct 28, 2023 at 12:40 AM Andrew Lunn <andrew@lunn.ch> wrote:
>
> After the discussion about mutability, i took a look at the C code,
> and started adding const to functions which take phydev, but don't
> modify it. Does bindgen look for such const attributes? Does it make a
> difference to be binding?

I think you are referring to the `const` type qualifier, not the
attribute (which the kernel uses too).

But yes, it makes a difference in the output it generates (if we are
talking about the pointed-to), e.g.

    void f(struct S *);       // *mut S
    void f(const struct S *); // *const S

Being const-correct would be a good change for C in any case.

Cheers,
Miguel
FUJITA Tomonori Oct. 28, 2023, 4:09 p.m. UTC | #15
On Sat, 28 Oct 2023 16:53:30 +0200
Andrew Lunn <andrew@lunn.ch> wrote:

>> > We need to be careful here, since doing this creates a reference
>> > `&bindings::phy_device` which asserts that it is immutable. That is not
>> > the case, since the C side might change it at any point (this is the
>> > reason we wrap things in `Opaque`, since that allows mutatation even
>> > through sharde references).
>> 
>> You meant that the C code might modify it independently anytime, not
>> the C code called the Rust abstractions might modify it, right?
> 
> The whole locking model is base around that not happening. Things
> should only change with the lock held. I you make a call into the C
> side, then yes, it can and will change it. So you should not cache a
> value over a C call.

Yeah, I understand that. But if I understand Benno correctly, from
Rust perspective, such might happen.
Benno Lossin Oct. 28, 2023, 4:37 p.m. UTC | #16
On 28.10.23 11:27, FUJITA Tomonori wrote:
> On Fri, 27 Oct 2023 21:19:38 +0000
> Benno Lossin <benno.lossin@proton.me> wrote:
>> I did not notice this before, but this means we cannot use the `link`
>> function from bindgen, since that takes `&self`. We would need a
>> function that takes `*const Self` instead.
> 
> Implementing functions to access to a bitfield looks tricky so we need
> to add such feature to bindgen or we add getters to the C side?

Indeed, I just opened an issue [1] on the bindgen repo.

[1]: https://github.com/rust-lang/rust-bindgen/issues/2674
Benno Lossin Oct. 28, 2023, 4:39 p.m. UTC | #17
On 28.10.23 18:09, FUJITA Tomonori wrote:
> On Sat, 28 Oct 2023 16:53:30 +0200
> Andrew Lunn <andrew@lunn.ch> wrote:
> 
>>>> We need to be careful here, since doing this creates a reference
>>>> `&bindings::phy_device` which asserts that it is immutable. That is not
>>>> the case, since the C side might change it at any point (this is the
>>>> reason we wrap things in `Opaque`, since that allows mutatation even
>>>> through sharde references).
>>>
>>> You meant that the C code might modify it independently anytime, not
>>> the C code called the Rust abstractions might modify it, right?
>>
>> The whole locking model is base around that not happening. Things
>> should only change with the lock held. I you make a call into the C
>> side, then yes, it can and will change it. So you should not cache a
>> value over a C call.
> 
> Yeah, I understand that. But if I understand Benno correctly, from
> Rust perspective, such might happen.

Yes, that is what I meant. Sure the C side might never modify the
value, but this is not good enough for Rust. It must somehow be ensured
that it never is modified, in order for us to rely on it.
Andrew Lunn Oct. 28, 2023, 6:18 p.m. UTC | #18
> But yes, it makes a difference in the output it generates (if we are
> talking about the pointed-to), e.g.
> 
>     void f(struct S *);       // *mut S
>     void f(const struct S *); // *const S
> 
> Being const-correct would be a good change for C in any case.

I have a patchset which changes a few functions to use const struct
*phydev. I will post them next cycle. They are a bit invasive, so i
don't want to do it now, this close to the merge window.

      Andrew
Andrew Lunn Oct. 28, 2023, 6:23 p.m. UTC | #19
On Sat, Oct 28, 2023 at 04:37:53PM +0000, Benno Lossin wrote:
> On 28.10.23 11:27, FUJITA Tomonori wrote:
> > On Fri, 27 Oct 2023 21:19:38 +0000
> > Benno Lossin <benno.lossin@proton.me> wrote:
> >> I did not notice this before, but this means we cannot use the `link`
> >> function from bindgen, since that takes `&self`. We would need a
> >> function that takes `*const Self` instead.
> > 
> > Implementing functions to access to a bitfield looks tricky so we need
> > to add such feature to bindgen or we add getters to the C side?
> 
> Indeed, I just opened an issue [1] on the bindgen repo.
> 
> [1]: https://github.com/rust-lang/rust-bindgen/issues/2674

Please could you help me understand the consequences here. Are you
saying the rust toolchain is fatally broken here, it cannot generate
valid code at the moment? As a result we need to wait for a new
version of bindgen?

	Andrew
Benno Lossin Oct. 28, 2023, 6:45 p.m. UTC | #20
On 28.10.23 20:23, Andrew Lunn wrote:
> On Sat, Oct 28, 2023 at 04:37:53PM +0000, Benno Lossin wrote:
>> On 28.10.23 11:27, FUJITA Tomonori wrote:
>>> On Fri, 27 Oct 2023 21:19:38 +0000
>>> Benno Lossin <benno.lossin@proton.me> wrote:
>>>> I did not notice this before, but this means we cannot use the `link`
>>>> function from bindgen, since that takes `&self`. We would need a
>>>> function that takes `*const Self` instead.
>>>
>>> Implementing functions to access to a bitfield looks tricky so we need
>>> to add such feature to bindgen or we add getters to the C side?
>>
>> Indeed, I just opened an issue [1] on the bindgen repo.
>>
>> [1]: https://github.com/rust-lang/rust-bindgen/issues/2674
> 
> Please could you help me understand the consequences here. Are you
> saying the rust toolchain is fatally broken here, it cannot generate
> valid code at the moment? As a result we need to wait for a new
> version of bindgen?
This only affects bitfields, since they require special accessor functions
generated by bindgen, so I would not say that the toolchain is fatally broken.
It also is theoretically possible to manually access the bitfields in a correct
manner, but that is error prone (which is why we use the accessor functions
provided by bindgen).

In this particular case we have three options:
1. wait until bindgen provides a raw accessor function that allows to use
    only raw pointers.
2. create some C helper functions for the bitfield access that will be replaced
    by the bindgen functions once bindgen has updated.
3. Since for the `phy_device` bindings, we only ever call functions while holding
    the `phy_device.lock` lock (at least I think that this is correct) we might be
    able to get away with creating a reference to the object and use the current
    accessor functions anyway.

But for point 3 I will have to consult the others.
Boqun Feng Oct. 28, 2023, 7:06 p.m. UTC | #21
On Sat, Oct 28, 2023 at 04:39:08PM +0000, Benno Lossin wrote:
> On 28.10.23 18:09, FUJITA Tomonori wrote:
> > On Sat, 28 Oct 2023 16:53:30 +0200
> > Andrew Lunn <andrew@lunn.ch> wrote:
> > 
> >>>> We need to be careful here, since doing this creates a reference
> >>>> `&bindings::phy_device` which asserts that it is immutable. That is not
> >>>> the case, since the C side might change it at any point (this is the
> >>>> reason we wrap things in `Opaque`, since that allows mutatation even
> >>>> through sharde references).
> >>>
> >>> You meant that the C code might modify it independently anytime, not
> >>> the C code called the Rust abstractions might modify it, right?
> >>
> >> The whole locking model is base around that not happening. Things
> >> should only change with the lock held. I you make a call into the C

Here is my understanding, I treat references in Rust as a special
pointer, so having a `&bindings::phy_device` means the *entire* object
is immutable unless the fields are interior mutable, for example,
behind a `UnsafeCell` or `Opaque`, examples of interior mutable types
are atomic and locks.

Now since C doesn't have the "interior mutable" concept or these types,
so when bindgen generates the `bindings::phy_device`, none of the fields
of a lock or atomic is marked as `UnsafeCell` or `Opaque`. That's why
`Opaque` is needed for defining `Device`:

	pub struct Device(Opaque<bindings::phy_device);

`Opaque` basically does two things:

1)	tell compilers that the underlying content may be modified (of
	course in some sort of serialization) when there exists a
	`&Device` or `&mut Device`.

2)	only provide `*mut bindings::phy_device`, so accessing the
	underlying object has to use unsafe.

Now let's look back into struct phy_device, it does have a few locks
in it, and at least even with phydev->lock held, the content of
phydev->lock itself can be changed (e.g tick locks), hence it breaks the
requirement of the existence of a `&bindings::phy_device`.

Of course, if we can define our own `bindings::phy_device` or ask
bindgen to automatically add `Opaque` to certain types, then
`&bindings::phy_device` is still possible to use.

If we are OK to not use `&bindings::phy_device` then Benno's proposal in
bindgen is one way to work with this.

Regards,
Boqun

> >> side, then yes, it can and will change it. So you should not cache a
> >> value over a C call.
> > 
> > Yeah, I understand that. But if I understand Benno correctly, from
> > Rust perspective, such might happen.
> 
> Yes, that is what I meant. Sure the C side might never modify the
> value, but this is not good enough for Rust. It must somehow be ensured
> that it never is modified, in order for us to rely on it.
> 
> -- 
> Cheers,
> Benno
> 
>
Andrew Lunn Oct. 28, 2023, 7:23 p.m. UTC | #22
> Now let's look back into struct phy_device, it does have a few locks
> in it, and at least even with phydev->lock held, the content of
> phydev->lock itself can be changed (e.g tick locks), hence it breaks the
> requirement of the existence of a `&bindings::phy_device`.

tick locks appear to be a Rust thing, so are unlikely to appear in a C
structure. However, kernel C mutex does have a linked list of other
threads waiting for the mutex. So phydev->lock can change at any time,
even when held.

	Andrew
Boqun Feng Oct. 28, 2023, 11:26 p.m. UTC | #23
On Sat, Oct 28, 2023 at 09:23:25PM +0200, Andrew Lunn wrote:
> > Now let's look back into struct phy_device, it does have a few locks
> > in it, and at least even with phydev->lock held, the content of
> > phydev->lock itself can be changed (e.g tick locks), hence it breaks the
> > requirement of the existence of a `&bindings::phy_device`.
> 
> tick locks appear to be a Rust thing, so are unlikely to appear in a C

Ah, I meant ticket locks... same here a mutex has a wait_lock which can
be implemented by ticket locks or queued spin locks, so the u32 lock
field may change even with lock held.

Regards,
Boqun

> structure. However, kernel C mutex does have a linked list of other
> threads waiting for the mutex. So phydev->lock can change at any time,
> even when held.
> 
> 	Andrew
FUJITA Tomonori Oct. 29, 2023, 4:21 a.m. UTC | #24
On Sat, 28 Oct 2023 18:45:40 +0000
Benno Lossin <benno.lossin@proton.me> wrote:

> On 28.10.23 20:23, Andrew Lunn wrote:
>> On Sat, Oct 28, 2023 at 04:37:53PM +0000, Benno Lossin wrote:
>>> On 28.10.23 11:27, FUJITA Tomonori wrote:
>>>> On Fri, 27 Oct 2023 21:19:38 +0000
>>>> Benno Lossin <benno.lossin@proton.me> wrote:
>>>>> I did not notice this before, but this means we cannot use the `link`
>>>>> function from bindgen, since that takes `&self`. We would need a
>>>>> function that takes `*const Self` instead.
>>>>
>>>> Implementing functions to access to a bitfield looks tricky so we need
>>>> to add such feature to bindgen or we add getters to the C side?
>>>
>>> Indeed, I just opened an issue [1] on the bindgen repo.
>>>
>>> [1]: https://github.com/rust-lang/rust-bindgen/issues/2674
>> 
>> Please could you help me understand the consequences here. Are you
>> saying the rust toolchain is fatally broken here, it cannot generate
>> valid code at the moment? As a result we need to wait for a new
>> version of bindgen?
> This only affects bitfields, since they require special accessor functions
> generated by bindgen, so I would not say that the toolchain is fatally broken.
> It also is theoretically possible to manually access the bitfields in a correct
> manner, but that is error prone (which is why we use the accessor functions
> provided by bindgen).
> 
> In this particular case we have three options:
> 1. wait until bindgen provides a raw accessor function that allows to use
>     only raw pointers.
> 2. create some C helper functions for the bitfield access that will be replaced
>     by the bindgen functions once bindgen has updated.
> 3. Since for the `phy_device` bindings, we only ever call functions while holding
>     the `phy_device.lock` lock (at least I think that this is correct) we might be
>     able to get away with creating a reference to the object and use the current
>     accessor functions anyway.
> 
> But for point 3 I will have to consult the others.

The current code is fine from Rust perspective because the current
code copies phy_driver on stack and makes a reference to the copy, if
I undertand correctly.

It's not nice to create an 500-bytes object on stack. It turned out
that it's not so simple to avoid it.
Boqun Feng Oct. 29, 2023, 4:48 p.m. UTC | #25
On Sun, Oct 29, 2023 at 01:21:12PM +0900, FUJITA Tomonori wrote:
[...]
> 
> The current code is fine from Rust perspective because the current
> code copies phy_driver on stack and makes a reference to the copy, if
> I undertand correctly.
> 

I had the same thought Benno brought the issue on `&`, but unfortunately
it's not true ;-) In the following code:

	let phydev = unsafe { *self.0.get() };

, semantically the *whole* `bindings::phy_device` is being read, so if
there is any modification (i.e. write) that may happen in the meanwhile,
it's data race, and data races are UB (even in C).

So both implementations have the problem because of the same cause.

> It's not nice to create an 500-bytes object on stack. It turned out
> that it's not so simple to avoid it.

As you can see, copying is not the way to work around this.

Regards,
Boqun
Andrew Lunn Oct. 29, 2023, 5:32 p.m. UTC | #26
> The current code is fine from Rust perspective because the current
> code copies phy_driver on stack and makes a reference to the copy, if
> I undertand correctly.
> 
> It's not nice to create an 500-bytes object on stack. It turned out
> that it's not so simple to avoid it.

Does it also copy the stack version over the 'real' version before
exiting? If not, it is broken, since modifying state in phy_device is
often why the driver is called. But copying the stack version is also
broken, since another thread taking the phydev->lock is going to get
lost from the linked list of waiters.

Taking a copy of the C structure does seem very odd, to me as a C
programmer.

     Andrew
Boqun Feng Oct. 29, 2023, 6:09 p.m. UTC | #27
On Sun, Oct 29, 2023 at 09:48:41AM -0700, Boqun Feng wrote:
> On Sun, Oct 29, 2023 at 01:21:12PM +0900, FUJITA Tomonori wrote:
> [...]
> > 
> > The current code is fine from Rust perspective because the current
> > code copies phy_driver on stack and makes a reference to the copy, if
> > I undertand correctly.
> > 
> 
> I had the same thought Benno brought the issue on `&`, but unfortunately
> it's not true ;-) In the following code:
> 
> 	let phydev = unsafe { *self.0.get() };
> 
> , semantically the *whole* `bindings::phy_device` is being read, so if
> there is any modification (i.e. write) that may happen in the meanwhile,
> it's data race, and data races are UB (even in C).
> 
> So both implementations have the problem because of the same cause.
> 
> > It's not nice to create an 500-bytes object on stack. It turned out
> > that it's not so simple to avoid it.
> 
> As you can see, copying is not the way to work around this.
> 

An temporary solution is doing the #2 option from Benno, but in Rust and
open code it, like the following:

diff --git a/rust/kernel/net/phy.rs b/rust/kernel/net/phy.rs
index 145d0407fe31..f5230ac48014 100644
--- a/rust/kernel/net/phy.rs
+++ b/rust/kernel/net/phy.rs
@@ -121,10 +121,10 @@ pub fn state(&self) -> DeviceState {
     ///
     /// It returns true if the link is up.
     pub fn is_link_up(&self) -> bool {
-        const LINK_IS_UP: u32 = 1;
+        const LINK_IS_UP: u64 = 1;
         // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
-        let phydev = unsafe { *self.0.get() };
-        phydev.link() == LINK_IS_UP
+        let bit_field = unsafe { &(*self.0.get())._bitfield_1 };
+        bit_field.get(14, 1) == LINK_IS_UP
     }
 
     /// Gets the current auto-negotiation configuration.
@@ -132,18 +132,18 @@ pub fn is_link_up(&self) -> bool {
     /// It returns true if auto-negotiation is enabled.
     pub fn is_autoneg_enabled(&self) -> bool {
         // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
-        let phydev = unsafe { *self.0.get() };
-        phydev.autoneg() == bindings::AUTONEG_ENABLE
+        let bit_field = unsafe { &(*self.0.get())._bitfield_1 };
+        bit_field.get(13, 1) == bindings::AUTONEG_ENABLE as u64
     }
 
     /// Gets the current auto-negotiation state.
     ///
     /// It returns true if auto-negotiation is completed.
     pub fn is_autoneg_completed(&self) -> bool {
-        const AUTONEG_COMPLETED: u32 = 1;
+        const AUTONEG_COMPLETED: u64 = 1;
         // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
-        let phydev = unsafe { *self.0.get() };
-        phydev.autoneg_complete() == AUTONEG_COMPLETED
+        let bit_field = unsafe { &(*self.0.get())._bitfield_1 };
+        bit_field.get(15, 1) == AUTONEG_COMPLETED
     }
 
     /// Sets the speed of the PHY.


Of course, it's not maintainable in longer term since it relies on
hard-coding the bit offset of these bit fields. But I think it's best we
can do from Linux kernel side. It's up to Andrew and Miguel whether this
temporary solution is OK.

Regards,
Boqun
Boqun Feng Oct. 29, 2023, 6:26 p.m. UTC | #28
On Sun, Oct 29, 2023 at 11:09:17AM -0700, Boqun Feng wrote:
[...]
> Of course, it's not maintainable in longer term since it relies on
> hard-coding the bit offset of these bit fields. But I think it's best we
> can do from Linux kernel side.

Hmm.. I guess I should have added "other than creating EXPORTed C
accessors for these bit fields".

Regards,
Boqun
Andrew Lunn Oct. 29, 2023, 7:39 p.m. UTC | #29
> Of course, it's not maintainable in longer term since it relies on
> hard-coding the bit offset of these bit fields. But I think it's best we
> can do from Linux kernel side. It's up to Andrew and Miguel whether this
> temporary solution is OK.

What is the likely time scale for getting a new version of bindgen
with the necessary bit field support?

We have probably missed the merge window for v6.7. So there is around
9 weeks to the next merge window. If a working bindgen is likely to be
available by then, we should not bother with a workaround, just wait.

If you think bindgen is going to take longer than that, then we should
consider workarounds. But we have no rush, its still 9 weeks before we
need working code.

Zooming out a bit. Is there any other in mainline Rust code, or about
to be submitted for this merge window code, using bit fields? I assume
that is equally broken?

	Andrew
FUJITA Tomonori Oct. 29, 2023, 10:58 p.m. UTC | #30
On Sun, 29 Oct 2023 09:48:41 -0700
Boqun Feng <boqun.feng@gmail.com> wrote:

> On Sun, Oct 29, 2023 at 01:21:12PM +0900, FUJITA Tomonori wrote:
> [...]
>> 
>> The current code is fine from Rust perspective because the current
>> code copies phy_driver on stack and makes a reference to the copy, if
>> I undertand correctly.
>> 
> 
> I had the same thought Benno brought the issue on `&`, but unfortunately
> it's not true ;-) In the following code:
> 
> 	let phydev = unsafe { *self.0.get() };
> 
> , semantically the *whole* `bindings::phy_device` is being read, so if
> there is any modification (i.e. write) that may happen in the meanwhile,
> it's data race, and data races are UB (even in C).

Benno said so? I'm not sure about the logic (whole v.s. partial). Even
if you read partially, the part might be modified by the C side during
reading.

For me, the issue is that creating &T for an object that might be
modified.
Boqun Feng Oct. 30, 2023, 12:19 a.m. UTC | #31
On Mon, Oct 30, 2023 at 07:58:52AM +0900, FUJITA Tomonori wrote:
> On Sun, 29 Oct 2023 09:48:41 -0700
> Boqun Feng <boqun.feng@gmail.com> wrote:
> 
> > On Sun, Oct 29, 2023 at 01:21:12PM +0900, FUJITA Tomonori wrote:
> > [...]
> >> 
> >> The current code is fine from Rust perspective because the current
> >> code copies phy_driver on stack and makes a reference to the copy, if
> >> I undertand correctly.
> >> 
> > 
> > I had the same thought Benno brought the issue on `&`, but unfortunately
> > it's not true ;-) In the following code:
> > 
> > 	let phydev = unsafe { *self.0.get() };
> > 
> > , semantically the *whole* `bindings::phy_device` is being read, so if
> > there is any modification (i.e. write) that may happen in the meanwhile,
> > it's data race, and data races are UB (even in C).
> 
> Benno said so? I'm not sure about the logic (whole v.s. partial). Even

We can wait for Benno's response, but there is an example where Miri
says it's data race:

	https://play.rust-lang.org/?version=stable&mode=release&edition=2021&gist=c7097644aa5f02a0a436e5b8b8624824

> if you read partially, the part might be modified by the C side during
> reading.

If you read the part protected by phy_device->lock, C side shouldn't
modify it, but the case here is not all fields in phy_device stay
unchanged when phy_device->lock (and Rust side doesn't mark them
interior mutable), see the discussion drom Andrew and me.

> 
> For me, the issue is that creating &T for an object that might be
> modified.

The reason a `&phy_device` cannot be created here is because concurrent
writes may cause a invalid phy_device (i.e. data race), the same applies
to a copy.

Regards,
Boqun
Benno Lossin Oct. 30, 2023, 8:34 a.m. UTC | #32
On 30.10.23 01:19, Boqun Feng wrote:
> On Mon, Oct 30, 2023 at 07:58:52AM +0900, FUJITA Tomonori wrote:
>> if you read partially, the part might be modified by the C side during
>> reading.
> 
> If you read the part protected by phy_device->lock, C side shouldn't
> modify it, but the case here is not all fields in phy_device stay
> unchanged when phy_device->lock (and Rust side doesn't mark them
> interior mutable), see the discussion drom Andrew and me.
> 
>>
>> For me, the issue is that creating &T for an object that might be
>> modified.
> 
> The reason a `&phy_device` cannot be created here is because concurrent
> writes may cause a invalid phy_device (i.e. data race), the same applies
> to a copy.

Both comments by Boqun above are correct. Additionally even if the write
would not have a data race with the read, it would still be UB. (For example
when the write is by the same thread)

If you just read the field itself then it should be fine, since it is
protected by a lock, see Boqun's patch for manually accessing the bitfields.

But I would wait until we see a response from the bindgen devs on the issue.
Benno Lossin Oct. 30, 2023, 8:37 a.m. UTC | #33
On 29.10.23 18:32, Andrew Lunn wrote:
>> The current code is fine from Rust perspective because the current
>> code copies phy_driver on stack and makes a reference to the copy, if
>> I undertand correctly.
>>
>> It's not nice to create an 500-bytes object on stack. It turned out
>> that it's not so simple to avoid it.
> 
> Does it also copy the stack version over the 'real' version before
> exiting? If not, it is broken, since modifying state in phy_device is
> often why the driver is called. But copying the stack version is also
> broken, since another thread taking the phydev->lock is going to get
> lost from the linked list of waiters.

It does not copy the stack version over the original. Since we only read
the `link` field in the discussed function and we hold the lock, it should
not get modified, right? So from a functional viewpoint there is no issue.
(Aside from increased stack size and the data race issue etc.)

> Taking a copy of the C structure does seem very odd, to me as a C
> programmer.

It is also odd in Rust.
Miguel Ojeda Oct. 30, 2023, 11:22 a.m. UTC | #34
On Sat, Oct 28, 2023 at 8:24 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> Please could you help me understand the consequences here. Are you
> saying the rust toolchain is fatally broken here, it cannot generate
> valid code at the moment? As a result we need to wait for a new
> version of bindgen?

Benno has already replied, but to be extra clear: no, it is not
"fatally broken".

`bindgen` is just a tool to automate writing some things you would
otherwise need to manually write. It currently generates some methods
that take a reference, but we should avoid creating references in this
case, so we would like methods that take a pointer instead. That's it.

In other words, we could simply write the methods ourselves. That
would be "Option 0" (which would be like Benno's 1, but manually; or
like Benno's 2, but in Rust).

Cheers,
Miguel
Miguel Ojeda Oct. 30, 2023, 12:07 p.m. UTC | #35
On Sun, Oct 29, 2023 at 8:39 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> We have probably missed the merge window for v6.7. So there is around

Definitely missed. We aim to send our PRs in advance of the merge
window, e.g. this cycle it has already been sent.

But even if it wasn't, for `rust-next`, feature patches should be sent
early in the cycle anyway.

> 9 weeks to the next merge window. If a working bindgen is likely to be
> available by then, we should not bother with a workaround, just wait.

In general, it is better to avoid workarounds if something is going to
be implemented properly. However, `bindgen` is a third-party project
which we do not control.

In the case of the exhaustiveness check, they looked interested in the
feature, which is why I wanted to avoid the workaround if possible.

In this instance though, we don't even know yet what they think about
the feature we requested. We will give them some time to see what they
think, and then decide based on that.

Cheers,
Miguel
Andrew Lunn Oct. 30, 2023, 12:32 p.m. UTC | #36
On Mon, Oct 30, 2023 at 01:07:07PM +0100, Miguel Ojeda wrote:
> On Sun, Oct 29, 2023 at 8:39 PM Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > We have probably missed the merge window for v6.7. So there is around
> 
> Definitely missed. We aim to send our PRs in advance of the merge
> window, e.g. this cycle it has already been sent.

Netdev works a bit different. Patches can be merged into net-next even
a couple of days into the merge window. A lot depends on the riskiness
of a patch. A new driver has very low risk, since it is generally self
contained. Even if its broken, there is another 8 weeks to fix it up.

However, for this cycle, the PR for netdev has already been sent,
because a lot of the Maintainers are travelling to the netdev
conference.

	Andrew
FUJITA Tomonori Oct. 30, 2023, 12:49 p.m. UTC | #37
On Mon, 30 Oct 2023 08:34:46 +0000
Benno Lossin <benno.lossin@proton.me> wrote:

> On 30.10.23 01:19, Boqun Feng wrote:
>> On Mon, Oct 30, 2023 at 07:58:52AM +0900, FUJITA Tomonori wrote:
>>> if you read partially, the part might be modified by the C side during
>>> reading.
>> 
>> If you read the part protected by phy_device->lock, C side shouldn't
>> modify it, but the case here is not all fields in phy_device stay
>> unchanged when phy_device->lock (and Rust side doesn't mark them
>> interior mutable), see the discussion drom Andrew and me.
>> 
>>>
>>> For me, the issue is that creating &T for an object that might be
>>> modified.
>> 
>> The reason a `&phy_device` cannot be created here is because concurrent
>> writes may cause a invalid phy_device (i.e. data race), the same applies
>> to a copy.
> 
> Both comments by Boqun above are correct. Additionally even if the write
> would not have a data race with the read, it would still be UB. (For example
> when the write is by the same thread)
> 
> If you just read the field itself then it should be fine, since it is
> protected by a lock, see Boqun's patch for manually accessing the bitfields.

The rust code can access to only fields in phy_device that the C side
doesn't modify; these fields are protected by a lock or in other ways
(resume/suspend cases).

Right?

> But I would wait until we see a response from the bindgen devs on the issue.

You meant that they might have a different option on this?
Benno Lossin Oct. 30, 2023, 4:45 p.m. UTC | #38
On 30.10.23 13:49, FUJITA Tomonori wrote:
> On Mon, 30 Oct 2023 08:34:46 +0000
> Benno Lossin <benno.lossin@proton.me> wrote:
>> Both comments by Boqun above are correct. Additionally even if the write
>> would not have a data race with the read, it would still be UB. (For example
>> when the write is by the same thread)
>>
>> If you just read the field itself then it should be fine, since it is
>> protected by a lock, see Boqun's patch for manually accessing the bitfields.
> 
> The rust code can access to only fields in phy_device that the C side
> doesn't modify; these fields are protected by a lock or in other ways
> (resume/suspend cases).

No it could access all fields in `phy_device`, which means it could also
access `phy_device.lock`. Since that is a mutex, it might change at any
time even if we hold the lock.

>> But I would wait until we see a response from the bindgen devs on the issue.
> 
> You meant that they might have a different option on this?

No, before you implement the workaround that Boqun posted you
should wait until the bindgen devs say how long/if they will
implement it.
FUJITA Tomonori Nov. 8, 2023, 10:46 a.m. UTC | #39
On Mon, 30 Oct 2023 16:45:38 +0000
Benno Lossin <benno.lossin@proton.me> wrote:

>>> But I would wait until we see a response from the bindgen devs on the issue.
>> 
>> You meant that they might have a different option on this?
> 
> No, before you implement the workaround that Boqun posted you
> should wait until the bindgen devs say how long/if they will
> implement it.

It has been 10 days but no response from bindgen developpers. I guess
that unlikely bindgen will support the feature until the next merge
window.

I prefer adding accessors in the C side rather than the workaround if
it's fine by Andrew because we have no idea when bindgen will support
the feature.
Andrew Lunn Nov. 10, 2023, 1:26 p.m. UTC | #40
On Wed, Nov 08, 2023 at 07:46:47PM +0900, FUJITA Tomonori wrote:
> On Mon, 30 Oct 2023 16:45:38 +0000
> Benno Lossin <benno.lossin@proton.me> wrote:
> 
> >>> But I would wait until we see a response from the bindgen devs on the issue.
> >> 
> >> You meant that they might have a different option on this?
> > 
> > No, before you implement the workaround that Boqun posted you
> > should wait until the bindgen devs say how long/if they will
> > implement it.
> 
> It has been 10 days but no response from bindgen developpers. I guess
> that unlikely bindgen will support the feature until the next merge
> window.

I just took a look around the kernel includes. Here are a few examples
i found, there are many many more.

include/target/iscsi/iscsi_target_core.h:       unsigned int            conn_tx_reset_cpumask:1;
include/media/videobuf2-core.h: unsigned int            synced:1;
include/media/v4l2-ctrls.h:     unsigned int done:1;
include/drm/bridge/samsung-dsim.h:      unsigned int has_freqband:1;
include/net/sock_reuseport.h:   unsigned int            bind_inany:1;
include/net/sock.h:     unsigned char           skc_ipv6only:1;
include/sound/simple_card_utils.h:      unsigned int force_dpcm:1;
include/sound/soc.h:    unsigned int autodisable:1;
include/linux/pci.h:    unsigned int    imm_ready:1;    /* Supports Immediate Readiness */
include/linux/regmap.h: unsigned int use_ack:1;
include/linux/cpuidle.h:        unsigned int            registered:1;
include/linux/regulator/gpio-regulator.h:       unsigned enabled_at_boot:1;
include/linux/phy.h:    unsigned link:1;
include/linux/pm.h:     unsigned int            irq_safe:1;
include/linux/nfs_xdr.h:        unsigned char                   renew:1;
include/linux/iio/iio.h:        unsigned                output:1;
include/linux/drbd.h:           unsigned user_isp:1 ;
include/linux/sched.h:  unsigned                        sched_migrated:1;
include/linux/writeback.h:      unsigned no_cgroup_owner:1;
include/linux/tty_port.h:       unsigned char           console:1;
include/linux/mpi.h:                    unsigned int two_inv_p:1;
include/linux/mmc/host.h:       unsigned int            use_spi_crc:1;
include/linux/netdevice.h:      unsigned                wol_enabled:1;
include/linux/serial_8250.h:    unsigned int            tx_stopped:1;
include/linux/usb.h:    unsigned can_submit:1;
include/linux/firewire.h:       unsigned is_local:1;
include/linux/phylink.h:        unsigned int link:1;
include/linux/gpio_keys.h:      unsigned int rep:1;
include/linux/spi/spi.h:        unsigned        cs_change:1;
include/linux/i2c-mux.h:        unsigned int arbitrator:1;
include/linux/kobject.h:        unsigned int state_in_sysfs:1;
include/linux/kbd_kern.h:       unsigned char ledmode:1;
include/linux/bpf_verifier.h:   unsigned int fit_for_inline:1;
include/linux/rtc.h:    unsigned int uie_irq_active:1;
include/linux/usb/usbnet.h:     unsigned                can_dma_sg:1;
include/linux/usb/serial.h:     unsigned char                   attached:1;
include/linux/usb/gadget.h:     unsigned dir_out:1;
include/linux/comedi/comedidev.h:       unsigned int attached:1;
include/scsi/sg.h:    unsigned int twelve_byte:1;
include/scsi/scsi_host.h:       unsigned emulated:1;
include/scsi/scsi_device.h:     unsigned removable:1;
include/rdma/ib_verbs.h:        unsigned int            ip_gids:1;

And thats just bitfields used as binary values. There are more with
:2, :3, :4, :8, :16.

The point i'm trying to make is that they are used in many
subsystems. Not having bindgen support for them seems like its going
to be a problem.

Isn't this also an unsoundness problem? Is there existing Rust code
which people think is sound but is actually not?

> I prefer adding accessors in the C side rather than the workaround if
> it's fine by Andrew because we have no idea when bindgen will support
> the feature.

Maybe we need a better understanding of the kernel wide implications
of this. It could be this is actually a big issue, and rust-for-linux
can then apply either pressure, or human resources, to get it
implemented.

    Andrew
Alice Ryhl Nov. 17, 2023, 9:39 a.m. UTC | #41
FUJITA Tomonori <fujita.tomonori@gmail.com> writes:
> This patch adds abstractions to implement network PHY drivers; the
> driver registration and bindings for some of callback functions in
> struct phy_driver and many genphy_ functions.
> 
> This feature is enabled with CONFIG_RUST_PHYLIB_ABSTRACTIONS=y.
> 
> This patch enables unstable const_maybe_uninit_zeroed feature for
> kernel crate to enable unsafe code to handle a constant value with
> uninitialized data. With the feature, the abstractions can initialize
> a phy_driver structure with zero easily; instead of initializing all
> the members by hand. It's supposed to be stable in the not so distant
> future.
> 
> Link: https://github.com/rust-lang/rust/pull/116218
> 
> Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com>

I promised Andrew to take a look at these patches at Plumbers. This
email contains the first part of my review.

In this email, I will bring up the question of how the safety comments
should be worded. I know that you've probably discussed this before, but
my opinion was asked for, and this is the main area where I think
there's room for improvement.

> +    /// # Safety
> +    ///
> +    /// This function must only be called from the callbacks in `phy_driver`.
> +    unsafe fn from_raw<'a>(ptr: *mut bindings::phy_device) -> &'a mut Self {

This kind of safety comment where you say "must only be used by internal
code and nothing else" isn't great. It doesn't really help with checking
the correctness. It's usually better to document what is actually
required here, even if it shouldn't be called by non-internal code. I
recommend something along the lines of:

	# Safety
	
	For the duration of 'a, the pointer must point at a valid `phy_device`,
	and the caller must hold the X mutex.

Then in methods like this: (which are missing justification for why
there's no data race!)
> +        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
> +        unsafe { (*phydev).phy_id }
you instead say:

	SAFETY: By the struct invariants, `phydev` points at a valid
	`phy_device`, and we hold the X lock, which gives us access to
	the `phy_id` field.

And you would also update the struct invariant accordingly:

/// # Invariants
///
/// Referencing a `phy_device` using this struct asserts that the X
/// mutex is held.
#[repr(transparent)]
pub struct Device(Opaque<bindings::phy_device>);





> +// During the calls to most functions in [`Driver`], the C side (`PHYLIB`) holds a lock that is
> +// unique for every instance of [`Device`]. `PHYLIB` uses a different serialization technique for
> +// [`Driver::resume`] and [`Driver::suspend`]: `PHYLIB` updates `phy_device`'s state with
> +// the lock held, thus guaranteeing that [`Driver::resume`] has exclusive access to the instance.
> +// [`Driver::resume`] and [`Driver::suspend`] also are called where only one thread can access
> +// to the instance.

I used "X mutex" as an example for the synchronization mechanism in the
above snippets, but it sounds like its more complicated than that? Here
are some possible alternatives I could come up with:

Maybe we don't need synchronization when some operations can't happen?

/// # Invariants
///
/// Referencing a `phy_device` using this struct asserts that the X
/// mutex is held, or that there are no concurrent operations of type Y.
#[repr(transparent)]
pub struct Device(Opaque<bindings::phy_device>);

Maybe we have a separate case for when the device is being initialized
and nobody else has access yet?

/// # Invariants
///
/// Referencing a `phy_device` using this struct asserts that the X
/// mutex is held, or that the reference has exclusive access to the
/// entire `phy_device`.
#[repr(transparent)]
pub struct Device(Opaque<bindings::phy_device>);

Maybe it is easier to just list the fields we need access to?

/// # Invariants
///
/// Referencing a `phy_device` using this struct asserts exclusive
/// access to the following fields: phy_id, state, speed, duplex. And
/// read access to the following fields: link, autoneg_complete,
/// autoneg.
#[repr(transparent)]
pub struct Device(Opaque<bindings::phy_device>);

Perhaps we want to avoid duplication with some existing C documentation?

/// # Invariants
///
/// Referencing a `phy_device` using this struct asserts that the user
/// is inside a Y scope as defined in Documentation/foo/bar.
#[repr(transparent)]
pub struct Device(Opaque<bindings::phy_device>);

But I don't know how these things are actually synchronized. Maybe
it is some sixth option. I would be happy to help draft these safety
comments once the actual synchronization mechanism is clear to me.

Or maybe you prefer to not do it this way, or to punt it for a later
patch series. I prefer to document these things in the above way, but
ultimately it is not up to me.

Alice
Alice Ryhl Nov. 17, 2023, 9:39 a.m. UTC | #42
FUJITA Tomonori <fujita.tomonori@gmail.com> writes:
> This patch adds abstractions to implement network PHY drivers; the
> driver registration and bindings for some of callback functions in
> struct phy_driver and many genphy_ functions.
> 
> This feature is enabled with CONFIG_RUST_PHYLIB_ABSTRACTIONS=y.
> 
> This patch enables unstable const_maybe_uninit_zeroed feature for
> kernel crate to enable unsafe code to handle a constant value with
> uninitialized data. With the feature, the abstractions can initialize
> a phy_driver structure with zero easily; instead of initializing all
> the members by hand. It's supposed to be stable in the not so distant
> future.
> 
> Link: https://github.com/rust-lang/rust/pull/116218
> 
> Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com>

In this reply, I go through my minor nits:

> +use crate::{
> +    prelude::{vtable, Pin},
> +};

Normally, if you're importing specific prelude items by name instead of
using prelude::*, then you would import them using their non-prelude
path.

> +#[derive(PartialEq)]
> +pub enum DeviceState {

If you add PartialEq and you can add Eq, then also add Eq.

> +/// An adapter for the registration of a PHY driver.
> +struct Adapter<T: Driver> {
> +    _p: PhantomData<T>,
> +}

You don't need this struct. The methods can be top-level functions.

But I know that others have used the same style of defining a useless
struct, so it's fine with me.

> +    /// Defines certain other features this PHY supports.
> +    /// It is a combination of the flags in the [`flags`] module.
> +    const FLAGS: u32 = 0;

You need an empty line between the two lines if you intend for them to
be separate lines in rendered documentation.

> +#[vtable]
> +pub trait Driver {
> +    /// Issues a PHY software reset.
> +    fn soft_reset(_dev: &mut Device) -> Result {
> +        Err(code::ENOTSUPP)
> +    }
>      [...]
> +}

I believe that the guidance for what to put in optional vtable-trait
methods was changed in:

https://lore.kernel.org/all/20231026201855.1497680-1-benno.lossin@proton.me/

> +// SAFETY: `Registration` does not expose any of its state across threads.
> +unsafe impl Send for Registration {}

I would change this to "it's okay to call phy_drivers_unregister from a
different thread than the one in which phy_drivers_register was called".

> +// SAFETY: `Registration` does not expose any of its state across threads.
> +unsafe impl Sync for Registration {}

Here, you can say "Registration has no &self methods, so immutable
references to it are useless".

> +    // macro use only
> +    #[doc(hidden)]
> +    pub const fn mdio_device_id(&self) -> bindings::mdio_device_id {
> +        bindings::mdio_device_id {
> +            phy_id: self.id,
> +            phy_id_mask: self.mask.as_int(),
> +        }
> +    }

This is fine, but I probably would just expose it for everyone. It's not
like it hurts if non-macro code can call this method, even if it doesn't
have a reason to do so.

Alice
Andrew Lunn Nov. 17, 2023, 1:34 p.m. UTC | #43
> And you would also update the struct invariant accordingly:
> 
> /// # Invariants
> ///
> /// Referencing a `phy_device` using this struct asserts that the X
> /// mutex is held.
> #[repr(transparent)]
> pub struct Device(Opaque<bindings::phy_device>);
> 
> 
> 
> 
> 
> > +// During the calls to most functions in [`Driver`], the C side (`PHYLIB`) holds a lock that is
> > +// unique for every instance of [`Device`]. `PHYLIB` uses a different serialization technique for
> > +// [`Driver::resume`] and [`Driver::suspend`]: `PHYLIB` updates `phy_device`'s state with
> > +// the lock held, thus guaranteeing that [`Driver::resume`] has exclusive access to the instance.
> > +// [`Driver::resume`] and [`Driver::suspend`] also are called where only one thread can access
> > +// to the instance.
> 
> I used "X mutex" as an example for the synchronization mechanism in the
> above snippets, but it sounds like its more complicated than that? Here
> are some possible alternatives I could come up with:

So X would be phy_device->lock.

Suspend and resume don't have this lock held. I don't actually
remember the details, but there is an email discussion with Russell
King which does explain the problem we had, and why we think it is
safe to not hold the lock.

> /// # Invariants
> ///
> /// Referencing a `phy_device` using this struct asserts that the X
> /// mutex is held, or that the reference has exclusive access to the
> /// entire `phy_device`.
> #[repr(transparent)]
> pub struct Device(Opaque<bindings::phy_device>);

You can never have exclusive access to the entire phy_device, because
it contains a mutex. Other threads can block on that mutex, which
involves changing the linked list in the mutex.

But that is also a pretty common pattern, put the mutex inside the
structure it protects. So when you say 'exclusive access to the entire
`phy_device`' you actually mean excluding mutex, spinlocks, atomic
variables, etc?

> /// Referencing a `phy_device` using this struct asserts exclusive
> /// access to the following fields: phy_id, state, speed, duplex. And
> /// read access to the following fields: link, autoneg_complete,
> /// autoneg.
> #[repr(transparent)]
> pub struct Device(Opaque<bindings::phy_device>);

For the Rust code, you can maybe do this. But the Rust code calls into
C helpers to get the real work done, and they expect to have access to
pretty much everything in phy_device.

> /// # Invariants
> ///
> /// Referencing a `phy_device` using this struct asserts that the user
> /// is inside a Y scope as defined in Documentation/foo/bar.
> #[repr(transparent)]
> pub struct Device(Opaque<bindings::phy_device>);

There is no such documentation that i know of, except it does get
repeated again and again on the mailling lists. Its tribal knowledge.

> But I don't know how these things are actually synchronized. Maybe
> it is some sixth option. I would be happy to help draft these safety
> comments once the actual synchronization mechanism is clear to me.

The model is: synchronisation is not the drivers problem.

With a few years of experience reviewing drivers, you notice that
there are a number of driver writers who have no idea about
locking. They don't even think about it. So where possible, its best
to hide all the locking from them in the core. The core is in theory
developed by developers who do mostly understand locking and have a
better chance of getting it right. Its also exercised a lot more,
since its shared by all drivers.

My experience with this one Rust driver so far is that Rust developers
have problems accepting that its not the drivers problem.

	Andrew
Andrew Lunn Nov. 17, 2023, 1:53 p.m. UTC | #44
> I would change this to "it's okay to call phy_drivers_unregister from a
> different thread than the one in which phy_drivers_register was called".

This got me thinking about 'threads'. For register and unregister, we
are talking abut the kernel modules module_init() and module_exit()
function. module_init() can be called before user space is even
started, but it could also be called by insmod. module_exit() could be
called by rmmod, but it could also be the kernel, after user space has
gone away on shutdown. We are always in a context which can block, but
i never really think of this being threads. You can start a kernel
thread, and have some data structure exclusively used by that kernel
thread, but that is pretty unusual.

So i would probably turn this commenting around. Only comment like
this in the special case that a kernel thread exists, and it is
expected to have exclusive access.

	 Andrew
Alice Ryhl Nov. 17, 2023, 3:42 p.m. UTC | #45
Andrew Lunn <andrew@lunn.ch> writes:
>> I used "X mutex" as an example for the synchronization mechanism in the
>> above snippets, but it sounds like its more complicated than that? Here
>> are some possible alternatives I could come up with:
> 
> So X would be phy_device->lock.
> 
> Suspend and resume don't have this lock held. I don't actually
> remember the details, but there is an email discussion with Russell
> King which does explain the problem we had, and why we think it is
> safe to not hold the lock.

So, what I would prefer to see as the struct invariant would be:

 * Either phy_device->lock is held,
 * or, we are in whatever situation you are in when you call suspend and
   resume.

The five suggestions I gave were my guesses as to what that situation
might be.

>> /// # Invariants
>> ///
>> /// Referencing a `phy_device` using this struct asserts that the X
>> /// mutex is held, or that the reference has exclusive access to the
>> /// entire `phy_device`.
>> #[repr(transparent)]
>> pub struct Device(Opaque<bindings::phy_device>);
> 
> You can never have exclusive access to the entire phy_device, because
> it contains a mutex. Other threads can block on that mutex, which
> involves changing the linked list in the mutex.
> 
> But that is also a pretty common pattern, put the mutex inside the
> structure it protects. So when you say 'exclusive access to the entire
> `phy_device`' you actually mean excluding mutex, spinlocks, atomic
> variables, etc?

No, I really meant exclusive access to everything. This suggestion is
where I guessed that the situation might be "we just created the
phy_device, and haven't yet shared it with anyone, so it's okay to
access it without the lock". But it sounds like that's not the case.

>> /// Referencing a `phy_device` using this struct asserts exclusive
>> /// access to the following fields: phy_id, state, speed, duplex. And
>> /// read access to the following fields: link, autoneg_complete,
>> /// autoneg.
>> #[repr(transparent)]
>> pub struct Device(Opaque<bindings::phy_device>);
> 
> For the Rust code, you can maybe do this. But the Rust code calls into
> C helpers to get the real work done, and they expect to have access to
> pretty much everything in phy_device.

Yeah, agreed, this is probably the suggestion I disliked the most.

>> /// # Invariants
>> ///
>> /// Referencing a `phy_device` using this struct asserts that the user
>> /// is inside a Y scope as defined in Documentation/foo/bar.
>> #[repr(transparent)]
>> pub struct Device(Opaque<bindings::phy_device>);
> 
> There is no such documentation that i know of, except it does get
> repeated again and again on the mailling lists. Its tribal knowledge.

Then, my suggestion would be to write down that tribal knowledge in the
safety comments.

>> But I don't know how these things are actually synchronized. Maybe
>> it is some sixth option. I would be happy to help draft these safety
>> comments once the actual synchronization mechanism is clear to me.
> 
> The model is: synchronisation is not the drivers problem.
> 
> With a few years of experience reviewing drivers, you notice that
> there are a number of driver writers who have no idea about
> locking. They don't even think about it. So where possible, its best
> to hide all the locking from them in the core. The core is in theory
> developed by developers who do mostly understand locking and have a
> better chance of getting it right. Its also exercised a lot more,
> since its shared by all drivers.
> 
> My experience with this one Rust driver so far is that Rust developers
> have problems accepting that its not the drivers problem.

I agree that locking should not be the driver's problem. If there's any
comment about locking in patch 5 of this patchset, then something has
gone wrong.

However, I don't see this file as part of the driver. It's a wrapper
around the core, which makes it part of the core. Like the C core, the
Rust abstractions will be shared by all Rust drivers. The correctness of
the unsafe code here is what ensures that drivers are not able to mess
up the locking in safe code.


Anyway. If you don't want to write down the tribal knowledge here, then
I suggest this simpler wording:

/// # Invariants
///
/// Referencing a `phy_device` using this struct asserts that you are in
/// a context where all methods defined on this struct are safe to call.
#[repr(transparent)]
pub struct Device(Opaque<bindings::phy_device>);

This invariant is much less precise, but at least it is correct.

Other safety comments may then be:

	/// Gets the id of the PHY.
	pub fn phy_id(&self) -> u32 {
	    let phydev = self.0.get();
	    // SAFETY: The struct invariant ensures that we may access
	    // this field without additional synchronization.
	    unsafe { (*phydev).phy_id }
	}

And:

	unsafe extern "C" fn soft_reset_callback(
	    phydev: *mut bindings::phy_device,
	) -> core::ffi::c_int {
	    from_result(|| {
	        // SAFETY: This callback is called only in contexts
		// where we hold `phy_device->lock`, so the accessors on
		// `Device` are okay to call.
	        let dev = unsafe { Device::from_raw(phydev) };
	        T::soft_reset(dev)?;
	        Ok(0)
	    })
	}

And:

	unsafe extern "C" fn resume_callback(phydev: *mut bindings::phy_device) -> core::ffi::c_int {
	    from_result(|| {
	        // SAFETY: The C core code ensures that the accessors on
		// `Device` are okay to call even though `phy_device->lock`
		// might not be held.
	        let dev = unsafe { Device::from_raw(phydev) };
	        T::resume(dev)?;
	        Ok(0)
	    })
	}

Alice
Andrew Lunn Nov. 17, 2023, 4:28 p.m. UTC | #46
> >> /// # Invariants
> >> ///
> >> /// Referencing a `phy_device` using this struct asserts that the X
> >> /// mutex is held, or that the reference has exclusive access to the
> >> /// entire `phy_device`.
> >> #[repr(transparent)]
> >> pub struct Device(Opaque<bindings::phy_device>);
> > 
> > You can never have exclusive access to the entire phy_device, because
> > it contains a mutex. Other threads can block on that mutex, which
> > involves changing the linked list in the mutex.
> > 
> > But that is also a pretty common pattern, put the mutex inside the
> > structure it protects. So when you say 'exclusive access to the entire
> > `phy_device`' you actually mean excluding mutex, spinlocks, atomic
> > variables, etc?
> 
> No, I really meant exclusive access to everything. This suggestion is
> where I guessed that the situation might be "we just created the
> phy_device, and haven't yet shared it with anyone, so it's okay to
> access it without the lock". But it sounds like that's not the case.

It is pretty unusual for a linux driver to actually create a
device. Some level of core code generally creates a basic device
structure and passes it to the probe function. The probe can then
setup members in the device, maybe allocate memory and assign it to
the device->priv member etc.

However, in the probe method, it should be safe to assume its not
globally visible yet, so you can be more relaxed about locking.

> >> /// # Invariants
> >> ///
> >> /// Referencing a `phy_device` using this struct asserts that the user
> >> /// is inside a Y scope as defined in Documentation/foo/bar.
> >> #[repr(transparent)]
> >> pub struct Device(Opaque<bindings::phy_device>);
> > 
> > There is no such documentation that i know of, except it does get
> > repeated again and again on the mailling lists. Its tribal knowledge.
> 
> Then, my suggestion would be to write down that tribal knowledge in the
> safety comments.

O.K, we can do that.

     Andrew
Alice Ryhl Nov. 17, 2023, 6:27 p.m. UTC | #47
On 11/17/23 17:28, Andrew Lunn wrote:>>>> /// # Invariants
>>>> ///
>>>> /// Referencing a `phy_device` using this struct asserts that the user
>>>> /// is inside a Y scope as defined in Documentation/foo/bar.
>>>> #[repr(transparent)]
>>>> pub struct Device(Opaque<bindings::phy_device>);
>>>
>>> There is no such documentation that i know of, except it does get
>>> repeated again and again on the mailling lists. Its tribal knowledge.
>>
>> Then, my suggestion would be to write down that tribal knowledge in the
>> safety comments.
> 
> O.K, we can do that.

Do you have a link to one of those email threads that you mentioned?

Alice
Greg Kroah-Hartman Nov. 17, 2023, 7:50 p.m. UTC | #48
On Fri, Nov 17, 2023 at 02:53:44PM +0100, Andrew Lunn wrote:
> > I would change this to "it's okay to call phy_drivers_unregister from a
> > different thread than the one in which phy_drivers_register was called".
> 
> This got me thinking about 'threads'. For register and unregister, we
> are talking abut the kernel modules module_init() and module_exit()
> function. module_init() can be called before user space is even
> started, but it could also be called by insmod. module_exit() could be
> called by rmmod, but it could also be the kernel, after user space has
> gone away on shutdown.

The kernel will not call module_exit() on shutdown.  Or has something
changed recently?

> We are always in a context which can block, but
> i never really think of this being threads. You can start a kernel
> thread, and have some data structure exclusively used by that kernel
> thread, but that is pretty unusual.
> 
> So i would probably turn this commenting around. Only comment like
> this in the special case that a kernel thread exists, and it is
> expected to have exclusive access.

With the driver model, you can be sure that your probe/release functions
for the bus will never be called at the same time, and module_init/exit
can never be called at the same time, so perhaps this isn't an issue?

thanks,

greg k-h
Boqun Feng Nov. 17, 2023, 11:28 p.m. UTC | #49
On Fri, Nov 17, 2023 at 02:50:45PM -0500, Greg KH wrote:
> On Fri, Nov 17, 2023 at 02:53:44PM +0100, Andrew Lunn wrote:
> > > I would change this to "it's okay to call phy_drivers_unregister from a
> > > different thread than the one in which phy_drivers_register was called".
> > 
> > This got me thinking about 'threads'. For register and unregister, we

Just to make things clear for discussion, the "thread" here (when we are
talking about trait `Send` and `Sync`) means "contexts" (thread/process
contexts, irq contexts, etc).

When we say a type is `Send`, it means the object of that type can be
created in one context, passed to another context (could be the same
type of context, e.g. sending between two thread/process contexts) and
dropped there. For example, if you have a work_struct embedded type, you
can create it in the driver code and pass it to workqueue, and when the
work is done, you can free it in the workqueue context.

One example of not `Send` type (or `!Send`) is spinlock guard:

	let guard: Guard<..> = some_lock.lock();

creating a Guard means "spin_lock()" and dropping a Guard means
"spin_unlock()", since we cannot acquire a spinlock in one context and
release it in another context in kernel, so `Guard<..>` is `!Send`.

Back to the code here:

	unsafe impl Send for Registration {}

the safety comment needs to explain why the `Registration::drop` is safe
to call in a different context.

Hope this helps.

Regards,
Boqun

> > are talking abut the kernel modules module_init() and module_exit()
> > function. module_init() can be called before user space is even
> > started, but it could also be called by insmod. module_exit() could be
> > called by rmmod, but it could also be the kernel, after user space has
> > gone away on shutdown.
> 
> The kernel will not call module_exit() on shutdown.  Or has something
> changed recently?
> 
> > We are always in a context which can block, but
> > i never really think of this being threads. You can start a kernel
> > thread, and have some data structure exclusively used by that kernel
> > thread, but that is pretty unusual.
> > 
> > So i would probably turn this commenting around. Only comment like
> > this in the special case that a kernel thread exists, and it is
> > expected to have exclusive access.
> 
> With the driver model, you can be sure that your probe/release functions
> for the bus will never be called at the same time, and module_init/exit
> can never be called at the same time, so perhaps this isn't an issue?
> 
> thanks,
> 
> greg k-h
>
Andrew Lunn Nov. 18, 2023, 3:32 p.m. UTC | #50
> One example of not `Send` type (or `!Send`) is spinlock guard:
> 
> 	let guard: Guard<..> = some_lock.lock();
> 
> creating a Guard means "spin_lock()" and dropping a Guard means
> "spin_unlock()", since we cannot acquire a spinlock in one context and
> release it in another context in kernel, so `Guard<..>` is `!Send`.

Thanks for the explanation. Kernel people might have a different
meaning for context, especially in this example. We have process
context and atomic context. Process context you are allowed to sleep,
atomic context you cannot sleep. If you are in process context and
take a spinlock, you change into atomic context. And when you release
the spinlock you go back to process context. So with this meaning of
context, you do acquire the spinlock in one context, and release it in
another.

So we are going to have to think about the context the word context is
used in, and expect kernel and Rust people to maybe think of it
differently.

	Andrew
Boqun Feng Nov. 18, 2023, 3:54 p.m. UTC | #51
On Sat, Nov 18, 2023 at 04:32:26PM +0100, Andrew Lunn wrote:
> > One example of not `Send` type (or `!Send`) is spinlock guard:
> > 
> > 	let guard: Guard<..> = some_lock.lock();
> > 
> > creating a Guard means "spin_lock()" and dropping a Guard means
> > "spin_unlock()", since we cannot acquire a spinlock in one context and
> > release it in another context in kernel, so `Guard<..>` is `!Send`.
> 
> Thanks for the explanation. Kernel people might have a different

Surely *we* do, and looks like I created more confusion ;-) Maybe I
should say "execution context" as in include/linux/preempt.h: NMI, hard
IRQ, softirq, task.

> meaning for context, especially in this example. We have process
> context and atomic context. Process context you are allowed to sleep,
> atomic context you cannot sleep. If you are in process context and
> take a spinlock, you change into atomic context. And when you release
> the spinlock you go back to process context. So with this meaning of
> context, you do acquire the spinlock in one context, and release it in
> another.
> 

Also as I tried to explain previously, the type of contexts doesn't
matter. Yes, once you hold a spinlock, you enter atomic context, but you
are still in the same task execution context, so acquiring and releasing
in the same task execution doesn't count as "Sending". But if after
acquired one somehow passes the guard to another task, or an interrupt
handler, that's "Sending".

> So we are going to have to think about the context the word context is
> used in, and expect kernel and Rust people to maybe think of it
> differently.
> 

In Rust doc [1], `Send` means:

	Types that can be transferred across thread boundaries.

but of course, we have more "thread-like" things in kernel, so I think
"execution context" may be a better term?

[1]: https://doc.rust-lang.org/core/marker/trait.Send.html

Regards,
Boqun

> 	Andrew
Trevor Gross Nov. 19, 2023, 11:06 a.m. UTC | #52
On Sat, Nov 18, 2023 at 9:54 AM Boqun Feng <boqun.feng@gmail.com> wrote:
> In Rust doc [1], `Send` means:
>
>         Types that can be transferred across thread boundaries.
>
> but of course, we have more "thread-like" things in kernel, so I think
> "execution context" may be a better term?

The docs are pretty OS-focused here (I intend to update them at some point).

`Sync` means that if you have a >1 pointer/reference to a struct you
can access any of the fields, as allowed by the API, from any of the
references at any time (i.e. switching between any two instructions)
without causing data races. Atomic accesses or mutexes can add this
property to things that do not have it. It really doesn't matter
whether you're going between different user threads, kthreads,
interrupt/preemption contexts, or nothing at all. It's a bit more
intrinsic to the data type and it says how you _could_ use it rather
than how you do use it.

And then `Send` basically means that any pointers in your struct are
either exclusive or point to `Sync` things. Mutexes cannot add this
property to anything that does not have it.

Note that this still never lets you have more than one `&mut`
(`restrict`) reference to a piece of data at once, this mostly relates
to interior mutability (when things are allowed to be changed behind
shared `&` references - such as atomics).

The consumers of these markers are (1) the compiler knowing what can
live in statics, (2) APIs that make things potentially concurrent, and
(3) the compiler automatically marking new structs `Send`/`Sync` if
all member types follow these rules.

When trying to figure this out for C types, the question is just
whether usage of the type follows those rules. Or at least whether it
follows them whenever Rust has access to it.

---

FUJITA Tomonori <fujita.tomonori@gmail.com> writes:
> +pub struct Registration {
> +    drivers: Pin<&'static mut [DriverVTable]>,
> +}
>
> [...]
>
> +// SAFETY: `Registration` does not expose any of its state across threads.
> +unsafe impl Send for Registration {}
>
> +// SAFETY: `Registration` does not expose any of its state across threads.
> +unsafe impl Sync for Registration {}

I don't think the impl here actually makes sense. `Registration` is a
buffer of references to `DriverVTable`. That type isn't marked Sync so
by the above rules, its references should not be either.

Tomo, does this need to be Sync at all? Probably easiest to drop the
impls if not, otherwise I think it is more correct to move them to
`DriverVTable`.  You may have had this before, I'm not sure if
discussion made you change it at some point...

---

> [1]: https://doc.rust-lang.org/core/marker/trait.Send.html
>
> Regards,
> Boqun

Sorry Boqun, the lengthy explanation is just for context and not aimed
at you in particular :)

- Trevor
FUJITA Tomonori Nov. 19, 2023, 1:51 p.m. UTC | #53
On Fri, 17 Nov 2023 09:39:05 +0000
Alice Ryhl <aliceryhl@google.com> wrote:

> FUJITA Tomonori <fujita.tomonori@gmail.com> writes:
>> This patch adds abstractions to implement network PHY drivers; the
>> driver registration and bindings for some of callback functions in
>> struct phy_driver and many genphy_ functions.
>> 
>> This feature is enabled with CONFIG_RUST_PHYLIB_ABSTRACTIONS=y.
>> 
>> This patch enables unstable const_maybe_uninit_zeroed feature for
>> kernel crate to enable unsafe code to handle a constant value with
>> uninitialized data. With the feature, the abstractions can initialize
>> a phy_driver structure with zero easily; instead of initializing all
>> the members by hand. It's supposed to be stable in the not so distant
>> future.
>> 
>> Link: https://github.com/rust-lang/rust/pull/116218
>> 
>> Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
> 
> In this reply, I go through my minor nits:
> 
>> +use crate::{
>> +    prelude::{vtable, Pin},
>> +};
> 
> Normally, if you're importing specific prelude items by name instead of
> using prelude::*, then you would import them using their non-prelude
> path.

Understood. I have no reason to import specific prelude items so I use
`prelude::*` instead.


>> +#[derive(PartialEq)]
>> +pub enum DeviceState {
> 
> If you add PartialEq and you can add Eq, then also add Eq.

Added Eq.


>> +/// An adapter for the registration of a PHY driver.
>> +struct Adapter<T: Driver> {
>> +    _p: PhantomData<T>,
>> +}
> 
> You don't need this struct. The methods can be top-level functions.
> 
> But I know that others have used the same style of defining a useless
> struct, so it's fine with me.

You meant like the following?

unsafe extern "C" fn soft_reset_callback<T: Driver>(
    phydev: *mut bindings::phy_device,
) -> core::ffi::c_int {
    from_result(|| {
        // SAFETY: Preconditions ensure `phydev` is valid.
        let dev = unsafe { Device::from_raw(phydev) };
        T::soft_reset(dev)?;
        Ok(0)
    })
}

pub const fn create_phy_driver<T: Driver>() -> DriverVTable {
    DriverVTable(Opaque::new(bindings::phy_driver {
	soft_reset: if T::HAS_SOFT_RESET {
            Some(soft_reset_callback::<T>)
        } else {
            None
	}
    }
}

I thought that a struct is used to group callbacks. Either is fine by
me though.


>> +    /// Defines certain other features this PHY supports.
>> +    /// It is a combination of the flags in the [`flags`] module.
>> +    const FLAGS: u32 = 0;
> 
> You need an empty line between the two lines if you intend for them to
> be separate lines in rendered documentation.

I don't intend to make them separate lines. If I make thme one line,
it's too long (over 100) so I made them two lines.


>> +#[vtable]
>> +pub trait Driver {
>> +    /// Issues a PHY software reset.
>> +    fn soft_reset(_dev: &mut Device) -> Result {
>> +        Err(code::ENOTSUPP)
>> +    }
>>      [...]
>> +}
> 
> I believe that the guidance for what to put in optional vtable-trait
> methods was changed in:
> 
> https://lore.kernel.org/all/20231026201855.1497680-1-benno.lossin@proton.me/

But VTABLE_DEFAULT_ERROR is defined in that patch, which isn't
merged. I'll update the code once that patch is merged.


>> +// SAFETY: `Registration` does not expose any of its state across threads.
>> +unsafe impl Send for Registration {}
> 
> I would change this to "it's okay to call phy_drivers_unregister from a
> different thread than the one in which phy_drivers_register was called".
> 
>> +// SAFETY: `Registration` does not expose any of its state across threads.
>> +unsafe impl Sync for Registration {}
> 
> Here, you can say "Registration has no &self methods, so immutable
> references to it are useless".

I'll reply to Trevor mail on this issue.


>> +    // macro use only
>> +    #[doc(hidden)]
>> +    pub const fn mdio_device_id(&self) -> bindings::mdio_device_id {
>> +        bindings::mdio_device_id {
>> +            phy_id: self.id,
>> +            phy_id_mask: self.mask.as_int(),
>> +        }
>> +    }
> 
> This is fine, but I probably would just expose it for everyone. It's not
> like it hurts if non-macro code can call this method, even if it doesn't
> have a reason to do so.

IIRC, someone said that drivers should not use bindings directly so
this function that returns bindings::mdio_device_id isn't exported.

Thanks!
Andrew Lunn Nov. 19, 2023, 4:08 p.m. UTC | #54
> I don't intend to make them separate lines. If I make thme one line,
> it's too long (over 100) so I made them two lines.

Any networking prefers 80 anyway, so shorter is better.

    Andrew
FUJITA Tomonori Nov. 21, 2023, 2:13 a.m. UTC | #55
On Sun, 19 Nov 2023 05:06:23 -0600
Trevor Gross <tmgross@umich.edu> wrote:

>> +pub struct Registration {
>> +    drivers: Pin<&'static mut [DriverVTable]>,
>> +}
>>
>> [...]
>>
>> +// SAFETY: `Registration` does not expose any of its state across threads.
>> +unsafe impl Send for Registration {}
>>
>> +// SAFETY: `Registration` does not expose any of its state across threads.
>> +unsafe impl Sync for Registration {}
> 
> I don't think the impl here actually makes sense. `Registration` is a
> buffer of references to `DriverVTable`. That type isn't marked Sync so
> by the above rules, its references should not be either.
> 
> Tomo, does this need to be Sync at all? Probably easiest to drop the
> impls if not, otherwise I think it is more correct to move them to
> `DriverVTable`.  You may have had this before, I'm not sure if
> discussion made you change it at some point...

This needs to be Sync:

601 | pub struct Registration {
    |            ^^^^^^^^^^^^
note: required because it appears within the type `Module`
   --> drivers/net/phy/foo_rust.rs:5:1
    |
5   | / kernel::module_phy_driver! {
6   | |     drivers: [Foo],
7   | |     device_table: [
8   | |         DeviceId::new_with_driver::<Foo>()
...   |
13  | |     license: "GPL",
14  | | }
    | |_^
note: required by a bound in `kernel::Module`
   --> /home/ubuntu/git/linux/rust/kernel/lib.rs:69:27
    |
69  | pub trait Module: Sized + Sync {
    |                           ^^^^ required by this bound in `Module`
    = note: this error originates in the macro `kernel::module_phy_driver` (in Nightly builds, run with -Z macro-backtrace for more info)


I'm not sure we discussed but making DriverVTable Sync works.

#[repr(transparent)]
pub struct DriverVTable(Opaque<bindings::phy_driver>);

// SAFETY: DriverVTable has no &self methods, so immutable references to it are useless.
unsafe impl Sync for DriverVTable {}


looks correct?
FUJITA Tomonori Nov. 21, 2023, 12:47 p.m. UTC | #56
On Fri, 17 Nov 2023 15:42:46 +0000
Alice Ryhl <aliceryhl@google.com> wrote:

> Anyway. If you don't want to write down the tribal knowledge here, then
> I suggest this simpler wording:
> 
> /// # Invariants
> ///
> /// Referencing a `phy_device` using this struct asserts that you are in
> /// a context where all methods defined on this struct are safe to call.
> #[repr(transparent)]
> pub struct Device(Opaque<bindings::phy_device>);
> 
> This invariant is much less precise, but at least it is correct.
> 
> Other safety comments may then be:
> 
> 	/// Gets the id of the PHY.
> 	pub fn phy_id(&self) -> u32 {
> 	    let phydev = self.0.get();
> 	    // SAFETY: The struct invariant ensures that we may access
> 	    // this field without additional synchronization.
> 	    unsafe { (*phydev).phy_id }
> 	}
> 
> And:
> 
> 	unsafe extern "C" fn soft_reset_callback(
> 	    phydev: *mut bindings::phy_device,
> 	) -> core::ffi::c_int {
> 	    from_result(|| {
> 	        // SAFETY: This callback is called only in contexts
> 		// where we hold `phy_device->lock`, so the accessors on
> 		// `Device` are okay to call.
> 	        let dev = unsafe { Device::from_raw(phydev) };
> 	        T::soft_reset(dev)?;
> 	        Ok(0)
> 	    })
> 	}
> 
> And:
> 
> 	unsafe extern "C" fn resume_callback(phydev: *mut bindings::phy_device) -> core::ffi::c_int {
> 	    from_result(|| {
> 	        // SAFETY: The C core code ensures that the accessors on
> 		// `Device` are okay to call even though `phy_device->lock`
> 		// might not be held.
> 	        let dev = unsafe { Device::from_raw(phydev) };
> 	        T::resume(dev)?;
> 	        Ok(0)
> 	    })
> 	}

With these comments, What I should write on from_raw() function as a
safety comment?

/// # Safety
///
/// For the duration of 'a, the pointer must point at a valid
/// `phy_device`, and the caller must ensure that an user of this struct
/// in a context where all methods defined on this struct are safe to
/// call.
unsafe fn from_raw<'a>(ptr: *mut bindings::phy_device) -> &'a mut Self {
Boqun Feng Nov. 22, 2023, 6:16 p.m. UTC | #57
On Tue, Nov 21, 2023 at 11:13:06AM +0900, FUJITA Tomonori wrote:
[...]
> 
> I'm not sure we discussed but making DriverVTable Sync works.
> 
> #[repr(transparent)]
> pub struct DriverVTable(Opaque<bindings::phy_driver>);
> 
> // SAFETY: DriverVTable has no &self methods, so immutable references to it are useless.

Minor nitpicking, I would add one more sentense in the safety comment:

	therefore it's safe to share immutable references between
	threads.

or 
	therefore it's safe to share immutable references between
	execution contexts.

once we decide the term here ;-)

The reason is to match Sync definition [1]:

"""
Types for which it is safe to share references between threads.

This trait is automatically implemented when the compiler determines
it’s appropriate.

The precise definition is: a type T is Sync if and only if &T is Send.
In other words, if there is no possibility of undefined behavior
(including data races) when passing &T references between threads.
"""

[1]: https://doc.rust-lang.org/std/marker/trait.Sync.html

Regards,
Boqun

> unsafe impl Sync for DriverVTable {}
> 
> 
> looks correct?
>
diff mbox series

Patch

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index 421d2b62918f..0faebdb184ca 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -60,6 +60,14 @@  config FIXED_PHY
 
 	  Currently tested with mpc866ads and mpc8349e-mitx.
 
+config RUST_PHYLIB_ABSTRACTIONS
+        bool "PHYLIB abstractions support"
+        depends on RUST
+        depends on PHYLIB=y
+        help
+          Adds support needed for PHY drivers written in Rust. It provides
+          a wrapper around the C phylib core.
+
 config SFP
 	tristate "SFP cage support"
 	depends on I2C && PHYLINK
diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index c91a3c24f607..ec4ee09a34ad 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -8,6 +8,9 @@ 
 
 #include <kunit/test.h>
 #include <linux/errname.h>
+#include <linux/ethtool.h>
+#include <linux/mdio.h>
+#include <linux/phy.h>
 #include <linux/slab.h>
 #include <linux/refcount.h>
 #include <linux/wait.h>
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index e8811700239a..0588422e273c 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -14,6 +14,7 @@ 
 #![no_std]
 #![feature(allocator_api)]
 #![feature(coerce_unsized)]
+#![feature(const_maybe_uninit_zeroed)]
 #![feature(dispatch_from_dyn)]
 #![feature(new_uninit)]
 #![feature(receiver_trait)]
@@ -36,6 +37,8 @@ 
 pub mod ioctl;
 #[cfg(CONFIG_KUNIT)]
 pub mod kunit;
+#[cfg(CONFIG_NET)]
+pub mod net;
 pub mod prelude;
 pub mod print;
 mod static_assert;
diff --git a/rust/kernel/net.rs b/rust/kernel/net.rs
new file mode 100644
index 000000000000..fe415cb369d3
--- /dev/null
+++ b/rust/kernel/net.rs
@@ -0,0 +1,6 @@ 
+// SPDX-License-Identifier: GPL-2.0
+
+//! Networking.
+
+#[cfg(CONFIG_RUST_PHYLIB_ABSTRACTIONS)]
+pub mod phy;
diff --git a/rust/kernel/net/phy.rs b/rust/kernel/net/phy.rs
new file mode 100644
index 000000000000..61223f638bc6
--- /dev/null
+++ b/rust/kernel/net/phy.rs
@@ -0,0 +1,725 @@ 
+// SPDX-License-Identifier: GPL-2.0
+
+// Copyright (C) 2023 FUJITA Tomonori <fujita.tomonori@gmail.com>
+
+//! Network PHY device.
+//!
+//! C headers: [`include/linux/phy.h`](../../../../../../../include/linux/phy.h).
+
+use crate::{
+    bindings,
+    error::*,
+    prelude::{vtable, Pin},
+    str::CStr,
+    types::Opaque,
+};
+use core::marker::PhantomData;
+
+/// PHY state machine states.
+///
+/// Corresponds to the kernel's [`enum phy_state`].
+///
+/// Some of PHY drivers access to the state of PHY's software state machine.
+///
+/// [`enum phy_state`]: ../../../../../../../include/linux/phy.h
+#[derive(PartialEq)]
+pub enum DeviceState {
+    /// PHY device and driver are not ready for anything.
+    Down,
+    /// PHY is ready to send and receive packets.
+    Ready,
+    /// PHY is up, but no polling or interrupts are done.
+    Halted,
+    /// PHY is up, but is in an error state.
+    Error,
+    /// PHY and attached device are ready to do work.
+    Up,
+    /// PHY is currently running.
+    Running,
+    /// PHY is up, but not currently plugged in.
+    NoLink,
+    /// PHY is performing a cable test.
+    CableTest,
+}
+
+/// A mode of Ethernet communication.
+///
+/// PHY drivers get duplex information from hardware and update the current state.
+pub enum DuplexMode {
+    /// PHY is in full-duplex mode.
+    Full,
+    /// PHY is in half-duplex mode.
+    Half,
+    /// PHY is in unknown duplex mode.
+    Unknown,
+}
+
+/// An instance of a PHY device.
+///
+/// Wraps the kernel's [`struct phy_device`].
+///
+/// A [`Device`] instance is created when a callback in [`Driver`] is executed. A PHY driver
+/// executes [`Driver`]'s methods during the callback.
+///
+/// # Invariants
+///
+/// `self.0` is always in a valid state.
+///
+/// [`struct phy_device`]: ../../../../../../../include/linux/phy.h
+// During the calls to most functions in [`Driver`], the C side (`PHYLIB`) holds a lock that is
+// unique for every instance of [`Device`]. `PHYLIB` uses a different serialization technique for
+// [`Driver::resume`] and [`Driver::suspend`]: `PHYLIB` updates `phy_device`'s state with
+// the lock held, thus guaranteeing that [`Driver::resume`] has exclusive access to the instance.
+// [`Driver::resume`] and [`Driver::suspend`] also are called where only one thread can access
+// to the instance.
+#[repr(transparent)]
+pub struct Device(Opaque<bindings::phy_device>);
+
+impl Device {
+    /// Creates a new [`Device`] instance from a raw pointer.
+    ///
+    /// # Safety
+    ///
+    /// This function must only be called from the callbacks in `phy_driver`.
+    unsafe fn from_raw<'a>(ptr: *mut bindings::phy_device) -> &'a mut Self {
+        // CAST: `Self` is a `repr(transparent)` wrapper around `bindings::phy_device`.
+        let ptr = ptr.cast::<Self>();
+        // SAFETY: by the function requirements the pointer is valid and we have unique access for
+        // the duration of `'a`.
+        unsafe { &mut *ptr }
+    }
+
+    /// Gets the id of the PHY.
+    pub fn phy_id(&self) -> u32 {
+        let phydev = self.0.get();
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        unsafe { (*phydev).phy_id }
+    }
+
+    /// Gets the state of PHY state machine states.
+    pub fn state(&self) -> DeviceState {
+        let phydev = self.0.get();
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        let state = unsafe { (*phydev).state };
+        // TODO: this conversion code will be replaced with automatically generated code by bindgen
+        // when it becomes possible.
+        // better to call WARN_ONCE() when the state is out-of-range.
+        match state {
+            bindings::phy_state_PHY_DOWN => DeviceState::Down,
+            bindings::phy_state_PHY_READY => DeviceState::Ready,
+            bindings::phy_state_PHY_HALTED => DeviceState::Halted,
+            bindings::phy_state_PHY_ERROR => DeviceState::Error,
+            bindings::phy_state_PHY_UP => DeviceState::Up,
+            bindings::phy_state_PHY_RUNNING => DeviceState::Running,
+            bindings::phy_state_PHY_NOLINK => DeviceState::NoLink,
+            bindings::phy_state_PHY_CABLETEST => DeviceState::CableTest,
+            _ => DeviceState::Error,
+        }
+    }
+
+    /// Gets the current link state.
+    ///
+    /// It returns true if the link is up.
+    pub fn is_link_up(&self) -> bool {
+        const LINK_IS_UP: u32 = 1;
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        let phydev = unsafe { *self.0.get() };
+        phydev.link() == LINK_IS_UP
+    }
+
+    /// Gets the current auto-negotiation configuration.
+    ///
+    /// It returns true if auto-negotiation is enabled.
+    pub fn is_autoneg_enabled(&self) -> bool {
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        let phydev = unsafe { *self.0.get() };
+        phydev.autoneg() == bindings::AUTONEG_ENABLE
+    }
+
+    /// Gets the current auto-negotiation state.
+    ///
+    /// It returns true if auto-negotiation is completed.
+    pub fn is_autoneg_completed(&self) -> bool {
+        const AUTONEG_COMPLETED: u32 = 1;
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        let phydev = unsafe { *self.0.get() };
+        phydev.autoneg_complete() == AUTONEG_COMPLETED
+    }
+
+    /// Sets the speed of the PHY.
+    pub fn set_speed(&mut self, speed: u32) {
+        let phydev = self.0.get();
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        unsafe { (*phydev).speed = speed as i32 };
+    }
+
+    /// Sets duplex mode.
+    pub fn set_duplex(&mut self, mode: DuplexMode) {
+        let phydev = self.0.get();
+        let v = match mode {
+            DuplexMode::Full => bindings::DUPLEX_FULL as i32,
+            DuplexMode::Half => bindings::DUPLEX_HALF as i32,
+            DuplexMode::Unknown => bindings::DUPLEX_UNKNOWN as i32,
+        };
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        unsafe { (*phydev).duplex = v };
+    }
+
+    /// Reads a given C22 PHY register.
+    // This function reads a hardware register and updates the stats so takes `&mut self`.
+    pub fn read(&mut self, regnum: u16) -> Result<u16> {
+        let phydev = self.0.get();
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        // So an FFI call with a valid pointer.
+        let ret = unsafe {
+            bindings::mdiobus_read((*phydev).mdio.bus, (*phydev).mdio.addr, regnum.into())
+        };
+        if ret < 0 {
+            Err(Error::from_errno(ret))
+        } else {
+            Ok(ret as u16)
+        }
+    }
+
+    /// Writes a given C22 PHY register.
+    pub fn write(&mut self, regnum: u16, val: u16) -> Result {
+        let phydev = self.0.get();
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        // So an FFI call with a valid pointer.
+        to_result(unsafe {
+            bindings::mdiobus_write((*phydev).mdio.bus, (*phydev).mdio.addr, regnum.into(), val)
+        })
+    }
+
+    /// Reads a paged register.
+    pub fn read_paged(&mut self, page: u16, regnum: u16) -> Result<u16> {
+        let phydev = self.0.get();
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        // So an FFI call with a valid pointer.
+        let ret = unsafe { bindings::phy_read_paged(phydev, page.into(), regnum.into()) };
+        if ret < 0 {
+            Err(Error::from_errno(ret))
+        } else {
+            Ok(ret as u16)
+        }
+    }
+
+    /// Resolves the advertisements into PHY settings.
+    pub fn resolve_aneg_linkmode(&mut self) {
+        let phydev = self.0.get();
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        // So an FFI call with a valid pointer.
+        unsafe { bindings::phy_resolve_aneg_linkmode(phydev) };
+    }
+
+    /// Executes software reset the PHY via `BMCR_RESET` bit.
+    pub fn genphy_soft_reset(&mut self) -> Result {
+        let phydev = self.0.get();
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        // So an FFI call with a valid pointer.
+        to_result(unsafe { bindings::genphy_soft_reset(phydev) })
+    }
+
+    /// Initializes the PHY.
+    pub fn init_hw(&mut self) -> Result {
+        let phydev = self.0.get();
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        // so an FFI call with a valid pointer.
+        to_result(unsafe { bindings::phy_init_hw(phydev) })
+    }
+
+    /// Starts auto-negotiation.
+    pub fn start_aneg(&mut self) -> Result {
+        let phydev = self.0.get();
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        // So an FFI call with a valid pointer.
+        to_result(unsafe { bindings::phy_start_aneg(phydev) })
+    }
+
+    /// Resumes the PHY via `BMCR_PDOWN` bit.
+    pub fn genphy_resume(&mut self) -> Result {
+        let phydev = self.0.get();
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        // So an FFI call with a valid pointer.
+        to_result(unsafe { bindings::genphy_resume(phydev) })
+    }
+
+    /// Suspends the PHY via `BMCR_PDOWN` bit.
+    pub fn genphy_suspend(&mut self) -> Result {
+        let phydev = self.0.get();
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        // So an FFI call with a valid pointer.
+        to_result(unsafe { bindings::genphy_suspend(phydev) })
+    }
+
+    /// Checks the link status and updates current link state.
+    pub fn genphy_read_status(&mut self) -> Result<u16> {
+        let phydev = self.0.get();
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        // So an FFI call with a valid pointer.
+        let ret = unsafe { bindings::genphy_read_status(phydev) };
+        if ret < 0 {
+            Err(Error::from_errno(ret))
+        } else {
+            Ok(ret as u16)
+        }
+    }
+
+    /// Updates the link status.
+    pub fn genphy_update_link(&mut self) -> Result {
+        let phydev = self.0.get();
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        // So an FFI call with a valid pointer.
+        to_result(unsafe { bindings::genphy_update_link(phydev) })
+    }
+
+    /// Reads link partner ability.
+    pub fn genphy_read_lpa(&mut self) -> Result {
+        let phydev = self.0.get();
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        // So an FFI call with a valid pointer.
+        to_result(unsafe { bindings::genphy_read_lpa(phydev) })
+    }
+
+    /// Reads PHY abilities.
+    pub fn genphy_read_abilities(&mut self) -> Result {
+        let phydev = self.0.get();
+        // SAFETY: `phydev` is pointing to a valid object by the type invariant of `Self`.
+        // So an FFI call with a valid pointer.
+        to_result(unsafe { bindings::genphy_read_abilities(phydev) })
+    }
+}
+
+/// Defines certain other features this PHY supports (like interrupts).
+///
+/// These flag values are used in [`Driver::FLAGS`].
+pub mod flags {
+    /// PHY is internal.
+    pub const IS_INTERNAL: u32 = bindings::PHY_IS_INTERNAL;
+    /// PHY needs to be reset after the refclk is enabled.
+    pub const RST_AFTER_CLK_EN: u32 = bindings::PHY_RST_AFTER_CLK_EN;
+    /// Polling is used to detect PHY status changes.
+    pub const POLL_CABLE_TEST: u32 = bindings::PHY_POLL_CABLE_TEST;
+    /// Don't suspend.
+    pub const ALWAYS_CALL_SUSPEND: u32 = bindings::PHY_ALWAYS_CALL_SUSPEND;
+}
+
+/// An adapter for the registration of a PHY driver.
+struct Adapter<T: Driver> {
+    _p: PhantomData<T>,
+}
+
+impl<T: Driver> Adapter<T> {
+    /// # Safety
+    ///
+    /// `phydev` must be passed by the corresponding callback in `phy_driver`.
+    unsafe extern "C" fn soft_reset_callback(
+        phydev: *mut bindings::phy_device,
+    ) -> core::ffi::c_int {
+        from_result(|| {
+            // SAFETY: Preconditions ensure `phydev` is valid.
+            let dev = unsafe { Device::from_raw(phydev) };
+            T::soft_reset(dev)?;
+            Ok(0)
+        })
+    }
+
+    /// # Safety
+    ///
+    /// `phydev` must be passed by the corresponding callback in `phy_driver`.
+    unsafe extern "C" fn get_features_callback(
+        phydev: *mut bindings::phy_device,
+    ) -> core::ffi::c_int {
+        from_result(|| {
+            // SAFETY: Preconditions ensure `phydev` is valid.
+            let dev = unsafe { Device::from_raw(phydev) };
+            T::get_features(dev)?;
+            Ok(0)
+        })
+    }
+
+    /// # Safety
+    ///
+    /// `phydev` must be passed by the corresponding callback in `phy_driver`.
+    unsafe extern "C" fn suspend_callback(phydev: *mut bindings::phy_device) -> core::ffi::c_int {
+        from_result(|| {
+            // SAFETY: Preconditions ensure `phydev` is valid.
+            let dev = unsafe { Device::from_raw(phydev) };
+            T::suspend(dev)?;
+            Ok(0)
+        })
+    }
+
+    /// # Safety
+    ///
+    /// `phydev` must be passed by the corresponding callback in `phy_driver`.
+    unsafe extern "C" fn resume_callback(phydev: *mut bindings::phy_device) -> core::ffi::c_int {
+        from_result(|| {
+            // SAFETY: Preconditions ensure `phydev` is valid.
+            let dev = unsafe { Device::from_raw(phydev) };
+            T::resume(dev)?;
+            Ok(0)
+        })
+    }
+
+    /// # Safety
+    ///
+    /// `phydev` must be passed by the corresponding callback in `phy_driver`.
+    unsafe extern "C" fn config_aneg_callback(
+        phydev: *mut bindings::phy_device,
+    ) -> core::ffi::c_int {
+        from_result(|| {
+            // SAFETY: Preconditions ensure `phydev` is valid.
+            let dev = unsafe { Device::from_raw(phydev) };
+            T::config_aneg(dev)?;
+            Ok(0)
+        })
+    }
+
+    /// # Safety
+    ///
+    /// `phydev` must be passed by the corresponding callback in `phy_driver`.
+    unsafe extern "C" fn read_status_callback(
+        phydev: *mut bindings::phy_device,
+    ) -> core::ffi::c_int {
+        from_result(|| {
+            // SAFETY: Preconditions ensure `phydev` is valid.
+            let dev = unsafe { Device::from_raw(phydev) };
+            T::read_status(dev)?;
+            Ok(0)
+        })
+    }
+
+    /// # Safety
+    ///
+    /// `phydev` must be passed by the corresponding callback in `phy_driver`.
+    unsafe extern "C" fn match_phy_device_callback(
+        phydev: *mut bindings::phy_device,
+    ) -> core::ffi::c_int {
+        // SAFETY: Preconditions ensure `phydev` is valid.
+        let dev = unsafe { Device::from_raw(phydev) };
+        T::match_phy_device(dev) as i32
+    }
+
+    /// # Safety
+    ///
+    /// `phydev` must be passed by the corresponding callback in `phy_driver`.
+    unsafe extern "C" fn read_mmd_callback(
+        phydev: *mut bindings::phy_device,
+        devnum: i32,
+        regnum: u16,
+    ) -> i32 {
+        from_result(|| {
+            // SAFETY: Preconditions ensure `phydev` is valid.
+            let dev = unsafe { Device::from_raw(phydev) };
+            // CAST: the C side verifies devnum < 32.
+            let ret = T::read_mmd(dev, devnum as u8, regnum)?;
+            Ok(ret.into())
+        })
+    }
+
+    /// # Safety
+    ///
+    /// `phydev` must be passed by the corresponding callback in `phy_driver`.
+    unsafe extern "C" fn write_mmd_callback(
+        phydev: *mut bindings::phy_device,
+        devnum: i32,
+        regnum: u16,
+        val: u16,
+    ) -> i32 {
+        from_result(|| {
+            // SAFETY: Preconditions ensure `phydev` is valid.
+            let dev = unsafe { Device::from_raw(phydev) };
+            T::write_mmd(dev, devnum as u8, regnum, val)?;
+            Ok(0)
+        })
+    }
+
+    /// # Safety
+    ///
+    /// `phydev` must be passed by the corresponding callback in `phy_driver`.
+    unsafe extern "C" fn link_change_notify_callback(phydev: *mut bindings::phy_device) {
+        // SAFETY: Preconditions ensure `phydev` is valid.
+        let dev = unsafe { Device::from_raw(phydev) };
+        T::link_change_notify(dev);
+    }
+}
+
+/// Driver structure for a particular PHY type.
+///
+/// Wraps the kernel's [`struct phy_driver`].
+/// This is used to register a driver for a particular PHY type with the kernel.
+///
+/// # Invariants
+///
+/// `self.0` is always in a valid state.
+///
+/// [`struct phy_driver`]: ../../../../../../../include/linux/phy.h
+#[repr(transparent)]
+pub struct DriverVTable(Opaque<bindings::phy_driver>);
+
+/// Creates a [`DriverVTable`] instance from [`Driver`].
+///
+/// This is used by [`module_phy_driver`] macro to create a static array of `phy_driver`.
+///
+/// [`module_phy_driver`]: crate::module_phy_driver
+pub const fn create_phy_driver<T: Driver>() -> DriverVTable {
+    // INVARIANT: All the fields of `struct phy_driver` are initialized properly.
+    DriverVTable(Opaque::new(bindings::phy_driver {
+        name: T::NAME.as_char_ptr().cast_mut(),
+        flags: T::FLAGS,
+        phy_id: T::PHY_DEVICE_ID.id,
+        phy_id_mask: T::PHY_DEVICE_ID.mask_as_int(),
+        soft_reset: if T::HAS_SOFT_RESET {
+            Some(Adapter::<T>::soft_reset_callback)
+        } else {
+            None
+        },
+        get_features: if T::HAS_GET_FEATURES {
+            Some(Adapter::<T>::get_features_callback)
+        } else {
+            None
+        },
+        match_phy_device: if T::HAS_MATCH_PHY_DEVICE {
+            Some(Adapter::<T>::match_phy_device_callback)
+        } else {
+            None
+        },
+        suspend: if T::HAS_SUSPEND {
+            Some(Adapter::<T>::suspend_callback)
+        } else {
+            None
+        },
+        resume: if T::HAS_RESUME {
+            Some(Adapter::<T>::resume_callback)
+        } else {
+            None
+        },
+        config_aneg: if T::HAS_CONFIG_ANEG {
+            Some(Adapter::<T>::config_aneg_callback)
+        } else {
+            None
+        },
+        read_status: if T::HAS_READ_STATUS {
+            Some(Adapter::<T>::read_status_callback)
+        } else {
+            None
+        },
+        read_mmd: if T::HAS_READ_MMD {
+            Some(Adapter::<T>::read_mmd_callback)
+        } else {
+            None
+        },
+        write_mmd: if T::HAS_WRITE_MMD {
+            Some(Adapter::<T>::write_mmd_callback)
+        } else {
+            None
+        },
+        link_change_notify: if T::HAS_LINK_CHANGE_NOTIFY {
+            Some(Adapter::<T>::link_change_notify_callback)
+        } else {
+            None
+        },
+        // SAFETY: The rest is zeroed out to initialize `struct phy_driver`,
+        // sets `Option<&F>` to be `None`.
+        ..unsafe { core::mem::MaybeUninit::<bindings::phy_driver>::zeroed().assume_init() }
+    }))
+}
+
+/// Driver implementation for a particular PHY type.
+///
+/// This trait is used to create a [`DriverVTable`].
+#[vtable]
+pub trait Driver {
+    /// Defines certain other features this PHY supports.
+    /// It is a combination of the flags in the [`flags`] module.
+    const FLAGS: u32 = 0;
+
+    /// The friendly name of this PHY type.
+    const NAME: &'static CStr;
+
+    /// This driver only works for PHYs with IDs which match this field.
+    /// The default id and mask are zero.
+    const PHY_DEVICE_ID: DeviceId = DeviceId::new_with_custom_mask(0, 0);
+
+    /// Issues a PHY software reset.
+    fn soft_reset(_dev: &mut Device) -> Result {
+        Err(code::ENOTSUPP)
+    }
+
+    /// Probes the hardware to determine what abilities it has.
+    fn get_features(_dev: &mut Device) -> Result {
+        Err(code::ENOTSUPP)
+    }
+
+    /// Returns true if this is a suitable driver for the given phydev.
+    /// If not implemented, matching is based on [`Driver::PHY_DEVICE_ID`].
+    fn match_phy_device(_dev: &Device) -> bool {
+        false
+    }
+
+    /// Configures the advertisement and resets auto-negotiation
+    /// if auto-negotiation is enabled.
+    fn config_aneg(_dev: &mut Device) -> Result {
+        Err(code::ENOTSUPP)
+    }
+
+    /// Determines the negotiated speed and duplex.
+    fn read_status(_dev: &mut Device) -> Result<u16> {
+        Err(code::ENOTSUPP)
+    }
+
+    /// Suspends the hardware, saving state if needed.
+    fn suspend(_dev: &mut Device) -> Result {
+        Err(code::ENOTSUPP)
+    }
+
+    /// Resumes the hardware, restoring state if needed.
+    fn resume(_dev: &mut Device) -> Result {
+        Err(code::ENOTSUPP)
+    }
+
+    /// Overrides the default MMD read function for reading a MMD register.
+    fn read_mmd(_dev: &mut Device, _devnum: u8, _regnum: u16) -> Result<u16> {
+        Err(code::ENOTSUPP)
+    }
+
+    /// Overrides the default MMD write function for writing a MMD register.
+    fn write_mmd(_dev: &mut Device, _devnum: u8, _regnum: u16, _val: u16) -> Result {
+        Err(code::ENOTSUPP)
+    }
+
+    /// Callback for notification of link change.
+    fn link_change_notify(_dev: &mut Device) {}
+}
+
+/// Registration structure for PHY drivers.
+///
+/// Registers [`DriverVTable`] instances with the kernel. They will be unregistered when dropped.
+///
+/// # Invariants
+///
+/// The `drivers` slice are currently registered to the kernel via `phy_drivers_register`.
+pub struct Registration {
+    drivers: Pin<&'static mut [DriverVTable]>,
+}
+
+impl Registration {
+    /// Registers a PHY driver.
+    pub fn register(
+        module: &'static crate::ThisModule,
+        drivers: Pin<&'static mut [DriverVTable]>,
+    ) -> Result<Self> {
+        if drivers.is_empty() {
+            return Err(code::EINVAL);
+        }
+        // SAFETY: The type invariants of [`DriverVTable`] ensure that all elements of
+        // the `drivers` slice are initialized properly. `drivers` will not be moved.
+        // So an FFI call with a valid pointer.
+        to_result(unsafe {
+            bindings::phy_drivers_register(drivers[0].0.get(), drivers.len().try_into()?, module.0)
+        })?;
+        // INVARIANT: The `drivers` slice is successfully registered to the kernel via `phy_drivers_register`.
+        Ok(Registration { drivers })
+    }
+}
+
+impl Drop for Registration {
+    fn drop(&mut self) {
+        // SAFETY: The type invariants guarantee that `self.drivers` is valid.
+        // So an FFI call with a valid pointer.
+        unsafe {
+            bindings::phy_drivers_unregister(self.drivers[0].0.get(), self.drivers.len() as i32)
+        };
+    }
+}
+
+// SAFETY: `Registration` does not expose any of its state across threads.
+unsafe impl Send for Registration {}
+
+// SAFETY: `Registration` does not expose any of its state across threads.
+unsafe impl Sync for Registration {}
+
+/// An identifier for PHY devices on an MDIO/MII bus.
+///
+/// Represents the kernel's `struct mdio_device_id`. This is used to find an appropriate
+/// PHY driver.
+pub struct DeviceId {
+    id: u32,
+    mask: DeviceMask,
+}
+
+impl DeviceId {
+    /// Creates a new instance with the exact match mask.
+    pub const fn new_with_exact_mask(id: u32) -> Self {
+        DeviceId {
+            id,
+            mask: DeviceMask::Exact,
+        }
+    }
+
+    /// Creates a new instance with the model match mask.
+    pub const fn new_with_model_mask(id: u32) -> Self {
+        DeviceId {
+            id,
+            mask: DeviceMask::Model,
+        }
+    }
+
+    /// Creates a new instance with the vendor match mask.
+    pub const fn new_with_vendor_mask(id: u32) -> Self {
+        DeviceId {
+            id,
+            mask: DeviceMask::Vendor,
+        }
+    }
+
+    /// Creates a new instance with a custom match mask.
+    pub const fn new_with_custom_mask(id: u32, mask: u32) -> Self {
+        DeviceId {
+            id,
+            mask: DeviceMask::Custom(mask),
+        }
+    }
+
+    /// Creates a new instance from [`Driver`].
+    pub const fn new_with_driver<T: Driver>() -> Self {
+        T::PHY_DEVICE_ID
+    }
+
+    /// Get a `mask` as u32.
+    pub const fn mask_as_int(&self) -> u32 {
+        self.mask.as_int()
+    }
+
+    // macro use only
+    #[doc(hidden)]
+    pub const fn mdio_device_id(&self) -> bindings::mdio_device_id {
+        bindings::mdio_device_id {
+            phy_id: self.id,
+            phy_id_mask: self.mask.as_int(),
+        }
+    }
+}
+
+enum DeviceMask {
+    Exact,
+    Model,
+    Vendor,
+    Custom(u32),
+}
+
+impl DeviceMask {
+    const MASK_EXACT: u32 = !0;
+    const MASK_MODEL: u32 = !0 << 4;
+    const MASK_VENDOR: u32 = !0 << 10;
+
+    const fn as_int(&self) -> u32 {
+        match self {
+            DeviceMask::Exact => Self::MASK_EXACT,
+            DeviceMask::Model => Self::MASK_MODEL,
+            DeviceMask::Vendor => Self::MASK_VENDOR,
+            DeviceMask::Custom(mask) => *mask,
+        }
+    }
+}