diff mbox series

[v2,01/10] rust: pass module name to `Module::init`

Message ID 20240618234025.15036-2-dakr@redhat.com (mailing list archive)
State Handled Elsewhere
Delegated to: Bjorn Helgaas
Headers show
Series Device / Driver and PCI Rust abstractions | expand

Commit Message

Danilo Krummrich June 18, 2024, 11:39 p.m. UTC
In a subsequent patch we introduce the `Registration` abstraction used
to register driver structures. Some subsystems require the module name on
driver registration (e.g. PCI in __pci_register_driver()), hence pass
the module name to `Module::init`.

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
---
 rust/kernel/lib.rs           | 14 ++++++++++----
 rust/kernel/net/phy.rs       |  2 +-
 rust/macros/module.rs        |  3 ++-
 samples/rust/rust_minimal.rs |  2 +-
 samples/rust/rust_print.rs   |  2 +-
 5 files changed, 15 insertions(+), 8 deletions(-)

Comments

Greg KH June 20, 2024, 2:19 p.m. UTC | #1
On Wed, Jun 19, 2024 at 01:39:47AM +0200, Danilo Krummrich wrote:
> In a subsequent patch we introduce the `Registration` abstraction used
> to register driver structures. Some subsystems require the module name on
> driver registration (e.g. PCI in __pci_register_driver()), hence pass
> the module name to `Module::init`.

I understand the need/want here, but it feels odd that you have to
change anything to do it.

> 
> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> ---
>  rust/kernel/lib.rs           | 14 ++++++++++----
>  rust/kernel/net/phy.rs       |  2 +-
>  rust/macros/module.rs        |  3 ++-
>  samples/rust/rust_minimal.rs |  2 +-
>  samples/rust/rust_print.rs   |  2 +-
>  5 files changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index a791702b4fee..5af00e072a58 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -71,7 +71,7 @@ pub trait Module: Sized + Sync + Send {
>      /// should do.
>      ///
>      /// Equivalent to the `module_init` macro in the C API.
> -    fn init(module: &'static ThisModule) -> error::Result<Self>;
> +    fn init(name: &'static str::CStr, module: &'static ThisModule) -> error::Result<Self>;

Why can't the name come directly from the build system?  Why must it be
passed into the init function of the module "class"?  What is it going
to do with it?

A PCI, or other bus, driver "knows" it's name already by virtue of the
build system, so it can pass that string into whatever function needs
that, but the module init function itself does NOT need that.

So I fail to understand why we need to burden ALL module init functions
with this, when only a very very very tiny subset of all drivers will
ever need to know this, and even then, they don't need to know it at
init module time, they know it at build time and it will be a static
string at that point, it will not be coming in through an init call.

thanks,

greg k-h
Danilo Krummrich June 20, 2024, 4:10 p.m. UTC | #2
On Thu, Jun 20, 2024 at 04:19:48PM +0200, Greg KH wrote:
> On Wed, Jun 19, 2024 at 01:39:47AM +0200, Danilo Krummrich wrote:
> > In a subsequent patch we introduce the `Registration` abstraction used
> > to register driver structures. Some subsystems require the module name on
> > driver registration (e.g. PCI in __pci_register_driver()), hence pass
> > the module name to `Module::init`.
> 
> I understand the need/want here, but it feels odd that you have to
> change anything to do it.
> 
> > 
> > Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> > ---
> >  rust/kernel/lib.rs           | 14 ++++++++++----
> >  rust/kernel/net/phy.rs       |  2 +-
> >  rust/macros/module.rs        |  3 ++-
> >  samples/rust/rust_minimal.rs |  2 +-
> >  samples/rust/rust_print.rs   |  2 +-
> >  5 files changed, 15 insertions(+), 8 deletions(-)
> > 
> > diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> > index a791702b4fee..5af00e072a58 100644
> > --- a/rust/kernel/lib.rs
> > +++ b/rust/kernel/lib.rs
> > @@ -71,7 +71,7 @@ pub trait Module: Sized + Sync + Send {
> >      /// should do.
> >      ///
> >      /// Equivalent to the `module_init` macro in the C API.
> > -    fn init(module: &'static ThisModule) -> error::Result<Self>;
> > +    fn init(name: &'static str::CStr, module: &'static ThisModule) -> error::Result<Self>;
> 
> Why can't the name come directly from the build system?  Why must it be
> passed into the init function of the module "class"?  What is it going
> to do with it?

The name does come from the build system, that's where `Module::init` gets it
from.

> 
> A PCI, or other bus, driver "knows" it's name already by virtue of the
> build system, so it can pass that string into whatever function needs

Let's take pci_register_driver() as example.

```
#define pci_register_driver(driver)		\
	__pci_register_driver(driver, THIS_MODULE, KBUILD_MODNAME)
```

In C drivers this works because (1) it's a macro and (2) it's called directly
from the driver code.

In Rust, for very good reasons, we have abstractions for C APIs, hence the
actual call to __pci_register_driver() does not come from code within the
module, but from the PCI Rust abstraction `Module::init` calls into instead.

Consequently, we have to pass things through. This also isn't new, please note
that the current code already does the same thing: `Module::init` (without this
patch) is already declared as

`fn init(module: &'static ThisModule) -> error::Result<Self>`

passing through `ThisModule` for the exact same reason.

Please also note that in the most common case (one driver per module) we don't
see any of this anyway.

Just like the C macro module_pci_driver(), Rust drivers can use the
`module_pci_driver!` macro.

Example from Nova:

```
    kernel::module_pci_driver! {
        // The driver type that implements the corresponding probe() and
        // remove() driver callbacks.
        type: NovaDriver,
        name: "Nova",
        author: "Danilo Krummrich",
        description: "Nova GPU driver",
        license: "GPL v2",
    }
```

> that, but the module init function itself does NOT need that.
> 
> So I fail to understand why we need to burden ALL module init functions
> with this, when only a very very very tiny subset of all drivers will
> ever need to know this, and even then, they don't need to know it at
> init module time, they know it at build time and it will be a static
> string at that point, it will not be coming in through an init call.
> 
> thanks,
> 
> greg k-h
>
Greg KH June 20, 2024, 4:36 p.m. UTC | #3
On Thu, Jun 20, 2024 at 06:10:05PM +0200, Danilo Krummrich wrote:
> On Thu, Jun 20, 2024 at 04:19:48PM +0200, Greg KH wrote:
> > On Wed, Jun 19, 2024 at 01:39:47AM +0200, Danilo Krummrich wrote:
> > > In a subsequent patch we introduce the `Registration` abstraction used
> > > to register driver structures. Some subsystems require the module name on
> > > driver registration (e.g. PCI in __pci_register_driver()), hence pass
> > > the module name to `Module::init`.
> > 
> > I understand the need/want here, but it feels odd that you have to
> > change anything to do it.
> > 
> > > 
> > > Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> > > ---
> > >  rust/kernel/lib.rs           | 14 ++++++++++----
> > >  rust/kernel/net/phy.rs       |  2 +-
> > >  rust/macros/module.rs        |  3 ++-
> > >  samples/rust/rust_minimal.rs |  2 +-
> > >  samples/rust/rust_print.rs   |  2 +-
> > >  5 files changed, 15 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> > > index a791702b4fee..5af00e072a58 100644
> > > --- a/rust/kernel/lib.rs
> > > +++ b/rust/kernel/lib.rs
> > > @@ -71,7 +71,7 @@ pub trait Module: Sized + Sync + Send {
> > >      /// should do.
> > >      ///
> > >      /// Equivalent to the `module_init` macro in the C API.
> > > -    fn init(module: &'static ThisModule) -> error::Result<Self>;
> > > +    fn init(name: &'static str::CStr, module: &'static ThisModule) -> error::Result<Self>;
> > 
> > Why can't the name come directly from the build system?  Why must it be
> > passed into the init function of the module "class"?  What is it going
> > to do with it?
> 
> The name does come from the build system, that's where `Module::init` gets it
> from.
> 
> > 
> > A PCI, or other bus, driver "knows" it's name already by virtue of the
> > build system, so it can pass that string into whatever function needs
> 
> Let's take pci_register_driver() as example.
> 
> ```
> #define pci_register_driver(driver)		\
> 	__pci_register_driver(driver, THIS_MODULE, KBUILD_MODNAME)
> ```
> 
> In C drivers this works because (1) it's a macro and (2) it's called directly
> from the driver code.
> 
> In Rust, for very good reasons, we have abstractions for C APIs, hence the
> actual call to __pci_register_driver() does not come from code within the
> module, but from the PCI Rust abstraction `Module::init` calls into instead.

I don't understand those reasons, sorry.

> Consequently, we have to pass things through. This also isn't new, please note
> that the current code already does the same thing: `Module::init` (without this
> patch) is already declared as
> 
> `fn init(module: &'static ThisModule) -> error::Result<Self>`
> passing through `ThisModule` for the exact same reason.

Yeah, and I never quite understood that either :)

> Please also note that in the most common case (one driver per module) we don't
> see any of this anyway.

True, but someone has to review and most importantly, maintain, this
glue code.

> Just like the C macro module_pci_driver(), Rust drivers can use the
> `module_pci_driver!` macro.
> 
> Example from Nova:
> 
> ```
>     kernel::module_pci_driver! {
>         // The driver type that implements the corresponding probe() and
>         // remove() driver callbacks.
>         type: NovaDriver,
>         name: "Nova",
>         author: "Danilo Krummrich",
>         description: "Nova GPU driver",
>         license: "GPL v2",
>     }
> ```

As I said when you implemented this fun macro, don't do this yet.

We added these "helper" macros WAY late in the development cycle of the
driver model, AFTER we were sure we got other parts right.  There is no
need to attempt to implement all of what you can do in C today in Rust,
in fact, I would argue that we don't want to do that, just to make
things simpler to review the glue code, which is the most important part
here to get right!

Changing drivers later, to take advantage of code savings like this
macro can be done then, not just yet.  Let's get this understood and
right first, no need to be tricky or complex.

And, as I hinted at before, I don't think we should be doing this at all
just yet either.  I still think a small "C shim" layer to wrap the
struct pci driver up and pass the calls to the rust portion of the
module is better to start with, it both saves you time and energy so
that you can work on what you really want to do (i.e. a driver in rust)
and not have to worry about the c bindings as that's the "tricky" part
that is stopping you from getting your driver work done.

After all, it's not the pci driver model code that is usually the tricky
bits to verify in C, it's the whole rest of the mess.  Stick with a
small C file, with just the pci driver structure and device ids, and
then instantiate your rust stuff when probe() is called, and clean up
when release() is called.

I can provide an example if needed.

thanks,

greg k-h
Danilo Krummrich June 20, 2024, 9:24 p.m. UTC | #4
On Thu, Jun 20, 2024 at 06:36:08PM +0200, Greg KH wrote:
> On Thu, Jun 20, 2024 at 06:10:05PM +0200, Danilo Krummrich wrote:
> > On Thu, Jun 20, 2024 at 04:19:48PM +0200, Greg KH wrote:
> > > On Wed, Jun 19, 2024 at 01:39:47AM +0200, Danilo Krummrich wrote:
> > > > In a subsequent patch we introduce the `Registration` abstraction used
> > > > to register driver structures. Some subsystems require the module name on
> > > > driver registration (e.g. PCI in __pci_register_driver()), hence pass
> > > > the module name to `Module::init`.
> > > 
> > > I understand the need/want here, but it feels odd that you have to
> > > change anything to do it.
> > > 
> > > > 
> > > > Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> > > > ---
> > > >  rust/kernel/lib.rs           | 14 ++++++++++----
> > > >  rust/kernel/net/phy.rs       |  2 +-
> > > >  rust/macros/module.rs        |  3 ++-
> > > >  samples/rust/rust_minimal.rs |  2 +-
> > > >  samples/rust/rust_print.rs   |  2 +-
> > > >  5 files changed, 15 insertions(+), 8 deletions(-)
> > > > 
> > > > diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> > > > index a791702b4fee..5af00e072a58 100644
> > > > --- a/rust/kernel/lib.rs
> > > > +++ b/rust/kernel/lib.rs
> > > > @@ -71,7 +71,7 @@ pub trait Module: Sized + Sync + Send {
> > > >      /// should do.
> > > >      ///
> > > >      /// Equivalent to the `module_init` macro in the C API.
> > > > -    fn init(module: &'static ThisModule) -> error::Result<Self>;
> > > > +    fn init(name: &'static str::CStr, module: &'static ThisModule) -> error::Result<Self>;
> > > 
> > > Why can't the name come directly from the build system?  Why must it be
> > > passed into the init function of the module "class"?  What is it going
> > > to do with it?
> > 
> > The name does come from the build system, that's where `Module::init` gets it
> > from.
> > 
> > > 
> > > A PCI, or other bus, driver "knows" it's name already by virtue of the
> > > build system, so it can pass that string into whatever function needs
> > 
> > Let's take pci_register_driver() as example.
> > 
> > ```
> > #define pci_register_driver(driver)		\
> > 	__pci_register_driver(driver, THIS_MODULE, KBUILD_MODNAME)
> > ```
> > 
> > In C drivers this works because (1) it's a macro and (2) it's called directly
> > from the driver code.
> > 
> > In Rust, for very good reasons, we have abstractions for C APIs, hence the
> > actual call to __pci_register_driver() does not come from code within the
> > module, but from the PCI Rust abstraction `Module::init` calls into instead.
> 
> I don't understand those reasons, sorry.

Ok, good you point this out. We should definitely discuss this point then and
build some consensus around it.

I propose to focus on this point first and follow up with the discussion of the
rest of the series afterwards.

Let me explain why I am convinced that it's very important to have abstractions
in place in general and from the get-go.

In general, having abstractions for C APIs is the foundation of being able to
make use of a lot of advantages Rust has to offer.

The most obvious one are all the safety aspects. For instance, with an
abstraction we have to get exactly one piece of code right in terms of pointer
validity, lifetimes, type safety, API semantics, etc. and in all other places
(e.g. drivers) we get the compiler to check those things for us through the
abstraction.

Now, the abstraction can be buggy or insufficient and hence there is no absolute
safety guarantee. *But*, if we get this one thing right, there is nothing a
driver can mess up by itself trying to do stupid things anymore.

If we just call the C code instead we have to get it right everywhere instead.

Now, you could approach this top-down instead and argue that we could at least
benefit from Rust for the driver specific parts.

Unfortunately, this doesn't really work out either. Also driver specific code is
typically (very) closely connected to kernel APIs. If you want to use the safety
aspects of Rust for the driver specific parts you inevitably end up writing
abstractions for the C APIs in your driver.

There are basically three options you can end up with:

(1) An abstraction for the C API within your driver that is actually generic
    for every driver, and hence shouldn't be there.
(2) Your driver specific code is full of raw pointers and `unsafe {}` calls,
    which in the end just means that you ended up baking the abstraction into
    your driver specific code.
(3) You ignore everything, put everything in a huge `unsafe {}` block and
    compile C code with the Rust compiler. (Admittedly, maybe slightly
    overstated, but not that far off either.)

The latter is also the reason why it doesn't make sense to only have
abstractions for some things, but not for other.

If an abstraction for B is based on A, but we don't start with A, then B ends up
implementing (at least partially) the abstraction for A as well. For instance,
if we don't implement `driver::Registration` then the PCI abstractions (and
platform, usb, etc.) have to implement it.

It really comes down to the point that it just bubbles up. We really have to do
this bottom-up, otherwise we just end up moving those abstractions up, layer by
layer, where they don't belong to and we re-implement them over and over again.

> 
> > Consequently, we have to pass things through. This also isn't new, please note
> > that the current code already does the same thing: `Module::init` (without this
> > patch) is already declared as
> > 
> > `fn init(module: &'static ThisModule) -> error::Result<Self>`
> > passing through `ThisModule` for the exact same reason.
> 
> Yeah, and I never quite understood that either :)

Since commit 247b365dc8dc ("rust: add `kernel` crate") shows me your RB for
that, am I good to assume that this one isn't a blocker?

> 
> > Please also note that in the most common case (one driver per module) we don't
> > see any of this anyway.
> 
> True, but someone has to review and most importantly, maintain, this
> glue code.
> 
> > Just like the C macro module_pci_driver(), Rust drivers can use the
> > `module_pci_driver!` macro.
> > 
> > Example from Nova:
> > 
> > ```
> >     kernel::module_pci_driver! {
> >         // The driver type that implements the corresponding probe() and
> >         // remove() driver callbacks.
> >         type: NovaDriver,
> >         name: "Nova",
> >         author: "Danilo Krummrich",
> >         description: "Nova GPU driver",
> >         license: "GPL v2",
> >     }
> > ```
> 
> As I said when you implemented this fun macro, don't do this yet.
> 
> We added these "helper" macros WAY late in the development cycle of the
> driver model, AFTER we were sure we got other parts right.  There is no
> need to attempt to implement all of what you can do in C today in Rust,
> in fact, I would argue that we don't want to do that, just to make
> things simpler to review the glue code, which is the most important part
> here to get right!

We're not reinventing the wheel here, we stick to the architecture the kernel
already has.

However, I understand that not starting with this macro directly makes it easier
for you to see what's going on. I can introduce the macro in a separate patch to
make it more obvious what's going on.

> 
> Changing drivers later, to take advantage of code savings like this
> macro can be done then, not just yet.  Let's get this understood and
> right first, no need to be tricky or complex.
> 
> And, as I hinted at before, I don't think we should be doing this at all
> just yet either.  I still think a small "C shim" layer to wrap the
> struct pci driver up and pass the calls to the rust portion of the
> module is better to start with, it both saves you time and energy so
> that you can work on what you really want to do (i.e. a driver in rust)
> and not have to worry about the c bindings as that's the "tricky" part
> that is stopping you from getting your driver work done.

I strongly disagree here. As explained above, it just bubbles up, this approach
would just make me end up having to write a lot of the code from the
abstractions in the driver.

However, it would indeed safe me time and energy to do just that. But that isn't
really what I want. I also don't want to write a driver in Rust *only*.

And I also don't really think that you want people, who commit to work hard to
get things right, to just take the shortcut and create some random mess buried
in a driver. :)

What I actually want is to get to a solid foundation for Rust drivers in
general, since I'm convinced that this provides a lot of value beyond the scope
of a single driver.

Since you've brought the topic up a few times, I am also willing to maintain
those abstractions if this is a concern. Maybe one day the corresponding
maintainers are comfortable enough and this isn't needed anymore, but at least
until then, I'm happy to help out.

> 
> After all, it's not the pci driver model code that is usually the tricky
> bits to verify in C, it's the whole rest of the mess.  Stick with a
> small C file, with just the pci driver structure and device ids, and
> then instantiate your rust stuff when probe() is called, and clean up
> when release() is called.

Again, this really bubbles up.

What do we pass to Rust probe() function? A raw struct pci_dev pointer or the
proper PCI device abstraction?

How do we implement I/O memory mappings for PCI bars without PCI / I/O
abstraction?

How do we perform (boundary checked) I/O memory reads / writes without `Io`
abstraction?

How do we handle the lifetime of resources without `Devres` abstraction?

How do we (properly) implement a DRM device registration abstraction without
`Devres`?

Luckily we already got the `Firmware` and `Device` abstraction in place. :)

> 
> I can provide an example if needed.

If you do so, please don't stop at the probe() boundary, please continue to the
point where the Nova stub driver is. It really does just enough to show / proove
that the abstractions tie together nicely. But it should be enough to see that
you end up with either (1), (2) or (3) from above.

> 
> thanks,
> 
> greg k-h
>
Danilo Krummrich June 26, 2024, 10:29 a.m. UTC | #5
Hi Greg,

On 6/20/24 23:24, Danilo Krummrich wrote:

This is a polite reminder about the below discussion.

> On Thu, Jun 20, 2024 at 06:36:08PM +0200, Greg KH wrote:
>> On Thu, Jun 20, 2024 at 06:10:05PM +0200, Danilo Krummrich wrote:
>>> On Thu, Jun 20, 2024 at 04:19:48PM +0200, Greg KH wrote:
>>>> On Wed, Jun 19, 2024 at 01:39:47AM +0200, Danilo Krummrich wrote:
>>>>> In a subsequent patch we introduce the `Registration` abstraction used
>>>>> to register driver structures. Some subsystems require the module name on
>>>>> driver registration (e.g. PCI in __pci_register_driver()), hence pass
>>>>> the module name to `Module::init`.
>>>>
>>>> I understand the need/want here, but it feels odd that you have to
>>>> change anything to do it.
>>>>
>>>>>
>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>>> ---
>>>>>   rust/kernel/lib.rs           | 14 ++++++++++----
>>>>>   rust/kernel/net/phy.rs       |  2 +-
>>>>>   rust/macros/module.rs        |  3 ++-
>>>>>   samples/rust/rust_minimal.rs |  2 +-
>>>>>   samples/rust/rust_print.rs   |  2 +-
>>>>>   5 files changed, 15 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
>>>>> index a791702b4fee..5af00e072a58 100644
>>>>> --- a/rust/kernel/lib.rs
>>>>> +++ b/rust/kernel/lib.rs
>>>>> @@ -71,7 +71,7 @@ pub trait Module: Sized + Sync + Send {
>>>>>       /// should do.
>>>>>       ///
>>>>>       /// Equivalent to the `module_init` macro in the C API.
>>>>> -    fn init(module: &'static ThisModule) -> error::Result<Self>;
>>>>> +    fn init(name: &'static str::CStr, module: &'static ThisModule) -> error::Result<Self>;
>>>>
>>>> Why can't the name come directly from the build system?  Why must it be
>>>> passed into the init function of the module "class"?  What is it going
>>>> to do with it?
>>>
>>> The name does come from the build system, that's where `Module::init` gets it
>>> from.
>>>
>>>>
>>>> A PCI, or other bus, driver "knows" it's name already by virtue of the
>>>> build system, so it can pass that string into whatever function needs
>>>
>>> Let's take pci_register_driver() as example.
>>>
>>> ```
>>> #define pci_register_driver(driver)		\
>>> 	__pci_register_driver(driver, THIS_MODULE, KBUILD_MODNAME)
>>> ```
>>>
>>> In C drivers this works because (1) it's a macro and (2) it's called directly
>>> from the driver code.
>>>
>>> In Rust, for very good reasons, we have abstractions for C APIs, hence the
>>> actual call to __pci_register_driver() does not come from code within the
>>> module, but from the PCI Rust abstraction `Module::init` calls into instead.
>>
>> I don't understand those reasons, sorry.
> 
> Ok, good you point this out. We should definitely discuss this point then and
> build some consensus around it.
> 
> I propose to focus on this point first and follow up with the discussion of the
> rest of the series afterwards.
> 
> Let me explain why I am convinced that it's very important to have abstractions
> in place in general and from the get-go.
> 
> In general, having abstractions for C APIs is the foundation of being able to
> make use of a lot of advantages Rust has to offer.
> 
> The most obvious one are all the safety aspects. For instance, with an
> abstraction we have to get exactly one piece of code right in terms of pointer
> validity, lifetimes, type safety, API semantics, etc. and in all other places
> (e.g. drivers) we get the compiler to check those things for us through the
> abstraction.
> 
> Now, the abstraction can be buggy or insufficient and hence there is no absolute
> safety guarantee. *But*, if we get this one thing right, there is nothing a
> driver can mess up by itself trying to do stupid things anymore.
> 
> If we just call the C code instead we have to get it right everywhere instead.
> 
> Now, you could approach this top-down instead and argue that we could at least
> benefit from Rust for the driver specific parts.
> 
> Unfortunately, this doesn't really work out either. Also driver specific code is
> typically (very) closely connected to kernel APIs. If you want to use the safety
> aspects of Rust for the driver specific parts you inevitably end up writing
> abstractions for the C APIs in your driver.
> 
> There are basically three options you can end up with:
> 
> (1) An abstraction for the C API within your driver that is actually generic
>      for every driver, and hence shouldn't be there.
> (2) Your driver specific code is full of raw pointers and `unsafe {}` calls,
>      which in the end just means that you ended up baking the abstraction into
>      your driver specific code.
> (3) You ignore everything, put everything in a huge `unsafe {}` block and
>      compile C code with the Rust compiler. (Admittedly, maybe slightly
>      overstated, but not that far off either.)
> 
> The latter is also the reason why it doesn't make sense to only have
> abstractions for some things, but not for other.
> 
> If an abstraction for B is based on A, but we don't start with A, then B ends up
> implementing (at least partially) the abstraction for A as well. For instance,
> if we don't implement `driver::Registration` then the PCI abstractions (and
> platform, usb, etc.) have to implement it.
> 
> It really comes down to the point that it just bubbles up. We really have to do
> this bottom-up, otherwise we just end up moving those abstractions up, layer by
> layer, where they don't belong to and we re-implement them over and over again.
> 
>>
>>> Consequently, we have to pass things through. This also isn't new, please note
>>> that the current code already does the same thing: `Module::init` (without this
>>> patch) is already declared as
>>>
>>> `fn init(module: &'static ThisModule) -> error::Result<Self>`
>>> passing through `ThisModule` for the exact same reason.
>>
>> Yeah, and I never quite understood that either :)
> 
> Since commit 247b365dc8dc ("rust: add `kernel` crate") shows me your RB for
> that, am I good to assume that this one isn't a blocker?
> 
>>
>>> Please also note that in the most common case (one driver per module) we don't
>>> see any of this anyway.
>>
>> True, but someone has to review and most importantly, maintain, this
>> glue code.
>>
>>> Just like the C macro module_pci_driver(), Rust drivers can use the
>>> `module_pci_driver!` macro.
>>>
>>> Example from Nova:
>>>
>>> ```
>>>      kernel::module_pci_driver! {
>>>          // The driver type that implements the corresponding probe() and
>>>          // remove() driver callbacks.
>>>          type: NovaDriver,
>>>          name: "Nova",
>>>          author: "Danilo Krummrich",
>>>          description: "Nova GPU driver",
>>>          license: "GPL v2",
>>>      }
>>> ```
>>
>> As I said when you implemented this fun macro, don't do this yet.
>>
>> We added these "helper" macros WAY late in the development cycle of the
>> driver model, AFTER we were sure we got other parts right.  There is no
>> need to attempt to implement all of what you can do in C today in Rust,
>> in fact, I would argue that we don't want to do that, just to make
>> things simpler to review the glue code, which is the most important part
>> here to get right!
> 
> We're not reinventing the wheel here, we stick to the architecture the kernel
> already has.
> 
> However, I understand that not starting with this macro directly makes it easier
> for you to see what's going on. I can introduce the macro in a separate patch to
> make it more obvious what's going on.
> 
>>
>> Changing drivers later, to take advantage of code savings like this
>> macro can be done then, not just yet.  Let's get this understood and
>> right first, no need to be tricky or complex.
>>
>> And, as I hinted at before, I don't think we should be doing this at all
>> just yet either.  I still think a small "C shim" layer to wrap the
>> struct pci driver up and pass the calls to the rust portion of the
>> module is better to start with, it both saves you time and energy so
>> that you can work on what you really want to do (i.e. a driver in rust)
>> and not have to worry about the c bindings as that's the "tricky" part
>> that is stopping you from getting your driver work done.
> 
> I strongly disagree here. As explained above, it just bubbles up, this approach
> would just make me end up having to write a lot of the code from the
> abstractions in the driver.
> 
> However, it would indeed safe me time and energy to do just that. But that isn't
> really what I want. I also don't want to write a driver in Rust *only*.
> 
> And I also don't really think that you want people, who commit to work hard to
> get things right, to just take the shortcut and create some random mess buried
> in a driver. :)
> 
> What I actually want is to get to a solid foundation for Rust drivers in
> general, since I'm convinced that this provides a lot of value beyond the scope
> of a single driver.
> 
> Since you've brought the topic up a few times, I am also willing to maintain
> those abstractions if this is a concern. Maybe one day the corresponding
> maintainers are comfortable enough and this isn't needed anymore, but at least
> until then, I'm happy to help out.
> 
>>
>> After all, it's not the pci driver model code that is usually the tricky
>> bits to verify in C, it's the whole rest of the mess.  Stick with a
>> small C file, with just the pci driver structure and device ids, and
>> then instantiate your rust stuff when probe() is called, and clean up
>> when release() is called.
> 
> Again, this really bubbles up.
> 
> What do we pass to Rust probe() function? A raw struct pci_dev pointer or the
> proper PCI device abstraction?
> 
> How do we implement I/O memory mappings for PCI bars without PCI / I/O
> abstraction?
> 
> How do we perform (boundary checked) I/O memory reads / writes without `Io`
> abstraction?
> 
> How do we handle the lifetime of resources without `Devres` abstraction?
> 
> How do we (properly) implement a DRM device registration abstraction without
> `Devres`?
> 
> Luckily we already got the `Firmware` and `Device` abstraction in place. :)
> 
>>
>> I can provide an example if needed.
> 
> If you do so, please don't stop at the probe() boundary, please continue to the
> point where the Nova stub driver is. It really does just enough to show / proove
> that the abstractions tie together nicely. But it should be enough to see that
> you end up with either (1), (2) or (3) from above.
> 
>>
>> thanks,
>>
>> greg k-h
>>
Greg KH June 27, 2024, 7:33 a.m. UTC | #6
On Wed, Jun 26, 2024 at 12:29:01PM +0200, Danilo Krummrich wrote:
> Hi Greg,
> 
> On 6/20/24 23:24, Danilo Krummrich wrote:
> 
> This is a polite reminder about the below discussion.

I know, it's on my TODO list, but I'm currently traveling for a
conference and will not have time until next week to even consider
looking at this...

greg k-h
Danilo Krummrich June 27, 2024, 7:41 a.m. UTC | #7
On Thu, Jun 27, 2024 at 09:33:39AM +0200, Greg KH wrote:
> On Wed, Jun 26, 2024 at 12:29:01PM +0200, Danilo Krummrich wrote:
> > Hi Greg,
> > 
> > On 6/20/24 23:24, Danilo Krummrich wrote:
> > 
> > This is a polite reminder about the below discussion.
> 
> I know, it's on my TODO list, but I'm currently traveling for a
> conference and will not have time until next week to even consider
> looking at this...

Thanks, Greg, for letting me know. Let's continue this discussion next week
then.

I may send replies on all other mails of this series in order to not leave them
unreplied for too long, but as mentioned, let's get this thread discussed first
once you're able to get back to it.

Have a good trip!

- Danilo

> 
> greg k-h
>
Danilo Krummrich July 9, 2024, 10:15 a.m. UTC | #8
On Thu, Jun 20, 2024 at 11:24:17PM +0200, Danilo Krummrich wrote:

Just another kind reminder on this one. :)

> On Thu, Jun 20, 2024 at 06:36:08PM +0200, Greg KH wrote:
> > On Thu, Jun 20, 2024 at 06:10:05PM +0200, Danilo Krummrich wrote:
> > > On Thu, Jun 20, 2024 at 04:19:48PM +0200, Greg KH wrote:
> > > > On Wed, Jun 19, 2024 at 01:39:47AM +0200, Danilo Krummrich wrote:
> > > > > In a subsequent patch we introduce the `Registration` abstraction used
> > > > > to register driver structures. Some subsystems require the module name on
> > > > > driver registration (e.g. PCI in __pci_register_driver()), hence pass
> > > > > the module name to `Module::init`.
> > > > 
> > > > I understand the need/want here, but it feels odd that you have to
> > > > change anything to do it.
> > > > 
> > > > > 
> > > > > Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> > > > > ---
> > > > >  rust/kernel/lib.rs           | 14 ++++++++++----
> > > > >  rust/kernel/net/phy.rs       |  2 +-
> > > > >  rust/macros/module.rs        |  3 ++-
> > > > >  samples/rust/rust_minimal.rs |  2 +-
> > > > >  samples/rust/rust_print.rs   |  2 +-
> > > > >  5 files changed, 15 insertions(+), 8 deletions(-)
> > > > > 
> > > > > diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> > > > > index a791702b4fee..5af00e072a58 100644
> > > > > --- a/rust/kernel/lib.rs
> > > > > +++ b/rust/kernel/lib.rs
> > > > > @@ -71,7 +71,7 @@ pub trait Module: Sized + Sync + Send {
> > > > >      /// should do.
> > > > >      ///
> > > > >      /// Equivalent to the `module_init` macro in the C API.
> > > > > -    fn init(module: &'static ThisModule) -> error::Result<Self>;
> > > > > +    fn init(name: &'static str::CStr, module: &'static ThisModule) -> error::Result<Self>;
> > > > 
> > > > Why can't the name come directly from the build system?  Why must it be
> > > > passed into the init function of the module "class"?  What is it going
> > > > to do with it?
> > > 
> > > The name does come from the build system, that's where `Module::init` gets it
> > > from.
> > > 
> > > > 
> > > > A PCI, or other bus, driver "knows" it's name already by virtue of the
> > > > build system, so it can pass that string into whatever function needs
> > > 
> > > Let's take pci_register_driver() as example.
> > > 
> > > ```
> > > #define pci_register_driver(driver)		\
> > > 	__pci_register_driver(driver, THIS_MODULE, KBUILD_MODNAME)
> > > ```
> > > 
> > > In C drivers this works because (1) it's a macro and (2) it's called directly
> > > from the driver code.
> > > 
> > > In Rust, for very good reasons, we have abstractions for C APIs, hence the
> > > actual call to __pci_register_driver() does not come from code within the
> > > module, but from the PCI Rust abstraction `Module::init` calls into instead.
> > 
> > I don't understand those reasons, sorry.
> 
> Ok, good you point this out. We should definitely discuss this point then and
> build some consensus around it.
> 
> I propose to focus on this point first and follow up with the discussion of the
> rest of the series afterwards.
> 
> Let me explain why I am convinced that it's very important to have abstractions
> in place in general and from the get-go.
> 
> In general, having abstractions for C APIs is the foundation of being able to
> make use of a lot of advantages Rust has to offer.
> 
> The most obvious one are all the safety aspects. For instance, with an
> abstraction we have to get exactly one piece of code right in terms of pointer
> validity, lifetimes, type safety, API semantics, etc. and in all other places
> (e.g. drivers) we get the compiler to check those things for us through the
> abstraction.
> 
> Now, the abstraction can be buggy or insufficient and hence there is no absolute
> safety guarantee. *But*, if we get this one thing right, there is nothing a
> driver can mess up by itself trying to do stupid things anymore.
> 
> If we just call the C code instead we have to get it right everywhere instead.
> 
> Now, you could approach this top-down instead and argue that we could at least
> benefit from Rust for the driver specific parts.
> 
> Unfortunately, this doesn't really work out either. Also driver specific code is
> typically (very) closely connected to kernel APIs. If you want to use the safety
> aspects of Rust for the driver specific parts you inevitably end up writing
> abstractions for the C APIs in your driver.
> 
> There are basically three options you can end up with:
> 
> (1) An abstraction for the C API within your driver that is actually generic
>     for every driver, and hence shouldn't be there.
> (2) Your driver specific code is full of raw pointers and `unsafe {}` calls,
>     which in the end just means that you ended up baking the abstraction into
>     your driver specific code.
> (3) You ignore everything, put everything in a huge `unsafe {}` block and
>     compile C code with the Rust compiler. (Admittedly, maybe slightly
>     overstated, but not that far off either.)
> 
> The latter is also the reason why it doesn't make sense to only have
> abstractions for some things, but not for other.
> 
> If an abstraction for B is based on A, but we don't start with A, then B ends up
> implementing (at least partially) the abstraction for A as well. For instance,
> if we don't implement `driver::Registration` then the PCI abstractions (and
> platform, usb, etc.) have to implement it.
> 
> It really comes down to the point that it just bubbles up. We really have to do
> this bottom-up, otherwise we just end up moving those abstractions up, layer by
> layer, where they don't belong to and we re-implement them over and over again.
> 
> > 
> > > Consequently, we have to pass things through. This also isn't new, please note
> > > that the current code already does the same thing: `Module::init` (without this
> > > patch) is already declared as
> > > 
> > > `fn init(module: &'static ThisModule) -> error::Result<Self>`
> > > passing through `ThisModule` for the exact same reason.
> > 
> > Yeah, and I never quite understood that either :)
> 
> Since commit 247b365dc8dc ("rust: add `kernel` crate") shows me your RB for
> that, am I good to assume that this one isn't a blocker?
> 
> > 
> > > Please also note that in the most common case (one driver per module) we don't
> > > see any of this anyway.
> > 
> > True, but someone has to review and most importantly, maintain, this
> > glue code.
> > 
> > > Just like the C macro module_pci_driver(), Rust drivers can use the
> > > `module_pci_driver!` macro.
> > > 
> > > Example from Nova:
> > > 
> > > ```
> > >     kernel::module_pci_driver! {
> > >         // The driver type that implements the corresponding probe() and
> > >         // remove() driver callbacks.
> > >         type: NovaDriver,
> > >         name: "Nova",
> > >         author: "Danilo Krummrich",
> > >         description: "Nova GPU driver",
> > >         license: "GPL v2",
> > >     }
> > > ```
> > 
> > As I said when you implemented this fun macro, don't do this yet.
> > 
> > We added these "helper" macros WAY late in the development cycle of the
> > driver model, AFTER we were sure we got other parts right.  There is no
> > need to attempt to implement all of what you can do in C today in Rust,
> > in fact, I would argue that we don't want to do that, just to make
> > things simpler to review the glue code, which is the most important part
> > here to get right!
> 
> We're not reinventing the wheel here, we stick to the architecture the kernel
> already has.
> 
> However, I understand that not starting with this macro directly makes it easier
> for you to see what's going on. I can introduce the macro in a separate patch to
> make it more obvious what's going on.
> 
> > 
> > Changing drivers later, to take advantage of code savings like this
> > macro can be done then, not just yet.  Let's get this understood and
> > right first, no need to be tricky or complex.
> > 
> > And, as I hinted at before, I don't think we should be doing this at all
> > just yet either.  I still think a small "C shim" layer to wrap the
> > struct pci driver up and pass the calls to the rust portion of the
> > module is better to start with, it both saves you time and energy so
> > that you can work on what you really want to do (i.e. a driver in rust)
> > and not have to worry about the c bindings as that's the "tricky" part
> > that is stopping you from getting your driver work done.
> 
> I strongly disagree here. As explained above, it just bubbles up, this approach
> would just make me end up having to write a lot of the code from the
> abstractions in the driver.
> 
> However, it would indeed safe me time and energy to do just that. But that isn't
> really what I want. I also don't want to write a driver in Rust *only*.
> 
> And I also don't really think that you want people, who commit to work hard to
> get things right, to just take the shortcut and create some random mess buried
> in a driver. :)
> 
> What I actually want is to get to a solid foundation for Rust drivers in
> general, since I'm convinced that this provides a lot of value beyond the scope
> of a single driver.
> 
> Since you've brought the topic up a few times, I am also willing to maintain
> those abstractions if this is a concern. Maybe one day the corresponding
> maintainers are comfortable enough and this isn't needed anymore, but at least
> until then, I'm happy to help out.
> 
> > 
> > After all, it's not the pci driver model code that is usually the tricky
> > bits to verify in C, it's the whole rest of the mess.  Stick with a
> > small C file, with just the pci driver structure and device ids, and
> > then instantiate your rust stuff when probe() is called, and clean up
> > when release() is called.
> 
> Again, this really bubbles up.
> 
> What do we pass to Rust probe() function? A raw struct pci_dev pointer or the
> proper PCI device abstraction?
> 
> How do we implement I/O memory mappings for PCI bars without PCI / I/O
> abstraction?
> 
> How do we perform (boundary checked) I/O memory reads / writes without `Io`
> abstraction?
> 
> How do we handle the lifetime of resources without `Devres` abstraction?
> 
> How do we (properly) implement a DRM device registration abstraction without
> `Devres`?
> 
> Luckily we already got the `Firmware` and `Device` abstraction in place. :)
> 
> > 
> > I can provide an example if needed.
> 
> If you do so, please don't stop at the probe() boundary, please continue to the
> point where the Nova stub driver is. It really does just enough to show / proove
> that the abstractions tie together nicely. But it should be enough to see that
> you end up with either (1), (2) or (3) from above.
> 
> > 
> > thanks,
> > 
> > greg k-h
> >
Greg KH July 10, 2024, 2:02 p.m. UTC | #9
On Thu, Jun 20, 2024 at 11:24:17PM +0200, Danilo Krummrich wrote:
> On Thu, Jun 20, 2024 at 06:36:08PM +0200, Greg KH wrote:
> > On Thu, Jun 20, 2024 at 06:10:05PM +0200, Danilo Krummrich wrote:
> > > On Thu, Jun 20, 2024 at 04:19:48PM +0200, Greg KH wrote:
> > > > On Wed, Jun 19, 2024 at 01:39:47AM +0200, Danilo Krummrich wrote:
> > > > > In a subsequent patch we introduce the `Registration` abstraction used
> > > > > to register driver structures. Some subsystems require the module name on
> > > > > driver registration (e.g. PCI in __pci_register_driver()), hence pass
> > > > > the module name to `Module::init`.
> > > > 
> > > > I understand the need/want here, but it feels odd that you have to
> > > > change anything to do it.
> > > > 
> > > > > 
> > > > > Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> > > > > ---
> > > > >  rust/kernel/lib.rs           | 14 ++++++++++----
> > > > >  rust/kernel/net/phy.rs       |  2 +-
> > > > >  rust/macros/module.rs        |  3 ++-
> > > > >  samples/rust/rust_minimal.rs |  2 +-
> > > > >  samples/rust/rust_print.rs   |  2 +-
> > > > >  5 files changed, 15 insertions(+), 8 deletions(-)
> > > > > 
> > > > > diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> > > > > index a791702b4fee..5af00e072a58 100644
> > > > > --- a/rust/kernel/lib.rs
> > > > > +++ b/rust/kernel/lib.rs
> > > > > @@ -71,7 +71,7 @@ pub trait Module: Sized + Sync + Send {
> > > > >      /// should do.
> > > > >      ///
> > > > >      /// Equivalent to the `module_init` macro in the C API.
> > > > > -    fn init(module: &'static ThisModule) -> error::Result<Self>;
> > > > > +    fn init(name: &'static str::CStr, module: &'static ThisModule) -> error::Result<Self>;
> > > > 
> > > > Why can't the name come directly from the build system?  Why must it be
> > > > passed into the init function of the module "class"?  What is it going
> > > > to do with it?
> > > 
> > > The name does come from the build system, that's where `Module::init` gets it
> > > from.
> > > 
> > > > 
> > > > A PCI, or other bus, driver "knows" it's name already by virtue of the
> > > > build system, so it can pass that string into whatever function needs
> > > 
> > > Let's take pci_register_driver() as example.
> > > 
> > > ```
> > > #define pci_register_driver(driver)		\
> > > 	__pci_register_driver(driver, THIS_MODULE, KBUILD_MODNAME)
> > > ```
> > > 
> > > In C drivers this works because (1) it's a macro and (2) it's called directly
> > > from the driver code.
> > > 
> > > In Rust, for very good reasons, we have abstractions for C APIs, hence the
> > > actual call to __pci_register_driver() does not come from code within the
> > > module, but from the PCI Rust abstraction `Module::init` calls into instead.
> > 
> > I don't understand those reasons, sorry.
> 
> Ok, good you point this out. We should definitely discuss this point then and
> build some consensus around it.
> 
> I propose to focus on this point first and follow up with the discussion of the
> rest of the series afterwards.
> 
> Let me explain why I am convinced that it's very important to have abstractions
> in place in general and from the get-go.
> 
> In general, having abstractions for C APIs is the foundation of being able to
> make use of a lot of advantages Rust has to offer.
> 
> The most obvious one are all the safety aspects. For instance, with an
> abstraction we have to get exactly one piece of code right in terms of pointer
> validity, lifetimes, type safety, API semantics, etc. and in all other places
> (e.g. drivers) we get the compiler to check those things for us through the
> abstraction.
> 
> Now, the abstraction can be buggy or insufficient and hence there is no absolute
> safety guarantee. *But*, if we get this one thing right, there is nothing a
> driver can mess up by itself trying to do stupid things anymore.
> 
> If we just call the C code instead we have to get it right everywhere instead.

I too want a pony.  But unfortunatly you are shaving a yak here instead :)

I'm not saying to call C code from rust, I'm saying call rust code from
C.

Have a "normal" pci_driver structure with C functions that THEN call
into the rust code that you want to implement them, with a pointer to
the proper structure.  That gives you everything that you really want
here, EXCEPT you don't have to build the whole tower of drivers and
busses and the like.

> Now, you could approach this top-down instead and argue that we could at least
> benefit from Rust for the driver specific parts.
> 
> Unfortunately, this doesn't really work out either. Also driver specific code is
> typically (very) closely connected to kernel APIs. If you want to use the safety
> aspects of Rust for the driver specific parts you inevitably end up writing
> abstractions for the C APIs in your driver.
> 
> There are basically three options you can end up with:
> 
> (1) An abstraction for the C API within your driver that is actually generic
>     for every driver, and hence shouldn't be there.
> (2) Your driver specific code is full of raw pointers and `unsafe {}` calls,
>     which in the end just means that you ended up baking the abstraction into
>     your driver specific code.
> (3) You ignore everything, put everything in a huge `unsafe {}` block and
>     compile C code with the Rust compiler. (Admittedly, maybe slightly
>     overstated, but not that far off either.)
> 
> The latter is also the reason why it doesn't make sense to only have
> abstractions for some things, but not for other.
> 
> If an abstraction for B is based on A, but we don't start with A, then B ends up
> implementing (at least partially) the abstraction for A as well. For instance,
> if we don't implement `driver::Registration` then the PCI abstractions (and
> platform, usb, etc.) have to implement it.
> 
> It really comes down to the point that it just bubbles up. We really have to do
> this bottom-up, otherwise we just end up moving those abstractions up, layer by
> layer, where they don't belong to and we re-implement them over and over again.

I think we are talking past each other.

Here is an example .c file that you can use today for your "implement a
PCI driver in rust" wish in a mere 34 lines of .c code:

------------------------------------------------------------------------
// SPDX-License-Identifier: GPL-2.0-only
#include <linux/module.h>
#include <linux/pci.h>
#include "my_pci_rust_bindings.h"

#define PCI_DEVICE_ID_FOO		0x0f00
#define PCI_VENDOR_ID_FOO		0x0f00

static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
{
	return my_rust_probe(pdev);
}

static void remove(struct pci_dev *pdev)
{
	my_rust_remove(pdev);
}

static const struct pci_device_id pci_ids[] = {
	{PCI_DEVICE(PCI_VENDOR_ID_FOO, PCI_DEVICE_ID_FOO)},
	{}
};
MODULE_DEVICE_TABLE(pci, pci_ids);

static struct pci_driver my_rust_pci_driver = {
	.name = "my_pci_rust_driver",
	.id_table = pci_ids,
	.probe = probe,
	.remove = remove,
};
module_pci_driver(my_rust_pci_driver);

MODULE_DESCRIPTION("Driver for my fancy PCI device");
MODULE_LICENSE("GPL v2");
------------------------------------------------------------------------

Now, all you have to do is provide a my_rust_probe() and
my_rust_remove() call in your rust code, and handle the conversion of a
struct pci_dev to whatever you want to use (which you had to do anyway),
and you are set!

That .c code above is "obviously" correct, and much simpler and worlds
easier for all of us to understand instead of the huge .rs files that
this patch series was attempting to implement.

You have to draw the c/rust line somewhere, you all seem to be wanting
to draw that line at "no .c in my module at all!" and I'm saying "put 34
lines of .c in your module and you will save a LOT of headache now."

All of the "real" work you want to do for your driver is behind the
probe/remove callbacks (ok, add a suspend/resume as well, forgot that),
and stop worrying about all of the bindings and other mess to tie into
the driver model for a specific bus type and the lunacy that the device
id mess was.

In short, KEEP IT SIMPLE!

Then, after we are comfortable with stuff like the above, slowly think
about moving the line back a bit more, if you really even need to.  But
why would you need to?  Only reason I can think of is if you wanted to
write an entire bus in rust, and for right now, I would strongly suggest
anyone not attempt that.

The "joy" of writing a driver in rust is that a driver consumes from
EVERYWHERE in the kernel (as you well know.)  Having to write bindings
and mappings for EVERYTHING all at once, when all you really want to do
is implement the logic between probe/remove is rough, I'm sorry.  I'm
suggesting that you stop at the above line for now, which should make
your life easier as the .c code is obviously correct, and anything you
do in the rust side is your problem, not mine as a bus maintainer :)

> > Changing drivers later, to take advantage of code savings like this
> > macro can be done then, not just yet.  Let's get this understood and
> > right first, no need to be tricky or complex.
> > 
> > And, as I hinted at before, I don't think we should be doing this at all
> > just yet either.  I still think a small "C shim" layer to wrap the
> > struct pci driver up and pass the calls to the rust portion of the
> > module is better to start with, it both saves you time and energy so
> > that you can work on what you really want to do (i.e. a driver in rust)
> > and not have to worry about the c bindings as that's the "tricky" part
> > that is stopping you from getting your driver work done.
> 
> I strongly disagree here. As explained above, it just bubbles up, this approach
> would just make me end up having to write a lot of the code from the
> abstractions in the driver.

See above, you cut out all of the .rs files you submitted in this series
and can probably just add one to map a pci_device and away you go.

> What I actually want is to get to a solid foundation for Rust drivers in
> general, since I'm convinced that this provides a lot of value beyond the scope
> of a single driver.

I don't see what not needing the above 35+ lines of .c code provides a
value for at all, EXCEPT we now need to maintain a lot more code
overall.

> What do we pass to Rust probe() function? A raw struct pci_dev pointer or the
> proper PCI device abstraction?

See above.

> How do we implement I/O memory mappings for PCI bars without PCI / I/O
> abstraction?

Call the .c functions you have today.

> How do we perform (boundary checked) I/O memory reads / writes without `Io`
> abstraction?

Do you really need/want that?  If so, ok, but I think you are going to
run into big issues with dynamic ranges that I didn't see answered on
those patches, but I could be totally wrong.

Again, one step at a time please, not the whole yak.

> How do we handle the lifetime of resources without `Devres` abstraction?

We have this question on the .c side as well.  Please work with the
people who are working to solve that there as it hits everyone.

> How do we (properly) implement a DRM device registration abstraction without
> `Devres`?

No idea, drm is huge :)

That being said, I think you can implement it above as the lifetime of
your device lives between probe and remove.  After remove is gone, all
of your resources HAVE to be freed, so do that in your driver.

> Luckily we already got the `Firmware` and `Device` abstraction in place. :)

Baby steps :)

thanks,

greg k-h
Danilo Krummrich July 11, 2024, 2:06 a.m. UTC | #10
On Wed, Jul 10, 2024 at 04:02:04PM +0200, Greg KH wrote:
> On Thu, Jun 20, 2024 at 11:24:17PM +0200, Danilo Krummrich wrote:
> > Ok, good you point this out. We should definitely discuss this point then and
> > build some consensus around it.
> > 
> > I propose to focus on this point first and follow up with the discussion of the
> > rest of the series afterwards.
> > 
> > Let me explain why I am convinced that it's very important to have abstractions
> > in place in general and from the get-go.
> > 
> > In general, having abstractions for C APIs is the foundation of being able to
> > make use of a lot of advantages Rust has to offer.
> > 
> > The most obvious one are all the safety aspects. For instance, with an
> > abstraction we have to get exactly one piece of code right in terms of pointer
> > validity, lifetimes, type safety, API semantics, etc. and in all other places
> > (e.g. drivers) we get the compiler to check those things for us through the
> > abstraction.
> > 
> > Now, the abstraction can be buggy or insufficient and hence there is no absolute
> > safety guarantee. *But*, if we get this one thing right, there is nothing a
> > driver can mess up by itself trying to do stupid things anymore.
> > 
> > If we just call the C code instead we have to get it right everywhere instead.
> 
> I too want a pony.  But unfortunatly you are shaving a yak here instead :)
> 
> I'm not saying to call C code from rust, I'm saying call rust code from
> C.
> 
> Have a "normal" pci_driver structure with C functions that THEN call
> into the rust code that you want to implement them, with a pointer to
> the proper structure.  That gives you everything that you really want
> here, EXCEPT you don't have to build the whole tower of drivers and
> busses and the like.

Please find my reply below where you expand this point.

> 
> > Now, you could approach this top-down instead and argue that we could at least
> > benefit from Rust for the driver specific parts.
> > 
> > Unfortunately, this doesn't really work out either. Also driver specific code is
> > typically (very) closely connected to kernel APIs. If you want to use the safety
> > aspects of Rust for the driver specific parts you inevitably end up writing
> > abstractions for the C APIs in your driver.
> > 
> > There are basically three options you can end up with:
> > 
> > (1) An abstraction for the C API within your driver that is actually generic
> >     for every driver, and hence shouldn't be there.
> > (2) Your driver specific code is full of raw pointers and `unsafe {}` calls,
> >     which in the end just means that you ended up baking the abstraction into
> >     your driver specific code.
> > (3) You ignore everything, put everything in a huge `unsafe {}` block and
> >     compile C code with the Rust compiler. (Admittedly, maybe slightly
> >     overstated, but not that far off either.)
> > 
> > The latter is also the reason why it doesn't make sense to only have
> > abstractions for some things, but not for other.
> > 
> > If an abstraction for B is based on A, but we don't start with A, then B ends up
> > implementing (at least partially) the abstraction for A as well. For instance,
> > if we don't implement `driver::Registration` then the PCI abstractions (and
> > platform, usb, etc.) have to implement it.
> > 
> > It really comes down to the point that it just bubbles up. We really have to do
> > this bottom-up, otherwise we just end up moving those abstractions up, layer by
> > layer, where they don't belong to and we re-implement them over and over again.
> 
> I think we are talking past each other.

Given your proposal below, that is correct.

I read your previous mail as if you question having abstractions for C APIs at
all. And some parts of your latest reply still read like that to me, but for the
rest of my reply I will assume that's not the case.

> 
> Here is an example .c file that you can use today for your "implement a
> PCI driver in rust" wish in a mere 34 lines of .c code:

Your proposal below actually only discards a rather small amount of the proposed
abstractions, in particular:

  (1) the code to create a struct pci_driver and the code to call
      pci_register_driver() and pci_unregister_driver() (implemented in Patch 2)
  (2) the generic code to create the struct pci_device_id table (or any other ID
      table) (implemented in Patch 3)

As I understand you, the concern here really seems to be about the complexity
vs. what we get from it and that someone has to maintain this.

I got the point and I take it seriously. As already mentioned, I'm also willing
to take responsibility for the code and offer to maintain it.

But back to the technical details.

Let's start with (2):

I agree that this seems a bit complicated. I'd propose to remove the
abstractions in device_id.rs (i.e. Patch 3) for now and let the PCI abstraction
simply create a struct pci_device_id table directly.

As for (1):

This just makes a pretty small part of the abstractions and, it's really only
about creating the struct pci_driver (or another driver type) instance and
call the corresponding register() / unregister() functions, since we already
have the Rust module support upstream.

As in C, we really just call register() from module_init() and unregister() from
module_exit() and the code for that, without comments, is around 50 lines of
code.

As mentioned this is implemented in Patch 2; Hence, please see my reply on Patch
2, where I put a rather long and detailed explanation which hopefully clarifies
things.

Now, what do we get from that compared to the proposal below?

  - The driver's probe() and remove() functions get called with the
    corresponding abstraction types directly instead of raw pointers; this moves
    the required  `unsafe {}` blocks to a cental place which otherwise *every*
    driver would need to implement itself (obviously it'd be the same for future
    suspend, resume, shutdown, etc. callbacks).

  - More complex drivers can do the work required to be done in module_init()
    and module_exit() in Rust directly, which allows them to attach the lifetime
    of structures to the lifetime of the `Module` structure in Rust which avoids
    the need for explicit cleanup in module_exit() since they can just implement
    in Rust's drop() trait.

The latter may sound a bit less important than it actually is, since it can break
the design, safety and soundness of Rust types. Let me explain:

In Rust a type instance should cleanup after itself when it goes out of scope.

Let's make up an example:

```
// DISCLAIMER: Obviously, this is not how we actually handle memory allocations
// in Rust, it's just an example.
struct Buffer {
	ptr: *const u8,
}

impl Buffer {
	fn alloc(size: u8) -> Result<Self> {
		let ptr = Kmalloc::alloc(size, GFP_KERNEL)?;

		// Return an instance of `Buffer` initialized with a pointer to
		// the above memory allocation.
		Ok(Self {
			ptr,
		})
	}
}

impl Drop for Buffer {
	fn drop(&mut self) {
		// SAFETY: `Self` is always holding a pointer to valid memory
		// allocated with `Kmalloc`.
		unsafe { Kmalloc::free(self.ptr) };
	}
}
```

(Side note: `Kmalloc::free` is unsafe since it has no control on whether the
pointer passed to it is valid and was allocated with `Kmalloc` previously.)

In Rust's module_init() you could now attach an instance of `Buffer` to the
`Module` instance, which lives until module_exit(), which looks like this:

```
struct MyModule {
	buffer: Buffer,
}

impl kernel::Moduke for MyModule {
	fn init(module: &'static ThisModule) -> Result<Self> {
		Ok(MyModule {
			buffer: Buffer::alloc(0x100)?,
		})
	}
}
```

Note that we don't need to implement module_exit() here, since `MyModule` lives
until module_exit() and hence `buffer` has the same lifetime. Once `buffer` goes
out of scope it frees itself due to the drop() trait. In fact, buffer can never
leak it's memory.

With the proposed approach below we can't do this, we'd need to play dirty and
unsafe tricks in order to not entirely break the design of `Buffer`, or any
other more complex data structure that the module needs.

> 
> ------------------------------------------------------------------------
> // SPDX-License-Identifier: GPL-2.0-only
> #include <linux/module.h>
> #include <linux/pci.h>
> #include "my_pci_rust_bindings.h"
> 
> #define PCI_DEVICE_ID_FOO		0x0f00
> #define PCI_VENDOR_ID_FOO		0x0f00
> 
> static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
> {
> 	return my_rust_probe(pdev);
> }
> 
> static void remove(struct pci_dev *pdev)
> {
> 	my_rust_remove(pdev);
> }
> 
> static const struct pci_device_id pci_ids[] = {
> 	{PCI_DEVICE(PCI_VENDOR_ID_FOO, PCI_DEVICE_ID_FOO)},
> 	{}
> };
> MODULE_DEVICE_TABLE(pci, pci_ids);
> 
> static struct pci_driver my_rust_pci_driver = {
> 	.name = "my_pci_rust_driver",
> 	.id_table = pci_ids,
> 	.probe = probe,
> 	.remove = remove,
> };
> module_pci_driver(my_rust_pci_driver);
> 
> MODULE_DESCRIPTION("Driver for my fancy PCI device");
> MODULE_LICENSE("GPL v2");
> ------------------------------------------------------------------------
> 
> Now, all you have to do is provide a my_rust_probe() and
> my_rust_remove() call in your rust code, and handle the conversion of a
> struct pci_dev to whatever you want to use (which you had to do anyway),
> and you are set!
> 
> That .c code above is "obviously" correct, and much simpler and worlds
> easier for all of us to understand instead of the huge .rs files that
> this patch series was attempting to implement.
> 
> You have to draw the c/rust line somewhere, you all seem to be wanting
> to draw that line at "no .c in my module at all!" and I'm saying "put 34
> lines of .c in your module and you will save a LOT of headache now."
> 
> All of the "real" work you want to do for your driver is behind the
> probe/remove callbacks (ok, add a suspend/resume as well, forgot that),
> and stop worrying about all of the bindings and other mess to tie into
> the driver model for a specific bus type and the lunacy that the device
> id mess was.
> 
> In short, KEEP IT SIMPLE!
> 
> Then, after we are comfortable with stuff like the above, slowly think
> about moving the line back a bit more, if you really even need to.  But
> why would you need to?  Only reason I can think of is if you wanted to
> write an entire bus in rust, and for right now, I would strongly suggest
> anyone not attempt that.

We're not trying to do that. Also, please note that all the abstractions are
*only* making use of APIs that C drivers use directly, just abstracting them
in a way, such that we can actually use the strength of Rust.

> 
> The "joy" of writing a driver in rust is that a driver consumes from
> EVERYWHERE in the kernel (as you well know.)  Having to write bindings
> and mappings for EVERYTHING all at once, when all you really want to do
> is implement the logic between probe/remove is rough, I'm sorry.  I'm
> suggesting that you stop at the above line for now, which should make
> your life easier as the .c code is obviously correct, and anything you
> do in the rust side is your problem, not mine as a bus maintainer :)

As explained above, we already have the module abstraction in place. Really all
that we're left with is creating the struct pci_driver and call register() /
remove() from it.

Now, this could be fully done in the PCI abstraction. The only reason we have
the `Registration` and `DriverOps` (again there are a few misunderstandings
around that, that I try to clarify in the corresponding thread) in "driver.rs"
is to not duplicate code for maintainability.

What I'm trying to say, the stuff in "driver.rs" is not even abstracting things
from drivers/base/, but is Rust helper code for subsystems to abstract their
stuff (e.g. struct pci_driver).

And again, I'm willing to take responsibility for the code and offer to maintain
it - I really want to do this the proper way.

- Danilo

(cutting of the rest since it was based on the misunderstanding mentioned above)
Danilo Krummrich July 22, 2024, 11:23 a.m. UTC | #11
On Thu, Jul 11, 2024 at 04:06:25AM +0200, Danilo Krummrich wrote:

Gentle reminder.

> On Wed, Jul 10, 2024 at 04:02:04PM +0200, Greg KH wrote:
> > On Thu, Jun 20, 2024 at 11:24:17PM +0200, Danilo Krummrich wrote:
> > > Ok, good you point this out. We should definitely discuss this point then and
> > > build some consensus around it.
> > > 
> > > I propose to focus on this point first and follow up with the discussion of the
> > > rest of the series afterwards.
> > > 
> > > Let me explain why I am convinced that it's very important to have abstractions
> > > in place in general and from the get-go.
> > > 
> > > In general, having abstractions for C APIs is the foundation of being able to
> > > make use of a lot of advantages Rust has to offer.
> > > 
> > > The most obvious one are all the safety aspects. For instance, with an
> > > abstraction we have to get exactly one piece of code right in terms of pointer
> > > validity, lifetimes, type safety, API semantics, etc. and in all other places
> > > (e.g. drivers) we get the compiler to check those things for us through the
> > > abstraction.
> > > 
> > > Now, the abstraction can be buggy or insufficient and hence there is no absolute
> > > safety guarantee. *But*, if we get this one thing right, there is nothing a
> > > driver can mess up by itself trying to do stupid things anymore.
> > > 
> > > If we just call the C code instead we have to get it right everywhere instead.
> > 
> > I too want a pony.  But unfortunatly you are shaving a yak here instead :)
> > 
> > I'm not saying to call C code from rust, I'm saying call rust code from
> > C.
> > 
> > Have a "normal" pci_driver structure with C functions that THEN call
> > into the rust code that you want to implement them, with a pointer to
> > the proper structure.  That gives you everything that you really want
> > here, EXCEPT you don't have to build the whole tower of drivers and
> > busses and the like.
> 
> Please find my reply below where you expand this point.
> 
> > 
> > > Now, you could approach this top-down instead and argue that we could at least
> > > benefit from Rust for the driver specific parts.
> > > 
> > > Unfortunately, this doesn't really work out either. Also driver specific code is
> > > typically (very) closely connected to kernel APIs. If you want to use the safety
> > > aspects of Rust for the driver specific parts you inevitably end up writing
> > > abstractions for the C APIs in your driver.
> > > 
> > > There are basically three options you can end up with:
> > > 
> > > (1) An abstraction for the C API within your driver that is actually generic
> > >     for every driver, and hence shouldn't be there.
> > > (2) Your driver specific code is full of raw pointers and `unsafe {}` calls,
> > >     which in the end just means that you ended up baking the abstraction into
> > >     your driver specific code.
> > > (3) You ignore everything, put everything in a huge `unsafe {}` block and
> > >     compile C code with the Rust compiler. (Admittedly, maybe slightly
> > >     overstated, but not that far off either.)
> > > 
> > > The latter is also the reason why it doesn't make sense to only have
> > > abstractions for some things, but not for other.
> > > 
> > > If an abstraction for B is based on A, but we don't start with A, then B ends up
> > > implementing (at least partially) the abstraction for A as well. For instance,
> > > if we don't implement `driver::Registration` then the PCI abstractions (and
> > > platform, usb, etc.) have to implement it.
> > > 
> > > It really comes down to the point that it just bubbles up. We really have to do
> > > this bottom-up, otherwise we just end up moving those abstractions up, layer by
> > > layer, where they don't belong to and we re-implement them over and over again.
> > 
> > I think we are talking past each other.
> 
> Given your proposal below, that is correct.
> 
> I read your previous mail as if you question having abstractions for C APIs at
> all. And some parts of your latest reply still read like that to me, but for the
> rest of my reply I will assume that's not the case.
> 
> > 
> > Here is an example .c file that you can use today for your "implement a
> > PCI driver in rust" wish in a mere 34 lines of .c code:
> 
> Your proposal below actually only discards a rather small amount of the proposed
> abstractions, in particular:
> 
>   (1) the code to create a struct pci_driver and the code to call
>       pci_register_driver() and pci_unregister_driver() (implemented in Patch 2)
>   (2) the generic code to create the struct pci_device_id table (or any other ID
>       table) (implemented in Patch 3)
> 
> As I understand you, the concern here really seems to be about the complexity
> vs. what we get from it and that someone has to maintain this.
> 
> I got the point and I take it seriously. As already mentioned, I'm also willing
> to take responsibility for the code and offer to maintain it.
> 
> But back to the technical details.
> 
> Let's start with (2):
> 
> I agree that this seems a bit complicated. I'd propose to remove the
> abstractions in device_id.rs (i.e. Patch 3) for now and let the PCI abstraction
> simply create a struct pci_device_id table directly.
> 
> As for (1):
> 
> This just makes a pretty small part of the abstractions and, it's really only
> about creating the struct pci_driver (or another driver type) instance and
> call the corresponding register() / unregister() functions, since we already
> have the Rust module support upstream.
> 
> As in C, we really just call register() from module_init() and unregister() from
> module_exit() and the code for that, without comments, is around 50 lines of
> code.
> 
> As mentioned this is implemented in Patch 2; Hence, please see my reply on Patch
> 2, where I put a rather long and detailed explanation which hopefully clarifies
> things.
> 
> Now, what do we get from that compared to the proposal below?
> 
>   - The driver's probe() and remove() functions get called with the
>     corresponding abstraction types directly instead of raw pointers; this moves
>     the required  `unsafe {}` blocks to a cental place which otherwise *every*
>     driver would need to implement itself (obviously it'd be the same for future
>     suspend, resume, shutdown, etc. callbacks).
> 
>   - More complex drivers can do the work required to be done in module_init()
>     and module_exit() in Rust directly, which allows them to attach the lifetime
>     of structures to the lifetime of the `Module` structure in Rust which avoids
>     the need for explicit cleanup in module_exit() since they can just implement
>     in Rust's drop() trait.
> 
> The latter may sound a bit less important than it actually is, since it can break
> the design, safety and soundness of Rust types. Let me explain:
> 
> In Rust a type instance should cleanup after itself when it goes out of scope.
> 
> Let's make up an example:
> 
> ```
> // DISCLAIMER: Obviously, this is not how we actually handle memory allocations
> // in Rust, it's just an example.
> struct Buffer {
> 	ptr: *const u8,
> }
> 
> impl Buffer {
> 	fn alloc(size: u8) -> Result<Self> {
> 		let ptr = Kmalloc::alloc(size, GFP_KERNEL)?;
> 
> 		// Return an instance of `Buffer` initialized with a pointer to
> 		// the above memory allocation.
> 		Ok(Self {
> 			ptr,
> 		})
> 	}
> }
> 
> impl Drop for Buffer {
> 	fn drop(&mut self) {
> 		// SAFETY: `Self` is always holding a pointer to valid memory
> 		// allocated with `Kmalloc`.
> 		unsafe { Kmalloc::free(self.ptr) };
> 	}
> }
> ```
> 
> (Side note: `Kmalloc::free` is unsafe since it has no control on whether the
> pointer passed to it is valid and was allocated with `Kmalloc` previously.)
> 
> In Rust's module_init() you could now attach an instance of `Buffer` to the
> `Module` instance, which lives until module_exit(), which looks like this:
> 
> ```
> struct MyModule {
> 	buffer: Buffer,
> }
> 
> impl kernel::Moduke for MyModule {
> 	fn init(module: &'static ThisModule) -> Result<Self> {
> 		Ok(MyModule {
> 			buffer: Buffer::alloc(0x100)?,
> 		})
> 	}
> }
> ```
> 
> Note that we don't need to implement module_exit() here, since `MyModule` lives
> until module_exit() and hence `buffer` has the same lifetime. Once `buffer` goes
> out of scope it frees itself due to the drop() trait. In fact, buffer can never
> leak it's memory.
> 
> With the proposed approach below we can't do this, we'd need to play dirty and
> unsafe tricks in order to not entirely break the design of `Buffer`, or any
> other more complex data structure that the module needs.
> 
> > 
> > ------------------------------------------------------------------------
> > // SPDX-License-Identifier: GPL-2.0-only
> > #include <linux/module.h>
> > #include <linux/pci.h>
> > #include "my_pci_rust_bindings.h"
> > 
> > #define PCI_DEVICE_ID_FOO		0x0f00
> > #define PCI_VENDOR_ID_FOO		0x0f00
> > 
> > static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
> > {
> > 	return my_rust_probe(pdev);
> > }
> > 
> > static void remove(struct pci_dev *pdev)
> > {
> > 	my_rust_remove(pdev);
> > }
> > 
> > static const struct pci_device_id pci_ids[] = {
> > 	{PCI_DEVICE(PCI_VENDOR_ID_FOO, PCI_DEVICE_ID_FOO)},
> > 	{}
> > };
> > MODULE_DEVICE_TABLE(pci, pci_ids);
> > 
> > static struct pci_driver my_rust_pci_driver = {
> > 	.name = "my_pci_rust_driver",
> > 	.id_table = pci_ids,
> > 	.probe = probe,
> > 	.remove = remove,
> > };
> > module_pci_driver(my_rust_pci_driver);
> > 
> > MODULE_DESCRIPTION("Driver for my fancy PCI device");
> > MODULE_LICENSE("GPL v2");
> > ------------------------------------------------------------------------
> > 
> > Now, all you have to do is provide a my_rust_probe() and
> > my_rust_remove() call in your rust code, and handle the conversion of a
> > struct pci_dev to whatever you want to use (which you had to do anyway),
> > and you are set!
> > 
> > That .c code above is "obviously" correct, and much simpler and worlds
> > easier for all of us to understand instead of the huge .rs files that
> > this patch series was attempting to implement.
> > 
> > You have to draw the c/rust line somewhere, you all seem to be wanting
> > to draw that line at "no .c in my module at all!" and I'm saying "put 34
> > lines of .c in your module and you will save a LOT of headache now."
> > 
> > All of the "real" work you want to do for your driver is behind the
> > probe/remove callbacks (ok, add a suspend/resume as well, forgot that),
> > and stop worrying about all of the bindings and other mess to tie into
> > the driver model for a specific bus type and the lunacy that the device
> > id mess was.
> > 
> > In short, KEEP IT SIMPLE!
> > 
> > Then, after we are comfortable with stuff like the above, slowly think
> > about moving the line back a bit more, if you really even need to.  But
> > why would you need to?  Only reason I can think of is if you wanted to
> > write an entire bus in rust, and for right now, I would strongly suggest
> > anyone not attempt that.
> 
> We're not trying to do that. Also, please note that all the abstractions are
> *only* making use of APIs that C drivers use directly, just abstracting them
> in a way, such that we can actually use the strength of Rust.
> 
> > 
> > The "joy" of writing a driver in rust is that a driver consumes from
> > EVERYWHERE in the kernel (as you well know.)  Having to write bindings
> > and mappings for EVERYTHING all at once, when all you really want to do
> > is implement the logic between probe/remove is rough, I'm sorry.  I'm
> > suggesting that you stop at the above line for now, which should make
> > your life easier as the .c code is obviously correct, and anything you
> > do in the rust side is your problem, not mine as a bus maintainer :)
> 
> As explained above, we already have the module abstraction in place. Really all
> that we're left with is creating the struct pci_driver and call register() /
> remove() from it.
> 
> Now, this could be fully done in the PCI abstraction. The only reason we have
> the `Registration` and `DriverOps` (again there are a few misunderstandings
> around that, that I try to clarify in the corresponding thread) in "driver.rs"
> is to not duplicate code for maintainability.
> 
> What I'm trying to say, the stuff in "driver.rs" is not even abstracting things
> from drivers/base/, but is Rust helper code for subsystems to abstract their
> stuff (e.g. struct pci_driver).
> 
> And again, I'm willing to take responsibility for the code and offer to maintain
> it - I really want to do this the proper way.
> 
> - Danilo
> 
> (cutting of the rest since it was based on the misunderstanding mentioned above)
Greg KH July 22, 2024, 11:35 a.m. UTC | #12
On Mon, Jul 22, 2024 at 01:23:25PM +0200, Danilo Krummrich wrote:
> On Thu, Jul 11, 2024 at 04:06:25AM +0200, Danilo Krummrich wrote:
> 
> Gentle reminder.

It's the merge window, sorry, will be back after -rc1 is out...
Danilo Krummrich Aug. 2, 2024, 12:06 p.m. UTC | #13
On Thu, Jul 11, 2024 at 04:06:25AM +0200, Danilo Krummrich wrote:

Gentle reminder on this and patch 2 of this series.

> On Wed, Jul 10, 2024 at 04:02:04PM +0200, Greg KH wrote:
> > On Thu, Jun 20, 2024 at 11:24:17PM +0200, Danilo Krummrich wrote:
> > > Ok, good you point this out. We should definitely discuss this point then and
> > > build some consensus around it.
> > > 
> > > I propose to focus on this point first and follow up with the discussion of the
> > > rest of the series afterwards.
> > > 
> > > Let me explain why I am convinced that it's very important to have abstractions
> > > in place in general and from the get-go.
> > > 
> > > In general, having abstractions for C APIs is the foundation of being able to
> > > make use of a lot of advantages Rust has to offer.
> > > 
> > > The most obvious one are all the safety aspects. For instance, with an
> > > abstraction we have to get exactly one piece of code right in terms of pointer
> > > validity, lifetimes, type safety, API semantics, etc. and in all other places
> > > (e.g. drivers) we get the compiler to check those things for us through the
> > > abstraction.
> > > 
> > > Now, the abstraction can be buggy or insufficient and hence there is no absolute
> > > safety guarantee. *But*, if we get this one thing right, there is nothing a
> > > driver can mess up by itself trying to do stupid things anymore.
> > > 
> > > If we just call the C code instead we have to get it right everywhere instead.
> > 
> > I too want a pony.  But unfortunatly you are shaving a yak here instead :)
> > 
> > I'm not saying to call C code from rust, I'm saying call rust code from
> > C.
> > 
> > Have a "normal" pci_driver structure with C functions that THEN call
> > into the rust code that you want to implement them, with a pointer to
> > the proper structure.  That gives you everything that you really want
> > here, EXCEPT you don't have to build the whole tower of drivers and
> > busses and the like.
> 
> Please find my reply below where you expand this point.
> 
> > 
> > > Now, you could approach this top-down instead and argue that we could at least
> > > benefit from Rust for the driver specific parts.
> > > 
> > > Unfortunately, this doesn't really work out either. Also driver specific code is
> > > typically (very) closely connected to kernel APIs. If you want to use the safety
> > > aspects of Rust for the driver specific parts you inevitably end up writing
> > > abstractions for the C APIs in your driver.
> > > 
> > > There are basically three options you can end up with:
> > > 
> > > (1) An abstraction for the C API within your driver that is actually generic
> > >     for every driver, and hence shouldn't be there.
> > > (2) Your driver specific code is full of raw pointers and `unsafe {}` calls,
> > >     which in the end just means that you ended up baking the abstraction into
> > >     your driver specific code.
> > > (3) You ignore everything, put everything in a huge `unsafe {}` block and
> > >     compile C code with the Rust compiler. (Admittedly, maybe slightly
> > >     overstated, but not that far off either.)
> > > 
> > > The latter is also the reason why it doesn't make sense to only have
> > > abstractions for some things, but not for other.
> > > 
> > > If an abstraction for B is based on A, but we don't start with A, then B ends up
> > > implementing (at least partially) the abstraction for A as well. For instance,
> > > if we don't implement `driver::Registration` then the PCI abstractions (and
> > > platform, usb, etc.) have to implement it.
> > > 
> > > It really comes down to the point that it just bubbles up. We really have to do
> > > this bottom-up, otherwise we just end up moving those abstractions up, layer by
> > > layer, where they don't belong to and we re-implement them over and over again.
> > 
> > I think we are talking past each other.
> 
> Given your proposal below, that is correct.
> 
> I read your previous mail as if you question having abstractions for C APIs at
> all. And some parts of your latest reply still read like that to me, but for the
> rest of my reply I will assume that's not the case.
> 
> > 
> > Here is an example .c file that you can use today for your "implement a
> > PCI driver in rust" wish in a mere 34 lines of .c code:
> 
> Your proposal below actually only discards a rather small amount of the proposed
> abstractions, in particular:
> 
>   (1) the code to create a struct pci_driver and the code to call
>       pci_register_driver() and pci_unregister_driver() (implemented in Patch 2)
>   (2) the generic code to create the struct pci_device_id table (or any other ID
>       table) (implemented in Patch 3)
> 
> As I understand you, the concern here really seems to be about the complexity
> vs. what we get from it and that someone has to maintain this.
> 
> I got the point and I take it seriously. As already mentioned, I'm also willing
> to take responsibility for the code and offer to maintain it.
> 
> But back to the technical details.
> 
> Let's start with (2):
> 
> I agree that this seems a bit complicated. I'd propose to remove the
> abstractions in device_id.rs (i.e. Patch 3) for now and let the PCI abstraction
> simply create a struct pci_device_id table directly.
> 
> As for (1):
> 
> This just makes a pretty small part of the abstractions and, it's really only
> about creating the struct pci_driver (or another driver type) instance and
> call the corresponding register() / unregister() functions, since we already
> have the Rust module support upstream.
> 
> As in C, we really just call register() from module_init() and unregister() from
> module_exit() and the code for that, without comments, is around 50 lines of
> code.
> 
> As mentioned this is implemented in Patch 2; Hence, please see my reply on Patch
> 2, where I put a rather long and detailed explanation which hopefully clarifies
> things.
> 
> Now, what do we get from that compared to the proposal below?
> 
>   - The driver's probe() and remove() functions get called with the
>     corresponding abstraction types directly instead of raw pointers; this moves
>     the required  `unsafe {}` blocks to a cental place which otherwise *every*
>     driver would need to implement itself (obviously it'd be the same for future
>     suspend, resume, shutdown, etc. callbacks).
> 
>   - More complex drivers can do the work required to be done in module_init()
>     and module_exit() in Rust directly, which allows them to attach the lifetime
>     of structures to the lifetime of the `Module` structure in Rust which avoids
>     the need for explicit cleanup in module_exit() since they can just implement
>     in Rust's drop() trait.
> 
> The latter may sound a bit less important than it actually is, since it can break
> the design, safety and soundness of Rust types. Let me explain:
> 
> In Rust a type instance should cleanup after itself when it goes out of scope.
> 
> Let's make up an example:
> 
> ```
> // DISCLAIMER: Obviously, this is not how we actually handle memory allocations
> // in Rust, it's just an example.
> struct Buffer {
> 	ptr: *const u8,
> }
> 
> impl Buffer {
> 	fn alloc(size: u8) -> Result<Self> {
> 		let ptr = Kmalloc::alloc(size, GFP_KERNEL)?;
> 
> 		// Return an instance of `Buffer` initialized with a pointer to
> 		// the above memory allocation.
> 		Ok(Self {
> 			ptr,
> 		})
> 	}
> }
> 
> impl Drop for Buffer {
> 	fn drop(&mut self) {
> 		// SAFETY: `Self` is always holding a pointer to valid memory
> 		// allocated with `Kmalloc`.
> 		unsafe { Kmalloc::free(self.ptr) };
> 	}
> }
> ```
> 
> (Side note: `Kmalloc::free` is unsafe since it has no control on whether the
> pointer passed to it is valid and was allocated with `Kmalloc` previously.)
> 
> In Rust's module_init() you could now attach an instance of `Buffer` to the
> `Module` instance, which lives until module_exit(), which looks like this:
> 
> ```
> struct MyModule {
> 	buffer: Buffer,
> }
> 
> impl kernel::Moduke for MyModule {
> 	fn init(module: &'static ThisModule) -> Result<Self> {
> 		Ok(MyModule {
> 			buffer: Buffer::alloc(0x100)?,
> 		})
> 	}
> }
> ```
> 
> Note that we don't need to implement module_exit() here, since `MyModule` lives
> until module_exit() and hence `buffer` has the same lifetime. Once `buffer` goes
> out of scope it frees itself due to the drop() trait. In fact, buffer can never
> leak it's memory.
> 
> With the proposed approach below we can't do this, we'd need to play dirty and
> unsafe tricks in order to not entirely break the design of `Buffer`, or any
> other more complex data structure that the module needs.
> 
> > 
> > ------------------------------------------------------------------------
> > // SPDX-License-Identifier: GPL-2.0-only
> > #include <linux/module.h>
> > #include <linux/pci.h>
> > #include "my_pci_rust_bindings.h"
> > 
> > #define PCI_DEVICE_ID_FOO		0x0f00
> > #define PCI_VENDOR_ID_FOO		0x0f00
> > 
> > static int probe(struct pci_dev *pdev, const struct pci_device_id *ent)
> > {
> > 	return my_rust_probe(pdev);
> > }
> > 
> > static void remove(struct pci_dev *pdev)
> > {
> > 	my_rust_remove(pdev);
> > }
> > 
> > static const struct pci_device_id pci_ids[] = {
> > 	{PCI_DEVICE(PCI_VENDOR_ID_FOO, PCI_DEVICE_ID_FOO)},
> > 	{}
> > };
> > MODULE_DEVICE_TABLE(pci, pci_ids);
> > 
> > static struct pci_driver my_rust_pci_driver = {
> > 	.name = "my_pci_rust_driver",
> > 	.id_table = pci_ids,
> > 	.probe = probe,
> > 	.remove = remove,
> > };
> > module_pci_driver(my_rust_pci_driver);
> > 
> > MODULE_DESCRIPTION("Driver for my fancy PCI device");
> > MODULE_LICENSE("GPL v2");
> > ------------------------------------------------------------------------
> > 
> > Now, all you have to do is provide a my_rust_probe() and
> > my_rust_remove() call in your rust code, and handle the conversion of a
> > struct pci_dev to whatever you want to use (which you had to do anyway),
> > and you are set!
> > 
> > That .c code above is "obviously" correct, and much simpler and worlds
> > easier for all of us to understand instead of the huge .rs files that
> > this patch series was attempting to implement.
> > 
> > You have to draw the c/rust line somewhere, you all seem to be wanting
> > to draw that line at "no .c in my module at all!" and I'm saying "put 34
> > lines of .c in your module and you will save a LOT of headache now."
> > 
> > All of the "real" work you want to do for your driver is behind the
> > probe/remove callbacks (ok, add a suspend/resume as well, forgot that),
> > and stop worrying about all of the bindings and other mess to tie into
> > the driver model for a specific bus type and the lunacy that the device
> > id mess was.
> > 
> > In short, KEEP IT SIMPLE!
> > 
> > Then, after we are comfortable with stuff like the above, slowly think
> > about moving the line back a bit more, if you really even need to.  But
> > why would you need to?  Only reason I can think of is if you wanted to
> > write an entire bus in rust, and for right now, I would strongly suggest
> > anyone not attempt that.
> 
> We're not trying to do that. Also, please note that all the abstractions are
> *only* making use of APIs that C drivers use directly, just abstracting them
> in a way, such that we can actually use the strength of Rust.
> 
> > 
> > The "joy" of writing a driver in rust is that a driver consumes from
> > EVERYWHERE in the kernel (as you well know.)  Having to write bindings
> > and mappings for EVERYTHING all at once, when all you really want to do
> > is implement the logic between probe/remove is rough, I'm sorry.  I'm
> > suggesting that you stop at the above line for now, which should make
> > your life easier as the .c code is obviously correct, and anything you
> > do in the rust side is your problem, not mine as a bus maintainer :)
> 
> As explained above, we already have the module abstraction in place. Really all
> that we're left with is creating the struct pci_driver and call register() /
> remove() from it.
> 
> Now, this could be fully done in the PCI abstraction. The only reason we have
> the `Registration` and `DriverOps` (again there are a few misunderstandings
> around that, that I try to clarify in the corresponding thread) in "driver.rs"
> is to not duplicate code for maintainability.
> 
> What I'm trying to say, the stuff in "driver.rs" is not even abstracting things
> from drivers/base/, but is Rust helper code for subsystems to abstract their
> stuff (e.g. struct pci_driver).
> 
> And again, I'm willing to take responsibility for the code and offer to maintain
> it - I really want to do this the proper way.
> 
> - Danilo
> 
> (cutting of the rest since it was based on the misunderstanding mentioned above)
diff mbox series

Patch

diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index a791702b4fee..5af00e072a58 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -71,7 +71,7 @@  pub trait Module: Sized + Sync + Send {
     /// should do.
     ///
     /// Equivalent to the `module_init` macro in the C API.
-    fn init(module: &'static ThisModule) -> error::Result<Self>;
+    fn init(name: &'static str::CStr, module: &'static ThisModule) -> error::Result<Self>;
 }
 
 /// A module that is pinned and initialised in-place.
@@ -79,13 +79,19 @@  pub trait InPlaceModule: Sync + Send {
     /// Creates an initialiser for the module.
     ///
     /// It is called when the module is loaded.
-    fn init(module: &'static ThisModule) -> impl init::PinInit<Self, error::Error>;
+    fn init(
+        name: &'static str::CStr,
+        module: &'static ThisModule,
+    ) -> impl init::PinInit<Self, error::Error>;
 }
 
 impl<T: Module> InPlaceModule for T {
-    fn init(module: &'static ThisModule) -> impl init::PinInit<Self, error::Error> {
+    fn init(
+        name: &'static str::CStr,
+        module: &'static ThisModule,
+    ) -> impl init::PinInit<Self, error::Error> {
         let initer = move |slot: *mut Self| {
-            let m = <Self as Module>::init(module)?;
+            let m = <Self as Module>::init(name, module)?;
 
             // SAFETY: `slot` is valid for write per the contract with `pin_init_from_closure`.
             unsafe { slot.write(m) };
diff --git a/rust/kernel/net/phy.rs b/rust/kernel/net/phy.rs
index fd40b703d224..ccb2552dc107 100644
--- a/rust/kernel/net/phy.rs
+++ b/rust/kernel/net/phy.rs
@@ -887,7 +887,7 @@  struct Module {
                 [$($crate::net::phy::create_phy_driver::<$driver>()),+];
 
             impl $crate::Module for Module {
-                fn init(module: &'static ThisModule) -> Result<Self> {
+                fn init(_name: &'static CStr, module: &'static ThisModule) -> Result<Self> {
                     // SAFETY: The anonymous constant guarantees that nobody else can access
                     // the `DRIVERS` static. The array is used only in the C side.
                     let drivers = unsafe { &mut DRIVERS };
diff --git a/rust/macros/module.rs b/rust/macros/module.rs
index 105be4797f85..be03b2cf77a1 100644
--- a/rust/macros/module.rs
+++ b/rust/macros/module.rs
@@ -302,7 +302,8 @@  mod __module_init {{
                     ///
                     /// This function must only be called once.
                     unsafe fn __init() -> core::ffi::c_int {{
-                        let initer = <{type_} as kernel::InPlaceModule>::init(&super::super::THIS_MODULE);
+                        let initer = <{type_} as kernel::InPlaceModule>::init(kernel::c_str!(\"{name}\"),
+                                                                              &super::super::THIS_MODULE);
                         // SAFETY: No data race, since `__MOD` can only be accessed by this module
                         // and there only `__init` and `__exit` access it. These functions are only
                         // called once and `__exit` cannot be called before or during `__init`.
diff --git a/samples/rust/rust_minimal.rs b/samples/rust/rust_minimal.rs
index 2a9eaab62d1c..3b918ff5eebb 100644
--- a/samples/rust/rust_minimal.rs
+++ b/samples/rust/rust_minimal.rs
@@ -17,7 +17,7 @@  struct RustMinimal {
 }
 
 impl kernel::Module for RustMinimal {
-    fn init(_module: &'static ThisModule) -> Result<Self> {
+    fn init(_name: &'static CStr, _module: &'static ThisModule) -> Result<Self> {
         pr_info!("Rust minimal sample (init)\n");
         pr_info!("Am I built-in? {}\n", !cfg!(MODULE));
 
diff --git a/samples/rust/rust_print.rs b/samples/rust/rust_print.rs
index 6eabb0d79ea3..722275a735f1 100644
--- a/samples/rust/rust_print.rs
+++ b/samples/rust/rust_print.rs
@@ -40,7 +40,7 @@  fn arc_print() -> Result {
 }
 
 impl kernel::Module for RustPrint {
-    fn init(_module: &'static ThisModule) -> Result<Self> {
+    fn init(_name: &'static CStr, _module: &'static ThisModule) -> Result<Self> {
         pr_info!("Rust printing macros sample (init)\n");
 
         pr_emerg!("Emergency message (level 0) without args\n");