diff mbox series

[RFC,v2,09/20] PCI/CMA: Expose in sysfs whether devices are authenticated

Message ID 20250227030952.2319050-10-alistair@alistair23.me
State New
Headers show
Series lib: Rust implementation of SPDM | expand

Commit Message

Alistair Francis Feb. 27, 2025, 3:09 a.m. UTC
From: Lukas Wunner <lukas@wunner.de>

The PCI core has just been amended to authenticate CMA-capable devices
on enumeration and store the result in an "authenticated" bit in struct
pci_dev->spdm_state.

Expose the bit to user space through an eponymous sysfs attribute.

Allow user space to trigger reauthentication (e.g. after it has updated
the CMA keyring) by writing to the sysfs attribute.

Implement the attribute in the SPDM library so that other bus types
besides PCI may take advantage of it.  They just need to add
spdm_attr_group to the attribute groups of their devices and amend the
dev_to_spdm_state() helper which retrieves the spdm_state for a given
device.

The helper may return an ERR_PTR if it couldn't be determined whether
SPDM is supported by the device.  The sysfs attribute is visible in that
case but returns an error on access.  This prevents downgrade attacks
where an attacker disturbs memory allocation or DOE communication
in order to create the appearance that SPDM is unsupported.

Subject to further discussion, a future commit might add a user-defined
policy to forbid driver binding to devices which failed authentication,
similar to the "authorized" attribute for USB.

Alternatively, authentication success might be signaled to user space
through a uevent, whereupon it may bind a (blacklisted) driver.
A uevent signaling authentication failure might similarly cause user
space to unbind or outright remove the potentially malicious device.

Traffic from devices which failed authentication could also be filtered
through ACS I/O Request Blocking Enable (PCIe r6.2 sec 7.7.11.3) or
through Link Disable (PCIe r6.2 sec 7.5.3.7).  Unlike an IOMMU, that
will not only protect the host, but also prevent malicious peer-to-peer
traffic to other devices.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
[ Changes by AF:
 - Drop lib/spdm changes and replace with Rust implementation
]
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 Documentation/ABI/testing/sysfs-devices-spdm | 31 +++++++
 MAINTAINERS                                  |  3 +-
 drivers/pci/cma.c                            | 12 ++-
 drivers/pci/doe.c                            |  2 +
 drivers/pci/pci-sysfs.c                      |  3 +
 drivers/pci/pci.h                            |  5 +
 include/linux/pci.h                          | 12 +++
 lib/rspdm/Makefile                           |  1 +
 lib/rspdm/lib.rs                             |  1 +
 lib/rspdm/req-sysfs.c                        | 97 ++++++++++++++++++++
 lib/rspdm/sysfs.rs                           | 28 ++++++
 11 files changed, 190 insertions(+), 5 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-devices-spdm
 create mode 100644 lib/rspdm/req-sysfs.c
 create mode 100644 lib/rspdm/sysfs.rs

Comments

Greg KH Feb. 27, 2025, 11:16 a.m. UTC | #1
On Thu, Feb 27, 2025 at 01:09:41PM +1000, Alistair Francis wrote:
> From: Lukas Wunner <lukas@wunner.de>
> 
> The PCI core has just been amended to authenticate CMA-capable devices
> on enumeration and store the result in an "authenticated" bit in struct
> pci_dev->spdm_state.
> 
> Expose the bit to user space through an eponymous sysfs attribute.
> 
> Allow user space to trigger reauthentication (e.g. after it has updated
> the CMA keyring) by writing to the sysfs attribute.
> 
> Implement the attribute in the SPDM library so that other bus types
> besides PCI may take advantage of it.  They just need to add
> spdm_attr_group to the attribute groups of their devices and amend the
> dev_to_spdm_state() helper which retrieves the spdm_state for a given
> device.
> 
> The helper may return an ERR_PTR if it couldn't be determined whether
> SPDM is supported by the device.  The sysfs attribute is visible in that
> case but returns an error on access.  This prevents downgrade attacks
> where an attacker disturbs memory allocation or DOE communication
> in order to create the appearance that SPDM is unsupported.

I don't like this "if it's present we still don't know if the device
supports this", as that is not normally the "sysfs way" here.  Why must
it be present in those situations?  What is going to happen to suddenly
allow it to come back to be "alive" and working while the device is
still present in the system?

I'd prefer it to be "if the file is there, this is supported by the
device.  If the file is not there, it is not supported", as that makes
things so much simpler for userspace (i.e. you don't want userspace to
have to both test if the file is there AND read all entries just to see
if the kernel knows what is going on or not.)

Also, how will userspace know if the state changes from "unknown" to
"now it might work, try it again"?

> Subject to further discussion, a future commit might add a user-defined
> policy to forbid driver binding to devices which failed authentication,
> similar to the "authorized" attribute for USB.

That user-defined policy belongs in userspace, just like USB has it.
Why would the kernel need this?  Or do you mean that the kernel provides
a default like USB does and then requires userspace to manually enable a
device before binding a driver to the device is allowed?

> Alternatively, authentication success might be signaled to user space
> through a uevent, whereupon it may bind a (blacklisted) driver.

How will that happen?

> A uevent signaling authentication failure might similarly cause user
> space to unbind or outright remove the potentially malicious device.

Again how?  Who will be sending nthose uevents?  Who will be listening
to them?  What in the kernel is going to change to know to send those
events?  Why is any of this needed at all?  :)

> Traffic from devices which failed authentication could also be filtered
> through ACS I/O Request Blocking Enable (PCIe r6.2 sec 7.7.11.3) or
> through Link Disable (PCIe r6.2 sec 7.5.3.7).  Unlike an IOMMU, that
> will not only protect the host, but also prevent malicious peer-to-peer
> traffic to other devices.
> 
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> [ Changes by AF:
>  - Drop lib/spdm changes and replace with Rust implementation
> ]
> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> ---
>  Documentation/ABI/testing/sysfs-devices-spdm | 31 +++++++
>  MAINTAINERS                                  |  3 +-
>  drivers/pci/cma.c                            | 12 ++-
>  drivers/pci/doe.c                            |  2 +
>  drivers/pci/pci-sysfs.c                      |  3 +
>  drivers/pci/pci.h                            |  5 +
>  include/linux/pci.h                          | 12 +++
>  lib/rspdm/Makefile                           |  1 +
>  lib/rspdm/lib.rs                             |  1 +
>  lib/rspdm/req-sysfs.c                        | 97 ++++++++++++++++++++
>  lib/rspdm/sysfs.rs                           | 28 ++++++
>  11 files changed, 190 insertions(+), 5 deletions(-)
>  create mode 100644 Documentation/ABI/testing/sysfs-devices-spdm
>  create mode 100644 lib/rspdm/req-sysfs.c
>  create mode 100644 lib/rspdm/sysfs.rs
> 
> diff --git a/Documentation/ABI/testing/sysfs-devices-spdm b/Documentation/ABI/testing/sysfs-devices-spdm
> new file mode 100644
> index 000000000000..2d6e5d513231
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-devices-spdm
> @@ -0,0 +1,31 @@
> +What:		/sys/devices/.../authenticated
> +Date:		June 2024
> +Contact:	Lukas Wunner <lukas@wunner.de>
> +Description:
> +		This file contains 1 if the device authenticated successfully
> +		with SPDM (Security Protocol and Data Model).  It contains 0 if
> +		the device failed authentication (and may thus be malicious).
> +
> +		Writing "re" to this file causes reauthentication.
> +		That may be opportune after updating the device keyring.
> +		The device keyring of the PCI bus is named ".cma"
> +		(Component Measurement and Authentication).
> +
> +		Reauthentication may also be necessary after device identity
> +		has mutated, e.g. after downloading firmware to an FPGA device.
> +
> +		The file is not visible if authentication is unsupported
> +		by the device.

Good.

> +		If the kernel could not determine whether authentication is
> +		supported because memory was low or communication with the
> +		device was not working, the file is visible but accessing it
> +		fails with error code ENOTTY.

Not good.  So this means that userspace can not trust that if the file
is there it actually works, so it has to do more work to determine if it
is working (as mentioned above).  Just either have the file if it works,
or not if it does not please.

> +		This prevents downgrade attacks where an attacker consumes
> +		memory or disturbs communication in order to create the
> +		appearance that a device does not support authentication.

If an attacker can consume kernel memory to cause this to happen you
have bigger problems.  That's not the kernel's issue here at all.

And "disable communication" means "we just don't support it as the
device doesn't say it does", so again, why does that matter?

> +		The reason why authentication support could not be determined
> +		is apparent from "dmesg".  To re-probe authentication support
> +		of PCI devices, exercise the "remove" and "rescan" attributes.

Don't make userspace parse kernel logs for this type of thing.  And
asking userspace to rely on remove and recan is a mess, either show it
works or not.

> diff --git a/MAINTAINERS b/MAINTAINERS
> index abb3b603299f..03e1076f915a 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -21385,7 +21385,8 @@ L:	linux-cxl@vger.kernel.org
>  L:	linux-pci@vger.kernel.org
>  S:	Maintained
>  T:	git git://git.kernel.org/pub/scm/linux/kernel/git/devsec/spdm.git
> -F:	drivers/pci/cma.c
> +F:	Documentation/ABI/testing/sysfs-devices-spdm
> +F:	drivers/pci/cma*
>  F:	include/linux/spdm.h
>  F:	lib/rspdm/
>  
> diff --git a/drivers/pci/cma.c b/drivers/pci/cma.c
> index f2c435b04b92..59558714f143 100644
> --- a/drivers/pci/cma.c
> +++ b/drivers/pci/cma.c
> @@ -171,8 +171,10 @@ void pci_cma_init(struct pci_dev *pdev)
>  {
>  	struct pci_doe_mb *doe;
>  
> -	if (IS_ERR(pci_cma_keyring))
> +	if (IS_ERR(pci_cma_keyring)) {
> +		pdev->spdm_state = ERR_PTR(-ENOTTY);

Why ENOTTY?

>  		return;
> +	}
>  
>  	if (!pci_is_pcie(pdev))
>  		return;
> @@ -185,8 +187,10 @@ void pci_cma_init(struct pci_dev *pdev)
>  	pdev->spdm_state = spdm_create(&pdev->dev, pci_doe_transport, doe,
>  				       PCI_DOE_MAX_PAYLOAD, pci_cma_keyring,
>  				       pci_cma_validate);
> -	if (!pdev->spdm_state)
> +	if (!pdev->spdm_state) {
> +		pdev->spdm_state = ERR_PTR(-ENOTTY);

Again, why ENOTTY?  This isn't a tty device :)

>  		return;
> +	}
>  
>  	/*
>  	 * Keep spdm_state allocated even if initial authentication fails
> @@ -204,7 +208,7 @@ void pci_cma_init(struct pci_dev *pdev)
>   */
>  void pci_cma_reauthenticate(struct pci_dev *pdev)
>  {
> -	if (!pdev->spdm_state)
> +	if (IS_ERR_OR_NULL(pdev->spdm_state))

Why does this matter, it should either be NULL or not, adding an error
value just makes it more complex and if you make the changes I mention
above, I don't think it's needed, right?

> --- a/lib/rspdm/lib.rs
> +++ b/lib/rspdm/lib.rs
> @@ -24,6 +24,7 @@
>  
>  mod consts;
>  mod state;
> +pub mod sysfs;

Hey, a sysfs binding!

Well, not really, it's a hack as mentioned below, that I do not think
you really want to do just yet.


>  mod validator;
>  
>  /// spdm_create() - Allocate SPDM session
> diff --git a/lib/rspdm/req-sysfs.c b/lib/rspdm/req-sysfs.c
> new file mode 100644
> index 000000000000..11bacb04f08f
> --- /dev/null
> +++ b/lib/rspdm/req-sysfs.c
> @@ -0,0 +1,97 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/*
> + * Rust implementation of the DMTF Security Protocol and Data Model (SPDM)
> + * https://www.dmtf.org/dsp/DSP0274
> + *
> + * Requester role: sysfs interface
> + *
> + * Copyright (C) 2023-24 Intel Corporation
> + * Copyright (C) 2024 Western Digital
> + */
> +
> +#include <linux/pci.h>
> +
> +int rust_authenticated_show(void *spdm_state, char *buf);

Don't put externs in a .c file, didn't checkpatch catch this?  More on
this below...

> +
> +/**
> + * dev_to_spdm_state() - Retrieve SPDM session state for given device
> + *
> + * @dev: Responder device
> + *
> + * Returns a pointer to the device's SPDM session state,
> + *	   %NULL if the device doesn't have one or
> + *	   %ERR_PTR if it couldn't be determined whether SPDM is supported.
> + *
> + * In the %ERR_PTR case, attributes are visible but return an error on access.
> + * This prevents downgrade attacks where an attacker disturbs memory allocation
> + * or communication with the device in order to create the appearance that SPDM
> + * is unsupported.  E.g. with PCI devices, the attacker may foil CMA or DOE
> + * initialization by simply hogging memory.
> + */
> +static void *dev_to_spdm_state(struct device *dev)
> +{
> +	if (dev_is_pci(dev))
> +		return pci_dev_to_spdm_state(to_pci_dev(dev));
> +
> +	/* Insert mappers for further bus types here. */
> +
> +	return NULL;
> +}
> +
> +/* authenticated attribute */
> +
> +static umode_t spdm_attrs_are_visible(struct kobject *kobj,
> +				      struct attribute *a, int n)
> +{
> +	struct device *dev = kobj_to_dev(kobj);
> +	void *spdm_state = dev_to_spdm_state(dev);
> +
> +	if (IS_ERR_OR_NULL(spdm_state))
> +		return SYSFS_GROUP_INVISIBLE;
> +
> +	return a->mode;
> +}
> +
> +static ssize_t authenticated_store(struct device *dev,
> +				   struct device_attribute *attr,
> +				   const char *buf, size_t count)
> +{
> +	void *spdm_state = dev_to_spdm_state(dev);
> +	int rc;
> +
> +	if (IS_ERR_OR_NULL(spdm_state))
> +		return PTR_ERR(spdm_state);
> +
> +	if (sysfs_streq(buf, "re")) {

I don't like sysfs files that when reading show a binary, but require a
"magic string" to be written to them to have them do something else.
that way lies madness.  What would you do if each sysfs file had a
custom language that you had to look up for each one?  Be consistant
here.  But again, I don't think you need a store function at all, either
the device supports this, or it doesn't.

> +		rc = spdm_authenticate(spdm_state);

Call into Rust code?  More on that below...

> +		if (rc)
> +			return rc;
> +	} else {
> +		return -EINVAL;
> +	}
> +
> +	return count;
> +}
> +
> +static ssize_t authenticated_show(struct device *dev,
> +				  struct device_attribute *attr, char *buf)
> +{
> +	void *spdm_state = dev_to_spdm_state(dev);
> +
> +	if (IS_ERR_OR_NULL(spdm_state))
> +		return PTR_ERR(spdm_state);

Again, this check can go away if the file just isn't there.


> +
> +	return rust_authenticated_show(spdm_state, buf);

Here you have C code calling into Rust code.  I'm not complaining about
it, but I think it will be the first use of this, and I didn't think
that the rust maintainers were willing to do that just yet.

Has that policy changed?

The issue here is that the C signature for this is not being
auto-generated, you have to manually keep it in sync (as you did above),
with the Rust side.  That's not going to scale over time at all, you
MUST have a .h file somewhere for C to know how to call into this and
for the compiler to check that all is sane on both sides.

And you are passing a random void * into the Rust side, what could go
wrong?  I think this needs more thought as this is fragile-as-f***.

thanks,

greg k-h
Alice Ryhl Feb. 27, 2025, 11:52 a.m. UTC | #2
On Thu, Feb 27, 2025 at 12:17 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Thu, Feb 27, 2025 at 01:09:41PM +1000, Alistair Francis wrote:
> > +     return rust_authenticated_show(spdm_state, buf);
>
> Here you have C code calling into Rust code.  I'm not complaining about
> it, but I think it will be the first use of this, and I didn't think
> that the rust maintainers were willing to do that just yet.
>
> Has that policy changed?
>
> The issue here is that the C signature for this is not being
> auto-generated, you have to manually keep it in sync (as you did above),
> with the Rust side.  That's not going to scale over time at all, you
> MUST have a .h file somewhere for C to know how to call into this and
> for the compiler to check that all is sane on both sides.
>
> And you are passing a random void * into the Rust side, what could go
> wrong?  I think this needs more thought as this is fragile-as-f***.

I don't think we have a policy against it? I'm pretty sure the QR code
thing does it.

Alice
Greg KH Feb. 27, 2025, noon UTC | #3
On Thu, Feb 27, 2025 at 12:52:02PM +0100, Alice Ryhl wrote:
> On Thu, Feb 27, 2025 at 12:17 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Thu, Feb 27, 2025 at 01:09:41PM +1000, Alistair Francis wrote:
> > > +     return rust_authenticated_show(spdm_state, buf);
> >
> > Here you have C code calling into Rust code.  I'm not complaining about
> > it, but I think it will be the first use of this, and I didn't think
> > that the rust maintainers were willing to do that just yet.
> >
> > Has that policy changed?
> >
> > The issue here is that the C signature for this is not being
> > auto-generated, you have to manually keep it in sync (as you did above),
> > with the Rust side.  That's not going to scale over time at all, you
> > MUST have a .h file somewhere for C to know how to call into this and
> > for the compiler to check that all is sane on both sides.
> >
> > And you are passing a random void * into the Rust side, what could go
> > wrong?  I think this needs more thought as this is fragile-as-f***.
> 
> I don't think we have a policy against it? I'm pretty sure the QR code
> thing does it.

Sorry, you are right, it does, and of course it happens (otherwise how
would bindings work), but for small functions like this, how is the C
code kept in sync with the rust side?  Where is the .h file that C
should include?

thanks,

greg "I need more coffee" k-h
Alice Ryhl Feb. 27, 2025, 12:11 p.m. UTC | #4
On Thu, Feb 27, 2025 at 1:01 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Thu, Feb 27, 2025 at 12:52:02PM +0100, Alice Ryhl wrote:
> > On Thu, Feb 27, 2025 at 12:17 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Thu, Feb 27, 2025 at 01:09:41PM +1000, Alistair Francis wrote:
> > > > +     return rust_authenticated_show(spdm_state, buf);
> > >
> > > Here you have C code calling into Rust code.  I'm not complaining about
> > > it, but I think it will be the first use of this, and I didn't think
> > > that the rust maintainers were willing to do that just yet.
> > >
> > > Has that policy changed?
> > >
> > > The issue here is that the C signature for this is not being
> > > auto-generated, you have to manually keep it in sync (as you did above),
> > > with the Rust side.  That's not going to scale over time at all, you
> > > MUST have a .h file somewhere for C to know how to call into this and
> > > for the compiler to check that all is sane on both sides.
> > >
> > > And you are passing a random void * into the Rust side, what could go
> > > wrong?  I think this needs more thought as this is fragile-as-f***.
> >
> > I don't think we have a policy against it? I'm pretty sure the QR code
> > thing does it.
>
> Sorry, you are right, it does, and of course it happens (otherwise how
> would bindings work), but for small functions like this, how is the C
> code kept in sync with the rust side?  Where is the .h file that C
> should include?

I don't think there is tooling for it today. We need the opposite of
bindgen, which does exist in a tool called cbindgen. Unfortunately,
cbindgen is written to only work in cargo-based build systems, so we
cannot use it.

One trick you could do is write the signature in a header file, and
then compare what bindgen generates to the real signature like this:

const _: () = {
    if true {
        bindings::my_function
    } else {
        my_function
    };
};

This would only compile if the two function pointers have the same signature.

Alice
Greg KH Feb. 27, 2025, 2:03 p.m. UTC | #5
On Thu, Feb 27, 2025 at 01:11:01PM +0100, Alice Ryhl wrote:
> On Thu, Feb 27, 2025 at 1:01 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Thu, Feb 27, 2025 at 12:52:02PM +0100, Alice Ryhl wrote:
> > > On Thu, Feb 27, 2025 at 12:17 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Thu, Feb 27, 2025 at 01:09:41PM +1000, Alistair Francis wrote:
> > > > > +     return rust_authenticated_show(spdm_state, buf);
> > > >
> > > > Here you have C code calling into Rust code.  I'm not complaining about
> > > > it, but I think it will be the first use of this, and I didn't think
> > > > that the rust maintainers were willing to do that just yet.
> > > >
> > > > Has that policy changed?
> > > >
> > > > The issue here is that the C signature for this is not being
> > > > auto-generated, you have to manually keep it in sync (as you did above),
> > > > with the Rust side.  That's not going to scale over time at all, you
> > > > MUST have a .h file somewhere for C to know how to call into this and
> > > > for the compiler to check that all is sane on both sides.
> > > >
> > > > And you are passing a random void * into the Rust side, what could go
> > > > wrong?  I think this needs more thought as this is fragile-as-f***.
> > >
> > > I don't think we have a policy against it? I'm pretty sure the QR code
> > > thing does it.
> >
> > Sorry, you are right, it does, and of course it happens (otherwise how
> > would bindings work), but for small functions like this, how is the C
> > code kept in sync with the rust side?  Where is the .h file that C
> > should include?
> 
> I don't think there is tooling for it today. We need the opposite of
> bindgen, which does exist in a tool called cbindgen. Unfortunately,
> cbindgen is written to only work in cargo-based build systems, so we
> cannot use it.
> 
> One trick you could do is write the signature in a header file, and
> then compare what bindgen generates to the real signature like this:
> 
> const _: () = {
>     if true {
>         bindings::my_function
>     } else {
>         my_function
>     };
> };
> 
> This would only compile if the two function pointers have the same signature.

That feels just wrong :(

As this seems like it's going to be a longer-term issue, has anyone
thought of how it's going to be handled?  Build time errors when
functions change is the key here, no one remembers to manually verify
each caller to verify the variables are correct anymore, that would be a
big step backwards.

thanks,

greg k-h
Miguel Ojeda Feb. 27, 2025, 4:45 p.m. UTC | #6
On Thu, Feb 27, 2025 at 1:01 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> Sorry, you are right, it does, and of course it happens (otherwise how
> would bindings work), but for small functions like this, how is the C
> code kept in sync with the rust side?  Where is the .h file that C
> should include?

What you were probably remembering is that it still needs to be
justified, i.e. we don't want to generally/freely start replacing
"individual functions" and doing FFI both ways everywhere, i.e. the
goal is to build safe abstractions wherever possible.

Cheers,
Miguel
Miguel Ojeda Feb. 27, 2025, 4:46 p.m. UTC | #7
On Thu, Feb 27, 2025 at 1:11 PM Alice Ryhl <aliceryhl@google.com> wrote:
>
> I don't think there is tooling for it today. We need the opposite of
> bindgen, which does exist in a tool called cbindgen. Unfortunately,
> cbindgen is written to only work in cargo-based build systems, so we
> cannot use it.

It doesn't require Cargo, e.g. see my reply to Daniel:

    https://rust-for-linux.zulipchat.com/#narrow/channel/288089-General/topic/What.20is.20the.20status.20on.20C.20APIs.20for.20Rust.20kernel.20code.3F/near/420712657

Even if it did require something extra for complex usage or similar,
we could ask the maintainers or I could perhaps come up with something
to generate whatever inputs they need.

Cheers,
Miguel
Miguel Ojeda Feb. 27, 2025, 4:47 p.m. UTC | #8
On Thu, Feb 27, 2025 at 3:04 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> As this seems like it's going to be a longer-term issue, has anyone
> thought of how it's going to be handled?  Build time errors when
> functions change is the key here, no one remembers to manually verify
> each caller to verify the variables are correct anymore, that would be a
> big step backwards.

I can look into it, after other build system things are done.

I talked to Emilio some months ago about this and he told me Firefox
solves the problem both ways, so we may be able to do something
similar.

Cheers,
Miguel
Greg KH Feb. 27, 2025, 7:31 p.m. UTC | #9
On Thu, Feb 27, 2025 at 05:47:01PM +0100, Miguel Ojeda wrote:
> On Thu, Feb 27, 2025 at 3:04 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > As this seems like it's going to be a longer-term issue, has anyone
> > thought of how it's going to be handled?  Build time errors when
> > functions change is the key here, no one remembers to manually verify
> > each caller to verify the variables are correct anymore, that would be a
> > big step backwards.
> 
> I can look into it, after other build system things are done.

Looks like Alice already sent a series to do this, so no need.
Greg KH Feb. 27, 2025, 7:32 p.m. UTC | #10
On Thu, Feb 27, 2025 at 05:45:02PM +0100, Miguel Ojeda wrote:
> On Thu, Feb 27, 2025 at 1:01 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > Sorry, you are right, it does, and of course it happens (otherwise how
> > would bindings work), but for small functions like this, how is the C
> > code kept in sync with the rust side?  Where is the .h file that C
> > should include?
> 
> What you were probably remembering is that it still needs to be
> justified, i.e. we don't want to generally/freely start replacing
> "individual functions" and doing FFI both ways everywhere, i.e. the
> goal is to build safe abstractions wherever possible.

Ah, ok, that's what I was remembering.

Anyway, the "pass a void blob from C into rust" that this patch is doing
feels really odd to me, and hard to verify it is "safe" at a simple
glance.

thanks,

greg k-h
Lukas Wunner Feb. 27, 2025, 10:42 p.m. UTC | #11
On Thu, Feb 27, 2025 at 03:16:40AM -0800, Greg KH wrote:
> On Thu, Feb 27, 2025 at 01:09:41PM +1000, Alistair Francis wrote:
> > The PCI core has just been amended to authenticate CMA-capable devices
> > on enumeration and store the result in an "authenticated" bit in struct
> > pci_dev->spdm_state.
> > 
> > Expose the bit to user space through an eponymous sysfs attribute.
> > 
> > Allow user space to trigger reauthentication (e.g. after it has updated
> > the CMA keyring) by writing to the sysfs attribute.
> > 
> > Implement the attribute in the SPDM library so that other bus types
> > besides PCI may take advantage of it.  They just need to add
> > spdm_attr_group to the attribute groups of their devices and amend the
> > dev_to_spdm_state() helper which retrieves the spdm_state for a given
> > device.
> > 
> > The helper may return an ERR_PTR if it couldn't be determined whether
> > SPDM is supported by the device.  The sysfs attribute is visible in that
> > case but returns an error on access.  This prevents downgrade attacks
> > where an attacker disturbs memory allocation or DOE communication
> > in order to create the appearance that SPDM is unsupported.
> 
> I don't like this "if it's present we still don't know if the device
> supports this", as that is not normally the "sysfs way" here.  Why must
> it be present in those situations?

That's explained above.

Unfortunately there is no (signed) bit in Config Space which tells us
whether authentication is supported by a PCI device.  Rather, it is
necessary to exchange several messages with the device through a
DOE mailbox in config space to determine that.  I'm worried that an
attacker deliberately "glitches" those DOE exchanges and thus creates
the appearance that the device does not support authentication.

Let's say the user's policy is to trust legacy devices which do not
support authentication, but require authentication for newer NVMe drives
from a certain vendor.  An attacker may manipulate an authentication-capable
NVMe drive from that vendor, whereupon it will fail authentication.
But the attacker can trick the user into trusting the device by glitching
the DOE exchanges.

Of course, this is an abnormal situation that users won't encounter
unless they're being attacked.  Normally the attribute is only present
if authentication is supported.

I disagree with your assessment that we have bigger problems.
For security protocols like this we have to be very careful
to prevent trivial circumvention.  We cannot just shrug this off
as unimportant.

The "authenticated" attribute tells user space whether the device
is authenticated.  User space needs to handle errors anyway when
reading the attribute.  Users will get an error if authentication
support could not be determined.  Simple.


> What is going to happen to suddenly
> allow it to come back to be "alive" and working while the device is
> still present in the system?

The device needs to be re-enumerated by the PCI core to retry
determining its authentication capability.  That's why the
sysfs documentation says the user may exercise the "remove"
and "rescan" attributes to retry authentication.


> I'd prefer it to be "if the file is there, this is supported by the
> device.  If the file is not there, it is not supported", as that makes
> things so much simpler for userspace (i.e. you don't want userspace to
> have to both test if the file is there AND read all entries just to see
> if the kernel knows what is going on or not.)

Huh?  Read all entries?  The attribute contains only 0 or 1!

Or you'll get an error reading it.


> Also, how will userspace know if the state changes from "unknown" to
> "now it might work, try it again"?

User space has to explicitly remove and rescan the device.
Otherwise its authentication capability remains unknown.

Again, this is in the abnormal situation when the user is being attacked.
There is no automatic resolution to this scenario, deliberately so.

The user needs to know that there may be an attack going on.
The user may then act on it at their own discretion.


> > Subject to further discussion, a future commit might add a user-defined
> > policy to forbid driver binding to devices which failed authentication,
> > similar to the "authorized" attribute for USB.
> 
> That user-defined policy belongs in userspace, just like USB has it.
> Why would the kernel need this?  Or do you mean that the kernel provides
> a default like USB does and then requires userspace to manually enable a
> device before binding a driver to the device is allowed?

The idea is to bring up PCI device authentication.
And as a first step, expose to user space whether authentication succeeded.

We can have a discussion what other functionality is desirable to make
this useful.  I am open to any ideas you might have.  The commit
message merely makes suggestions as to what might be interesting
going forward.


> > Alternatively, authentication success might be signaled to user space
> > through a uevent, whereupon it may bind a (blacklisted) driver.
> 
> How will that happen?

The SPDM library can be amended to signal a uevent when authentication
succeeds or fails and user space can then act on it.  I imagine systemd
or some other daemon might listen to such events and do interesting things,
such as binding a driver once authentication succeeds.


> > A uevent signaling authentication failure might similarly cause user
> > space to unbind or outright remove the potentially malicious device.
> 
> Again how?  Who will be sending nthose uevents?  Who will be listening
> to them?  What in the kernel is going to change to know to send those
> events?  Why is any of this needed at all?  :)

None of this is needed.  It's just a suggestion.

Maybe you have better ideas.  Be constructive!  Make suggestions!


> > +		If the kernel could not determine whether authentication is
> > +		supported because memory was low or communication with the
> > +		device was not working, the file is visible but accessing it
> > +		fails with error code ENOTTY.
> 
> Not good.  So this means that userspace can not trust that if the file
> is there it actually works, so it has to do more work to determine if it
> is working (as mentioned above).  Just either have the file if it works,
> or not if it does not please.

No no no.

Quoting Destiny's Child here. ;)


> > +		This prevents downgrade attacks where an attacker consumes
> > +		memory or disturbs communication in order to create the
> > +		appearance that a device does not support authentication.
> 
> If an attacker can consume kernel memory to cause this to happen you
> have bigger problems.  That's not the kernel's issue here at all.
> 
> And "disable communication" means "we just don't support it as the
> device doesn't say it does", so again, why does that matter?

Reacting to potential attacks sure is the kernel's business.


> > +		The reason why authentication support could not be determined
> > +		is apparent from "dmesg".  To re-probe authentication support
> > +		of PCI devices, exercise the "remove" and "rescan" attributes.
> 
> Don't make userspace parse kernel logs for this type of thing.  And
> asking userspace to rely on remove and recan is a mess, either show it
> works or not.

I'd say looking in dmesg to determine whether the user is being attacked
is perfectly fine, as is requiring the user to explicitly act on a
potential attack.


> > --- a/drivers/pci/cma.c
> > +++ b/drivers/pci/cma.c
> > @@ -171,8 +171,10 @@ void pci_cma_init(struct pci_dev *pdev)
> >  {
> >  	struct pci_doe_mb *doe;
> >  
> > -	if (IS_ERR(pci_cma_keyring))
> > +	if (IS_ERR(pci_cma_keyring)) {
> > +		pdev->spdm_state = ERR_PTR(-ENOTTY);
> 
> Why ENOTTY?

We use -ENOTTY as return value for unsupported reset methods in the
PCI core, see e.g. pcie_reset_flr(), pcie_af_flr(), pci_pm_reset(),
pci_parent_bus_reset(), pci_reset_hotplug_slot(), ...

We also use -ENOTTY in pci_bridge_wait_for_secondary_bus() and
pci_dev_wait().

It was used here to be consistent with those existing occurrences
in the PCI core.

If you'd prefer something else, please make a suggestion.


> > +static ssize_t authenticated_store(struct device *dev,
> > +				   struct device_attribute *attr,
> > +				   const char *buf, size_t count)
> > +{
> > +	void *spdm_state = dev_to_spdm_state(dev);
> > +	int rc;
> > +
> > +	if (IS_ERR_OR_NULL(spdm_state))
> > +		return PTR_ERR(spdm_state);
> > +
> > +	if (sysfs_streq(buf, "re")) {
> 
> I don't like sysfs files that when reading show a binary, but require a
> "magic string" to be written to them to have them do something else.
> that way lies madness.  What would you do if each sysfs file had a
> custom language that you had to look up for each one?  Be consistant
> here.  But again, I don't think you need a store function at all, either
> the device supports this, or it doesn't.

I'm not sure if you've even read the ABI documentation in full.

The store method is needed to reauthenticate the device,
e.g. after a new trusted root certificate was added to the
kernel's .cma keyring.

Originally I allowed any value to be written to the "authenticated"
attribute to trigger reauthentication.  I changed that to "re"
(actually any word that starts with "re") because I envision that
we may need additional verbs to force authentication with a specific
certificate slot or to force-trust a device despite missing its
root certificate in the .cma keyring.

Thanks,

Lukas
Greg KH Feb. 28, 2025, 1:39 a.m. UTC | #12
On Thu, Feb 27, 2025 at 11:42:46PM +0100, Lukas Wunner wrote:
> On Thu, Feb 27, 2025 at 03:16:40AM -0800, Greg KH wrote:
> > On Thu, Feb 27, 2025 at 01:09:41PM +1000, Alistair Francis wrote:
> > > The PCI core has just been amended to authenticate CMA-capable devices
> > > on enumeration and store the result in an "authenticated" bit in struct
> > > pci_dev->spdm_state.
> > > 
> > > Expose the bit to user space through an eponymous sysfs attribute.
> > > 
> > > Allow user space to trigger reauthentication (e.g. after it has updated
> > > the CMA keyring) by writing to the sysfs attribute.
> > > 
> > > Implement the attribute in the SPDM library so that other bus types
> > > besides PCI may take advantage of it.  They just need to add
> > > spdm_attr_group to the attribute groups of their devices and amend the
> > > dev_to_spdm_state() helper which retrieves the spdm_state for a given
> > > device.
> > > 
> > > The helper may return an ERR_PTR if it couldn't be determined whether
> > > SPDM is supported by the device.  The sysfs attribute is visible in that
> > > case but returns an error on access.  This prevents downgrade attacks
> > > where an attacker disturbs memory allocation or DOE communication
> > > in order to create the appearance that SPDM is unsupported.
> > 
> > I don't like this "if it's present we still don't know if the device
> > supports this", as that is not normally the "sysfs way" here.  Why must
> > it be present in those situations?
> 
> That's explained above.

Not really, you just say "downgrade attacks", which is not something
that we need to worry about, right?  If so, I think this bit is the
least of our worries.

> Unfortunately there is no (signed) bit in Config Space which tells us
> whether authentication is supported by a PCI device.  Rather, it is
> necessary to exchange several messages with the device through a
> DOE mailbox in config space to determine that.  I'm worried that an
> attacker deliberately "glitches" those DOE exchanges and thus creates
> the appearance that the device does not support authentication.

That's a hardware glitch, and if that happens, then it will show a 0 and
that's the same as not being present at all, right?  Otherwise you just
pound on the file to try to see if the glitch was not real?  That's not
going to go over well.

> Let's say the user's policy is to trust legacy devices which do not
> support authentication, but require authentication for newer NVMe drives
> from a certain vendor.  An attacker may manipulate an authentication-capable
> NVMe drive from that vendor, whereupon it will fail authentication.
> But the attacker can trick the user into trusting the device by glitching
> the DOE exchanges.

Again, are we now claiming that Linux needs to support "hardware
glitching"?  Is that required somewhere?  I think if the DOE exchanges
fail, we just trust the device as we have to trust something, right?

> Of course, this is an abnormal situation that users won't encounter
> unless they're being attacked.  Normally the attribute is only present
> if authentication is supported.
> 
> I disagree with your assessment that we have bigger problems.
> For security protocols like this we have to be very careful
> to prevent trivial circumvention.  We cannot just shrug this off
> as unimportant.

hardware glitching is not trivial.  Let's only worry about that if the
hardware people somehow require it, and if so, we can push back and say
"stop making us fix your broken designs" :)

> The "authenticated" attribute tells user space whether the device
> is authenticated.  User space needs to handle errors anyway when
> reading the attribute.  Users will get an error if authentication
> support could not be determined.  Simple.

No, if it's not determined, it shouldn't be present.

> > What is going to happen to suddenly
> > allow it to come back to be "alive" and working while the device is
> > still present in the system?
> 
> The device needs to be re-enumerated by the PCI core to retry
> determining its authentication capability.  That's why the
> sysfs documentation says the user may exercise the "remove"
> and "rescan" attributes to retry authentication.

But how does it know that?  remove and recan is a huge sledgehammer, and
an amazing one if it even works on most hardware.  Don't make it part of
any normal process please.

> > I'd prefer it to be "if the file is there, this is supported by the
> > device.  If the file is not there, it is not supported", as that makes
> > things so much simpler for userspace (i.e. you don't want userspace to
> > have to both test if the file is there AND read all entries just to see
> > if the kernel knows what is going on or not.)
> 
> Huh?  Read all entries?  The attribute contains only 0 or 1!
> 
> Or you'll get an error reading it.

It's the error, don't do that.  If an error is going to happen, then
don't have the file there.  That's the way sysfs works, it's not a
"let's add all possible files and then make userspace open them all and
see if an error happens to determine what really is present for this
device" model.  It's a "if a file is there, that attribute is there and
we can read it".

> > > Alternatively, authentication success might be signaled to user space
> > > through a uevent, whereupon it may bind a (blacklisted) driver.
> > 
> > How will that happen?
> 
> The SPDM library can be amended to signal a uevent when authentication
> succeeds or fails and user space can then act on it.  I imagine systemd
> or some other daemon might listen to such events and do interesting things,
> such as binding a driver once authentication succeeds.

That's a new user/kernel api and should be designed ONLY if you actually
need it and have a user.  Otherwise let's just wait for later for that.

> Maybe you have better ideas.  Be constructive!  Make suggestions!

Again, have the file there only if this is something that the hardware
supports.  Don't fail a read just because the hardware does not seem to
support it, but it might sometime in the future if you just happen to
unplug/plug it back in.

> > > +		This prevents downgrade attacks where an attacker consumes
> > > +		memory or disturbs communication in order to create the
> > > +		appearance that a device does not support authentication.
> > 
> > If an attacker can consume kernel memory to cause this to happen you
> > have bigger problems.  That's not the kernel's issue here at all.
> > 
> > And "disable communication" means "we just don't support it as the
> > device doesn't say it does", so again, why does that matter?
> 
> Reacting to potential attacks sure is the kernel's business.

Reacting to real, software attacks is the kernel's business.  Reacting
to possible hardware issues that are just theoretical is not.

> > > +		The reason why authentication support could not be determined
> > > +		is apparent from "dmesg".  To re-probe authentication support
> > > +		of PCI devices, exercise the "remove" and "rescan" attributes.
> > 
> > Don't make userspace parse kernel logs for this type of thing.  And
> > asking userspace to rely on remove and recan is a mess, either show it
> > works or not.
> 
> I'd say looking in dmesg to determine whether the user is being attacked
> is perfectly fine, as is requiring the user to explicitly act on a
> potential attack.
> 
> 
> > > --- a/drivers/pci/cma.c
> > > +++ b/drivers/pci/cma.c
> > > @@ -171,8 +171,10 @@ void pci_cma_init(struct pci_dev *pdev)
> > >  {
> > >  	struct pci_doe_mb *doe;
> > >  
> > > -	if (IS_ERR(pci_cma_keyring))
> > > +	if (IS_ERR(pci_cma_keyring)) {
> > > +		pdev->spdm_state = ERR_PTR(-ENOTTY);
> > 
> > Why ENOTTY?
> 
> We use -ENOTTY as return value for unsupported reset methods in the
> PCI core, see e.g. pcie_reset_flr(), pcie_af_flr(), pci_pm_reset(),
> pci_parent_bus_reset(), pci_reset_hotplug_slot(), ...
> 
> We also use -ENOTTY in pci_bridge_wait_for_secondary_bus() and
> pci_dev_wait().
> 
> It was used here to be consistent with those existing occurrences
> in the PCI core.
> 
> If you'd prefer something else, please make a suggestion.

Ah, didn't realize that was a pci thing, ok, nevermind.

> > > +static ssize_t authenticated_store(struct device *dev,
> > > +				   struct device_attribute *attr,
> > > +				   const char *buf, size_t count)
> > > +{
> > > +	void *spdm_state = dev_to_spdm_state(dev);
> > > +	int rc;
> > > +
> > > +	if (IS_ERR_OR_NULL(spdm_state))
> > > +		return PTR_ERR(spdm_state);
> > > +
> > > +	if (sysfs_streq(buf, "re")) {
> > 
> > I don't like sysfs files that when reading show a binary, but require a
> > "magic string" to be written to them to have them do something else.
> > that way lies madness.  What would you do if each sysfs file had a
> > custom language that you had to look up for each one?  Be consistant
> > here.  But again, I don't think you need a store function at all, either
> > the device supports this, or it doesn't.
> 
> I'm not sure if you've even read the ABI documentation in full.
> 
> The store method is needed to reauthenticate the device,
> e.g. after a new trusted root certificate was added to the
> kernel's .cma keyring.

Why not have a different file called "reauthentication" that only allows
a write to it of a 1/Y/y to do the reauthentication.  sysfs is a "one
file per thing" interface, not a "parse a command and do something but
when read from return a different value" interface.

Let's keep it dirt simple please.

thanks,

greg k-h
Alistair Francis Feb. 28, 2025, 2:27 a.m. UTC | #13
On Fri, Feb 28, 2025 at 5:33 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Thu, Feb 27, 2025 at 05:45:02PM +0100, Miguel Ojeda wrote:
> > On Thu, Feb 27, 2025 at 1:01 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > Sorry, you are right, it does, and of course it happens (otherwise how
> > > would bindings work), but for small functions like this, how is the C
> > > code kept in sync with the rust side?  Where is the .h file that C
> > > should include?

This I can address with something like Alice mentioned earlier to
ensure the C and Rust functions stay in sync.

> >
> > What you were probably remembering is that it still needs to be
> > justified, i.e. we don't want to generally/freely start replacing
> > "individual functions" and doing FFI both ways everywhere, i.e. the
> > goal is to build safe abstractions wherever possible.
>
> Ah, ok, that's what I was remembering.
>
> Anyway, the "pass a void blob from C into rust" that this patch is doing
> feels really odd to me, and hard to verify it is "safe" at a simple
> glance.

I agree, it's a bit odd. Ideally I would like to use a sysfs binding,
but there isn't one today.

I had a quick look and a Rust sysfs binding implementation seems like
a lot of work, which I wasn't convinced I wanted to invest in for this
series. This is only a single sysfs attribute and I didn't want to
slow down this series on a whole sysfs Rust implementation.

If this approach isn't ok for now, I will just drop the sysfs changes
from the series so the SPDM implementation doesn't stall on sysfs
changes. Then come back to the sysfs attributes in the future.

So the high level question, is "pass[ing] a void blob from C into
rust" ok or should I defer for a future safer implementation?

Alistair

>
> thanks,
>
> greg k-h
Alistair Francis Feb. 28, 2025, 2:55 a.m. UTC | #14
On Fri, Feb 28, 2025 at 11:41 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Thu, Feb 27, 2025 at 11:42:46PM +0100, Lukas Wunner wrote:
> > On Thu, Feb 27, 2025 at 03:16:40AM -0800, Greg KH wrote:
> > > On Thu, Feb 27, 2025 at 01:09:41PM +1000, Alistair Francis wrote:
> > > > The PCI core has just been amended to authenticate CMA-capable devices
> > > > on enumeration and store the result in an "authenticated" bit in struct
> > > > pci_dev->spdm_state.
> > > >
> > > > Expose the bit to user space through an eponymous sysfs attribute.
> > > >
> > > > Allow user space to trigger reauthentication (e.g. after it has updated
> > > > the CMA keyring) by writing to the sysfs attribute.
> > > >
> > > > Implement the attribute in the SPDM library so that other bus types
> > > > besides PCI may take advantage of it.  They just need to add
> > > > spdm_attr_group to the attribute groups of their devices and amend the
> > > > dev_to_spdm_state() helper which retrieves the spdm_state for a given
> > > > device.
> > > >
> > > > The helper may return an ERR_PTR if it couldn't be determined whether
> > > > SPDM is supported by the device.  The sysfs attribute is visible in that
> > > > case but returns an error on access.  This prevents downgrade attacks
> > > > where an attacker disturbs memory allocation or DOE communication
> > > > in order to create the appearance that SPDM is unsupported.
> > >
> > > I don't like this "if it's present we still don't know if the device
> > > supports this", as that is not normally the "sysfs way" here.  Why must
> > > it be present in those situations?

I do think there are 4 situations

 1. Device supports authentication and was authenticated
 2.
  a) Device supports authentication, but the certificate chain can't
be authenticated
  b) Device supports authentication, but the communication was interrupted
 3. Device doesn't support authentication

Case 1 and 3 are easy

Case 2 (a) means that the kernel doesn't trust the certificate. This
could be for a range of reasons, including the user just hasn't
installed the cert yet. So it makes sense to pass that information to
the user as they can act on it. I think it's a different case to 3
(where there is no action for the user to take).

> >
> > That's explained above.
>
> Not really, you just say "downgrade attacks", which is not something
> that we need to worry about, right?  If so, I think this bit is the
> least of our worries.
>
> > Unfortunately there is no (signed) bit in Config Space which tells us
> > whether authentication is supported by a PCI device.  Rather, it is
> > necessary to exchange several messages with the device through a
> > DOE mailbox in config space to determine that.  I'm worried that an
> > attacker deliberately "glitches" those DOE exchanges and thus creates
> > the appearance that the device does not support authentication.
>
> That's a hardware glitch, and if that happens, then it will show a 0 and
> that's the same as not being present at all, right?  Otherwise you just
> pound on the file to try to see if the glitch was not real?  That's not
> going to go over well.
>
> > Let's say the user's policy is to trust legacy devices which do not
> > support authentication, but require authentication for newer NVMe drives
> > from a certain vendor.  An attacker may manipulate an authentication-capable
> > NVMe drive from that vendor, whereupon it will fail authentication.
> > But the attacker can trick the user into trusting the device by glitching
> > the DOE exchanges.
>
> Again, are we now claiming that Linux needs to support "hardware
> glitching"?  Is that required somewhere?  I think if the DOE exchanges
> fail, we just trust the device as we have to trust something, right?

This is case 2 (b) above. If an attacker could flip a bit or drop a
byte in the communication path they could cause the authentication to
stop.

The process to authenticate a device involves sending a lot of data,
so it's not impossible that an attacker could interrupt some of the
traffic.

If the DOE exchange fails I don't think we want to trust the device,
as that's the point of SPDM. So being able to communicate that to
userspace does seem really useful.

>
> > Of course, this is an abnormal situation that users won't encounter
> > unless they're being attacked.  Normally the attribute is only present
> > if authentication is supported.
> >
> > I disagree with your assessment that we have bigger problems.
> > For security protocols like this we have to be very careful
> > to prevent trivial circumvention.  We cannot just shrug this off
> > as unimportant.
>
> hardware glitching is not trivial.  Let's only worry about that if the
> hardware people somehow require it, and if so, we can push back and say
> "stop making us fix your broken designs" :)

Hardware glitching is hard to do, but SPDM is designed to protect
against a range of hard attacks. The types of attacks that nation
states would do. So I think we should do our best to protect against
them.Obviously there is only so much that software can do to protect
against a hardware glitch, but this seems like a good in between.

I don't think the design is overall broken though. At some point when
all devices support SPDM it becomes a non issue as you just fail if
authentication fails, but in the mean time we need to handle devices
with and without authentication.

>
> > The "authenticated" attribute tells user space whether the device
> > is authenticated.  User space needs to handle errors anyway when
> > reading the attribute.  Users will get an error if authentication
> > support could not be determined.  Simple.
>
> No, if it's not determined, it shouldn't be present.

That doesn't allow us to differentiate the cases I mentioned above though.

Another option is we could create multiple attributes "spdm",
"authenticated" and "spdm_err" for example to convey the information
with just yes/no attributes?

>
> > > What is going to happen to suddenly
> > > allow it to come back to be "alive" and working while the device is
> > > still present in the system?
> >
> > The device needs to be re-enumerated by the PCI core to retry
> > determining its authentication capability.  That's why the
> > sysfs documentation says the user may exercise the "remove"
> > and "rescan" attributes to retry authentication.
>
> But how does it know that?  remove and recan is a huge sledgehammer, and
> an amazing one if it even works on most hardware.  Don't make it part of
> any normal process please.
>
> > > I'd prefer it to be "if the file is there, this is supported by the
> > > device.  If the file is not there, it is not supported", as that makes
> > > things so much simpler for userspace (i.e. you don't want userspace to
> > > have to both test if the file is there AND read all entries just to see
> > > if the kernel knows what is going on or not.)
> >
> > Huh?  Read all entries?  The attribute contains only 0 or 1!
> >
> > Or you'll get an error reading it.
>
> It's the error, don't do that.  If an error is going to happen, then
> don't have the file there.  That's the way sysfs works, it's not a
> "let's add all possible files and then make userspace open them all and
> see if an error happens to determine what really is present for this
> device" model.  It's a "if a file is there, that attribute is there and
> we can read it".
>
> > > > Alternatively, authentication success might be signaled to user space
> > > > through a uevent, whereupon it may bind a (blacklisted) driver.
> > >
> > > How will that happen?
> >
> > The SPDM library can be amended to signal a uevent when authentication
> > succeeds or fails and user space can then act on it.  I imagine systemd
> > or some other daemon might listen to such events and do interesting things,
> > such as binding a driver once authentication succeeds.
>
> That's a new user/kernel api and should be designed ONLY if you actually
> need it and have a user.  Otherwise let's just wait for later for that.
>
> > Maybe you have better ideas.  Be constructive!  Make suggestions!
>
> Again, have the file there only if this is something that the hardware
> supports.  Don't fail a read just because the hardware does not seem to
> support it, but it might sometime in the future if you just happen to
> unplug/plug it back in.
>
> > > > +         This prevents downgrade attacks where an attacker consumes
> > > > +         memory or disturbs communication in order to create the
> > > > +         appearance that a device does not support authentication.
> > >
> > > If an attacker can consume kernel memory to cause this to happen you
> > > have bigger problems.  That's not the kernel's issue here at all.
> > >
> > > And "disable communication" means "we just don't support it as the
> > > device doesn't say it does", so again, why does that matter?
> >
> > Reacting to potential attacks sure is the kernel's business.
>
> Reacting to real, software attacks is the kernel's business.  Reacting
> to possible hardware issues that are just theoretical is not.

Generally I would agree, but the idea of SPDM is to react to hardware
attacks, so it's something that the kernel should be conscious of

>
> > > > +         The reason why authentication support could not be determined
> > > > +         is apparent from "dmesg".  To re-probe authentication support
> > > > +         of PCI devices, exercise the "remove" and "rescan" attributes.
> > >
> > > Don't make userspace parse kernel logs for this type of thing.  And
> > > asking userspace to rely on remove and recan is a mess, either show it
> > > works or not.
> >
> > I'd say looking in dmesg to determine whether the user is being attacked
> > is perfectly fine, as is requiring the user to explicitly act on a
> > potential attack.
> >
> >
> > > > --- a/drivers/pci/cma.c
> > > > +++ b/drivers/pci/cma.c
> > > > @@ -171,8 +171,10 @@ void pci_cma_init(struct pci_dev *pdev)
> > > >  {
> > > >   struct pci_doe_mb *doe;
> > > >
> > > > - if (IS_ERR(pci_cma_keyring))
> > > > + if (IS_ERR(pci_cma_keyring)) {
> > > > +         pdev->spdm_state = ERR_PTR(-ENOTTY);
> > >
> > > Why ENOTTY?
> >
> > We use -ENOTTY as return value for unsupported reset methods in the
> > PCI core, see e.g. pcie_reset_flr(), pcie_af_flr(), pci_pm_reset(),
> > pci_parent_bus_reset(), pci_reset_hotplug_slot(), ...
> >
> > We also use -ENOTTY in pci_bridge_wait_for_secondary_bus() and
> > pci_dev_wait().
> >
> > It was used here to be consistent with those existing occurrences
> > in the PCI core.
> >
> > If you'd prefer something else, please make a suggestion.
>
> Ah, didn't realize that was a pci thing, ok, nevermind.
>
> > > > +static ssize_t authenticated_store(struct device *dev,
> > > > +                            struct device_attribute *attr,
> > > > +                            const char *buf, size_t count)
> > > > +{
> > > > + void *spdm_state = dev_to_spdm_state(dev);
> > > > + int rc;
> > > > +
> > > > + if (IS_ERR_OR_NULL(spdm_state))
> > > > +         return PTR_ERR(spdm_state);
> > > > +
> > > > + if (sysfs_streq(buf, "re")) {
> > >
> > > I don't like sysfs files that when reading show a binary, but require a
> > > "magic string" to be written to them to have them do something else.
> > > that way lies madness.  What would you do if each sysfs file had a
> > > custom language that you had to look up for each one?  Be consistant
> > > here.  But again, I don't think you need a store function at all, either
> > > the device supports this, or it doesn't.
> >
> > I'm not sure if you've even read the ABI documentation in full.
> >
> > The store method is needed to reauthenticate the device,
> > e.g. after a new trusted root certificate was added to the
> > kernel's .cma keyring.
>
> Why not have a different file called "reauthentication" that only allows
> a write to it of a 1/Y/y to do the reauthentication.  sysfs is a "one
> file per thing" interface, not a "parse a command and do something but
> when read from return a different value" interface.

That works for me as well

Alistair

>
> Let's keep it dirt simple please.
>
> thanks,
>
> greg k-h
Miguel Ojeda Feb. 28, 2025, 8:49 a.m. UTC | #15
On Thu, Feb 27, 2025 at 8:32 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Thu, Feb 27, 2025 at 05:47:01PM +0100, Miguel Ojeda wrote:
> >
> > I can look into it, after other build system things are done.
>
> Looks like Alice already sent a series to do this, so no need.

Above I meant the having bidirectional bindings as automated as
possible, which I suspect we could eventually want if Rust keeps
growing. For the time being checking is more than enough, yeah.

Cheers,
Miguel
Greg KH March 1, 2025, 4:27 a.m. UTC | #16
On Fri, Feb 28, 2025 at 12:27:36PM +1000, Alistair Francis wrote:
> On Fri, Feb 28, 2025 at 5:33 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Thu, Feb 27, 2025 at 05:45:02PM +0100, Miguel Ojeda wrote:
> > > On Thu, Feb 27, 2025 at 1:01 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > Sorry, you are right, it does, and of course it happens (otherwise how
> > > > would bindings work), but for small functions like this, how is the C
> > > > code kept in sync with the rust side?  Where is the .h file that C
> > > > should include?
> 
> This I can address with something like Alice mentioned earlier to
> ensure the C and Rust functions stay in sync.

Yes, that looks to be fixed up now and should not be an issue.

> > > What you were probably remembering is that it still needs to be
> > > justified, i.e. we don't want to generally/freely start replacing
> > > "individual functions" and doing FFI both ways everywhere, i.e. the
> > > goal is to build safe abstractions wherever possible.
> >
> > Ah, ok, that's what I was remembering.
> >
> > Anyway, the "pass a void blob from C into rust" that this patch is doing
> > feels really odd to me, and hard to verify it is "safe" at a simple
> > glance.
> 
> I agree, it's a bit odd. Ideally I would like to use a sysfs binding,
> but there isn't one today.
> 
> I had a quick look and a Rust sysfs binding implementation seems like
> a lot of work, which I wasn't convinced I wanted to invest in for this
> series. This is only a single sysfs attribute and I didn't want to
> slow down this series on a whole sysfs Rust implementation.
> 
> If this approach isn't ok for now, I will just drop the sysfs changes
> from the series so the SPDM implementation doesn't stall on sysfs
> changes. Then come back to the sysfs attributes in the future.

Please do that, we can revisit the sysfs stuff later.

> So the high level question, is "pass[ing] a void blob from C into
> rust" ok or should I defer for a future safer implementation?

I don't think we want random void * blobs being passed between C and
Rust like that as ensuring that both sides really know what is happening
and keep that in sync is going to be impossible over time.  Type safety
is our friend :)

thanks,

greg k-h
Greg KH March 1, 2025, 4:33 a.m. UTC | #17
On Fri, Feb 28, 2025 at 12:55:30PM +1000, Alistair Francis wrote:
> On Fri, Feb 28, 2025 at 11:41 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Thu, Feb 27, 2025 at 11:42:46PM +0100, Lukas Wunner wrote:
> > > On Thu, Feb 27, 2025 at 03:16:40AM -0800, Greg KH wrote:
> > > > On Thu, Feb 27, 2025 at 01:09:41PM +1000, Alistair Francis wrote:
> > > > > The PCI core has just been amended to authenticate CMA-capable devices
> > > > > on enumeration and store the result in an "authenticated" bit in struct
> > > > > pci_dev->spdm_state.
> > > > >
> > > > > Expose the bit to user space through an eponymous sysfs attribute.
> > > > >
> > > > > Allow user space to trigger reauthentication (e.g. after it has updated
> > > > > the CMA keyring) by writing to the sysfs attribute.
> > > > >
> > > > > Implement the attribute in the SPDM library so that other bus types
> > > > > besides PCI may take advantage of it.  They just need to add
> > > > > spdm_attr_group to the attribute groups of their devices and amend the
> > > > > dev_to_spdm_state() helper which retrieves the spdm_state for a given
> > > > > device.
> > > > >
> > > > > The helper may return an ERR_PTR if it couldn't be determined whether
> > > > > SPDM is supported by the device.  The sysfs attribute is visible in that
> > > > > case but returns an error on access.  This prevents downgrade attacks
> > > > > where an attacker disturbs memory allocation or DOE communication
> > > > > in order to create the appearance that SPDM is unsupported.
> > > >
> > > > I don't like this "if it's present we still don't know if the device
> > > > supports this", as that is not normally the "sysfs way" here.  Why must
> > > > it be present in those situations?
> 
> I do think there are 4 situations
> 
>  1. Device supports authentication and was authenticated
>  2.
>   a) Device supports authentication, but the certificate chain can't
> be authenticated
>   b) Device supports authentication, but the communication was interrupted
>  3. Device doesn't support authentication
> 
> Case 1 and 3 are easy
> 
> Case 2 (a) means that the kernel doesn't trust the certificate. This
> could be for a range of reasons, including the user just hasn't
> installed the cert yet. So it makes sense to pass that information to
> the user as they can act on it. I think it's a different case to 3
> (where there is no action for the user to take).

Ok, that's fine, but as a "default" you should fail such that the device
is not allowed to actually do anything if 2) happens.  I think you are
doing that, it's the whole use of sysfs as the user/kernel api here in
an odd way that I am objecting to.

> > Again, are we now claiming that Linux needs to support "hardware
> > glitching"?  Is that required somewhere?  I think if the DOE exchanges
> > fail, we just trust the device as we have to trust something, right?
> 
> This is case 2 (b) above. If an attacker could flip a bit or drop a
> byte in the communication path they could cause the authentication to
> stop.
> 
> The process to authenticate a device involves sending a lot of data,
> so it's not impossible that an attacker could interrupt some of the
> traffic.
> 
> If the DOE exchange fails I don't think we want to trust the device,
> as that's the point of SPDM. So being able to communicate that to
> userspace does seem really useful.

Agreed, but again, let's come up with a better api than "a random sysfs
file is present but reading it gets an error to convey this is an
unvalidated device" feels wrong to me.

Why not just report the "state" of the device?  You have 3 states, just
report that?  Should be much simpler overall.

> > > Of course, this is an abnormal situation that users won't encounter
> > > unless they're being attacked.  Normally the attribute is only present
> > > if authentication is supported.
> > >
> > > I disagree with your assessment that we have bigger problems.
> > > For security protocols like this we have to be very careful
> > > to prevent trivial circumvention.  We cannot just shrug this off
> > > as unimportant.
> >
> > hardware glitching is not trivial.  Let's only worry about that if the
> > hardware people somehow require it, and if so, we can push back and say
> > "stop making us fix your broken designs" :)
> 
> Hardware glitching is hard to do, but SPDM is designed to protect
> against a range of hard attacks. The types of attacks that nation
> states would do. So I think we should do our best to protect against
> them.Obviously there is only so much that software can do to protect
> against a hardware glitch, but this seems like a good in between.
> 
> I don't think the design is overall broken though. At some point when
> all devices support SPDM it becomes a non issue as you just fail if
> authentication fails, but in the mean time we need to handle devices
> with and without authentication.

Agreed, again it's the user/kernel api I am objecting to here the most.

> > > The "authenticated" attribute tells user space whether the device
> > > is authenticated.  User space needs to handle errors anyway when
> > > reading the attribute.  Users will get an error if authentication
> > > support could not be determined.  Simple.
> >
> > No, if it's not determined, it shouldn't be present.
> 
> That doesn't allow us to differentiate the cases I mentioned above though.
> 
> Another option is we could create multiple attributes "spdm",
> "authenticated" and "spdm_err" for example to convey the information
> with just yes/no attributes?

Yes!  Or a simple "spdm_state" file that has one of these values in it?
sysfs files do not need to only return "1 or 0", they can return
different strings, but only 1 at a time.

thanks,

greg k-h
Lukas Wunner March 1, 2025, 6:01 p.m. UTC | #18
On Thu, Feb 27, 2025 at 05:39:53PM -0800, Greg KH wrote:
> On Thu, Feb 27, 2025 at 11:42:46PM +0100, Lukas Wunner wrote:
> > On Thu, Feb 27, 2025 at 03:16:40AM -0800, Greg KH wrote:
> > > I don't like this "if it's present we still don't know if the device
> > > supports this", as that is not normally the "sysfs way" here.  Why must
> > > it be present in those situations?
> > 
> > That's explained above.
> 
> Not really, you just say "downgrade attacks", which is not something
> that we need to worry about, right?

A downgrade attack means duping the victim into believing that only
a weaker security mode is supported.  E.g. only sha1, but not sha256.

In this context, downgrade attack means duping the kernel or user
into believing that SPDM authentication is unsupported, even though it is.

https://en.wikipedia.org/wiki/Downgrade_attack

That's definitely something we need to be aware of and guard against,
otherwise what's the point of authenticating in the first place.


> > Unfortunately there is no (signed) bit in Config Space which tells us
> > whether authentication is supported by a PCI device.  Rather, it is
> > necessary to exchange several messages with the device through a
> > DOE mailbox in config space to determine that.  I'm worried that an
> > attacker deliberately "glitches" those DOE exchanges and thus creates
> > the appearance that the device does not support authentication.
> 
> That's a hardware glitch, and if that happens, then it will show a 0 and
> that's the same as not being present at all, right?

No, the "authenticated" attribute is not present in sysfs if authentication
is unsupported.

The downgrade attack protection comprises exposing the attribute if it
could not be determined whether authentication is supported or not,
and returning an error (ENOTTY) on read or write.

User space applications need to check anyway whether read() or write()
failed for some reason.  E.g. if the device is hot-removed concurrently,
the read() system call returns ENODEV.  So returning ENOTTY is just
another error that can occur on access to the attribute.

The idea is that user space typically wants to check whether the attribute
contains "1", signifying that the device was authenticated successfully.
Hence a return value of "0" or any error code signifies that the device
is not authenticated.

And if user space wants to check whether authentication is supported at all,
it checks for presence of the sysfs attribute.  Hence exposing the attribute
if support could not be determined is a safety net to not mislead user space
that the device does not support authentication.

For PCIe, glitching the hardware (the electric signals exchanged with
the device) is indeed one way to disrupt the DOE and SPDM exchanges.

However the SPDM protocol has not only been adopted by PCIe, but also
other buses, in particular SCSI and ATA.  And in those cases, glitching
the SPDM exchanges may be a pure software thing.  (Think iSCSI communication
with storage devices in a remote rack or data center.)

Damien Le Moal has explicitly requested that the user space ABI for SPDM
is consistent across buses.  So the downgrade attack protection can be
taken advantage of by those other buses as well.


> > Let's say the user's policy is to trust legacy devices which do not
> > support authentication, but require authentication for newer NVMe drives
> > from a certain vendor.  An attacker may manipulate an authentication-capable
> > NVMe drive from that vendor, whereupon it will fail authentication.
> > But the attacker can trick the user into trusting the device by glitching
> > the DOE exchanges.
> 
> Again, are we now claiming that Linux needs to support "hardware
> glitching"?  Is that required somewhere?

Required?  It's simply prudent to protect users from being duped into
thinking the device doesn't support authentication.


> I think if the DOE exchanges
> fail, we just trust the device as we have to trust something, right?

If the DOE exchanges fail, something fishy is going on.
Why should we hide that fact from the user?


> > The device needs to be re-enumerated by the PCI core to retry
> > determining its authentication capability.  That's why the
> > sysfs documentation says the user may exercise the "remove"
> > and "rescan" attributes to retry authentication.
> 
> But how does it know that?

Because reads and writes to the attribute return ENOTTY.

> remove and recan is a huge sledgehammer, and
> an amazing one if it even works on most hardware.  Don't make it part of
> any normal process please.

It's not a normal process.  It's manual recovery in case of a
potential attack.  The user can also choose to unplug the device
or reboot the machine.  That's arguably a bigger sledgehammer.


> It's the error, don't do that.  If an error is going to happen, then
> don't have the file there.  That's the way sysfs works, it's not a
> "let's add all possible files and then make userspace open them all and
> see if an error happens to determine what really is present for this
> device" model.  It's a "if a file is there, that attribute is there and
> we can read it".

The point is that if the file isn't there even though the device might
support authentication, we're creating a false and dangerous illusion.
This is different from other attributes which don't have that quality.


> > > > Alternatively, authentication success might be signaled to user space
> > > > through a uevent, whereupon it may bind a (blacklisted) driver.
> > > 
> > > How will that happen?
> > 
> > The SPDM library can be amended to signal a uevent when authentication
> > succeeds or fails and user space can then act on it.  I imagine systemd
> > or some other daemon might listen to such events and do interesting things,
> > such as binding a driver once authentication succeeds.
> 
> That's a new user/kernel api and should be designed ONLY if you actually
> need it and have a user.  Otherwise let's just wait for later for that.

Of course.  Again, the commit message makes suggestions for future
extensions to justify the change.  Those are just ideas.  Whether
and how they are implemented remains to be seen.  Signaling a uevent
on authentication success or failure seems like an obvious idea,
hence I included it in the commit message.

I fear if I don't include those ideas in the commit message, someone
will come along and ask "why do you need this at all?", thus putting
into question the whole set of authentication patches.


> > > If an attacker can consume kernel memory to cause this to happen you
> > > have bigger problems.  That's not the kernel's issue here at all.
> > > 
> > > And "disable communication" means "we just don't support it as the
> > > device doesn't say it does", so again, why does that matter?
> > 
> > Reacting to potential attacks sure is the kernel's business.
> 
> Reacting to real, software attacks is the kernel's business.  Reacting
> to possible hardware issues that are just theoretical is not.

We have fundamental disagreement whether certain attacks need to be taken
seriously.  Which reminds me of...

   "the final topic on the agenda was the corporate attempt at
    security consciousness raising; a shouting match ensued,
    in the course of which several and various reputations
    were sullied, certain paranoid reactions were taken less
    than seriously, and no great meeting of the minds was met."

   [minutes of the uucp-lovers interest group, 20 April 1983,
    from "A Quarter Century of UNIX" (1994) page 113]
    https://wiki.tuhs.org/lib/exe/fetch.php?media=publications:qcu.pdf

Looks like we're just upholding the time honored tradition of
UNIX security disagreements!

Thanks,

Lukas
diff mbox series

Patch

diff --git a/Documentation/ABI/testing/sysfs-devices-spdm b/Documentation/ABI/testing/sysfs-devices-spdm
new file mode 100644
index 000000000000..2d6e5d513231
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-devices-spdm
@@ -0,0 +1,31 @@ 
+What:		/sys/devices/.../authenticated
+Date:		June 2024
+Contact:	Lukas Wunner <lukas@wunner.de>
+Description:
+		This file contains 1 if the device authenticated successfully
+		with SPDM (Security Protocol and Data Model).  It contains 0 if
+		the device failed authentication (and may thus be malicious).
+
+		Writing "re" to this file causes reauthentication.
+		That may be opportune after updating the device keyring.
+		The device keyring of the PCI bus is named ".cma"
+		(Component Measurement and Authentication).
+
+		Reauthentication may also be necessary after device identity
+		has mutated, e.g. after downloading firmware to an FPGA device.
+
+		The file is not visible if authentication is unsupported
+		by the device.
+
+		If the kernel could not determine whether authentication is
+		supported because memory was low or communication with the
+		device was not working, the file is visible but accessing it
+		fails with error code ENOTTY.
+
+		This prevents downgrade attacks where an attacker consumes
+		memory or disturbs communication in order to create the
+		appearance that a device does not support authentication.
+
+		The reason why authentication support could not be determined
+		is apparent from "dmesg".  To re-probe authentication support
+		of PCI devices, exercise the "remove" and "rescan" attributes.
diff --git a/MAINTAINERS b/MAINTAINERS
index abb3b603299f..03e1076f915a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -21385,7 +21385,8 @@  L:	linux-cxl@vger.kernel.org
 L:	linux-pci@vger.kernel.org
 S:	Maintained
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/devsec/spdm.git
-F:	drivers/pci/cma.c
+F:	Documentation/ABI/testing/sysfs-devices-spdm
+F:	drivers/pci/cma*
 F:	include/linux/spdm.h
 F:	lib/rspdm/
 
diff --git a/drivers/pci/cma.c b/drivers/pci/cma.c
index f2c435b04b92..59558714f143 100644
--- a/drivers/pci/cma.c
+++ b/drivers/pci/cma.c
@@ -171,8 +171,10 @@  void pci_cma_init(struct pci_dev *pdev)
 {
 	struct pci_doe_mb *doe;
 
-	if (IS_ERR(pci_cma_keyring))
+	if (IS_ERR(pci_cma_keyring)) {
+		pdev->spdm_state = ERR_PTR(-ENOTTY);
 		return;
+	}
 
 	if (!pci_is_pcie(pdev))
 		return;
@@ -185,8 +187,10 @@  void pci_cma_init(struct pci_dev *pdev)
 	pdev->spdm_state = spdm_create(&pdev->dev, pci_doe_transport, doe,
 				       PCI_DOE_MAX_PAYLOAD, pci_cma_keyring,
 				       pci_cma_validate);
-	if (!pdev->spdm_state)
+	if (!pdev->spdm_state) {
+		pdev->spdm_state = ERR_PTR(-ENOTTY);
 		return;
+	}
 
 	/*
 	 * Keep spdm_state allocated even if initial authentication fails
@@ -204,7 +208,7 @@  void pci_cma_init(struct pci_dev *pdev)
  */
 void pci_cma_reauthenticate(struct pci_dev *pdev)
 {
-	if (!pdev->spdm_state)
+	if (IS_ERR_OR_NULL(pdev->spdm_state))
 		return;
 
 	spdm_authenticate(pdev->spdm_state);
@@ -212,7 +216,7 @@  void pci_cma_reauthenticate(struct pci_dev *pdev)
 
 void pci_cma_destroy(struct pci_dev *pdev)
 {
-	if (!pdev->spdm_state)
+	if (IS_ERR_OR_NULL(pdev->spdm_state))
 		return;
 
 	spdm_destroy(pdev->spdm_state);
diff --git a/drivers/pci/doe.c b/drivers/pci/doe.c
index 8ea37698082a..e4b609f613da 100644
--- a/drivers/pci/doe.c
+++ b/drivers/pci/doe.c
@@ -706,6 +706,7 @@  void pci_doe_init(struct pci_dev *pdev)
 		if (IS_ERR(doe_mb)) {
 			pci_err(pdev, "[%x] failed to create mailbox: %ld\n",
 				offset, PTR_ERR(doe_mb));
+			pci_cma_disable(pdev);
 			continue;
 		}
 
@@ -714,6 +715,7 @@  void pci_doe_init(struct pci_dev *pdev)
 			pci_err(pdev, "[%x] failed to insert mailbox: %d\n",
 				offset, rc);
 			pci_doe_destroy_mb(doe_mb);
+			pci_cma_disable(pdev);
 		}
 	}
 }
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index b46ce1a2c554..0645935fbcaf 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1804,6 +1804,9 @@  const struct attribute_group *pci_dev_attr_groups[] = {
 #endif
 #ifdef CONFIG_PCIEASPM
 	&aspm_ctrl_attr_group,
+#endif
+#ifdef CONFIG_PCI_CMA
+	&spdm_attr_group,
 #endif
 	NULL,
 };
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index fa6e3ae10b67..ad646b69a9c7 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -452,10 +452,15 @@  static inline void pci_doe_disconnected(struct pci_dev *pdev) { }
 void pci_cma_init(struct pci_dev *pdev);
 void pci_cma_destroy(struct pci_dev *pdev);
 void pci_cma_reauthenticate(struct pci_dev *pdev);
+static inline void pci_cma_disable(struct pci_dev *pdev)
+{
+	pdev->spdm_state = ERR_PTR(-ENOTTY);
+}
 #else
 static inline void pci_cma_init(struct pci_dev *pdev) { }
 static inline void pci_cma_destroy(struct pci_dev *pdev) { }
 static inline void pci_cma_reauthenticate(struct pci_dev *pdev) { }
+static inline void pci_cma_disable(struct pci_dev *pdev) { }
 #endif
 
 #ifdef CONFIG_PCI_NPEM
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 50c43546a32b..696f5ce971f4 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2704,6 +2704,18 @@  static inline bool pci_is_thunderbolt_attached(struct pci_dev *pdev)
 void pci_uevent_ers(struct pci_dev *pdev, enum  pci_ers_result err_type);
 #endif
 
+#ifdef CONFIG_PCI_CMA
+static inline struct spdm_state *pci_dev_to_spdm_state(struct pci_dev *pdev)
+{
+	return pdev->spdm_state;
+}
+#else
+static inline struct spdm_state *pci_dev_to_spdm_state(struct pci_dev *pdev)
+{
+	return NULL;
+}
+#endif
+
 #include <linux/dma-mapping.h>
 
 #define pci_printk(level, pdev, fmt, arg...) \
diff --git a/lib/rspdm/Makefile b/lib/rspdm/Makefile
index 1f62ee2a882d..f15b1437196b 100644
--- a/lib/rspdm/Makefile
+++ b/lib/rspdm/Makefile
@@ -8,3 +8,4 @@ 
 obj-$(CONFIG_RSPDM) += spdm.o
 
 spdm-y := lib.o
+spdm-$(CONFIG_SYSFS) += req-sysfs.o
diff --git a/lib/rspdm/lib.rs b/lib/rspdm/lib.rs
index 2bb716140e0a..24064d3c4669 100644
--- a/lib/rspdm/lib.rs
+++ b/lib/rspdm/lib.rs
@@ -24,6 +24,7 @@ 
 
 mod consts;
 mod state;
+pub mod sysfs;
 mod validator;
 
 /// spdm_create() - Allocate SPDM session
diff --git a/lib/rspdm/req-sysfs.c b/lib/rspdm/req-sysfs.c
new file mode 100644
index 000000000000..11bacb04f08f
--- /dev/null
+++ b/lib/rspdm/req-sysfs.c
@@ -0,0 +1,97 @@ 
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Rust implementation of the DMTF Security Protocol and Data Model (SPDM)
+ * https://www.dmtf.org/dsp/DSP0274
+ *
+ * Requester role: sysfs interface
+ *
+ * Copyright (C) 2023-24 Intel Corporation
+ * Copyright (C) 2024 Western Digital
+ */
+
+#include <linux/pci.h>
+
+int rust_authenticated_show(void *spdm_state, char *buf);
+
+/**
+ * dev_to_spdm_state() - Retrieve SPDM session state for given device
+ *
+ * @dev: Responder device
+ *
+ * Returns a pointer to the device's SPDM session state,
+ *	   %NULL if the device doesn't have one or
+ *	   %ERR_PTR if it couldn't be determined whether SPDM is supported.
+ *
+ * In the %ERR_PTR case, attributes are visible but return an error on access.
+ * This prevents downgrade attacks where an attacker disturbs memory allocation
+ * or communication with the device in order to create the appearance that SPDM
+ * is unsupported.  E.g. with PCI devices, the attacker may foil CMA or DOE
+ * initialization by simply hogging memory.
+ */
+static void *dev_to_spdm_state(struct device *dev)
+{
+	if (dev_is_pci(dev))
+		return pci_dev_to_spdm_state(to_pci_dev(dev));
+
+	/* Insert mappers for further bus types here. */
+
+	return NULL;
+}
+
+/* authenticated attribute */
+
+static umode_t spdm_attrs_are_visible(struct kobject *kobj,
+				      struct attribute *a, int n)
+{
+	struct device *dev = kobj_to_dev(kobj);
+	void *spdm_state = dev_to_spdm_state(dev);
+
+	if (IS_ERR_OR_NULL(spdm_state))
+		return SYSFS_GROUP_INVISIBLE;
+
+	return a->mode;
+}
+
+static ssize_t authenticated_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t count)
+{
+	void *spdm_state = dev_to_spdm_state(dev);
+	int rc;
+
+	if (IS_ERR_OR_NULL(spdm_state))
+		return PTR_ERR(spdm_state);
+
+	if (sysfs_streq(buf, "re")) {
+		rc = spdm_authenticate(spdm_state);
+		if (rc)
+			return rc;
+	} else {
+		return -EINVAL;
+	}
+
+	return count;
+}
+
+static ssize_t authenticated_show(struct device *dev,
+				  struct device_attribute *attr, char *buf)
+{
+	void *spdm_state = dev_to_spdm_state(dev);
+
+	if (IS_ERR_OR_NULL(spdm_state))
+		return PTR_ERR(spdm_state);
+
+	return rust_authenticated_show(spdm_state, buf);
+}
+static DEVICE_ATTR_RW(authenticated);
+
+static struct attribute *spdm_attrs[] = {
+	&dev_attr_authenticated.attr,
+	NULL
+};
+
+const struct attribute_group spdm_attr_group = {
+	.attrs = spdm_attrs,
+	.is_visible = spdm_attrs_are_visible,
+};
diff --git a/lib/rspdm/sysfs.rs b/lib/rspdm/sysfs.rs
new file mode 100644
index 000000000000..50901c55b8f7
--- /dev/null
+++ b/lib/rspdm/sysfs.rs
@@ -0,0 +1,28 @@ 
+// SPDX-License-Identifier: GPL-2.0
+
+// Copyright (C) 2024 Western Digital
+
+//! Rust sysfs helper functions
+//!
+//! Rust implementation of the DMTF Security Protocol and Data Model (SPDM)
+//! <https://www.dmtf.org/dsp/DSP0274>
+
+use crate::SpdmState;
+use kernel::prelude::*;
+use kernel::{bindings, fmt, str::CString};
+
+/// Helper function for the sysfs `authenticated_show()`.
+#[no_mangle]
+pub extern "C" fn rust_authenticated_show(spdm_state: *mut SpdmState, buf: *mut u8) -> isize {
+    // SAFETY: The opaque pointer will be directly from the `spdm_create()`
+    // function, so we can safely reconstruct it.
+    let state = unsafe { KBox::from_raw(spdm_state) };
+
+    let fmt = match CString::try_from_fmt(fmt!("{}\n", state.authenticated)) {
+        Ok(f) => f,
+        Err(_e) => return 0,
+    };
+
+    // SAFETY: Calling a kernel C function with valid arguments
+    unsafe { bindings::sysfs_emit(buf, fmt.as_char_ptr()) as isize }
+}