diff mbox series

[v3,12/16] PCI/DOE: Create mailboxes on device enumeration

Message ID c3f9e24fffa318a045f89664fb9545099cb0d603.1676043318.git.lukas@wunner.de
State Superseded
Headers show
Series Collection of DOE material | expand

Commit Message

Lukas Wunner Feb. 10, 2023, 8:25 p.m. UTC
Currently a DOE instance cannot be shared by multiple drivers because
each driver creates its own pci_doe_mb struct for a given DOE instance.
For the same reason a DOE instance cannot be shared between the PCI core
and a driver.

Overcome this limitation by creating mailboxes in the PCI core on device
enumeration.

Provide a pci_find_doe_mailbox() API call to allow drivers to get a
pci_doe_mb for a given (pci_dev, vendor, protocol) triple.  This API is
modeled after pci_find_capability() and can later be amended with a
pci_find_next_doe_mailbox() call to iterate over all mailboxes of a
given pci_dev which support a specific protocol.

On removal, destroy the mailboxes in pci_destroy_dev(), after the driver
is unbound.  This allows drivers to use DOE in their ->remove() hook.

On surprise removal, cancel ongoing DOE exchanges and prevent new ones
from being scheduled.  Thereby ensure that a hot-removed device doesn't
needlessly wait for a running exchange to time out.

Tested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 Changes v2 -> v3:
 * Don't cancel ongoing DOE exchanges in pci_stop_dev() so that
   drivers may perform DOE in their ->remove() hooks
 * Instead cancel ongoing DOE exchanges on surprise removal in
   pci_dev_set_disconnected()
 * Emit error message in pci_doe_init() if mailbox creation fails (Ira)
 * Explain in commit message that pci_find_doe_mailbox() can later
   be amended with pci_find_next_doe_mailbox() (Jonathan)

 drivers/pci/doe.c       | 73 +++++++++++++++++++++++++++++++++++++++++
 drivers/pci/pci.h       | 12 +++++++
 drivers/pci/probe.c     |  1 +
 drivers/pci/remove.c    |  1 +
 include/linux/pci-doe.h |  2 ++
 include/linux/pci.h     |  3 ++
 6 files changed, 92 insertions(+)

Comments

Li, Ming4 Feb. 15, 2023, 2:07 a.m. UTC | #1
On 2/11/2023 4:25 AM, Lukas Wunner wrote:
> Currently a DOE instance cannot be shared by multiple drivers because
> each driver creates its own pci_doe_mb struct for a given DOE instance.
> For the same reason a DOE instance cannot be shared between the PCI core
> and a driver.
> 
> Overcome this limitation by creating mailboxes in the PCI core on device
> enumeration.
> 
> Provide a pci_find_doe_mailbox() API call to allow drivers to get a
> pci_doe_mb for a given (pci_dev, vendor, protocol) triple.  This API is
> modeled after pci_find_capability() and can later be amended with a
> pci_find_next_doe_mailbox() call to iterate over all mailboxes of a
> given pci_dev which support a specific protocol.
> 
> On removal, destroy the mailboxes in pci_destroy_dev(), after the driver
> is unbound.  This allows drivers to use DOE in their ->remove() hook.
> 
> On surprise removal, cancel ongoing DOE exchanges and prevent new ones
> from being scheduled.  Thereby ensure that a hot-removed device doesn't
> needlessly wait for a running exchange to time out.
> 
> Tested-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Reviewed-by: Ming Li <ming4.li@intel.com>
Alexey Kardashevskiy Feb. 28, 2023, 1:18 a.m. UTC | #2
On 11/2/23 07:25, Lukas Wunner wrote:
> Currently a DOE instance cannot be shared by multiple drivers because
> each driver creates its own pci_doe_mb struct for a given DOE instance.

Sorry for my ignorance but why/how/when would a device have multiple 
drivers bound? Or it only one driver at the time but we also want DOE 
MBs to survive switching to another (different or newer) driver?


> For the same reason a DOE instance cannot be shared between the PCI core
> and a driver.

And we want this sharing why? Any example will do. Thanks,

> Overcome this limitation by creating mailboxes in the PCI core on device
> enumeration.
> 
> Provide a pci_find_doe_mailbox() API call to allow drivers to get a
> pci_doe_mb for a given (pci_dev, vendor, protocol) triple. 
 >
> This API is
> modeled after pci_find_capability() and can later be amended with a
> pci_find_next_doe_mailbox() call to iterate over all mailboxes of a
> given pci_dev which support a specific protocol.
> 
> On removal, destroy the mailboxes in pci_destroy_dev(), after the driver
> is unbound.  This allows drivers to use DOE in their ->remove() hook.
> 
> On surprise removal, cancel ongoing DOE exchanges and prevent new ones
> from being scheduled.  Thereby ensure that a hot-removed device doesn't
> needlessly wait for a running exchange to time out.
> 
> Tested-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>   Changes v2 -> v3:
>   * Don't cancel ongoing DOE exchanges in pci_stop_dev() so that
>     drivers may perform DOE in their ->remove() hooks
>   * Instead cancel ongoing DOE exchanges on surprise removal in
>     pci_dev_set_disconnected()
>   * Emit error message in pci_doe_init() if mailbox creation fails (Ira)
>   * Explain in commit message that pci_find_doe_mailbox() can later
>     be amended with pci_find_next_doe_mailbox() (Jonathan)
> 
>   drivers/pci/doe.c       | 73 +++++++++++++++++++++++++++++++++++++++++
>   drivers/pci/pci.h       | 12 +++++++
>   drivers/pci/probe.c     |  1 +
>   drivers/pci/remove.c    |  1 +
>   include/linux/pci-doe.h |  2 ++
>   include/linux/pci.h     |  3 ++
>   6 files changed, 92 insertions(+)
> 
> diff --git a/drivers/pci/doe.c b/drivers/pci/doe.c
> index 2bc202b64b6a..bf32875d27da 100644
> --- a/drivers/pci/doe.c
> +++ b/drivers/pci/doe.c
> @@ -20,6 +20,8 @@
>   #include <linux/pci-doe.h>
>   #include <linux/workqueue.h>
>   
> +#include "pci.h"
> +
>   #define PCI_DOE_PROTOCOL_DISCOVERY 0
>   
>   /* Timeout of 1 second from 6.30.2 Operation, PCI Spec r6.0 */
> @@ -658,3 +660,74 @@ int pci_doe(struct pci_doe_mb *doe_mb, u16 vendor, u8 type,
>   	return task.rv;
>   }
>   EXPORT_SYMBOL_GPL(pci_doe);
> +
> +/**
> + * pci_find_doe_mailbox() - Find Data Object Exchange mailbox
> + *
> + * @pdev: PCI device
> + * @vendor: Vendor ID
> + * @type: Data Object Type
> + *
> + * Find first DOE mailbox of a PCI device which supports the given protocol.
> + *
> + * RETURNS: Pointer to the DOE mailbox or NULL if none was found.
> + */
> +struct pci_doe_mb *pci_find_doe_mailbox(struct pci_dev *pdev, u16 vendor,
> +					u8 type)
> +{
> +	struct pci_doe_mb *doe_mb;
> +	unsigned long index;
> +
> +	xa_for_each(&pdev->doe_mbs, index, doe_mb)
> +		if (pci_doe_supports_prot(doe_mb, vendor, type))
> +			return doe_mb;
> +
> +	return NULL;
> +}
> +EXPORT_SYMBOL_GPL(pci_find_doe_mailbox);
> +
> +void pci_doe_init(struct pci_dev *pdev)
> +{
> +	struct pci_doe_mb *doe_mb;
> +	u16 offset = 0;
> +	int rc;
> +
> +	xa_init(&pdev->doe_mbs);
> +
> +	while ((offset = pci_find_next_ext_capability(pdev, offset,
> +						      PCI_EXT_CAP_ID_DOE))) {
> +		doe_mb = pci_doe_create_mb(pdev, offset);
> +		if (IS_ERR(doe_mb)) {
> +			pci_err(pdev, "[%x] failed to create mailbox: %ld\n",
> +				offset, PTR_ERR(doe_mb));
> +			continue;
> +		}
> +
> +		rc = xa_insert(&pdev->doe_mbs, offset, doe_mb, GFP_KERNEL);
> +		if (rc) {
> +			pci_err(pdev, "[%x] failed to insert mailbox: %d\n",
> +				offset, rc);
> +			pci_doe_destroy_mb(doe_mb);
> +		}
> +	}
> +}
> +
> +void pci_doe_destroy(struct pci_dev *pdev)
> +{
> +	struct pci_doe_mb *doe_mb;
> +	unsigned long index;
> +
> +	xa_for_each(&pdev->doe_mbs, index, doe_mb)
> +		pci_doe_destroy_mb(doe_mb);
> +
> +	xa_destroy(&pdev->doe_mbs);
> +}
> +
> +void pci_doe_disconnected(struct pci_dev *pdev)
> +{
> +	struct pci_doe_mb *doe_mb;
> +	unsigned long index;
> +
> +	xa_for_each(&pdev->doe_mbs, index, doe_mb)
> +		pci_doe_cancel_tasks(doe_mb);
> +}
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 8f5d4bd5b410..065ca9743ec1 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -318,6 +318,16 @@ struct pci_sriov {
>   	bool		drivers_autoprobe; /* Auto probing of VFs by driver */
>   };
>   
> +#ifdef CONFIG_PCI_DOE
> +void pci_doe_init(struct pci_dev *pdev);
> +void pci_doe_destroy(struct pci_dev *pdev);
> +void pci_doe_disconnected(struct pci_dev *pdev);
> +#else
> +static inline void pci_doe_init(struct pci_dev *pdev) { }
> +static inline void pci_doe_destroy(struct pci_dev *pdev) { }
> +static inline void pci_doe_disconnected(struct pci_dev *pdev) { }
> +#endif
> +
>   /**
>    * pci_dev_set_io_state - Set the new error state if possible.
>    *
> @@ -372,6 +382,8 @@ static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
>   	pci_dev_set_io_state(dev, pci_channel_io_perm_failure);
>   	device_unlock(&dev->dev);
>   
> +	pci_doe_disconnected(dev);
> +
>   	return 0;
>   }
>   
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 1779582fb500..65e60ee50489 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -2476,6 +2476,7 @@ static void pci_init_capabilities(struct pci_dev *dev)
>   	pci_aer_init(dev);		/* Advanced Error Reporting */
>   	pci_dpc_init(dev);		/* Downstream Port Containment */
>   	pci_rcec_init(dev);		/* Root Complex Event Collector */
> +	pci_doe_init(dev);		/* Data Object Exchange */
>   
>   	pcie_report_downtraining(dev);
>   	pci_init_reset_methods(dev);
> diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
> index 0145aef1b930..f25acf50879f 100644
> --- a/drivers/pci/remove.c
> +++ b/drivers/pci/remove.c
> @@ -39,6 +39,7 @@ static void pci_destroy_dev(struct pci_dev *dev)
>   	list_del(&dev->bus_list);
>   	up_write(&pci_bus_sem);
>   
> +	pci_doe_destroy(dev);
>   	pcie_aspm_exit_link_state(dev);
>   	pci_bridge_d3_update(dev);
>   	pci_free_resources(dev);
> diff --git a/include/linux/pci-doe.h b/include/linux/pci-doe.h
> index 7f16749c6aa3..d6192ee0ac07 100644
> --- a/include/linux/pci-doe.h
> +++ b/include/linux/pci-doe.h
> @@ -29,6 +29,8 @@ struct pci_doe_mb;
>   
>   struct pci_doe_mb *pcim_doe_create_mb(struct pci_dev *pdev, u16 cap_offset);
>   bool pci_doe_supports_prot(struct pci_doe_mb *doe_mb, u16 vid, u8 type);
> +struct pci_doe_mb *pci_find_doe_mailbox(struct pci_dev *pdev, u16 vendor,
> +					u8 type);
>   
>   int pci_doe(struct pci_doe_mb *doe_mb, u16 vendor, u8 type,
>   	    const void *request, size_t request_sz,
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 3c51cac3890b..b19c2965e384 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -511,6 +511,9 @@ struct pci_dev {
>   #endif
>   #ifdef CONFIG_PCI_P2PDMA
>   	struct pci_p2pdma __rcu *p2pdma;
> +#endif
> +#ifdef CONFIG_PCI_DOE
> +	struct xarray	doe_mbs;	/* Data Object Exchange mailboxes */
>   #endif
>   	u16		acs_cap;	/* ACS Capability offset */
>   	phys_addr_t	rom;		/* Physical address if not from BAR */
Dan Williams Feb. 28, 2023, 1:39 a.m. UTC | #3
Alexey Kardashevskiy wrote:
> On 11/2/23 07:25, Lukas Wunner wrote:
> > Currently a DOE instance cannot be shared by multiple drivers because
> > each driver creates its own pci_doe_mb struct for a given DOE instance.
> 
> Sorry for my ignorance but why/how/when would a device have multiple 
> drivers bound? Or it only one driver at the time but we also want DOE 
> MBs to survive switching to another (different or newer) driver?
> 
> 
> > For the same reason a DOE instance cannot be shared between the PCI core
> > and a driver.
> 
> And we want this sharing why? Any example will do. Thanks,

I understood "sharing" in this context to mean that the PCI core can use
the DOE instance and the device can reuse the same *through* the PCI
core. So for example an SPDM session can be established by the core and
used later by the driver.
Lukas Wunner Feb. 28, 2023, 5:43 a.m. UTC | #4
On Tue, Feb 28, 2023 at 12:18:07PM +1100, Alexey Kardashevskiy wrote:
> On 11/2/23 07:25, Lukas Wunner wrote:
> > For the same reason a DOE instance cannot be shared between the PCI core
> > and a driver.
> 
> And we want this sharing why? Any example will do. Thanks,

The PCI core is going to perform CMA/SPDM authentication when a device
gets enumerated (PCIe r6.0 sec 6.31).  That's the main motivation
to lift DOE mailbox creation into the PCI core.  It's not mentioned
here explicitly because I want the patch to stand on its own.
CMA/SPDM support will be submitted separately.

A driver that later on gets bound to the device should be allowed
to talk to it via DOE as well, possibly even sharing the same DOE
mailboxes used by the PCI core.

Patches for CMA/SPDM are under development on this branch:

https://github.com/l1k/linux/commits/doe


> > Currently a DOE instance cannot be shared by multiple drivers because
> > each driver creates its own pci_doe_mb struct for a given DOE instance.
> 
> Sorry for my ignorance but why/how/when would a device have multiple drivers
> bound? Or it only one driver at the time but we also want DOE MBs to survive
> switching to another (different or newer) driver?

Conceivably, a driver may have the need to talk to multiple devices
via DOE, even ones it's not bound to.  (E.g. devices in its ancestry
or children.)

Thanks,

Lukas
Alexey Kardashevskiy Feb. 28, 2023, 7:24 a.m. UTC | #5
On 28/2/23 16:43, Lukas Wunner wrote:
> On Tue, Feb 28, 2023 at 12:18:07PM +1100, Alexey Kardashevskiy wrote:
>> On 11/2/23 07:25, Lukas Wunner wrote:
>>> For the same reason a DOE instance cannot be shared between the PCI core
>>> and a driver.
>>
>> And we want this sharing why? Any example will do. Thanks,
> 
> The PCI core is going to perform CMA/SPDM authentication when a device
> gets enumerated (PCIe r6.0 sec 6.31).  That's the main motivation
> to lift DOE mailbox creation into the PCI core.  It's not mentioned
> here explicitly because I want the patch to stand on its own.
> CMA/SPDM support will be submitted separately.

I was going the opposite direction with avoiding adding this into the 
PCI core as 1) the pci_dev struct is already 2K  and 2) it is a niche 
feature and  3) I wanted this CMA/SPDM session setup to be platform 
specific as on our platform the SPDM support requires some devices to be 
probed before we can any SPDM.


> A driver that later on gets bound to the device should be allowed
> to talk to it via DOE as well, possibly even sharing the same DOE
> mailboxes used by the PCI core.
> 
> Patches for CMA/SPDM are under development on this branch:
> 
> https://github.com/l1k/linux/commits/doe

yes, thanks! Lots of reading :)


>>> Currently a DOE instance cannot be shared by multiple drivers because
>>> each driver creates its own pci_doe_mb struct for a given DOE instance.
>>
>> Sorry for my ignorance but why/how/when would a device have multiple drivers
>> bound? Or it only one driver at the time but we also want DOE MBs to survive
>> switching to another (different or newer) driver?
> 
> Conceivably, a driver may have the need to talk to multiple devices
> via DOE, even ones it's not bound to.  (E.g. devices in its ancestry
> or children.)

Ah ok. Well, a parent device could look for the DOE MB in a child using 
devres_find(), this requirement alone does not require moving things to 
the PCI core and potentially allows it to be a module which could be a 
better way as distros could have it always enabled but it would not 
waste any memory on my laptop when not loaded. Thanks,
Jonathan Cameron Feb. 28, 2023, 10:42 a.m. UTC | #6
On Tue, 28 Feb 2023 18:24:41 +1100
Alexey Kardashevskiy <aik@amd.com> wrote:

> On 28/2/23 16:43, Lukas Wunner wrote:
> > On Tue, Feb 28, 2023 at 12:18:07PM +1100, Alexey Kardashevskiy wrote:  
> >> On 11/2/23 07:25, Lukas Wunner wrote:  
> >>> For the same reason a DOE instance cannot be shared between the PCI core
> >>> and a driver.  
> >>
> >> And we want this sharing why? Any example will do. Thanks,  
> > 
> > The PCI core is going to perform CMA/SPDM authentication when a device
> > gets enumerated (PCIe r6.0 sec 6.31).  That's the main motivation
> > to lift DOE mailbox creation into the PCI core.  It's not mentioned
> > here explicitly because I want the patch to stand on its own.
> > CMA/SPDM support will be submitted separately.  
> 
> I was going the opposite direction with avoiding adding this into the 
> PCI core as 1) the pci_dev struct is already 2K  and 2) it is a niche 
> feature and  3) I wanted this CMA/SPDM session setup to be platform 
> specific as on our platform the SPDM support requires some devices to be 
> probed before we can any SPDM.

Is that happening over a DOE mailbox, or a different transport?
If it's a different transport then that should be fine, though we'll need
to have a slightly different security model and any part of early
driver load will need to be carefully hardened against a "malicious" device
if it is doing anything non trivial. If it's just "turning on the lights"
then shouldn't be a problem.

Will be interesting to see how niche DOE ends up.  My guess it it's worth
a struct xarray, but we could take it out of line if that saves enough
to bother.

> 
> 
> > A driver that later on gets bound to the device should be allowed
> > to talk to it via DOE as well, possibly even sharing the same DOE
> > mailboxes used by the PCI core.
> > 
> > Patches for CMA/SPDM are under development on this branch:
> > 
> > https://github.com/l1k/linux/commits/doe  
> 
> yes, thanks! Lots of reading :)
> 
> 
> >>> Currently a DOE instance cannot be shared by multiple drivers because
> >>> each driver creates its own pci_doe_mb struct for a given DOE instance.  
> >>
> >> Sorry for my ignorance but why/how/when would a device have multiple drivers
> >> bound? Or it only one driver at the time but we also want DOE MBs to survive
> >> switching to another (different or newer) driver?  
> > 
> > Conceivably, a driver may have the need to talk to multiple devices
> > via DOE, even ones it's not bound to.  (E.g. devices in its ancestry
> > or children.)  
> 
> Ah ok. Well, a parent device could look for the DOE MB in a child using 
> devres_find(), this requirement alone does not require moving things to 
> the PCI core and potentially allows it to be a module which could be a 
> better way as distros could have it always enabled but it would not 
> waste any memory on my laptop when not loaded. Thanks,

The DOE mailboxes have an "exciting" level of flexibility and discovering
supported protocols requires use of the DOE itself. So we need a single
entity to take control over concurrent access to each DOE instance.

Given the mix of protocols, I'd expect some of them to be potentially accessed
by a parent, and others to be accessed by driver attached to the child.

Whether it needs to chat to it's parent isn't totally clear to me yet, as
depends a bit on what entities end up getting created for management of
encryption etc + what other usecases we see in medium term.

Jonathan
Lukas Wunner March 2, 2023, 8:22 p.m. UTC | #7
On Tue, Feb 28, 2023 at 06:24:41PM +1100, Alexey Kardashevskiy wrote:
> On 28/2/23 16:43, Lukas Wunner wrote:
> > On Tue, Feb 28, 2023 at 12:18:07PM +1100, Alexey Kardashevskiy wrote:
> > > On 11/2/23 07:25, Lukas Wunner wrote:
> > > > For the same reason a DOE instance cannot be shared between the
> > > > PCI core and a driver.
> > > 
> > > And we want this sharing why? Any example will do. Thanks,
> > 
> > The PCI core is going to perform CMA/SPDM authentication when a device
> > gets enumerated (PCIe r6.0 sec 6.31).  That's the main motivation
> > to lift DOE mailbox creation into the PCI core.  It's not mentioned
> > here explicitly because I want the patch to stand on its own.
> > CMA/SPDM support will be submitted separately.
> 
> I was going the opposite direction with avoiding adding this into the PCI
> core as 1) the pci_dev struct is already 2K  and 2) it is a niche feature
> and  3) I wanted this CMA/SPDM session setup to be platform specific as on
> our platform the SPDM support requires some devices to be probed before we
> can any SPDM.

We had an open discussion at Plumbers with stakeholders from various
companies and the consensus was that initially we'll upstream a CMA/SPDM
implementation which authenticates devices on enumeration and exposes
the result to user space via sysfs.  Nothing more, nothing less:

https://lpc.events/event/16/contributions/1304/
https://lpc.events/event/16/contributions/1304/attachments/1029/1974/LPC2022-SPDM-BoF-v4.pdf

Thereby we seek to minimize friction in the upstreaming process
because we avoid depending on PCIe features layered on top of
CMA/SPDM (such as IDE) and we can postpone controversial discussions
about the consequences of failed authentication (such as forbidding
driver binding or firewalling the device off via ACS).

The patches under development follow the approach we agreed on back then.

I honestly think that the size of struct pci_dev is a non-issue because
even the lowest-end IoT devices come with gigabytes of memory these days
and as long as we do not exceed PAGE_SIZE (which is the limit when we'd
have to switch from kmalloc() to vmalloc() I believe), there's no
problem.

The claim that DOE and CMA/SPDM is going to be a niche feature is at
least debatable.  It seems rather likely to me that we'll see adoption
particularly in the mobile segment as iOS/Android/Chrome promulgators
will see an opportunity to harden their products further.

It's intriguing that you need some devices to be probed before you
can perform authentication.  What do you need this for?

Do you perhaps need to load firmware onto the device before you can
authenticate it?  Doesn't that defeat the whole point of authentication?

Or do you mean that certain *other* devices need to probe before a device
can be authenticated?  What are these other devices?

What happens if the device is suspended to D3cold or experiences a
Conventional Reset?  Do you need to do anything special before you
can re-authenticate the device?

On the one hand, performing authentication in the PCI core could be
considered a case of midlayer fallacy.  On the other hand, I think we
do want to afford CMA/SPDM and the features layered on top of it
(IDE, TDISP) regardless whether a driver is bound:  That way we can
protect the PCI core's communication with the device and with ACS
we can protect other PCI devices in the system from peer-to-peer
communication originating from a malicious device which failed to
authenticate.

I would like to make the implementation under development work for
your use case as well, but to do that I need a better understanding
what your requirements are.

Thanks,

Lukas
Alexey Kardashevskiy March 7, 2023, 1:55 a.m. UTC | #8
On 3/3/23 07:22, Lukas Wunner wrote:
> On Tue, Feb 28, 2023 at 06:24:41PM +1100, Alexey Kardashevskiy wrote:
>> On 28/2/23 16:43, Lukas Wunner wrote:
>>> On Tue, Feb 28, 2023 at 12:18:07PM +1100, Alexey Kardashevskiy wrote:
>>>> On 11/2/23 07:25, Lukas Wunner wrote:
>>>>> For the same reason a DOE instance cannot be shared between the
>>>>> PCI core and a driver.
>>>>
>>>> And we want this sharing why? Any example will do. Thanks,
>>>
>>> The PCI core is going to perform CMA/SPDM authentication when a device
>>> gets enumerated (PCIe r6.0 sec 6.31).  That's the main motivation
>>> to lift DOE mailbox creation into the PCI core.  It's not mentioned
>>> here explicitly because I want the patch to stand on its own.
>>> CMA/SPDM support will be submitted separately.
>>
>> I was going the opposite direction with avoiding adding this into the PCI
>> core as 1) the pci_dev struct is already 2K  and 2) it is a niche feature
>> and  3) I wanted this CMA/SPDM session setup to be platform specific as on
>> our platform the SPDM support requires some devices to be probed before we
>> can any SPDM.
> 
> We had an open discussion at Plumbers with stakeholders from various
> companies and the consensus was that initially we'll upstream a CMA/SPDM
> implementation which authenticates devices on enumeration and exposes
> the result to user space via sysfs.  Nothing more, nothing less:
> 
> https://lpc.events/event/16/contributions/1304/
> https://lpc.events/event/16/contributions/1304/attachments/1029/1974/LPC2022-SPDM-BoF-v4.pdf
> 
> Thereby we seek to minimize friction in the upstreaming process
> because we avoid depending on PCIe features layered on top of
> CMA/SPDM (such as IDE) and we can postpone controversial discussions
> about the consequences of failed authentication (such as forbidding
> driver binding or firewalling the device off via ACS).
> 
> The patches under development follow the approach we agreed on back then.
> 
> I honestly think that the size of struct pci_dev is a non-issue because
> even the lowest-end IoT devices come with gigabytes of memory these days
> and as long as we do not exceed PAGE_SIZE (which is the limit when we'd
> have to switch from kmalloc() to vmalloc() I believe), there's no
> problem.

Well, not sizeof(pci_dev) as much, it is more about DOE/SPDM code being 
compiled into vmlinux - it is hard for me to imagine this running on my 
laptop soon but distros dislike multiple kernel configs. Not a huge deal 
but still.

> The claim that DOE and CMA/SPDM is going to be a niche feature is at
> least debatable.  It seems rather likely to me that we'll see adoption
> particularly in the mobile segment as iOS/Android/Chrome promulgators
> will see an opportunity to harden their products further.

These guys all use custom configs (as was also suggested in that threat 
thread - "Linux guest kernel threat model for Confidential Computing").

> It's intriguing that you need some devices to be probed before you
> can perform authentication.  What do you need this for?
> 
> Do you perhaps need to load firmware onto the device before you can
> authenticate it?  Doesn't that defeat the whole point of authentication?
> > Or do you mean that certain *other* devices need to probe before a device
> can be authenticated?  What are these other devices?

The AMD CCP device (which represents PSP == a secure processor == TCB 
for SEV VMs) is a PCI device and when we add TDISP/etc to the TCB, we 
will need the host to speak to PSP to set this up. It does not matter 
for host-only IDE support though.


> What happens if the device is suspended to D3cold or experiences a
> Conventional Reset?  Do you need to do anything special before you
> can re-authenticate the device?
> 
> On the one hand, performing authentication in the PCI core could be
> considered a case of midlayer fallacy.  On the other hand, I think we
> do want to afford CMA/SPDM and the features layered on top of it
> (IDE, TDISP) regardless whether a driver is bound:


https://cdrdv2-public.intel.com/742542/software-enabling-for-tdx-tee-io-fixed.pdf 
suggests that TDX is going to have to reimplement SPDM/IDE/TDISP again 
so the common base here is DOE only, is not it?

And what is the value of TDISP without a device driver, how can the PCI 
core decide if the device is alright to use? I am sure there is value 
but my imagination fails me.


>  That way we can
> protect the PCI core's communication with the device and with ACS
> we can protect other PCI devices in the system from peer-to-peer
> communication originating from a malicious device which failed to
> authenticate.
> 
> I would like to make the implementation under development work for
> your use case as well, but to do that I need a better understanding
> what your requirements are.

While I am figuring this out, is it a bug? :)
https://github.com/l1k/linux/commit/c1b242d74d55d361934bdcff1bb4f881922b10ae#r102531215

Thanks,

> Thanks,
> 
> Lukas
Alexey Kardashevskiy April 3, 2023, 12:55 a.m. UTC | #9
On 3/3/23 07:22, Lukas Wunner wrote:

> I would like to make the implementation under development work for
> your use case as well, but to do that I need a better understanding
> what your requirements are.

Soooo here is what we are up to:

https://www.amd.com/content/dam/amd/en/documents/developer/sev-tio-whitepaper.pdf

tl;dr: it is secure VFIO == TDISP on AMD SEV-SNP so it is going to use 
ASP (== PSP == the secure processor) for SPDM/IDE.
diff mbox series

Patch

diff --git a/drivers/pci/doe.c b/drivers/pci/doe.c
index 2bc202b64b6a..bf32875d27da 100644
--- a/drivers/pci/doe.c
+++ b/drivers/pci/doe.c
@@ -20,6 +20,8 @@ 
 #include <linux/pci-doe.h>
 #include <linux/workqueue.h>
 
+#include "pci.h"
+
 #define PCI_DOE_PROTOCOL_DISCOVERY 0
 
 /* Timeout of 1 second from 6.30.2 Operation, PCI Spec r6.0 */
@@ -658,3 +660,74 @@  int pci_doe(struct pci_doe_mb *doe_mb, u16 vendor, u8 type,
 	return task.rv;
 }
 EXPORT_SYMBOL_GPL(pci_doe);
+
+/**
+ * pci_find_doe_mailbox() - Find Data Object Exchange mailbox
+ *
+ * @pdev: PCI device
+ * @vendor: Vendor ID
+ * @type: Data Object Type
+ *
+ * Find first DOE mailbox of a PCI device which supports the given protocol.
+ *
+ * RETURNS: Pointer to the DOE mailbox or NULL if none was found.
+ */
+struct pci_doe_mb *pci_find_doe_mailbox(struct pci_dev *pdev, u16 vendor,
+					u8 type)
+{
+	struct pci_doe_mb *doe_mb;
+	unsigned long index;
+
+	xa_for_each(&pdev->doe_mbs, index, doe_mb)
+		if (pci_doe_supports_prot(doe_mb, vendor, type))
+			return doe_mb;
+
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(pci_find_doe_mailbox);
+
+void pci_doe_init(struct pci_dev *pdev)
+{
+	struct pci_doe_mb *doe_mb;
+	u16 offset = 0;
+	int rc;
+
+	xa_init(&pdev->doe_mbs);
+
+	while ((offset = pci_find_next_ext_capability(pdev, offset,
+						      PCI_EXT_CAP_ID_DOE))) {
+		doe_mb = pci_doe_create_mb(pdev, offset);
+		if (IS_ERR(doe_mb)) {
+			pci_err(pdev, "[%x] failed to create mailbox: %ld\n",
+				offset, PTR_ERR(doe_mb));
+			continue;
+		}
+
+		rc = xa_insert(&pdev->doe_mbs, offset, doe_mb, GFP_KERNEL);
+		if (rc) {
+			pci_err(pdev, "[%x] failed to insert mailbox: %d\n",
+				offset, rc);
+			pci_doe_destroy_mb(doe_mb);
+		}
+	}
+}
+
+void pci_doe_destroy(struct pci_dev *pdev)
+{
+	struct pci_doe_mb *doe_mb;
+	unsigned long index;
+
+	xa_for_each(&pdev->doe_mbs, index, doe_mb)
+		pci_doe_destroy_mb(doe_mb);
+
+	xa_destroy(&pdev->doe_mbs);
+}
+
+void pci_doe_disconnected(struct pci_dev *pdev)
+{
+	struct pci_doe_mb *doe_mb;
+	unsigned long index;
+
+	xa_for_each(&pdev->doe_mbs, index, doe_mb)
+		pci_doe_cancel_tasks(doe_mb);
+}
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 8f5d4bd5b410..065ca9743ec1 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -318,6 +318,16 @@  struct pci_sriov {
 	bool		drivers_autoprobe; /* Auto probing of VFs by driver */
 };
 
+#ifdef CONFIG_PCI_DOE
+void pci_doe_init(struct pci_dev *pdev);
+void pci_doe_destroy(struct pci_dev *pdev);
+void pci_doe_disconnected(struct pci_dev *pdev);
+#else
+static inline void pci_doe_init(struct pci_dev *pdev) { }
+static inline void pci_doe_destroy(struct pci_dev *pdev) { }
+static inline void pci_doe_disconnected(struct pci_dev *pdev) { }
+#endif
+
 /**
  * pci_dev_set_io_state - Set the new error state if possible.
  *
@@ -372,6 +382,8 @@  static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
 	pci_dev_set_io_state(dev, pci_channel_io_perm_failure);
 	device_unlock(&dev->dev);
 
+	pci_doe_disconnected(dev);
+
 	return 0;
 }
 
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 1779582fb500..65e60ee50489 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2476,6 +2476,7 @@  static void pci_init_capabilities(struct pci_dev *dev)
 	pci_aer_init(dev);		/* Advanced Error Reporting */
 	pci_dpc_init(dev);		/* Downstream Port Containment */
 	pci_rcec_init(dev);		/* Root Complex Event Collector */
+	pci_doe_init(dev);		/* Data Object Exchange */
 
 	pcie_report_downtraining(dev);
 	pci_init_reset_methods(dev);
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index 0145aef1b930..f25acf50879f 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -39,6 +39,7 @@  static void pci_destroy_dev(struct pci_dev *dev)
 	list_del(&dev->bus_list);
 	up_write(&pci_bus_sem);
 
+	pci_doe_destroy(dev);
 	pcie_aspm_exit_link_state(dev);
 	pci_bridge_d3_update(dev);
 	pci_free_resources(dev);
diff --git a/include/linux/pci-doe.h b/include/linux/pci-doe.h
index 7f16749c6aa3..d6192ee0ac07 100644
--- a/include/linux/pci-doe.h
+++ b/include/linux/pci-doe.h
@@ -29,6 +29,8 @@  struct pci_doe_mb;
 
 struct pci_doe_mb *pcim_doe_create_mb(struct pci_dev *pdev, u16 cap_offset);
 bool pci_doe_supports_prot(struct pci_doe_mb *doe_mb, u16 vid, u8 type);
+struct pci_doe_mb *pci_find_doe_mailbox(struct pci_dev *pdev, u16 vendor,
+					u8 type);
 
 int pci_doe(struct pci_doe_mb *doe_mb, u16 vendor, u8 type,
 	    const void *request, size_t request_sz,
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 3c51cac3890b..b19c2965e384 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -511,6 +511,9 @@  struct pci_dev {
 #endif
 #ifdef CONFIG_PCI_P2PDMA
 	struct pci_p2pdma __rcu *p2pdma;
+#endif
+#ifdef CONFIG_PCI_DOE
+	struct xarray	doe_mbs;	/* Data Object Exchange mailboxes */
 #endif
 	u16		acs_cap;	/* ACS Capability offset */
 	phys_addr_t	rom;		/* Physical address if not from BAR */