diff mbox series

[v2,4/6] virtio: Initialize authorized attribute for confidential guest

Message ID 20210930010511.3387967-5-sathyanarayanan.kuppuswamy@linux.intel.com (mailing list archive)
State Not Applicable
Delegated to: Bjorn Helgaas
Headers show
Series Add device filter support | expand

Commit Message

Kuppuswamy Sathyanarayanan Sept. 30, 2021, 1:05 a.m. UTC
Confidential guest platforms like TDX have a requirement to allow
only trusted devices. By default the confidential-guest core will
arrange for all devices to default to unauthorized (via
dev_default_authorization) in device_initialize(). Since virtio
driver is already hardened against the attack from the un-trusted host,
override the confidential computing default unauthorized state

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---
 drivers/virtio/virtio.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Michael S. Tsirkin Sept. 30, 2021, 11:03 a.m. UTC | #1
On Wed, Sep 29, 2021 at 06:05:09PM -0700, Kuppuswamy Sathyanarayanan wrote:
> Confidential guest platforms like TDX have a requirement to allow
> only trusted devices. By default the confidential-guest core will
> arrange for all devices to default to unauthorized (via
> dev_default_authorization) in device_initialize(). Since virtio
> driver is already hardened against the attack from the un-trusted host,
> override the confidential computing default unauthorized state
> 
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

Architecturally this all looks backwards. IIUC nothing about virtio
makes it authorized or trusted. The driver is hardened,
true, but this should be set at the driver not the device level.
And in particular, not all virtio drivers are hardened -
I think at this point blk and scsi drivers have been hardened - so
treating them all the same looks wrong.

> ---
>  drivers/virtio/virtio.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> index 588e02fb91d3..377b0ccdc503 100644
> --- a/drivers/virtio/virtio.c
> +++ b/drivers/virtio/virtio.c
> @@ -5,6 +5,8 @@
>  #include <linux/module.h>
>  #include <linux/idr.h>
>  #include <linux/of.h>
> +#include <linux/cc_platform.h>
> +#include <linux/device.h>
>  #include <uapi/linux/virtio_ids.h>
>  
>  /* Unique numbering for virtio devices. */
> @@ -390,6 +392,13 @@ int register_virtio_device(struct virtio_device *dev)
>  	dev->config_enabled = false;
>  	dev->config_change_pending = false;
>  
> +	/*
> +	 * For Confidential guest (like TDX), virtio devices are
> +	 * trusted. So set authorized status as true.
> +	 */
> +	if (cc_platform_has(CC_ATTR_GUEST_DEVICE_FILTER))
> +		dev->dev.authorized = true;
> +
>  	/* We always start by resetting the device, in case a previous
>  	 * driver messed it up.  This also tests that code path a little. */
>  	dev->config->reset(dev);
> -- 
> 2.25.1
Dan Williams Sept. 30, 2021, 1:36 p.m. UTC | #2
On Thu, Sep 30, 2021 at 4:03 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Sep 29, 2021 at 06:05:09PM -0700, Kuppuswamy Sathyanarayanan wrote:
> > Confidential guest platforms like TDX have a requirement to allow
> > only trusted devices. By default the confidential-guest core will
> > arrange for all devices to default to unauthorized (via
> > dev_default_authorization) in device_initialize(). Since virtio
> > driver is already hardened against the attack from the un-trusted host,
> > override the confidential computing default unauthorized state
> >
> > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> > Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>
> Architecturally this all looks backwards. IIUC nothing about virtio
> makes it authorized or trusted. The driver is hardened,
> true, but this should be set at the driver not the device level.

That's was my initial reaction to this proposal as well, and I ended
up leading Sathya astray from what Greg wanted. Greg rightly points
out that the "authorized" attribute from USB and Thunderbolt already
exists [1] [2]. So the choice is find an awkward way to mix driver
trust with existing bus-local "authorized" mechanisms, or promote the
authorized capability to the driver-core. This patch set implements
the latter to keep the momentum on the already shipping design scheme
to not add to the driver-core maintenance burden.

[1]: https://lore.kernel.org/all/YQuaJ78y8j1UmBoz@kroah.com/
[2]: https://lore.kernel.org/all/YQzF%2FutgrJfbZuHh@kroah.com/

> And in particular, not all virtio drivers are hardened -
> I think at this point blk and scsi drivers have been hardened - so
> treating them all the same looks wrong.

My understanding was that they have been audited, Sathya?
Greg Kroah-Hartman Sept. 30, 2021, 1:49 p.m. UTC | #3
On Thu, Sep 30, 2021 at 06:36:18AM -0700, Dan Williams wrote:
> On Thu, Sep 30, 2021 at 4:03 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Sep 29, 2021 at 06:05:09PM -0700, Kuppuswamy Sathyanarayanan wrote:
> > > Confidential guest platforms like TDX have a requirement to allow
> > > only trusted devices. By default the confidential-guest core will
> > > arrange for all devices to default to unauthorized (via
> > > dev_default_authorization) in device_initialize(). Since virtio
> > > driver is already hardened against the attack from the un-trusted host,
> > > override the confidential computing default unauthorized state
> > >
> > > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> > > Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> >
> > Architecturally this all looks backwards. IIUC nothing about virtio
> > makes it authorized or trusted. The driver is hardened,
> > true, but this should be set at the driver not the device level.
> 
> That's was my initial reaction to this proposal as well, and I ended
> up leading Sathya astray from what Greg wanted. Greg rightly points
> out that the "authorized" attribute from USB and Thunderbolt already
> exists [1] [2]. So the choice is find an awkward way to mix driver
> trust with existing bus-local "authorized" mechanisms, or promote the
> authorized capability to the driver-core. This patch set implements
> the latter to keep the momentum on the already shipping design scheme
> to not add to the driver-core maintenance burden.
> 
> [1]: https://lore.kernel.org/all/YQuaJ78y8j1UmBoz@kroah.com/
> [2]: https://lore.kernel.org/all/YQzF%2FutgrJfbZuHh@kroah.com/
> 
> > And in particular, not all virtio drivers are hardened -
> > I think at this point blk and scsi drivers have been hardened - so
> > treating them all the same looks wrong.
> 
> My understanding was that they have been audited, Sathya?

Please define "audited" and "trusted" here.

thanks,

greg k-h
Kuppuswamy Sathyanarayanan Sept. 30, 2021, 3:18 p.m. UTC | #4
On 9/30/21 6:36 AM, Dan Williams wrote:
>> And in particular, not all virtio drivers are hardened -
>> I think at this point blk and scsi drivers have been hardened - so
>> treating them all the same looks wrong.
> My understanding was that they have been audited, Sathya?

Yes, AFAIK, it has been audited. Andi also submitted some patches
related to it. Andi, can you confirm.

We also authorize the virtio at PCI ID level. And currently we allow
console, block and net virtio PCI devices.

{ PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, VIRTIO_TRANS_ID_NET) },
{ PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, VIRTIO_TRANS_ID_BLOCK) },
{ PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, VIRTIO_TRANS_ID_CONSOLE) },
Michael S. Tsirkin Sept. 30, 2021, 3:20 p.m. UTC | #5
On Thu, Sep 30, 2021 at 08:18:18AM -0700, Kuppuswamy, Sathyanarayanan wrote:
> 
> 
> On 9/30/21 6:36 AM, Dan Williams wrote:
> > > And in particular, not all virtio drivers are hardened -
> > > I think at this point blk and scsi drivers have been hardened - so
> > > treating them all the same looks wrong.
> > My understanding was that they have been audited, Sathya?
> 
> Yes, AFAIK, it has been audited. Andi also submitted some patches
> related to it. Andi, can you confirm.
> 
> We also authorize the virtio at PCI ID level. And currently we allow
> console, block and net virtio PCI devices.
> 
> { PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, VIRTIO_TRANS_ID_NET) },
> { PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, VIRTIO_TRANS_ID_BLOCK) },
> { PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, VIRTIO_TRANS_ID_CONSOLE) },

Presumably modern IDs should be allowed too?
Greg Kroah-Hartman Sept. 30, 2021, 3:23 p.m. UTC | #6
On Thu, Sep 30, 2021 at 08:18:18AM -0700, Kuppuswamy, Sathyanarayanan wrote:
> 
> 
> On 9/30/21 6:36 AM, Dan Williams wrote:
> > > And in particular, not all virtio drivers are hardened -
> > > I think at this point blk and scsi drivers have been hardened - so
> > > treating them all the same looks wrong.
> > My understanding was that they have been audited, Sathya?
> 
> Yes, AFAIK, it has been audited. Andi also submitted some patches
> related to it. Andi, can you confirm.

What is the official definition of "audited"?

thanks,

greg k-h
Kuppuswamy Sathyanarayanan Sept. 30, 2021, 3:23 p.m. UTC | #7
On 9/30/21 8:20 AM, Michael S. Tsirkin wrote:
> On Thu, Sep 30, 2021 at 08:18:18AM -0700, Kuppuswamy, Sathyanarayanan wrote:
>>
>>
>> On 9/30/21 6:36 AM, Dan Williams wrote:
>>>> And in particular, not all virtio drivers are hardened -
>>>> I think at this point blk and scsi drivers have been hardened - so
>>>> treating them all the same looks wrong.
>>> My understanding was that they have been audited, Sathya?
>>
>> Yes, AFAIK, it has been audited. Andi also submitted some patches
>> related to it. Andi, can you confirm.
>>
>> We also authorize the virtio at PCI ID level. And currently we allow
>> console, block and net virtio PCI devices.
>>
>> { PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, VIRTIO_TRANS_ID_NET) },
>> { PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, VIRTIO_TRANS_ID_BLOCK) },
>> { PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, VIRTIO_TRANS_ID_CONSOLE) },
> 
> Presumably modern IDs should be allowed too?

You mean IDs in range 104x right? If yes, we also allow them for above
mentioned types.

>
Kuppuswamy Sathyanarayanan Sept. 30, 2021, 7:04 p.m. UTC | #8
On 9/30/21 8:23 AM, Greg Kroah-Hartman wrote:
> On Thu, Sep 30, 2021 at 08:18:18AM -0700, Kuppuswamy, Sathyanarayanan wrote:
>>
>>
>> On 9/30/21 6:36 AM, Dan Williams wrote:
>>>> And in particular, not all virtio drivers are hardened -
>>>> I think at this point blk and scsi drivers have been hardened - so
>>>> treating them all the same looks wrong.
>>> My understanding was that they have been audited, Sathya?
>>
>> Yes, AFAIK, it has been audited. Andi also submitted some patches
>> related to it. Andi, can you confirm.
> 
> What is the official definition of "audited"?


In our case (Confidential Computing platform), the host is an un-trusted
entity. So any interaction with host from the drivers will have to be
protected against the possible attack from the host. For example, if we
are accessing a memory based on index value received from host, we have
to make sure it does not lead to out of bound access or when sharing the
memory with the host, we need to make sure only the required region is
shared with the host and the memory is un-shared after use properly.

Elena can share more details, but it was achieved with static analysis
and fuzzing. Here is a presentation she is sharing about the work at the
Linux Security Summit:
https://static.sched.com/hosted_files/lssna2021/b6/LSS-HardeningLinuxGuestForCCC.pdf

Andi, can talk more about the specific driver changes that came out of this
effort.

In our case the driver is considered "hardened" / "audited" if it can handle
the above scenarios.

> 
> thanks,
> 
> greg k-h
>
Kuppuswamy Sathyanarayanan Sept. 30, 2021, 7:16 p.m. UTC | #9
+Elena

On 9/30/21 12:04 PM, Kuppuswamy, Sathyanarayanan wrote:
> 
> 
> On 9/30/21 8:23 AM, Greg Kroah-Hartman wrote:
>> On Thu, Sep 30, 2021 at 08:18:18AM -0700, Kuppuswamy, Sathyanarayanan wrote:
>>>
>>>
>>> On 9/30/21 6:36 AM, Dan Williams wrote:
>>>>> And in particular, not all virtio drivers are hardened -
>>>>> I think at this point blk and scsi drivers have been hardened - so
>>>>> treating them all the same looks wrong.
>>>> My understanding was that they have been audited, Sathya?
>>>
>>> Yes, AFAIK, it has been audited. Andi also submitted some patches
>>> related to it. Andi, can you confirm.
>>
>> What is the official definition of "audited"?
> 
> 
> In our case (Confidential Computing platform), the host is an un-trusted
> entity. So any interaction with host from the drivers will have to be
> protected against the possible attack from the host. For example, if we
> are accessing a memory based on index value received from host, we have
> to make sure it does not lead to out of bound access or when sharing the
> memory with the host, we need to make sure only the required region is
> shared with the host and the memory is un-shared after use properly.
> 
> Elena can share more details, but it was achieved with static analysis
> and fuzzing. Here is a presentation she is sharing about the work at the
> Linux Security Summit:
> https://static.sched.com/hosted_files/lssna2021/b6/LSS-HardeningLinuxGuestForCCC.pdf
> 
> Andi, can talk more about the specific driver changes that came out of this
> effort.
> 
> In our case the driver is considered "hardened" / "audited" if it can handle
> the above scenarios.
> 
>>
>> thanks,
>>
>> greg k-h
>>
>
Andi Kleen Sept. 30, 2021, 7:25 p.m. UTC | #10
On 9/30/2021 8:18 AM, Kuppuswamy, Sathyanarayanan wrote:
>
>
> On 9/30/21 6:36 AM, Dan Williams wrote:
>>> And in particular, not all virtio drivers are hardened -
>>> I think at this point blk and scsi drivers have been hardened - so
>>> treating them all the same looks wrong.
>> My understanding was that they have been audited, Sathya?
>
> Yes, AFAIK, it has been audited. Andi also submitted some patches
> related to it. Andi, can you confirm.
>
> We also authorize the virtio at PCI ID level. And currently we allow
> console, block and net virtio PCI devices.
>
> { PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, VIRTIO_TRANS_ID_NET) },
> { PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, VIRTIO_TRANS_ID_BLOCK) },
> { PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, VIRTIO_TRANS_ID_CONSOLE) },
>
The only drivers that are being audited and fuzzed are these three 
virtio drivers (in addition to some other x86 code outside the driver model)

-Andi
Andi Kleen Sept. 30, 2021, 7:30 p.m. UTC | #11
On 9/30/2021 12:04 PM, Kuppuswamy, Sathyanarayanan wrote:
>
>
> On 9/30/21 8:23 AM, Greg Kroah-Hartman wrote:
>> On Thu, Sep 30, 2021 at 08:18:18AM -0700, Kuppuswamy, Sathyanarayanan 
>> wrote:
>>>
>>>
>>> On 9/30/21 6:36 AM, Dan Williams wrote:
>>>>> And in particular, not all virtio drivers are hardened -
>>>>> I think at this point blk and scsi drivers have been hardened - so
>>>>> treating them all the same looks wrong.
>>>> My understanding was that they have been audited, Sathya?
>>>
>>> Yes, AFAIK, it has been audited. Andi also submitted some patches
>>> related to it. Andi, can you confirm.
>>
>> What is the official definition of "audited"?
>
>
> In our case (Confidential Computing platform), the host is an un-trusted
> entity. So any interaction with host from the drivers will have to be
> protected against the possible attack from the host. For example, if we
> are accessing a memory based on index value received from host, we have
> to make sure it does not lead to out of bound access or when sharing the
> memory with the host, we need to make sure only the required region is
> shared with the host and the memory is un-shared after use properly.
>
> Elena can share more details, but it was achieved with static analysis
> and fuzzing. Here is a presentation she is sharing about the work at the
> Linux Security Summit:
> https://static.sched.com/hosted_files/lssna2021/b6/LSS-HardeningLinuxGuestForCCC.pdf 
>
>
> Andi, can talk more about the specific driver changes that came out of 
> this
> effort.

The original virtio was quite easy to exploit because it put its free 
list into the shared ring buffer.

We had a patchkit to harden virtio originally, but after some discussion 
we instead switched to Jason Wang's patchkit to move the virtio metadata 
into protected memory, which fixed near all of the issues. These patches 
have been already merged. There is one additional patch to limit the 
virtio modes.

There's an ongoing effort to audit (mostly finished I believe) and fuzz 
the three virtio drivers (fuzzing is still ongoing).

There was also a range of changes outside virtio for code outside the 
device model. Most of it was just disabling it though.

-Andi
Kuppuswamy Sathyanarayanan Sept. 30, 2021, 7:40 p.m. UTC | #12
+Elena

On 9/30/21 12:30 PM, Andi Kleen wrote:
> 
> On 9/30/2021 12:04 PM, Kuppuswamy, Sathyanarayanan wrote:
>>
>>
>> On 9/30/21 8:23 AM, Greg Kroah-Hartman wrote:
>>> On Thu, Sep 30, 2021 at 08:18:18AM -0700, Kuppuswamy, Sathyanarayanan wrote:
>>>>
>>>>
>>>> On 9/30/21 6:36 AM, Dan Williams wrote:
>>>>>> And in particular, not all virtio drivers are hardened -
>>>>>> I think at this point blk and scsi drivers have been hardened - so
>>>>>> treating them all the same looks wrong.
>>>>> My understanding was that they have been audited, Sathya?
>>>>
>>>> Yes, AFAIK, it has been audited. Andi also submitted some patches
>>>> related to it. Andi, can you confirm.
>>>
>>> What is the official definition of "audited"?
>>
>>
>> In our case (Confidential Computing platform), the host is an un-trusted
>> entity. So any interaction with host from the drivers will have to be
>> protected against the possible attack from the host. For example, if we
>> are accessing a memory based on index value received from host, we have
>> to make sure it does not lead to out of bound access or when sharing the
>> memory with the host, we need to make sure only the required region is
>> shared with the host and the memory is un-shared after use properly.
>>
>> Elena can share more details, but it was achieved with static analysis
>> and fuzzing. Here is a presentation she is sharing about the work at the
>> Linux Security Summit:
>> https://static.sched.com/hosted_files/lssna2021/b6/LSS-HardeningLinuxGuestForCCC.pdf
>>
>> Andi, can talk more about the specific driver changes that came out of this
>> effort.
> 
> The original virtio was quite easy to exploit because it put its free list into the shared ring buffer.
> 
> We had a patchkit to harden virtio originally, but after some discussion we instead switched to 
> Jason Wang's patchkit to move the virtio metadata into protected memory, which fixed near all of the 
> issues. These patches have been already merged. There is one additional patch to limit the virtio 
> modes.
> 
> There's an ongoing effort to audit (mostly finished I believe) and fuzz the three virtio drivers 
> (fuzzing is still ongoing).
> 
> There was also a range of changes outside virtio for code outside the device model. Most of it was 
> just disabling it though.
> 
> -Andi
>
Greg Kroah-Hartman Oct. 1, 2021, 7:03 a.m. UTC | #13
On Thu, Sep 30, 2021 at 12:04:05PM -0700, Kuppuswamy, Sathyanarayanan wrote:
> 
> 
> On 9/30/21 8:23 AM, Greg Kroah-Hartman wrote:
> > On Thu, Sep 30, 2021 at 08:18:18AM -0700, Kuppuswamy, Sathyanarayanan wrote:
> > > 
> > > 
> > > On 9/30/21 6:36 AM, Dan Williams wrote:
> > > > > And in particular, not all virtio drivers are hardened -
> > > > > I think at this point blk and scsi drivers have been hardened - so
> > > > > treating them all the same looks wrong.
> > > > My understanding was that they have been audited, Sathya?
> > > 
> > > Yes, AFAIK, it has been audited. Andi also submitted some patches
> > > related to it. Andi, can you confirm.
> > 
> > What is the official definition of "audited"?
> 
> 
> In our case (Confidential Computing platform), the host is an un-trusted
> entity. So any interaction with host from the drivers will have to be
> protected against the possible attack from the host. For example, if we
> are accessing a memory based on index value received from host, we have
> to make sure it does not lead to out of bound access or when sharing the
> memory with the host, we need to make sure only the required region is
> shared with the host and the memory is un-shared after use properly.

You have not defined the term "audited" here at all in any way that can
be reviewed or verified by anyone from what I can tell.

You have only described a new model that you wish the kernel to run in,
one in which it does not trust the hardware at all.  That is explicitly
NOT what the kernel has been designed for so far, and if you wish to
change that, lots of things need to be done outside of simply running
some fuzzers on a few random drivers.

For one example, how do you ensure that the memory you are reading from
hasn't been modified by the host between writing to it the last time you
did?  Do you have a list of specific drivers and kernel options that you
feel you now "trust"?  If so, how long does that trust last for?  Until
someonen else modifies that code?  What about modifications to functions
that your "audited" code touches?  Who is doing this auditing?  How do
you know the auditing has been done correctly?  Who has reviewed and
audited the tools that are doing the auditing?  Where is the
specification that has been agreed on how the auditing must be done?
And so on...

I feel like there are a lot of different things all being mixed up here
into one "oh we want this to happen!" type of thread.  Please let's just
stick to the one request that I had here, which was to move the way that
busses are allowed to authorize the devices they wish to control into a
generic way instead of being bus-specific logic.

Any requests outside of that type of functionality are just that,
outside the scope of this patchset and should get their own patch series
and discussion.

thanks,

greg k-h
Andi Kleen Oct. 1, 2021, 3:49 p.m. UTC | #14
On 10/1/2021 12:03 AM, Greg Kroah-Hartman wrote:
> On Thu, Sep 30, 2021 at 12:04:05PM -0700, Kuppuswamy, Sathyanarayanan wrote:
>>
>> On 9/30/21 8:23 AM, Greg Kroah-Hartman wrote:
>>> On Thu, Sep 30, 2021 at 08:18:18AM -0700, Kuppuswamy, Sathyanarayanan wrote:
>>>>
>>>> On 9/30/21 6:36 AM, Dan Williams wrote:
>>>>>> And in particular, not all virtio drivers are hardened -
>>>>>> I think at this point blk and scsi drivers have been hardened - so
>>>>>> treating them all the same looks wrong.
>>>>> My understanding was that they have been audited, Sathya?
>>>> Yes, AFAIK, it has been audited. Andi also submitted some patches
>>>> related to it. Andi, can you confirm.
>>> What is the official definition of "audited"?
>>
>> In our case (Confidential Computing platform), the host is an un-trusted
>> entity. So any interaction with host from the drivers will have to be
>> protected against the possible attack from the host. For example, if we
>> are accessing a memory based on index value received from host, we have
>> to make sure it does not lead to out of bound access or when sharing the
>> memory with the host, we need to make sure only the required region is
>> shared with the host and the memory is un-shared after use properly.
> You have not defined the term "audited" here at all in any way that can
> be reviewed or verified by anyone from what I can tell.
>
> You have only described a new model that you wish the kernel to run in,
> one in which it does not trust the hardware at all.  That is explicitly
> NOT what the kernel has been designed for so far,

It has been already done for a few USB/TB drivers, but yes not for the 
majority of the kernel.

>   and if you wish to
> change that, lots of things need to be done outside of simply running
> some fuzzers on a few random drivers.

The goal is to do similar work as USB/TB did, but do it for a small set 
of virtio drivers and use a custom allow list for those for the specific 
secure guest cases.

(there are some other goals, but let's not discuss them here for now)


>
> For one example, how do you ensure that the memory you are reading from
> hasn't been modified by the host between writing to it the last time you
> did?

It's similar techniques as we do on user space accesses. For example if 
you bound check some value the code needs to ensure it is cached in 
private memory, not reread from MMIO or shared memory. Of course that's 
a good idea anyways for performance because MMIO is slow.

In the concrete cases of virtio the main problem was the free list in 
shared memory, but that has been addressed now.



>   Do you have a list of specific drivers and kernel options that you
> feel you now "trust"?

For TDX it's currently only virtio net/block/console

But we expect this list to grow slightly over time, but not at a high 
rate (so hopefully <10)


> If so, how long does that trust last for?  Until
> someonen else modifies that code?  What about modifications to functions
> that your "audited" code touches?  Who is doing this auditing?  How do
> you know the auditing has been done correctly?  Who has reviewed and
> audited the tools that are doing the auditing?  Where is the
> specification that has been agreed on how the auditing must be done?
> And so on...

Well, I mean we already have a similar situation with user space APIs. 
So it's not a new problem. For those we've done it for many years, with 
audits and extra fuzzing.

There are people working on the audit and fuzzing today. How exactly it 
will be ensured long term is still be worked out, but I expect we can 
work out something.

>
> I feel like there are a lot of different things all being mixed up here
> into one "oh we want this to happen!" type of thread.



Agreed. The thread ended up about a lot of stuff which is outside the 
scope of the patches.

>    Please let's just
> stick to the one request that I had here, which was to move the way that
> busses are allowed to authorize the devices they wish to control into a
> generic way instead of being bus-specific logic.
>
> Any requests outside of that type of functionality are just that,
> outside the scope of this patchset and should get their own patch series
> and discussion.


Yes that's the intention. This patch kit is only about controlling what 
devices can enumerate.

Also please let's avoid the "trusted" term. It's really misleading and 
confusing in the context of confidential computing.


-Andi
Dan Williams Oct. 1, 2021, 4:13 p.m. UTC | #15
On Fri, Oct 1, 2021 at 12:03 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Thu, Sep 30, 2021 at 12:04:05PM -0700, Kuppuswamy, Sathyanarayanan wrote:
> >
> >
> > On 9/30/21 8:23 AM, Greg Kroah-Hartman wrote:
> > > On Thu, Sep 30, 2021 at 08:18:18AM -0700, Kuppuswamy, Sathyanarayanan wrote:
> > > >
> > > >
> > > > On 9/30/21 6:36 AM, Dan Williams wrote:
> > > > > > And in particular, not all virtio drivers are hardened -
> > > > > > I think at this point blk and scsi drivers have been hardened - so
> > > > > > treating them all the same looks wrong.
> > > > > My understanding was that they have been audited, Sathya?
> > > >
> > > > Yes, AFAIK, it has been audited. Andi also submitted some patches
> > > > related to it. Andi, can you confirm.
> > >
> > > What is the official definition of "audited"?
> >
> >
> > In our case (Confidential Computing platform), the host is an un-trusted
> > entity. So any interaction with host from the drivers will have to be
> > protected against the possible attack from the host. For example, if we
> > are accessing a memory based on index value received from host, we have
> > to make sure it does not lead to out of bound access or when sharing the
> > memory with the host, we need to make sure only the required region is
> > shared with the host and the memory is un-shared after use properly.
>
> You have not defined the term "audited" here at all in any way that can
> be reviewed or verified by anyone from what I can tell.

Agree.

>
> You have only described a new model that you wish the kernel to run in,
> one in which it does not trust the hardware at all.  That is explicitly
> NOT what the kernel has been designed for so far, and if you wish to
> change that, lots of things need to be done outside of simply running
> some fuzzers on a few random drivers.
>
> For one example, how do you ensure that the memory you are reading from
> hasn't been modified by the host between writing to it the last time you
> did?  Do you have a list of specific drivers and kernel options that you
> feel you now "trust"?  If so, how long does that trust last for?  Until
> someonen else modifies that code?  What about modifications to functions
> that your "audited" code touches?  Who is doing this auditing?  How do
> you know the auditing has been done correctly?  Who has reviewed and
> audited the tools that are doing the auditing?  Where is the
> specification that has been agreed on how the auditing must be done?
> And so on...
>
> I feel like there are a lot of different things all being mixed up here
> into one "oh we want this to happen!" type of thread.  Please let's just
> stick to the one request that I had here, which was to move the way that
> busses are allowed to authorize the devices they wish to control into a
> generic way instead of being bus-specific logic.

I want to continue to shave this proposal down, but that last sentence
reads as self-contradictory.

Bear with me, and perhaps it's a lack of imagination on my part, but I
don't see how to get to a globally generic "authorized" sysfs ABI
given that USB and Thunderbolt want to do bus specific actions on
authorization toggle events. Certainly a default generic authorized
attribute can be defined for all the other buses that don't have
legacy here, but Thunderbolt will still require support for '2' as an
authorized value, and USB will still want to base probe decisions on
the authorization state of both the usb_device and the usb_interface.

> Any requests outside of that type of functionality are just that,
> outside the scope of this patchset and should get their own patch series
> and discussion.

A way to shave this further would be to default authorize just enough
devices to do the rest of the authorization from initramfs. That would
seem to cut out 'virtio-net' and 'virtio-storage' from default
authorized list.

The generic authorized attribute would only need to appear when
'dev_default_authorization' is set to false.
Alan Stern Oct. 1, 2021, 4:45 p.m. UTC | #16
On Fri, Oct 01, 2021 at 09:13:54AM -0700, Dan Williams wrote:
> Bear with me, and perhaps it's a lack of imagination on my part, but I
> don't see how to get to a globally generic "authorized" sysfs ABI
> given that USB and Thunderbolt want to do bus specific actions on
> authorization toggle events. Certainly a default generic authorized
> attribute can be defined for all the other buses that don't have
> legacy here, but Thunderbolt will still require support for '2' as an
> authorized value, and USB will still want to base probe decisions on
> the authorization state of both the usb_device and the usb_interface.

The USB part isn't really accurate (I can't speak for Thunderbolt). 
When a usb_device is deauthorized, the device will be unconfigured, 
deleting all its interfaces and removing the need for any probe 
decisions about them.  In other words, the probe decision for a 
usb_device or usb_interface depends only on the device's/interface's 
own authorization state.

True, the interface binding code does contain a test of the device's 
authorization setting.  That test is redundant and can be removed.

The actions that USB wants to take on authorization toggle events for 
usb_devices are: for authorize, select and install a configuration; 
for deauthorize, unconfigure the device.  Each of these could be 
handled simply enough just by binding/unbinding the device.  (There 
is some special code for handling wireless USB devices, but wireless 
USB is now defunct.)

Alan Stern
Dan Williams Oct. 1, 2021, 6:09 p.m. UTC | #17
On Fri, Oct 1, 2021 at 9:47 AM Alan Stern <stern@rowland.harvard.edu> wrote:
>
> On Fri, Oct 01, 2021 at 09:13:54AM -0700, Dan Williams wrote:
> > Bear with me, and perhaps it's a lack of imagination on my part, but I
> > don't see how to get to a globally generic "authorized" sysfs ABI
> > given that USB and Thunderbolt want to do bus specific actions on
> > authorization toggle events. Certainly a default generic authorized
> > attribute can be defined for all the other buses that don't have
> > legacy here, but Thunderbolt will still require support for '2' as an
> > authorized value, and USB will still want to base probe decisions on
> > the authorization state of both the usb_device and the usb_interface.
>
> The USB part isn't really accurate (I can't speak for Thunderbolt).
> When a usb_device is deauthorized, the device will be unconfigured,
> deleting all its interfaces and removing the need for any probe
> decisions about them.  In other words, the probe decision for a
> usb_device or usb_interface depends only on the device's/interface's
> own authorization state.
>
> True, the interface binding code does contain a test of the device's
> authorization setting.  That test is redundant and can be removed.
>
> The actions that USB wants to take on authorization toggle events for
> usb_devices are: for authorize, select and install a configuration;
> for deauthorize, unconfigure the device.  Each of these could be
> handled simply enough just by binding/unbinding the device.  (There
> is some special code for handling wireless USB devices, but wireless
> USB is now defunct.)

Ah, so are you saying that it would be sufficient for USB if the
generic authorized implementation did something like:

dev->authorized = 1;
device_attach(dev);

...for the authorize case, and:

dev->authorize = 0;
device_release_driver(dev);

...for the deauthorize case?
Alan Stern Oct. 1, 2021, 7 p.m. UTC | #18
On Fri, Oct 01, 2021 at 11:09:52AM -0700, Dan Williams wrote:
> On Fri, Oct 1, 2021 at 9:47 AM Alan Stern <stern@rowland.harvard.edu> wrote:
> >
> > On Fri, Oct 01, 2021 at 09:13:54AM -0700, Dan Williams wrote:
> > > Bear with me, and perhaps it's a lack of imagination on my part, but I
> > > don't see how to get to a globally generic "authorized" sysfs ABI
> > > given that USB and Thunderbolt want to do bus specific actions on
> > > authorization toggle events. Certainly a default generic authorized
> > > attribute can be defined for all the other buses that don't have
> > > legacy here, but Thunderbolt will still require support for '2' as an
> > > authorized value, and USB will still want to base probe decisions on
> > > the authorization state of both the usb_device and the usb_interface.
> >
> > The USB part isn't really accurate (I can't speak for Thunderbolt).
> > When a usb_device is deauthorized, the device will be unconfigured,
> > deleting all its interfaces and removing the need for any probe
> > decisions about them.  In other words, the probe decision for a
> > usb_device or usb_interface depends only on the device's/interface's
> > own authorization state.
> >
> > True, the interface binding code does contain a test of the device's
> > authorization setting.  That test is redundant and can be removed.
> >
> > The actions that USB wants to take on authorization toggle events for
> > usb_devices are: for authorize, select and install a configuration;
> > for deauthorize, unconfigure the device.  Each of these could be
> > handled simply enough just by binding/unbinding the device.  (There
> > is some special code for handling wireless USB devices, but wireless
> > USB is now defunct.)
> 
> Ah, so are you saying that it would be sufficient for USB if the
> generic authorized implementation did something like:
> 
> dev->authorized = 1;
> device_attach(dev);
> 
> ...for the authorize case, and:
> 
> dev->authorize = 0;
> device_release_driver(dev);
> 
> ...for the deauthorize case?

Yes, I think so.  But I haven't tried making this change to test and 
see what really happens.

Alan Stern
Kuppuswamy Sathyanarayanan Oct. 1, 2021, 7:45 p.m. UTC | #19
On 10/1/21 12:00 PM, Alan Stern wrote:
> On Fri, Oct 01, 2021 at 11:09:52AM -0700, Dan Williams wrote:
>> On Fri, Oct 1, 2021 at 9:47 AM Alan Stern <stern@rowland.harvard.edu> wrote:
>>>
>>> On Fri, Oct 01, 2021 at 09:13:54AM -0700, Dan Williams wrote:
>>>> Bear with me, and perhaps it's a lack of imagination on my part, but I
>>>> don't see how to get to a globally generic "authorized" sysfs ABI
>>>> given that USB and Thunderbolt want to do bus specific actions on
>>>> authorization toggle events. Certainly a default generic authorized
>>>> attribute can be defined for all the other buses that don't have
>>>> legacy here, but Thunderbolt will still require support for '2' as an
>>>> authorized value, and USB will still want to base probe decisions on
>>>> the authorization state of both the usb_device and the usb_interface.
>>>
>>> The USB part isn't really accurate (I can't speak for Thunderbolt).
>>> When a usb_device is deauthorized, the device will be unconfigured,
>>> deleting all its interfaces and removing the need for any probe
>>> decisions about them.  In other words, the probe decision for a
>>> usb_device or usb_interface depends only on the device's/interface's
>>> own authorization state.
>>>
>>> True, the interface binding code does contain a test of the device's
>>> authorization setting.  That test is redundant and can be removed.
>>>
>>> The actions that USB wants to take on authorization toggle events for
>>> usb_devices are: for authorize, select and install a configuration;
>>> for deauthorize, unconfigure the device.  Each of these could be
>>> handled simply enough just by binding/unbinding the device.  (There
>>> is some special code for handling wireless USB devices, but wireless
>>> USB is now defunct.)
>>
>> Ah, so are you saying that it would be sufficient for USB if the
>> generic authorized implementation did something like:
>>
>> dev->authorized = 1;
>> device_attach(dev);
>>
>> ...for the authorize case, and:
>>
>> dev->authorize = 0;
>> device_release_driver(dev);
>>
>> ...for the deauthorize case?
> 
> Yes, I think so.  But I haven't tried making this change to test and
> see what really happens.
> 

For thunderbolt driver, it looks much more complicated. Unless you
define some callbacks in struct bus_type, we cannot easily generalize
it (but such callbacks are not recommended because it brings bus specific
operations to core layer).

sysfs_read()
-> simple read

sysfs_write()
-> tb_switch_set_authorized()
   -> disapprove_switch()
      -> tb_domain_disapprove_switch()
         -> tb->cm_ops->disapprove_switch() (product specific call)
   -> tb_domain_approve_switch_key()
      -> tb->cm_ops->add_switch_key
      -> tb->cm_ops->approve_switch() (product specific call)
   -> tb_domain_approve_switch()
      -> tb->cm_ops->approve_switch() (product specific call)
   -> tb_domain_challenge_switch_key()
      -> tb->cm_ops->challenge_switch_key()
      -> crypto_alloc_shash()
      -> crypto_shash_setkey()
      -> crypto_shash_digest()
      -> tb->cm_ops->approve_switch() (product specific call)


> Alan Stern
>
Dan Williams Oct. 1, 2021, 7:57 p.m. UTC | #20
On Fri, Oct 1, 2021 at 12:02 PM Alan Stern <stern@rowland.harvard.edu> wrote:
>
> On Fri, Oct 01, 2021 at 11:09:52AM -0700, Dan Williams wrote:
> > On Fri, Oct 1, 2021 at 9:47 AM Alan Stern <stern@rowland.harvard.edu> wrote:
> > >
> > > On Fri, Oct 01, 2021 at 09:13:54AM -0700, Dan Williams wrote:
> > > > Bear with me, and perhaps it's a lack of imagination on my part, but I
> > > > don't see how to get to a globally generic "authorized" sysfs ABI
> > > > given that USB and Thunderbolt want to do bus specific actions on
> > > > authorization toggle events. Certainly a default generic authorized
> > > > attribute can be defined for all the other buses that don't have
> > > > legacy here, but Thunderbolt will still require support for '2' as an
> > > > authorized value, and USB will still want to base probe decisions on
> > > > the authorization state of both the usb_device and the usb_interface.
> > >
> > > The USB part isn't really accurate (I can't speak for Thunderbolt).
> > > When a usb_device is deauthorized, the device will be unconfigured,
> > > deleting all its interfaces and removing the need for any probe
> > > decisions about them.  In other words, the probe decision for a
> > > usb_device or usb_interface depends only on the device's/interface's
> > > own authorization state.
> > >
> > > True, the interface binding code does contain a test of the device's
> > > authorization setting.  That test is redundant and can be removed.
> > >
> > > The actions that USB wants to take on authorization toggle events for
> > > usb_devices are: for authorize, select and install a configuration;
> > > for deauthorize, unconfigure the device.  Each of these could be
> > > handled simply enough just by binding/unbinding the device.  (There
> > > is some special code for handling wireless USB devices, but wireless
> > > USB is now defunct.)
> >
> > Ah, so are you saying that it would be sufficient for USB if the
> > generic authorized implementation did something like:
> >
> > dev->authorized = 1;
> > device_attach(dev);
> >
> > ...for the authorize case, and:
> >
> > dev->authorize = 0;
> > device_release_driver(dev);
> >
> > ...for the deauthorize case?
>
> Yes, I think so.  But I haven't tried making this change to test and
> see what really happens.

Sounds like a useful path for this effort to explore. Especially as
Greg seems to want the proposed "has_probe_authorization" flag in the
bus_type to disappear and make this all generic. It just seems that
Thunderbolt would need deeper surgery to move what it does in the
authorization toggle path into the probe and remove paths.

Mika, do you see a path for Thunderbolt to align its authorization
paths behind bus ->probe() ->remove() events similar to what USB might
be able to support for a generic authorization path?
Michael S. Tsirkin Oct. 2, 2021, 11:04 a.m. UTC | #21
On Fri, Oct 01, 2021 at 08:49:28AM -0700, Andi Kleen wrote:
> >   Do you have a list of specific drivers and kernel options that you
> > feel you now "trust"?
> 
> For TDX it's currently only virtio net/block/console
> 
> But we expect this list to grow slightly over time, but not at a high rate
> (so hopefully <10)

Well there are already >10 virtio drivers and I think it's reasonable
that all of these will be used with encrypted guests. The list will
grow.
Greg Kroah-Hartman Oct. 2, 2021, 11:14 a.m. UTC | #22
On Sat, Oct 02, 2021 at 07:04:28AM -0400, Michael S. Tsirkin wrote:
> On Fri, Oct 01, 2021 at 08:49:28AM -0700, Andi Kleen wrote:
> > >   Do you have a list of specific drivers and kernel options that you
> > > feel you now "trust"?
> > 
> > For TDX it's currently only virtio net/block/console
> > 
> > But we expect this list to grow slightly over time, but not at a high rate
> > (so hopefully <10)
> 
> Well there are already >10 virtio drivers and I think it's reasonable
> that all of these will be used with encrypted guests. The list will
> grow.

What is keeping "all" drivers from being on this list?  How exactly are
you determining what should, and should not, be allowed?  How can
drivers move on, or off, of it over time?

And why not just put all of that into userspace and have it pick and
choose?  That should be the end-goal here, you don't want to encode
policy like this in the kernel, right?

thanks,

greg k-h
Andi Kleen Oct. 2, 2021, 2:20 p.m. UTC | #23
On 10/2/2021 4:14 AM, Greg Kroah-Hartman wrote:
> On Sat, Oct 02, 2021 at 07:04:28AM -0400, Michael S. Tsirkin wrote:
>> On Fri, Oct 01, 2021 at 08:49:28AM -0700, Andi Kleen wrote:
>>>>    Do you have a list of specific drivers and kernel options that you
>>>> feel you now "trust"?
>>> For TDX it's currently only virtio net/block/console
>>>
>>> But we expect this list to grow slightly over time, but not at a high rate
>>> (so hopefully <10)
>> Well there are already >10 virtio drivers and I think it's reasonable
>> that all of these will be used with encrypted guests. The list will
>> grow.
> What is keeping "all" drivers from being on this list?

It would be too much work to harden them all, and it would be pointless 
because all these drivers are never legitimately needed in a virtualized 
environment which only virtualize a very small number of devices.

>   How exactly are
> you determining what should, and should not, be allowed?

Everything that has had reasonable effort at hardening can be added. But 
if someone proposes to add a driver that should trigger additional 
scrutiny in code review. We should also request them to do some fuzzing.

It's a bit similar to someone trying to add a new syscall interface. 
That also triggers much additional scrutiny for good reasons and people 
start fuzzing it.


>    How can
> drivers move on, or off, of it over time?

Adding something is submitting a patch to the allow list.

I'm not sure the "off" case would happen, unless the driver is 
completely removed, or maybe it has some unfixable security problem. But 
that is all rather unlikely.


>
> And why not just put all of that into userspace and have it pick and
> choose?  That should be the end-goal here, you don't want to encode
> policy like this in the kernel, right?

How would user space know what drivers have been hardened? This is 
really something that the kernel needs to determine. I don't think we 
can outsource it to anyone else.

Also BTW of course user space can still override it, but really the 
defaults should be a kernel policy.

There's also the additional problem that one of the goals of 
confidential guest is to just move existing guest virtual images into 
them without much changes. So it's better for such a case if as much as 
possible of the policy is in the kernel. But that's more a secondary 
consideration, the first point is really the important part.


-Andi
Greg Kroah-Hartman Oct. 2, 2021, 2:44 p.m. UTC | #24
On Sat, Oct 02, 2021 at 07:20:22AM -0700, Andi Kleen wrote:
> 
> On 10/2/2021 4:14 AM, Greg Kroah-Hartman wrote:
> > On Sat, Oct 02, 2021 at 07:04:28AM -0400, Michael S. Tsirkin wrote:
> > > On Fri, Oct 01, 2021 at 08:49:28AM -0700, Andi Kleen wrote:
> > > > >    Do you have a list of specific drivers and kernel options that you
> > > > > feel you now "trust"?
> > > > For TDX it's currently only virtio net/block/console
> > > > 
> > > > But we expect this list to grow slightly over time, but not at a high rate
> > > > (so hopefully <10)
> > > Well there are already >10 virtio drivers and I think it's reasonable
> > > that all of these will be used with encrypted guests. The list will
> > > grow.
> > What is keeping "all" drivers from being on this list?
> 
> It would be too much work to harden them all, and it would be pointless
> because all these drivers are never legitimately needed in a virtualized
> environment which only virtualize a very small number of devices.

Why would you not want to properly review and fix up all kernel drivers?
That feels like you are being lazy.

What exactly are you meaning by "harden"?  Why isn't it automated?  Who
is doing this work?  Where is it being done?

Come on, you have a small number of virtio drivers, to somehow say that
they are now divided up into trusted/untrusted feels very very odd.

Just do the real work here, everyone will benefit, including yourself.

> >   How exactly are
> > you determining what should, and should not, be allowed?
> 
> Everything that has had reasonable effort at hardening can be added. But if
> someone proposes to add a driver that should trigger additional scrutiny in
> code review. We should also request them to do some fuzzing.

You can provide that fuzzing right now, why isn't syzbot running on
these interfaces today?

And again, what _exactly_ do you all mean by "hardening" that has
happened?  Where is that documented and who did that work?

> > And why not just put all of that into userspace and have it pick and
> > choose?  That should be the end-goal here, you don't want to encode
> > policy like this in the kernel, right?
> 
> How would user space know what drivers have been hardened? This is really
> something that the kernel needs to determine. I don't think we can outsource
> it to anyone else.

It would "know" just as well as you know today in the kernel.  There is
no difference here.

Just do the real work here, and "harden" all of the virtio drivers
please.  What prevents that?

> Also BTW of course user space can still override it, but really the defaults
> should be a kernel policy.
> 
> There's also the additional problem that one of the goals of confidential
> guest is to just move existing guest virtual images into them without much
> changes. So it's better for such a case if as much as possible of the policy
> is in the kernel. But that's more a secondary consideration, the first point
> is really the important part.

Where exactly are all of these "goals" and "requirements" documented and
who is defining them and forcing them on us without actually doing any
of the work involved?

thanks,

greg k-h
Michael S. Tsirkin Oct. 2, 2021, 6:40 p.m. UTC | #25
On Sat, Oct 02, 2021 at 07:20:22AM -0700, Andi Kleen wrote:
> 
> On 10/2/2021 4:14 AM, Greg Kroah-Hartman wrote:
> > On Sat, Oct 02, 2021 at 07:04:28AM -0400, Michael S. Tsirkin wrote:
> > > On Fri, Oct 01, 2021 at 08:49:28AM -0700, Andi Kleen wrote:
> > > > >    Do you have a list of specific drivers and kernel options that you
> > > > > feel you now "trust"?
> > > > For TDX it's currently only virtio net/block/console
> > > > 
> > > > But we expect this list to grow slightly over time, but not at a high rate
> > > > (so hopefully <10)
> > > Well there are already >10 virtio drivers and I think it's reasonable
> > > that all of these will be used with encrypted guests. The list will
> > > grow.
> > What is keeping "all" drivers from being on this list?
> 
> It would be too much work to harden them all, and it would be pointless
> because all these drivers are never legitimately needed in a virtualized
> environment which only virtualize a very small number of devices.
> 
> >   How exactly are
> > you determining what should, and should not, be allowed?
> 
> Everything that has had reasonable effort at hardening can be added. But if
> someone proposes to add a driver that should trigger additional scrutiny in
> code review. We should also request them to do some fuzzing.

Looks like out of tree modules get a free pass then.
Which is exactly the reverse of what it should be,
people who spent the time to get their drivers into the kernel
expect that if kernel decides to change some API their
driver is automatically updated. This was always the social
contract, was it not?

> It's a bit similar to someone trying to add a new syscall interface. That
> also triggers much additional scrutiny for good reasons and people start
> fuzzing it.
> 
> 
> >    How can
> > drivers move on, or off, of it over time?
> 
> Adding something is submitting a patch to the allow list.
> 
> I'm not sure the "off" case would happen, unless the driver is completely
> removed, or maybe it has some unfixable security problem. But that is all
> rather unlikely.
> 
> 
> > 
> > And why not just put all of that into userspace and have it pick and
> > choose?  That should be the end-goal here, you don't want to encode
> > policy like this in the kernel, right?
> 
> How would user space know what drivers have been hardened? This is really
> something that the kernel needs to determine. I don't think we can outsource
> it to anyone else.

IIUC userspace is the distro. It can also do more than a binary on/off,
e.g. it can decide "only virtio", "no out of tree drivers".
A distro can also ship configs with a specific features
enabled/disabled. E.g. I can see where some GPU drivers will be
included by some distros since they are so useful, and excluded
by others since they are so big and hard to audit.
I don't see how the kernel can reasonably make a stand here.
Is "some audit and some fuzzing" a good policy? How much is enough?

> Also BTW of course user space can still override it, but really the defaults
> should be a kernel policy.

Well if userspace sets the policy then I'm not sure we also want
a kernel one ... but if yes I'd like it to be in a central
place so whoever is building the kernel can tweak it easily
and rebuild, without poking at individual drivers.

> There's also the additional problem that one of the goals of confidential
> guest is to just move existing guest virtual images into them without much
> changes. So it's better for such a case if as much as possible of the policy
> is in the kernel.

If it's e.g. the kernel command line then we can set that
when building the kernel right? No need for a dedicated interface
just for this ...

> But that's more a secondary consideration, the first point
> is really the important part.
> 
> 
> -Andi
Greg Kroah-Hartman Oct. 3, 2021, 6:40 a.m. UTC | #26
On Sat, Oct 02, 2021 at 02:40:55PM -0400, Michael S. Tsirkin wrote:
> On Sat, Oct 02, 2021 at 07:20:22AM -0700, Andi Kleen wrote:
> > 
> > On 10/2/2021 4:14 AM, Greg Kroah-Hartman wrote:
> > > On Sat, Oct 02, 2021 at 07:04:28AM -0400, Michael S. Tsirkin wrote:
> > > > On Fri, Oct 01, 2021 at 08:49:28AM -0700, Andi Kleen wrote:
> > > > > >    Do you have a list of specific drivers and kernel options that you
> > > > > > feel you now "trust"?
> > > > > For TDX it's currently only virtio net/block/console
> > > > > 
> > > > > But we expect this list to grow slightly over time, but not at a high rate
> > > > > (so hopefully <10)
> > > > Well there are already >10 virtio drivers and I think it's reasonable
> > > > that all of these will be used with encrypted guests. The list will
> > > > grow.
> > > What is keeping "all" drivers from being on this list?
> > 
> > It would be too much work to harden them all, and it would be pointless
> > because all these drivers are never legitimately needed in a virtualized
> > environment which only virtualize a very small number of devices.
> > 
> > >   How exactly are
> > > you determining what should, and should not, be allowed?
> > 
> > Everything that has had reasonable effort at hardening can be added. But if
> > someone proposes to add a driver that should trigger additional scrutiny in
> > code review. We should also request them to do some fuzzing.
> 
> Looks like out of tree modules get a free pass then.

That's not good.  As we already know if a module is in or out of the
tree, you all should be banning all out-of-tree modules if you care
about these things.  That should be very easy to do if you care.

> > How would user space know what drivers have been hardened? This is really
> > something that the kernel needs to determine. I don't think we can outsource
> > it to anyone else.
> 
> IIUC userspace is the distro. It can also do more than a binary on/off,
> e.g. it can decide "only virtio", "no out of tree drivers".
> A distro can also ship configs with a specific features
> enabled/disabled. E.g. I can see where some GPU drivers will be
> included by some distros since they are so useful, and excluded
> by others since they are so big and hard to audit.
> I don't see how the kernel can reasonably make a stand here.
> Is "some audit and some fuzzing" a good policy? How much is enough?

Agreed, that is why the policy for this should be in userspace.

> Well if userspace sets the policy then I'm not sure we also want
> a kernel one ... but if yes I'd like it to be in a central
> place so whoever is building the kernel can tweak it easily
> and rebuild, without poking at individual drivers.

And here I thought the requirement was that no one could rebuild their
kernel as it was provided by someone else.

Again, these requirements seem contradicting, but as no one has actually
pointed me at the real list of them, who knows what they are?

thanks,

greg k-h
Mika Westerberg Oct. 4, 2021, 5:16 a.m. UTC | #27
Hi,

On Fri, Oct 01, 2021 at 12:57:18PM -0700, Dan Williams wrote:
> > > Ah, so are you saying that it would be sufficient for USB if the
> > > generic authorized implementation did something like:
> > >
> > > dev->authorized = 1;
> > > device_attach(dev);
> > >
> > > ...for the authorize case, and:
> > >
> > > dev->authorize = 0;
> > > device_release_driver(dev);
> > >
> > > ...for the deauthorize case?
> >
> > Yes, I think so.  But I haven't tried making this change to test and
> > see what really happens.
> 
> Sounds like a useful path for this effort to explore. Especially as
> Greg seems to want the proposed "has_probe_authorization" flag in the
> bus_type to disappear and make this all generic. It just seems that
> Thunderbolt would need deeper surgery to move what it does in the
> authorization toggle path into the probe and remove paths.
> 
> Mika, do you see a path for Thunderbolt to align its authorization
> paths behind bus ->probe() ->remove() events similar to what USB might
> be able to support for a generic authorization path?

In Thunderbolt "authorization" actually means whether there is a PCIe
tunnel to the device or not. There is no driver bind/unbind happening
when authorization toggles (well on Thunderbolt bus, there can be on PCI
bus after the tunnel is established) so I'm not entirely sure how we
could use the bus ->probe() or ->remove for that to be honest.
Dan Williams Oct. 4, 2021, 9:04 p.m. UTC | #28
On Sat, Oct 2, 2021 at 7:20 AM Andi Kleen <ak@linux.intel.com> wrote:
>
>
> On 10/2/2021 4:14 AM, Greg Kroah-Hartman wrote:
> > On Sat, Oct 02, 2021 at 07:04:28AM -0400, Michael S. Tsirkin wrote:
> >> On Fri, Oct 01, 2021 at 08:49:28AM -0700, Andi Kleen wrote:
> >>>>    Do you have a list of specific drivers and kernel options that you
> >>>> feel you now "trust"?
> >>> For TDX it's currently only virtio net/block/console
> >>>
> >>> But we expect this list to grow slightly over time, but not at a high rate
> >>> (so hopefully <10)
> >> Well there are already >10 virtio drivers and I think it's reasonable
> >> that all of these will be used with encrypted guests. The list will
> >> grow.
> > What is keeping "all" drivers from being on this list?
>
> It would be too much work to harden them all, and it would be pointless
> because all these drivers are never legitimately needed in a virtualized
> environment which only virtualize a very small number of devices.
>
> >   How exactly are
> > you determining what should, and should not, be allowed?
>
> Everything that has had reasonable effort at hardening can be added. But
> if someone proposes to add a driver that should trigger additional
> scrutiny in code review. We should also request them to do some fuzzing.
>
> It's a bit similar to someone trying to add a new syscall interface.
> That also triggers much additional scrutiny for good reasons and people
> start fuzzing it.
>
>
> >    How can
> > drivers move on, or off, of it over time?
>
> Adding something is submitting a patch to the allow list.
>
> I'm not sure the "off" case would happen, unless the driver is
> completely removed, or maybe it has some unfixable security problem. But
> that is all rather unlikely.
>
>
> >
> > And why not just put all of that into userspace and have it pick and
> > choose?  That should be the end-goal here, you don't want to encode
> > policy like this in the kernel, right?
>
> How would user space know what drivers have been hardened? This is
> really something that the kernel needs to determine. I don't think we
> can outsource it to anyone else.

How it is outsourcing by moving that same allow list over the kernel /
user boundary. It can be maintained by the same engineers and get
deployed by something like:

dracut --authorize-device-list=confidential-computing-default $kernel-version

With that distributions can deploy kernel-specific authorizations and
admins can deploy site-specific authorizations. Then the kernel
implementation is minimized to authorize just enough drivers by
default to let userspace take over the policy.

> Also BTW of course user space can still override it, but really the
> defaults should be a kernel policy.

The default is secure, trust nothing but bootstrap devices.

> There's also the additional problem that one of the goals of
> confidential guest is to just move existing guest virtual images into
> them without much changes. So it's better for such a case if as much as
> possible of the policy is in the kernel. But that's more a secondary
> consideration, the first point is really the important part.

The same image can be used on host and guest in this "do it in
userspace" proposal.
Dan Williams Oct. 5, 2021, 10:33 p.m. UTC | #29
On Sun, Oct 3, 2021 at 10:16 PM Mika Westerberg
<mika.westerberg@linux.intel.com> wrote:
>
> Hi,
>
> On Fri, Oct 01, 2021 at 12:57:18PM -0700, Dan Williams wrote:
> > > > Ah, so are you saying that it would be sufficient for USB if the
> > > > generic authorized implementation did something like:
> > > >
> > > > dev->authorized = 1;
> > > > device_attach(dev);
> > > >
> > > > ...for the authorize case, and:
> > > >
> > > > dev->authorize = 0;
> > > > device_release_driver(dev);
> > > >
> > > > ...for the deauthorize case?
> > >
> > > Yes, I think so.  But I haven't tried making this change to test and
> > > see what really happens.
> >
> > Sounds like a useful path for this effort to explore. Especially as
> > Greg seems to want the proposed "has_probe_authorization" flag in the
> > bus_type to disappear and make this all generic. It just seems that
> > Thunderbolt would need deeper surgery to move what it does in the
> > authorization toggle path into the probe and remove paths.
> >
> > Mika, do you see a path for Thunderbolt to align its authorization
> > paths behind bus ->probe() ->remove() events similar to what USB might
> > be able to support for a generic authorization path?
>
> In Thunderbolt "authorization" actually means whether there is a PCIe
> tunnel to the device or not. There is no driver bind/unbind happening
> when authorization toggles (well on Thunderbolt bus, there can be on PCI
> bus after the tunnel is established) so I'm not entirely sure how we
> could use the bus ->probe() or ->remove for that to be honest.

Greg, per your comment:

"... which was to move the way that busses are allowed to authorize
the devices they wish to control into a generic way instead of being
bus-specific logic."

We have USB and TB that have already diverged on the ABI here. The USB
behavior is more in line with the "probe authorization" concept, while
TB is about tunnel establishment and not cleanly tied to probe
authorization. So while I see a path to a common authorization
implementation for USB and other buses (per the insight from Alan), TB
needs to retain the ability to record the authorization state as an
enum rather than a bool, and emit a uevent on authorization status
change.

So how about something like the following that moves the attribute
into the core, but still calls back to TB and USB to perform their
legacy authorization work. This new authorized attribute only shows up
when devices default to not authorized, i.e. when userspace owns the
allow list past critical-boot built-in drivers, or if the bus (USB /
TB) implements ->authorize().


diff --git a/drivers/base/core.c b/drivers/base/core.c
index e65dd803a453..8f8fbe2637d1 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2414,6 +2414,58 @@ static ssize_t online_store(struct device *dev,
struct device_attribute *attr,
 }
 static DEVICE_ATTR_RW(online);

+static ssize_t authorized_show(struct device *dev,
+                              struct device_attribute *attr, char *buf)
+{
+       return sysfs_emit(buf, "%u\n", dev->authorized);
+}
+
+static ssize_t authorized_store(struct device *dev,
+                               struct device_attribute *attr, const char *buf,
+                               size_t count)
+{
+       unsigned int val, save;
+       ssize_t rc;
+
+       rc = kstrtouint(buf, 0, &val);
+       if (rc < 0)
+               return rc;
+
+       /* some buses (Thunderbolt) support authorized values > 1 */
+       if (val > 1 && !dev->bus->authorize)
+               return -EINVAL;
+
+       device_lock(dev);
+       save = dev->authorized;
+       if (save == val) {
+               rc = count;
+               goto err;
+       }
+
+       dev->authorized = val;
+       if (dev->bus->authorize) {
+               /* notify bus about change in authorization state */
+               rc = dev->bus->authorize(dev);
+               if (rc) {
+                       dev->authorized = save;
+                       goto err;
+               }
+       }
+       device_unlock(dev);
+
+       if (dev->authorized) {
+               if (!device_attach(dev))
+                       dev_dbg(dev, "failed to probe after authorize\n");
+       } else
+               device_release_driver(dev);
+       return count;
+
+err:
+       device_unlock(dev);
+       return rc < 0 ? rc : count;
+}
+static DEVICE_ATTR_RW(authorized);
+
 static ssize_t removable_show(struct device *dev, struct
device_attribute *attr,
                              char *buf)
 {
@@ -2616,8 +2668,16 @@ static int device_add_attrs(struct device *dev)
                        goto err_remove_dev_waiting_for_supplier;
        }

+       if (dev_needs_authorization(dev)) {
+               error = device_create_file(dev, &dev_attr_authorized);
+               if (error)
+                       goto err_remove_dev_removable;
+       }
+
        return 0;

+ err_remove_dev_removable:
+       device_remove_file(dev, &dev_attr_removable);
  err_remove_dev_waiting_for_supplier:
        device_remove_file(dev, &dev_attr_waiting_for_supplier);
  err_remove_dev_online:
@@ -2639,6 +2699,7 @@ static void device_remove_attrs(struct device *dev)
        struct class *class = dev->class;
        const struct device_type *type = dev->type;

+       device_remove_file(dev, &dev_attr_authorized);
        device_remove_file(dev, &dev_attr_removable);
        device_remove_file(dev, &dev_attr_waiting_for_supplier);
        device_remove_file(dev, &dev_attr_online);
@@ -2805,6 +2866,8 @@ static void klist_children_put(struct klist_node *n)
        put_device(dev);
 }

+unsigned int dev_default_authorization;
+
 /**
  * device_initialize - init device structure.
  * @dev: device.
diff --git a/include/linux/device.h b/include/linux/device.h
index e270cb740b9e..fbb83e46af9d 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -561,6 +561,7 @@ struct device {
        struct dev_iommu        *iommu;

        enum device_removable   removable;
+       unsigned int            authorized;

        bool                    offline_disabled:1;
        bool                    offline:1;
@@ -814,6 +815,19 @@ static inline bool dev_removable_is_valid(struct
device *dev)
        return dev->removable != DEVICE_REMOVABLE_NOT_SUPPORTED;
 }

+extern unsigned int dev_default_authorization;
+
+/*
+ * If the bus has custom authorization, or if devices default to not
+ * authorized, register the 'authorized' attribute for @dev.
+ */
+static inline bool dev_needs_authorization(struct device *dev)
+{
+       if (dev->bus->authorize || dev_default_authorization == 0)
+               return true;
+       return false;
+}
+
 /*
  * High level routines for use by the bus drivers
  */
diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h
index 062777a45a74..3202a2b13374 100644
--- a/include/linux/device/bus.h
+++ b/include/linux/device/bus.h
@@ -40,6 +40,11 @@ struct fwnode_handle;
  *             that generate uevents to add the environment variables.
  * @probe:     Called when a new device or driver add to this bus, and callback
  *             the specific driver's probe to initial the matched device.
+ * @authorize: Called after authorized_store() changes the
+ *             authorization state of the device. Do not use for new
+ *             bus implementations, revalidate dev->authorized in
+ *             @probe and @remove to take any bus specific
+ *             authorization actions.
  * @sync_state:        Called to sync device state to software state
after all the
  *             state tracking consumers linked to this device (present at
  *             the time of late_initcall) have successfully bound to a
@@ -90,6 +95,7 @@ struct bus_type {
        int (*match)(struct device *dev, struct device_driver *drv);
        int (*uevent)(struct device *dev, struct kobj_uevent_env *env);
        int (*probe)(struct device *dev);
+       int (*authorize)(struct device *dev);
        void (*sync_state)(struct device *dev);
        void (*remove)(struct device *dev);
        void (*shutdown)(struct device *dev);
Greg Kroah-Hartman Oct. 6, 2021, 5:45 a.m. UTC | #30
On Tue, Oct 05, 2021 at 03:33:29PM -0700, Dan Williams wrote:
> On Sun, Oct 3, 2021 at 10:16 PM Mika Westerberg
> <mika.westerberg@linux.intel.com> wrote:
> >
> > Hi,
> >
> > On Fri, Oct 01, 2021 at 12:57:18PM -0700, Dan Williams wrote:
> > > > > Ah, so are you saying that it would be sufficient for USB if the
> > > > > generic authorized implementation did something like:
> > > > >
> > > > > dev->authorized = 1;
> > > > > device_attach(dev);
> > > > >
> > > > > ...for the authorize case, and:
> > > > >
> > > > > dev->authorize = 0;
> > > > > device_release_driver(dev);
> > > > >
> > > > > ...for the deauthorize case?
> > > >
> > > > Yes, I think so.  But I haven't tried making this change to test and
> > > > see what really happens.
> > >
> > > Sounds like a useful path for this effort to explore. Especially as
> > > Greg seems to want the proposed "has_probe_authorization" flag in the
> > > bus_type to disappear and make this all generic. It just seems that
> > > Thunderbolt would need deeper surgery to move what it does in the
> > > authorization toggle path into the probe and remove paths.
> > >
> > > Mika, do you see a path for Thunderbolt to align its authorization
> > > paths behind bus ->probe() ->remove() events similar to what USB might
> > > be able to support for a generic authorization path?
> >
> > In Thunderbolt "authorization" actually means whether there is a PCIe
> > tunnel to the device or not. There is no driver bind/unbind happening
> > when authorization toggles (well on Thunderbolt bus, there can be on PCI
> > bus after the tunnel is established) so I'm not entirely sure how we
> > could use the bus ->probe() or ->remove for that to be honest.
> 
> Greg, per your comment:
> 
> "... which was to move the way that busses are allowed to authorize
> the devices they wish to control into a generic way instead of being
> bus-specific logic."
> 
> We have USB and TB that have already diverged on the ABI here. The USB
> behavior is more in line with the "probe authorization" concept, while
> TB is about tunnel establishment and not cleanly tied to probe
> authorization. So while I see a path to a common authorization
> implementation for USB and other buses (per the insight from Alan), TB
> needs to retain the ability to record the authorization state as an
> enum rather than a bool, and emit a uevent on authorization status
> change.
> 
> So how about something like the following that moves the attribute
> into the core, but still calls back to TB and USB to perform their
> legacy authorization work. This new authorized attribute only shows up
> when devices default to not authorized, i.e. when userspace owns the
> allow list past critical-boot built-in drivers, or if the bus (USB /
> TB) implements ->authorize().

At quick glance, this looks better, but it would be good to see someone
test it :)

thanks,

greg k-h
diff mbox series

Patch

diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index 588e02fb91d3..377b0ccdc503 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -5,6 +5,8 @@ 
 #include <linux/module.h>
 #include <linux/idr.h>
 #include <linux/of.h>
+#include <linux/cc_platform.h>
+#include <linux/device.h>
 #include <uapi/linux/virtio_ids.h>
 
 /* Unique numbering for virtio devices. */
@@ -390,6 +392,13 @@  int register_virtio_device(struct virtio_device *dev)
 	dev->config_enabled = false;
 	dev->config_change_pending = false;
 
+	/*
+	 * For Confidential guest (like TDX), virtio devices are
+	 * trusted. So set authorized status as true.
+	 */
+	if (cc_platform_has(CC_ATTR_GUEST_DEVICE_FILTER))
+		dev->dev.authorized = true;
+
 	/* We always start by resetting the device, in case a previous
 	 * driver messed it up.  This also tests that code path a little. */
 	dev->config->reset(dev);