diff mbox

tpm: add support for partial reads

Message ID 153201555276.20155.1352499992826895966.stgit@tstruk-mobl1.jf.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Tadeusz Struk July 19, 2018, 3:52 p.m. UTC
Currently to read a response from the TPM device an application needs
provide "big enough" buffer for the whole response and read it in one go.
The application doesn't know how big the response it beforehand so it
always needs to maintain a 4K buffer and read the max (4K).
In case if the user of the TSS library doesn't provide big enough buffer
the TCTI spec says that the library should set the required size and return
TSS2_TCTI_RC_INSUFFICIENT_BUFFER error code so that the application could
allocate a bigger buffer and call receive again.
To make it possible in the TSS library this requires being able to do
partial reads from the driver.
The library would read the header first to get the actual size of the
response from the header and then read the rest of the response.
This patch adds support for partial reads.

The usecase is implemented in this TSS commit:
https://github.com/tpm2-software/tpm2-tss/commit/ce982f67a67dc08e24683d30b05800648d8a264c

Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
---
 drivers/char/tpm/tpm-dev-common.c |   32 +++++++++++++++++++++++---------
 1 file changed, 23 insertions(+), 9 deletions(-)

Comments

Tadeusz Struk July 19, 2018, 3:55 p.m. UTC | #1
On 07/19/2018 08:52 AM, Tadeusz Struk wrote:
> Currently to read a response from the TPM device an application needs
> provide "big enough" buffer for the whole response and read it in one go.
> The application doesn't know how big the response it beforehand so it
> always needs to maintain a 4K buffer and read the max (4K).
> In case if the user of the TSS library doesn't provide big enough buffer
> the TCTI spec says that the library should set the required size and return
> TSS2_TCTI_RC_INSUFFICIENT_BUFFER error code so that the application could
> allocate a bigger buffer and call receive again.
> To make it possible in the TSS library this requires being able to do
> partial reads from the driver.
> The library would read the header first to get the actual size of the
> response from the header and then read the rest of the response.
> This patch adds support for partial reads.
> 
> The usecase is implemented in this TSS commit:
> https://github.com/tpm2-software/tpm2-tss/commit/ce982f67a67dc08e24683d30b05800648d8a264c
> 
> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>

This needs to go on top of this:
https://patchwork.kernel.org/patch/10460863/

Thanks,
James Bottomley July 19, 2018, 5:19 p.m. UTC | #2
On Thu, 2018-07-19 at 08:52 -0700, Tadeusz Struk wrote:
> Currently to read a response from the TPM device an application needs
> provide "big enough" buffer for the whole response and read it in one
> go.  The application doesn't know how big the response it beforehand
> so it always needs to maintain a 4K buffer and read the max (4K).
> In case if the user of the TSS library doesn't provide big enough
> buffer the TCTI spec says that the library should set the required
> size and return TSS2_TCTI_RC_INSUFFICIENT_BUFFER error code so that
> the application could allocate a bigger buffer and call receive
> again. To make it possible in the TSS library this requires being
> able to do partial reads from the driver.
> The library would read the header first to get the actual size of the
> response from the header and then read the rest of the response.
> This patch adds support for partial reads.
> 
> The usecase is implemented in this TSS commit:
> https://github.com/tpm2-software/tpm2-tss/commit/ce982f67a67dc08e2468
> 3d30b05800648d8a264c

That's just an implementation, though, what's the use case?

I'm curious because all the TPM applications I've written need to be
aware of TPM2B_MAX_BUFFER_SIZE, which is related to MAX_RESPONSE_SIZE
because you can't go over that for big buffer commands (mostly sealing
and unsealing).

The TCG supporting routines define MAX_RESPONSE_SIZE to be 4096, so you
know absolutely how large a buffer you have to have ... and the value
is rather handy for us because if it were larger we'd have to do
scatter gather.

I think the point is that knowing the max buffer size allows us to
behave like UDP: if your packet is the wrong size it gets dropped and
relieves the applications from having to do fragmentation and
reassembly.  Since the max size is so low, what's the benefit of not
assuming the application has to know it?

James
Tadeusz Struk July 19, 2018, 5:54 p.m. UTC | #3
On 07/19/2018 10:19 AM, James Bottomley wrote:
> That's just an implementation, though, what's the use case?

Hi James,
The use case is described in the TCTI spec [1] in the
"3.2.5.2 receive" section.

Thanks,
James Bottomley July 19, 2018, 6:47 p.m. UTC | #4
On Thu, 2018-07-19 at 10:54 -0700, Tadeusz Struk wrote:
> On 07/19/2018 10:19 AM, James Bottomley wrote:
> > That's just an implementation, though, what's the use case?
> 
> Hi James,
> The use case is described in the TCTI spec [1] in the
> "3.2.5.2 receive" section.

Well, yes, but I think we've all agreed that the /dev/tpm and
/dev/tpmrmX aren't TCTI interfaces, although you can layer TCTI on top
of them, so why not simply do fragmentation on top if you need it?

The reason for not doing it in the interface is that it alters the ABI.
 Before this patch we had a hard packet boundary: one packet per read,
one per write and a -EFAULT if you fail to provide a correctly sized
buffer.  Now if you provide a buffer too small but don't know about the
fragmentation you're going to misprocess a packet (because you think
you got a whole reply but you didn't) and then you get a -EBUSY on your
next command which you don't know how to handle.  The only way out is
to update the applications to handle the new behaviour, which is a no-
no in Linux ABI terms.

It might be possible to layer the behaviour you want compatibly into
the current device structure (say an ioctl to switch to the fragment
behaviour) but I've got to ask why we'd go to the complexity without a
use case?

James
Tadeusz Struk July 19, 2018, 7:05 p.m. UTC | #5
On 07/19/2018 11:47 AM, James Bottomley wrote:
> On Thu, 2018-07-19 at 10:54 -0700, Tadeusz Struk wrote:
>> On 07/19/2018 10:19 AM, James Bottomley wrote:
>>> That's just an implementation, though, what's the use case?
>>
>> Hi James,
>> The use case is described in the TCTI spec [1] in the
>> "3.2.5.2 receive" section.
> 
> Well, yes, but I think we've all agreed that the /dev/tpm and
> /dev/tpmrmX aren't TCTI interfaces, although you can layer TCTI on top
> of them, so why not simply do fragmentation on top if you need it?
> 
> The reason for not doing it in the interface is that it alters the ABI.
>  Before this patch we had a hard packet boundary: one packet per read,
> one per write and a -EFAULT if you fail to provide a correctly sized
> buffer.  Now if you provide a buffer too small but don't know about the
> fragmentation you're going to misprocess a packet (because you think
> you got a whole reply but you didn't) and then you get a -EBUSY on your
> next command which you don't know how to handle.  The only way out is
> to update the applications to handle the new behaviour, which is a no-
> no in Linux ABI terms.

Don't all the existing applications that read a response in one go
do a 4K read now? So nothing will change for them. They will work
exactly the same with this change as they do without it.
This doesn't break the ABI.

> 
> It might be possible to layer the behaviour you want compatibly into
> the current device structure (say an ioctl to switch to the fragment
> behaviour) but I've got to ask why we'd go to the complexity without a
> use case?

New IOCTL would add extra complexity, which isn't necessary.

Thanks,
James Bottomley July 19, 2018, 7:52 p.m. UTC | #6
On Thu, 2018-07-19 at 12:05 -0700, Tadeusz Struk wrote:
> On 07/19/2018 11:47 AM, James Bottomley wrote:
> > On Thu, 2018-07-19 at 10:54 -0700, Tadeusz Struk wrote:
> > > On 07/19/2018 10:19 AM, James Bottomley wrote:
> > > > That's just an implementation, though, what's the use case?
> > > 
> > > Hi James,
> > > The use case is described in the TCTI spec [1] in the
> > > "3.2.5.2 receive" section.
> > 
> > Well, yes, but I think we've all agreed that the /dev/tpm and
> > /dev/tpmrmX aren't TCTI interfaces, although you can layer TCTI on
> > top of them, so why not simply do fragmentation on top if you need
> > it?
> > 
> > The reason for not doing it in the interface is that it alters the
> > ABI.  Before this patch we had a hard packet boundary: one packet
> > per read, one per write and a -EFAULT if you fail to provide a
> > correctly sized buffer.  Now if you provide a buffer too small but
> > don't know about the fragmentation you're going to misprocess a
> > packet (because you think you got a whole reply but you didn't) and
> > then you get a -EBUSY on your next command which you don't know how
> > to handle.  The only way out is to update the applications to
> > handle the new behaviour, which is a no-no in Linux ABI terms.
> 
> Don't all the existing applications that read a response in one go
> do a 4K read now? So nothing will change for them. They will work
> exactly the same with this change as they do without it.
> This doesn't break the ABI.

The ABI break is the error case as I outlined above.  We can't assume
everyone uses the current interface without getting an error and one
error and your hosed is a nasty failure case to change the interface
to.  Plus, if you assume everyone is passing 4k buffers, why would you
even need the fragmentation case?

> > It might be possible to layer the behaviour you want compatibly
> > into the current device structure (say an ioctl to switch to the
> > fragment behaviour) but I've got to ask why we'd go to the
> > complexity without a use case?
> 
> New IOCTL would add extra complexity, which isn't necessary.

So what's wrong with fragmenting in the layer above the device driver
(in userspace) and not actually changing the kernel?

James
Tadeusz Struk July 19, 2018, 8:12 p.m. UTC | #7
On 07/19/2018 12:52 PM, James Bottomley wrote:
> The ABI break is the error case as I outlined above.  We can't assume
> everyone uses the current interface without getting an error and one
> error and your hosed is a nasty failure case to change the interface
> to. 

Well, if there is a broken application out there that doesn't work today
it will not work after this change neither.

> Plus, if you assume everyone is passing 4k buffers, why would you
> even need the fragmentation case?

So that people don't need to do this anymore and we can run a
spec compliant TCTI on top of /dev/tpm<N>.

> 
>>> It might be possible to layer the behaviour you want compatibly
>>> into the current device structure (say an ioctl to switch to the
>>> fragment behaviour) but I've got to ask why we'd go to the
>>> complexity without a use case?
>> New IOCTL would add extra complexity, which isn't necessary.
> So what's wrong with fragmenting in the layer above the device driver
> (in userspace) and not actually changing the kernel?

Because it is much easier to implement in the driver, and
we can run a spec compliant TCTI on top of /dev/tpm<N>.

Thanks,
James Bottomley July 19, 2018, 8:27 p.m. UTC | #8
On Thu, 2018-07-19 at 13:12 -0700, Tadeusz Struk wrote:
> On 07/19/2018 12:52 PM, James Bottomley wrote:
> > The ABI break is the error case as I outlined above.  We can't
> > assume everyone uses the current interface without getting an error
> > and one error and your hosed is a nasty failure case to change the
> > interface to. 
> 
> Well, if there is a broken application out there that doesn't work
> today it will not work after this change neither.

It doesn't have to be broken ... it could be using EFAULT to probe the
buffer size for instance.  That's the point of not breaking the ABI:
you don't second guess what applications are doing.

James
Tadeusz Struk July 19, 2018, 9:01 p.m. UTC | #9
On 07/19/2018 01:27 PM, James Bottomley wrote:
> On Thu, 2018-07-19 at 13:12 -0700, Tadeusz Struk wrote:
>> On 07/19/2018 12:52 PM, James Bottomley wrote:
>>> The ABI break is the error case as I outlined above.  We can't
>>> assume everyone uses the current interface without getting an error
>>> and one error and your hosed is a nasty failure case to change the
>>> interface to. 
>>
>> Well, if there is a broken application out there that doesn't work
>> today it will not work after this change neither.
> 
> It doesn't have to be broken ... it could be using EFAULT to probe the
> buffer size for instance.  That's the point of not breaking the ABI:
> you don't second guess what applications are doing.
> 

Looking at the existing implementation again:
https://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git/tree/drivers/char/tpm/tpm-dev-common.c?h=next-tpm#n56

EFAULT is returned only if the copy_to_user() fails.
So today, if an application wants read 1 byte of a response, and provides
1 byte buffer for it, then only 1 byte of the response will be copied,
no error code will be returned, and the rest of the response will be gone.
I don't really see how and why would anyone use EFAULT err to probe for
the buffer size. That would really be a broken application.

Thanks,
Jarkko Sakkinen July 23, 2018, 8:19 p.m. UTC | #10
On Thu, Jul 19, 2018 at 08:52:32AM -0700, Tadeusz Struk wrote:
> Currently to read a response from the TPM device an application needs
> provide "big enough" buffer for the whole response and read it in one go.
> The application doesn't know how big the response it beforehand so it
> always needs to maintain a 4K buffer and read the max (4K).
> In case if the user of the TSS library doesn't provide big enough buffer
> the TCTI spec says that the library should set the required size and return
> TSS2_TCTI_RC_INSUFFICIENT_BUFFER error code so that the application could
> allocate a bigger buffer and call receive again.
> To make it possible in the TSS library this requires being able to do
> partial reads from the driver.
> The library would read the header first to get the actual size of the
> response from the header and then read the rest of the response.
> This patch adds support for partial reads.
> 
> The usecase is implemented in this TSS commit:
> https://github.com/tpm2-software/tpm2-tss/commit/ce982f67a67dc08e24683d30b05800648d8a264c
> 
> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>

For non-blocking operation I see the benefit because it does not break
the ABI and it really simplifies threading in the user space.

In this case I do not have any major evidence of any major benefit *and*
the change breaks the ABI.

Linux does not *have* to implement in kernel level every tidbit of the
TCG spec but it *can* provide support in places where it makes sense
and things do not break.

/Jarkko
Tadeusz Struk July 23, 2018, 8:53 p.m. UTC | #11
On 07/23/2018 01:19 PM, Jarkko Sakkinen wrote:
> In this case I do not have any major evidence of any major benefit *and*
> the change breaks the ABI.

As I said before - this does not break the ABI.
As for the benefits - it help user space in how they implement the receive
path. Application does not need to provide a 4K buffer for every read even
if the response is, for instance 8 bytes long.
Thanks,
James Bottomley July 23, 2018, 9:13 p.m. UTC | #12
On Mon, 2018-07-23 at 13:53 -0700, Tadeusz Struk wrote:
> On 07/23/2018 01:19 PM, Jarkko Sakkinen wrote:
> > In this case I do not have any major evidence of any major benefit
> > *and* the change breaks the ABI.
> 
> As I said before - this does not break the ABI.

The current patch does, you even provided a use case in your last email
 (it's do command to get sizing followed by do command with correctly
sized buffer). 

However, if you tie it to O_NONBLOCK, it won't because no-one currently
opens the TPM device non blocking so it's an ABI conformant
discriminator of the uses.  Tying to O_NONBLOCK should be simple
because it's in file->f_flags.

James
Tadeusz Struk July 23, 2018, 9:38 p.m. UTC | #13
On 07/23/2018 02:13 PM, James Bottomley wrote:
> The current patch does, you even provided a use case in your last email
>  (it's do command to get sizing followed by do command with correctly
> sized buffer). 

The example I provided was: #1 send a command, #2 read the response header
(10 bytes), get the actual response size from the header and then #3 read
the full response (response size - size of the header bytes).

> 
> However, if you tie it to O_NONBLOCK, it won't because no-one currently
> opens the TPM device non blocking so it's an ABI conformant
> discriminator of the uses.  Tying to O_NONBLOCK should be simple
> because it's in file->f_flags.

I think that it might be an option. Especially that I have this on top of
the async patch. Let's discuss this when Jarkko is back.

Thanks,
Jason Gunthorpe July 23, 2018, 9:48 p.m. UTC | #14
On Thu, Jul 19, 2018 at 08:52:32AM -0700, Tadeusz Struk wrote:
> Currently to read a response from the TPM device an application needs
> provide "big enough" buffer for the whole response and read it in one go.
> The application doesn't know how big the response it beforehand so it
> always needs to maintain a 4K buffer and read the max (4K).
> In case if the user of the TSS library doesn't provide big enough buffer
> the TCTI spec says that the library should set the required size and return
> TSS2_TCTI_RC_INSUFFICIENT_BUFFER error code so that the application could
> allocate a bigger buffer and call receive again.
> To make it possible in the TSS library this requires being able to do
> partial reads from the driver.
> The library would read the header first to get the actual size of the
> response from the header and then read the rest of the response.
> This patch adds support for partial reads.

You should solve this in user space, the kernel API requires a full
sized buffer here, the tss library should always provide such an
internal buffer and then implement whatever scheme TSS wants by
memcpying from that buffer..

Jason
Jason Gunthorpe July 23, 2018, 9:56 p.m. UTC | #15
On Mon, Jul 23, 2018 at 02:38:08PM -0700, Tadeusz Struk wrote:
> On 07/23/2018 02:13 PM, James Bottomley wrote:
> > The current patch does, you even provided a use case in your last email
> >  (it's do command to get sizing followed by do command with correctly
> > sized buffer). 
> 
> The example I provided was: #1 send a command, #2 read the response header
> (10 bytes), get the actual response size from the header and then #3 read
> the full response (response size - size of the header bytes).

The proposed patch doesn't clear the data_pending if the entire buffer
is not consumed, so of course it is ABI breaking, that really isn't OK.

> > However, if you tie it to O_NONBLOCK, it won't because no-one currently
> > opens the TPM device non blocking so it's an ABI conformant
> > discriminator of the uses.  Tying to O_NONBLOCK should be simple
> > because it's in file->f_flags.
> 
> I think that it might be an option. Especially that I have this on top of
> the async patch. Let's discuss this when Jarkko is back.

Maybe you could do this by requiring the userspace to call pread()
with a non-zero offset to get the trailing segment of the last
executed command and leave normal read/pread(off=0) with the semantics
as they have today.

Jason
Tadeusz Struk July 23, 2018, 10 p.m. UTC | #16
On 07/23/2018 02:56 PM, Jason Gunthorpe wrote:
> The proposed patch doesn't clear the data_pending if the entire buffer
> is not consumed, so of course it is ABI breaking, that really isn't OK.

The data_pending will be cleared by the timeout handler if the user doesn't
read the response fully before the timeout expires. The is the same situation
if the user would not read the response at all.
Thanks,
Jason Gunthorpe July 23, 2018, 10:08 p.m. UTC | #17
On Mon, Jul 23, 2018 at 03:00:20PM -0700, Tadeusz Struk wrote:
> On 07/23/2018 02:56 PM, Jason Gunthorpe wrote:
> > The proposed patch doesn't clear the data_pending if the entire buffer
> > is not consumed, so of course it is ABI breaking, that really isn't OK.
> 
> The data_pending will be cleared by the timeout handler if the user doesn't
> read the response fully before the timeout expires. The is the same situation
> if the user would not read the response at all.

That causes write() to fail with EBUSY

NAK from me on breaking the ABI like this

Jason
Tadeusz Struk July 23, 2018, 11:42 p.m. UTC | #18
On 07/23/2018 03:08 PM, Jason Gunthorpe wrote:
> On Mon, Jul 23, 2018 at 03:00:20PM -0700, Tadeusz Struk wrote:
>> On 07/23/2018 02:56 PM, Jason Gunthorpe wrote:
>>> The proposed patch doesn't clear the data_pending if the entire buffer
>>> is not consumed, so of course it is ABI breaking, that really isn't OK.
>> The data_pending will be cleared by the timeout handler if the user doesn't
>> read the response fully before the timeout expires. The is the same situation
>> if the user would not read the response at all.
> That causes write() to fail with EBUSY
> 
> NAK from me on breaking the ABI like this

What if we introduce this new behavior only for the non-blocking mode
as James suggested? Or do you have some other suggestions?

Thanks,
Jason Gunthorpe July 24, 2018, 2:05 a.m. UTC | #19
On Mon, Jul 23, 2018 at 04:42:38PM -0700, Tadeusz Struk wrote:
> On 07/23/2018 03:08 PM, Jason Gunthorpe wrote:
> > On Mon, Jul 23, 2018 at 03:00:20PM -0700, Tadeusz Struk wrote:
> >> On 07/23/2018 02:56 PM, Jason Gunthorpe wrote:
> >>> The proposed patch doesn't clear the data_pending if the entire buffer
> >>> is not consumed, so of course it is ABI breaking, that really isn't OK.
> >> The data_pending will be cleared by the timeout handler if the user doesn't
> >> read the response fully before the timeout expires. The is the same situation
> >> if the user would not read the response at all.
> > That causes write() to fail with EBUSY
> > 
> > NAK from me on breaking the ABI like this
> 
> What if we introduce this new behavior only for the non-blocking mode
> as James suggested? Or do you have some other suggestions?

I think you should do it entirely in userspace.

But something sensible linked to O_NONBLOCK could be OK.

Jason
diff mbox

Patch

diff --git a/drivers/char/tpm/tpm-dev-common.c b/drivers/char/tpm/tpm-dev-common.c
index 3f2089f75c30..f5b614984dbc 100644
--- a/drivers/char/tpm/tpm-dev-common.c
+++ b/drivers/char/tpm/tpm-dev-common.c
@@ -99,20 +99,34 @@  ssize_t tpm_common_read(struct file *file, char __user *buf,
 	ssize_t ret_size = 0;
 	int rc;
 
-	del_singleshot_timer_sync(&priv->user_read_timer);
-	flush_work(&priv->timeout_work);
 	mutex_lock(&priv->buffer_mutex);
 
 	if (priv->data_pending) {
-		ret_size = min_t(ssize_t, size, priv->data_pending);
-		if (ret_size > 0) {
-			rc = copy_to_user(buf, priv->data_buffer, ret_size);
-			memset(priv->data_buffer, 0, priv->data_pending);
-			if (rc)
-				ret_size = -EFAULT;
+		ret_size = min_t(ssize_t, size + *off, priv->data_pending);
+		if (ret_size <= 0) {
+			ret_size = 0;
+			priv->data_pending = 0;
+			*off = 0;
+			goto out;
 		}
 
-		priv->data_pending = 0;
+		rc = copy_to_user(buf, priv->data_buffer + *off, ret_size);
+		if (rc) {
+			memset(priv->data_buffer, 0, priv->data_pending);
+			ret_size = -EFAULT;
+			priv->data_pending = 0;
+			*off = 0;
+		} else {
+			memset(priv->data_buffer + *off, 0, ret_size);
+			priv->data_pending -= ret_size;
+			*off += ret_size;
+		}
+	}
+out:
+	if (!priv->data_pending) {
+		del_singleshot_timer_sync(&priv->user_read_timer);
+		flush_work(&priv->timeout_work);
+		*off = 0;
 	}
 
 	mutex_unlock(&priv->buffer_mutex);