Message ID | 153201555276.20155.1352499992826895966.stgit@tstruk-mobl1.jf.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 07/19/2018 08:52 AM, Tadeusz Struk wrote: > Currently to read a response from the TPM device an application needs > provide "big enough" buffer for the whole response and read it in one go. > The application doesn't know how big the response it beforehand so it > always needs to maintain a 4K buffer and read the max (4K). > In case if the user of the TSS library doesn't provide big enough buffer > the TCTI spec says that the library should set the required size and return > TSS2_TCTI_RC_INSUFFICIENT_BUFFER error code so that the application could > allocate a bigger buffer and call receive again. > To make it possible in the TSS library this requires being able to do > partial reads from the driver. > The library would read the header first to get the actual size of the > response from the header and then read the rest of the response. > This patch adds support for partial reads. > > The usecase is implemented in this TSS commit: > https://github.com/tpm2-software/tpm2-tss/commit/ce982f67a67dc08e24683d30b05800648d8a264c > > Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> This needs to go on top of this: https://patchwork.kernel.org/patch/10460863/ Thanks,
On Thu, 2018-07-19 at 08:52 -0700, Tadeusz Struk wrote: > Currently to read a response from the TPM device an application needs > provide "big enough" buffer for the whole response and read it in one > go. The application doesn't know how big the response it beforehand > so it always needs to maintain a 4K buffer and read the max (4K). > In case if the user of the TSS library doesn't provide big enough > buffer the TCTI spec says that the library should set the required > size and return TSS2_TCTI_RC_INSUFFICIENT_BUFFER error code so that > the application could allocate a bigger buffer and call receive > again. To make it possible in the TSS library this requires being > able to do partial reads from the driver. > The library would read the header first to get the actual size of the > response from the header and then read the rest of the response. > This patch adds support for partial reads. > > The usecase is implemented in this TSS commit: > https://github.com/tpm2-software/tpm2-tss/commit/ce982f67a67dc08e2468 > 3d30b05800648d8a264c That's just an implementation, though, what's the use case? I'm curious because all the TPM applications I've written need to be aware of TPM2B_MAX_BUFFER_SIZE, which is related to MAX_RESPONSE_SIZE because you can't go over that for big buffer commands (mostly sealing and unsealing). The TCG supporting routines define MAX_RESPONSE_SIZE to be 4096, so you know absolutely how large a buffer you have to have ... and the value is rather handy for us because if it were larger we'd have to do scatter gather. I think the point is that knowing the max buffer size allows us to behave like UDP: if your packet is the wrong size it gets dropped and relieves the applications from having to do fragmentation and reassembly. Since the max size is so low, what's the benefit of not assuming the application has to know it? James
On 07/19/2018 10:19 AM, James Bottomley wrote:
> That's just an implementation, though, what's the use case?
Hi James,
The use case is described in the TCTI spec [1] in the
"3.2.5.2 receive" section.
Thanks,
On Thu, 2018-07-19 at 10:54 -0700, Tadeusz Struk wrote: > On 07/19/2018 10:19 AM, James Bottomley wrote: > > That's just an implementation, though, what's the use case? > > Hi James, > The use case is described in the TCTI spec [1] in the > "3.2.5.2 receive" section. Well, yes, but I think we've all agreed that the /dev/tpm and /dev/tpmrmX aren't TCTI interfaces, although you can layer TCTI on top of them, so why not simply do fragmentation on top if you need it? The reason for not doing it in the interface is that it alters the ABI. Before this patch we had a hard packet boundary: one packet per read, one per write and a -EFAULT if you fail to provide a correctly sized buffer. Now if you provide a buffer too small but don't know about the fragmentation you're going to misprocess a packet (because you think you got a whole reply but you didn't) and then you get a -EBUSY on your next command which you don't know how to handle. The only way out is to update the applications to handle the new behaviour, which is a no- no in Linux ABI terms. It might be possible to layer the behaviour you want compatibly into the current device structure (say an ioctl to switch to the fragment behaviour) but I've got to ask why we'd go to the complexity without a use case? James
On 07/19/2018 11:47 AM, James Bottomley wrote: > On Thu, 2018-07-19 at 10:54 -0700, Tadeusz Struk wrote: >> On 07/19/2018 10:19 AM, James Bottomley wrote: >>> That's just an implementation, though, what's the use case? >> >> Hi James, >> The use case is described in the TCTI spec [1] in the >> "3.2.5.2 receive" section. > > Well, yes, but I think we've all agreed that the /dev/tpm and > /dev/tpmrmX aren't TCTI interfaces, although you can layer TCTI on top > of them, so why not simply do fragmentation on top if you need it? > > The reason for not doing it in the interface is that it alters the ABI. > Before this patch we had a hard packet boundary: one packet per read, > one per write and a -EFAULT if you fail to provide a correctly sized > buffer. Now if you provide a buffer too small but don't know about the > fragmentation you're going to misprocess a packet (because you think > you got a whole reply but you didn't) and then you get a -EBUSY on your > next command which you don't know how to handle. The only way out is > to update the applications to handle the new behaviour, which is a no- > no in Linux ABI terms. Don't all the existing applications that read a response in one go do a 4K read now? So nothing will change for them. They will work exactly the same with this change as they do without it. This doesn't break the ABI. > > It might be possible to layer the behaviour you want compatibly into > the current device structure (say an ioctl to switch to the fragment > behaviour) but I've got to ask why we'd go to the complexity without a > use case? New IOCTL would add extra complexity, which isn't necessary. Thanks,
On Thu, 2018-07-19 at 12:05 -0700, Tadeusz Struk wrote: > On 07/19/2018 11:47 AM, James Bottomley wrote: > > On Thu, 2018-07-19 at 10:54 -0700, Tadeusz Struk wrote: > > > On 07/19/2018 10:19 AM, James Bottomley wrote: > > > > That's just an implementation, though, what's the use case? > > > > > > Hi James, > > > The use case is described in the TCTI spec [1] in the > > > "3.2.5.2 receive" section. > > > > Well, yes, but I think we've all agreed that the /dev/tpm and > > /dev/tpmrmX aren't TCTI interfaces, although you can layer TCTI on > > top of them, so why not simply do fragmentation on top if you need > > it? > > > > The reason for not doing it in the interface is that it alters the > > ABI. Before this patch we had a hard packet boundary: one packet > > per read, one per write and a -EFAULT if you fail to provide a > > correctly sized buffer. Now if you provide a buffer too small but > > don't know about the fragmentation you're going to misprocess a > > packet (because you think you got a whole reply but you didn't) and > > then you get a -EBUSY on your next command which you don't know how > > to handle. The only way out is to update the applications to > > handle the new behaviour, which is a no-no in Linux ABI terms. > > Don't all the existing applications that read a response in one go > do a 4K read now? So nothing will change for them. They will work > exactly the same with this change as they do without it. > This doesn't break the ABI. The ABI break is the error case as I outlined above. We can't assume everyone uses the current interface without getting an error and one error and your hosed is a nasty failure case to change the interface to. Plus, if you assume everyone is passing 4k buffers, why would you even need the fragmentation case? > > It might be possible to layer the behaviour you want compatibly > > into the current device structure (say an ioctl to switch to the > > fragment behaviour) but I've got to ask why we'd go to the > > complexity without a use case? > > New IOCTL would add extra complexity, which isn't necessary. So what's wrong with fragmenting in the layer above the device driver (in userspace) and not actually changing the kernel? James
On 07/19/2018 12:52 PM, James Bottomley wrote: > The ABI break is the error case as I outlined above. We can't assume > everyone uses the current interface without getting an error and one > error and your hosed is a nasty failure case to change the interface > to. Well, if there is a broken application out there that doesn't work today it will not work after this change neither. > Plus, if you assume everyone is passing 4k buffers, why would you > even need the fragmentation case? So that people don't need to do this anymore and we can run a spec compliant TCTI on top of /dev/tpm<N>. > >>> It might be possible to layer the behaviour you want compatibly >>> into the current device structure (say an ioctl to switch to the >>> fragment behaviour) but I've got to ask why we'd go to the >>> complexity without a use case? >> New IOCTL would add extra complexity, which isn't necessary. > So what's wrong with fragmenting in the layer above the device driver > (in userspace) and not actually changing the kernel? Because it is much easier to implement in the driver, and we can run a spec compliant TCTI on top of /dev/tpm<N>. Thanks,
On Thu, 2018-07-19 at 13:12 -0700, Tadeusz Struk wrote: > On 07/19/2018 12:52 PM, James Bottomley wrote: > > The ABI break is the error case as I outlined above. We can't > > assume everyone uses the current interface without getting an error > > and one error and your hosed is a nasty failure case to change the > > interface to. > > Well, if there is a broken application out there that doesn't work > today it will not work after this change neither. It doesn't have to be broken ... it could be using EFAULT to probe the buffer size for instance. That's the point of not breaking the ABI: you don't second guess what applications are doing. James
On 07/19/2018 01:27 PM, James Bottomley wrote: > On Thu, 2018-07-19 at 13:12 -0700, Tadeusz Struk wrote: >> On 07/19/2018 12:52 PM, James Bottomley wrote: >>> The ABI break is the error case as I outlined above. We can't >>> assume everyone uses the current interface without getting an error >>> and one error and your hosed is a nasty failure case to change the >>> interface to. >> >> Well, if there is a broken application out there that doesn't work >> today it will not work after this change neither. > > It doesn't have to be broken ... it could be using EFAULT to probe the > buffer size for instance. That's the point of not breaking the ABI: > you don't second guess what applications are doing. > Looking at the existing implementation again: https://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git/tree/drivers/char/tpm/tpm-dev-common.c?h=next-tpm#n56 EFAULT is returned only if the copy_to_user() fails. So today, if an application wants read 1 byte of a response, and provides 1 byte buffer for it, then only 1 byte of the response will be copied, no error code will be returned, and the rest of the response will be gone. I don't really see how and why would anyone use EFAULT err to probe for the buffer size. That would really be a broken application. Thanks,
On Thu, Jul 19, 2018 at 08:52:32AM -0700, Tadeusz Struk wrote: > Currently to read a response from the TPM device an application needs > provide "big enough" buffer for the whole response and read it in one go. > The application doesn't know how big the response it beforehand so it > always needs to maintain a 4K buffer and read the max (4K). > In case if the user of the TSS library doesn't provide big enough buffer > the TCTI spec says that the library should set the required size and return > TSS2_TCTI_RC_INSUFFICIENT_BUFFER error code so that the application could > allocate a bigger buffer and call receive again. > To make it possible in the TSS library this requires being able to do > partial reads from the driver. > The library would read the header first to get the actual size of the > response from the header and then read the rest of the response. > This patch adds support for partial reads. > > The usecase is implemented in this TSS commit: > https://github.com/tpm2-software/tpm2-tss/commit/ce982f67a67dc08e24683d30b05800648d8a264c > > Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> For non-blocking operation I see the benefit because it does not break the ABI and it really simplifies threading in the user space. In this case I do not have any major evidence of any major benefit *and* the change breaks the ABI. Linux does not *have* to implement in kernel level every tidbit of the TCG spec but it *can* provide support in places where it makes sense and things do not break. /Jarkko
On 07/23/2018 01:19 PM, Jarkko Sakkinen wrote: > In this case I do not have any major evidence of any major benefit *and* > the change breaks the ABI. As I said before - this does not break the ABI. As for the benefits - it help user space in how they implement the receive path. Application does not need to provide a 4K buffer for every read even if the response is, for instance 8 bytes long. Thanks,
On Mon, 2018-07-23 at 13:53 -0700, Tadeusz Struk wrote: > On 07/23/2018 01:19 PM, Jarkko Sakkinen wrote: > > In this case I do not have any major evidence of any major benefit > > *and* the change breaks the ABI. > > As I said before - this does not break the ABI. The current patch does, you even provided a use case in your last email (it's do command to get sizing followed by do command with correctly sized buffer). However, if you tie it to O_NONBLOCK, it won't because no-one currently opens the TPM device non blocking so it's an ABI conformant discriminator of the uses. Tying to O_NONBLOCK should be simple because it's in file->f_flags. James
On 07/23/2018 02:13 PM, James Bottomley wrote: > The current patch does, you even provided a use case in your last email > (it's do command to get sizing followed by do command with correctly > sized buffer). The example I provided was: #1 send a command, #2 read the response header (10 bytes), get the actual response size from the header and then #3 read the full response (response size - size of the header bytes). > > However, if you tie it to O_NONBLOCK, it won't because no-one currently > opens the TPM device non blocking so it's an ABI conformant > discriminator of the uses. Tying to O_NONBLOCK should be simple > because it's in file->f_flags. I think that it might be an option. Especially that I have this on top of the async patch. Let's discuss this when Jarkko is back. Thanks,
On Thu, Jul 19, 2018 at 08:52:32AM -0700, Tadeusz Struk wrote: > Currently to read a response from the TPM device an application needs > provide "big enough" buffer for the whole response and read it in one go. > The application doesn't know how big the response it beforehand so it > always needs to maintain a 4K buffer and read the max (4K). > In case if the user of the TSS library doesn't provide big enough buffer > the TCTI spec says that the library should set the required size and return > TSS2_TCTI_RC_INSUFFICIENT_BUFFER error code so that the application could > allocate a bigger buffer and call receive again. > To make it possible in the TSS library this requires being able to do > partial reads from the driver. > The library would read the header first to get the actual size of the > response from the header and then read the rest of the response. > This patch adds support for partial reads. You should solve this in user space, the kernel API requires a full sized buffer here, the tss library should always provide such an internal buffer and then implement whatever scheme TSS wants by memcpying from that buffer.. Jason
On Mon, Jul 23, 2018 at 02:38:08PM -0700, Tadeusz Struk wrote: > On 07/23/2018 02:13 PM, James Bottomley wrote: > > The current patch does, you even provided a use case in your last email > > (it's do command to get sizing followed by do command with correctly > > sized buffer). > > The example I provided was: #1 send a command, #2 read the response header > (10 bytes), get the actual response size from the header and then #3 read > the full response (response size - size of the header bytes). The proposed patch doesn't clear the data_pending if the entire buffer is not consumed, so of course it is ABI breaking, that really isn't OK. > > However, if you tie it to O_NONBLOCK, it won't because no-one currently > > opens the TPM device non blocking so it's an ABI conformant > > discriminator of the uses. Tying to O_NONBLOCK should be simple > > because it's in file->f_flags. > > I think that it might be an option. Especially that I have this on top of > the async patch. Let's discuss this when Jarkko is back. Maybe you could do this by requiring the userspace to call pread() with a non-zero offset to get the trailing segment of the last executed command and leave normal read/pread(off=0) with the semantics as they have today. Jason
On 07/23/2018 02:56 PM, Jason Gunthorpe wrote: > The proposed patch doesn't clear the data_pending if the entire buffer > is not consumed, so of course it is ABI breaking, that really isn't OK. The data_pending will be cleared by the timeout handler if the user doesn't read the response fully before the timeout expires. The is the same situation if the user would not read the response at all. Thanks,
On Mon, Jul 23, 2018 at 03:00:20PM -0700, Tadeusz Struk wrote: > On 07/23/2018 02:56 PM, Jason Gunthorpe wrote: > > The proposed patch doesn't clear the data_pending if the entire buffer > > is not consumed, so of course it is ABI breaking, that really isn't OK. > > The data_pending will be cleared by the timeout handler if the user doesn't > read the response fully before the timeout expires. The is the same situation > if the user would not read the response at all. That causes write() to fail with EBUSY NAK from me on breaking the ABI like this Jason
On 07/23/2018 03:08 PM, Jason Gunthorpe wrote: > On Mon, Jul 23, 2018 at 03:00:20PM -0700, Tadeusz Struk wrote: >> On 07/23/2018 02:56 PM, Jason Gunthorpe wrote: >>> The proposed patch doesn't clear the data_pending if the entire buffer >>> is not consumed, so of course it is ABI breaking, that really isn't OK. >> The data_pending will be cleared by the timeout handler if the user doesn't >> read the response fully before the timeout expires. The is the same situation >> if the user would not read the response at all. > That causes write() to fail with EBUSY > > NAK from me on breaking the ABI like this What if we introduce this new behavior only for the non-blocking mode as James suggested? Or do you have some other suggestions? Thanks,
On Mon, Jul 23, 2018 at 04:42:38PM -0700, Tadeusz Struk wrote: > On 07/23/2018 03:08 PM, Jason Gunthorpe wrote: > > On Mon, Jul 23, 2018 at 03:00:20PM -0700, Tadeusz Struk wrote: > >> On 07/23/2018 02:56 PM, Jason Gunthorpe wrote: > >>> The proposed patch doesn't clear the data_pending if the entire buffer > >>> is not consumed, so of course it is ABI breaking, that really isn't OK. > >> The data_pending will be cleared by the timeout handler if the user doesn't > >> read the response fully before the timeout expires. The is the same situation > >> if the user would not read the response at all. > > That causes write() to fail with EBUSY > > > > NAK from me on breaking the ABI like this > > What if we introduce this new behavior only for the non-blocking mode > as James suggested? Or do you have some other suggestions? I think you should do it entirely in userspace. But something sensible linked to O_NONBLOCK could be OK. Jason
diff --git a/drivers/char/tpm/tpm-dev-common.c b/drivers/char/tpm/tpm-dev-common.c index 3f2089f75c30..f5b614984dbc 100644 --- a/drivers/char/tpm/tpm-dev-common.c +++ b/drivers/char/tpm/tpm-dev-common.c @@ -99,20 +99,34 @@ ssize_t tpm_common_read(struct file *file, char __user *buf, ssize_t ret_size = 0; int rc; - del_singleshot_timer_sync(&priv->user_read_timer); - flush_work(&priv->timeout_work); mutex_lock(&priv->buffer_mutex); if (priv->data_pending) { - ret_size = min_t(ssize_t, size, priv->data_pending); - if (ret_size > 0) { - rc = copy_to_user(buf, priv->data_buffer, ret_size); - memset(priv->data_buffer, 0, priv->data_pending); - if (rc) - ret_size = -EFAULT; + ret_size = min_t(ssize_t, size + *off, priv->data_pending); + if (ret_size <= 0) { + ret_size = 0; + priv->data_pending = 0; + *off = 0; + goto out; } - priv->data_pending = 0; + rc = copy_to_user(buf, priv->data_buffer + *off, ret_size); + if (rc) { + memset(priv->data_buffer, 0, priv->data_pending); + ret_size = -EFAULT; + priv->data_pending = 0; + *off = 0; + } else { + memset(priv->data_buffer + *off, 0, ret_size); + priv->data_pending -= ret_size; + *off += ret_size; + } + } +out: + if (!priv->data_pending) { + del_singleshot_timer_sync(&priv->user_read_timer); + flush_work(&priv->timeout_work); + *off = 0; } mutex_unlock(&priv->buffer_mutex);
Currently to read a response from the TPM device an application needs provide "big enough" buffer for the whole response and read it in one go. The application doesn't know how big the response it beforehand so it always needs to maintain a 4K buffer and read the max (4K). In case if the user of the TSS library doesn't provide big enough buffer the TCTI spec says that the library should set the required size and return TSS2_TCTI_RC_INSUFFICIENT_BUFFER error code so that the application could allocate a bigger buffer and call receive again. To make it possible in the TSS library this requires being able to do partial reads from the driver. The library would read the header first to get the actual size of the response from the header and then read the rest of the response. This patch adds support for partial reads. The usecase is implemented in this TSS commit: https://github.com/tpm2-software/tpm2-tss/commit/ce982f67a67dc08e24683d30b05800648d8a264c Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> --- drivers/char/tpm/tpm-dev-common.c | 32 +++++++++++++++++++++++--------- 1 file changed, 23 insertions(+), 9 deletions(-)