Message ID | 20171017203232.2262-3-nayna@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
On Tue, Oct 17, 2017 at 04:32:30PM -0400, Nayna Jain wrote: > The TPM burstcount status indicates the number of bytes that can > be sent to the TPM without causing bus wait states. Effectively, > it is the number of empty bytes in the command FIFO. > > This patch optimizes the tpm_tis_send_data() function by checking > the burstcount only once. And if the burstcount is valid, it writes > all the bytes at once, permitting wait state. > > After this change, performance on a TPM 1.2 with an 8 byte > burstcount for 1000 extends improved from ~41sec to ~14sec. > > Suggested-by: Ken Goldman <kgold@linux.vnet.ibm.com> in > conjunction with the TPM Device Driver work group. > Signed-off-by: Nayna Jain <nayna@linux.vnet.ibm.com> > Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> /Jarkko > --- > drivers/char/tpm/tpm_tis_core.c | 42 +++++++++++++++-------------------------- > 1 file changed, 15 insertions(+), 27 deletions(-) > > diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c > index b33126a35694..993328ae988c 100644 > --- a/drivers/char/tpm/tpm_tis_core.c > +++ b/drivers/char/tpm/tpm_tis_core.c > @@ -316,7 +316,6 @@ static int tpm_tis_send_data(struct tpm_chip *chip, u8 *buf, size_t len) > { > struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev); > int rc, status, burstcnt; > - size_t count = 0; > bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND; > > status = tpm_tis_status(chip); > @@ -330,35 +329,24 @@ static int tpm_tis_send_data(struct tpm_chip *chip, u8 *buf, size_t len) > } > } > > - while (count < len - 1) { > - burstcnt = get_burstcount(chip); > - if (burstcnt < 0) { > - dev_err(&chip->dev, "Unable to read burstcount\n"); > - rc = burstcnt; > - goto out_err; > - } > - burstcnt = min_t(int, burstcnt, len - count - 1); > - rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv->locality), > - burstcnt, buf + count); > - if (rc < 0) > - goto out_err; > - > - count += burstcnt; > - > - if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip->timeout_c, > - &priv->int_queue, false) < 0) { > - rc = -ETIME; > - goto out_err; > - } > - status = tpm_tis_status(chip); > - if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) { > - rc = -EIO; > - goto out_err; > - } > + /* > + * Get the initial burstcount to ensure TPM is ready to > + * accept data. > + */ > + burstcnt = get_burstcount(chip); > + if (burstcnt < 0) { > + dev_err(&chip->dev, "Unable to read burstcount\n"); > + rc = burstcnt; > + goto out_err; > } > > + rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv->locality), > + len - 1, buf); > + if (rc < 0) > + goto out_err; > + > /* write last byte */ > - rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), buf[count]); > + rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), buf[len-1]); > if (rc < 0) > goto out_err; > > -- > 2.13.3 >
On Tue, Oct 17, 2017 at 04:32:30PM -0400, Nayna Jain wrote: > The TPM burstcount status indicates the number of bytes that can > be sent to the TPM without causing bus wait states. Effectively, > it is the number of empty bytes in the command FIFO. > > This patch optimizes the tpm_tis_send_data() function by checking > the burstcount only once. And if the burstcount is valid, it writes > all the bytes at once, permitting wait state. > > After this change, performance on a TPM 1.2 with an 8 byte > burstcount for 1000 extends improved from ~41sec to ~14sec. > > Suggested-by: Ken Goldman <kgold@linux.vnet.ibm.com> in > conjunction with the TPM Device Driver work group. > Signed-off-by: Nayna Jain <nayna@linux.vnet.ibm.com> > Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> > --- > drivers/char/tpm/tpm_tis_core.c | 42 +++++++++++++++-------------------------- > 1 file changed, 15 insertions(+), 27 deletions(-) > > diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c > index b33126a35694..993328ae988c 100644 > --- a/drivers/char/tpm/tpm_tis_core.c > +++ b/drivers/char/tpm/tpm_tis_core.c > @@ -316,7 +316,6 @@ static int tpm_tis_send_data(struct tpm_chip *chip, u8 *buf, size_t len) > { > struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev); > int rc, status, burstcnt; > - size_t count = 0; > bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND; > > status = tpm_tis_status(chip); > @@ -330,35 +329,24 @@ static int tpm_tis_send_data(struct tpm_chip *chip, u8 *buf, size_t len) > } > } > > - while (count < len - 1) { > - burstcnt = get_burstcount(chip); > - if (burstcnt < 0) { > - dev_err(&chip->dev, "Unable to read burstcount\n"); > - rc = burstcnt; > - goto out_err; > - } > - burstcnt = min_t(int, burstcnt, len - count - 1); > - rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv->locality), > - burstcnt, buf + count); > - if (rc < 0) > - goto out_err; > - > - count += burstcnt; > - > - if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip->timeout_c, > - &priv->int_queue, false) < 0) { > - rc = -ETIME; > - goto out_err; > - } > - status = tpm_tis_status(chip); > - if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) { > - rc = -EIO; > - goto out_err; > - } > + /* > + * Get the initial burstcount to ensure TPM is ready to > + * accept data. > + */ > + burstcnt = get_burstcount(chip); > + if (burstcnt < 0) { > + dev_err(&chip->dev, "Unable to read burstcount\n"); > + rc = burstcnt; > + goto out_err; > } > > + rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv->locality), > + len - 1, buf); > + if (rc < 0) > + goto out_err; > + > /* write last byte */ > - rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), buf[count]); > + rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), buf[len-1]); > if (rc < 0) > goto out_err; > > -- > 2.13.3 > Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> /Jarkko
> The TPM burstcount status indicates the number of bytes that can > be sent to the TPM without causing bus wait states. Effectively, > it is the number of empty bytes in the command FIFO. > > This patch optimizes the tpm_tis_send_data() function by checking > the burstcount only once. And if the burstcount is valid, it writes > all the bytes at once, permitting wait state. > > After this change, performance on a TPM 1.2 with an 8 byte > burstcount for 1000 extends improved from ~41sec to ~14sec. > > Suggested-by: Ken Goldman <kgold@linux.vnet.ibm.com> in > conjunction with the TPM Device Driver work group. > Signed-off-by: Nayna Jain <nayna@linux.vnet.ibm.com> > Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> > --- > drivers/char/tpm/tpm_tis_core.c | 42 +++++++++++++++---------------------- > ---- > 1 file changed, 15 insertions(+), 27 deletions(-) > > diff --git a/drivers/char/tpm/tpm_tis_core.c > b/drivers/char/tpm/tpm_tis_core.c > index b33126a35694..993328ae988c 100644 > --- a/drivers/char/tpm/tpm_tis_core.c > +++ b/drivers/char/tpm/tpm_tis_core.c > @@ -316,7 +316,6 @@ static int tpm_tis_send_data(struct tpm_chip *chip, > u8 *buf, size_t len) > { > struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev); > int rc, status, burstcnt; > - size_t count = 0; > bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND; > > status = tpm_tis_status(chip); > @@ -330,35 +329,24 @@ static int tpm_tis_send_data(struct tpm_chip *chip, > u8 *buf, size_t len) > } > } > > - while (count < len - 1) { > - burstcnt = get_burstcount(chip); > - if (burstcnt < 0) { > - dev_err(&chip->dev, "Unable to read burstcount\n"); > - rc = burstcnt; > - goto out_err; > - } > - burstcnt = min_t(int, burstcnt, len - count - 1); > - rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv- > >locality), > - burstcnt, buf + count); > - if (rc < 0) > - goto out_err; > - > - count += burstcnt; > - > - if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip- > >timeout_c, > - &priv->int_queue, false) < 0) { > - rc = -ETIME; > - goto out_err; > - } > - status = tpm_tis_status(chip); > - if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) { > - rc = -EIO; > - goto out_err; > - } > + /* > + * Get the initial burstcount to ensure TPM is ready to > + * accept data. > + */ > + burstcnt = get_burstcount(chip); > + if (burstcnt < 0) { > + dev_err(&chip->dev, "Unable to read burstcount\n"); > + rc = burstcnt; > + goto out_err; > } > > + rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv->locality), > + len - 1, buf); > + if (rc < 0) > + goto out_err; > + > /* write last byte */ > - rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), > buf[count]); > + rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), buf[len- > 1]); > if (rc < 0) > goto out_err; > > -- > 2.13.3 This seems to fail reliably with my SPI TPM 2.0. I get EIO when trying to send large amounts of data, e.g. with TPM2_Hash, and subsequent tests seem to take an unusual amount of time. More analysis probably has to wait until November, since I am going to be in Prague next week. Alexander
On 10/20/2017 10:42 AM, Alexander.Steffen@infineon.com wrote: > > This seems to fail reliably with my SPI TPM 2.0. I get EIO when > trying to send large amounts of data, e.g. with TPM2_Hash, and > subsequent tests seem to take an unusual amount of time. More > analysis probably has to wait until November, since I am going to be > in Prague next week. I have a guess as to the cause of the failure. Would it be possible for you to test it? 1 - My guess is that EIO is coming from here: static int tpm_tis_send_data(struct tpm_chip *chip, u8 *buf, size_t len) ... /* write last byte */ rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), buf[count]); if (rc < 0) goto out_err; if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip->timeout_c, &priv->int_queue, false) < 0) { rc = -ETIME; goto out_err; } status = tpm_tis_status(chip); if (!itpm && (status & TPM_STS_DATA_EXPECT) != 0) { rc = -EIO; goto out_err; } ... Can you verify that this is the cause. 2 - If that's the cause, I believe that there is a latent bug. Expect is not guaranteed to become false immediately. It only occurs after the TPM firmware has emptied the FIFO. Thus, the tpm_tis_status() really should be something like "wait_for_tpm_expect_false()", with a sleep loop. This missing wait has been in the code for a while. If may just surface now because the patch causes data to be written faster, and thus it takes longer for the TPM to empty the FIFO and clear Expect. It also makes sense that it would occur more often on long commands.
On Fri, Oct 20, 2017 at 02:42:54PM +0000, Alexander.Steffen@infineon.com wrote: > > The TPM burstcount status indicates the number of bytes that can > > be sent to the TPM without causing bus wait states. Effectively, > > it is the number of empty bytes in the command FIFO. > > > > This patch optimizes the tpm_tis_send_data() function by checking > > the burstcount only once. And if the burstcount is valid, it writes > > all the bytes at once, permitting wait state. > > > > After this change, performance on a TPM 1.2 with an 8 byte > > burstcount for 1000 extends improved from ~41sec to ~14sec. > > > > Suggested-by: Ken Goldman <kgold@linux.vnet.ibm.com> in > > conjunction with the TPM Device Driver work group. > > Signed-off-by: Nayna Jain <nayna@linux.vnet.ibm.com> > > Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> > > --- > > drivers/char/tpm/tpm_tis_core.c | 42 +++++++++++++++---------------------- > > ---- > > 1 file changed, 15 insertions(+), 27 deletions(-) > > > > diff --git a/drivers/char/tpm/tpm_tis_core.c > > b/drivers/char/tpm/tpm_tis_core.c > > index b33126a35694..993328ae988c 100644 > > --- a/drivers/char/tpm/tpm_tis_core.c > > +++ b/drivers/char/tpm/tpm_tis_core.c > > @@ -316,7 +316,6 @@ static int tpm_tis_send_data(struct tpm_chip *chip, > > u8 *buf, size_t len) > > { > > struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev); > > int rc, status, burstcnt; > > - size_t count = 0; > > bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND; > > > > status = tpm_tis_status(chip); > > @@ -330,35 +329,24 @@ static int tpm_tis_send_data(struct tpm_chip *chip, > > u8 *buf, size_t len) > > } > > } > > > > - while (count < len - 1) { > > - burstcnt = get_burstcount(chip); > > - if (burstcnt < 0) { > > - dev_err(&chip->dev, "Unable to read burstcount\n"); > > - rc = burstcnt; > > - goto out_err; > > - } > > - burstcnt = min_t(int, burstcnt, len - count - 1); > > - rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv- > > >locality), > > - burstcnt, buf + count); > > - if (rc < 0) > > - goto out_err; > > - > > - count += burstcnt; > > - > > - if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip- > > >timeout_c, > > - &priv->int_queue, false) < 0) { > > - rc = -ETIME; > > - goto out_err; > > - } > > - status = tpm_tis_status(chip); > > - if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) { > > - rc = -EIO; > > - goto out_err; > > - } > > + /* > > + * Get the initial burstcount to ensure TPM is ready to > > + * accept data. > > + */ > > + burstcnt = get_burstcount(chip); > > + if (burstcnt < 0) { > > + dev_err(&chip->dev, "Unable to read burstcount\n"); > > + rc = burstcnt; > > + goto out_err; > > } > > > > + rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv->locality), > > + len - 1, buf); > > + if (rc < 0) > > + goto out_err; > > + > > /* write last byte */ > > - rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), > > buf[count]); > > + rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), buf[len- > > 1]); > > if (rc < 0) > > goto out_err; > > > > -- > > 2.13.3 > > This seems to fail reliably with my SPI TPM 2.0. I get EIO when trying > to send large amounts of data, e.g. with TPM2_Hash, and subsequent > tests seem to take an unusual amount of time. More analysis probably > has to wait until November, since I am going to be in Prague next > week. > > Alexander For the time being I'll drop this patch from the next branch and keep it in the master branch. /Jarkko
On 10/20/2017 08:12 PM, Alexander.Steffen@infineon.com wrote: >> The TPM burstcount status indicates the number of bytes that can >> be sent to the TPM without causing bus wait states. Effectively, >> it is the number of empty bytes in the command FIFO. >> >> This patch optimizes the tpm_tis_send_data() function by checking >> the burstcount only once. And if the burstcount is valid, it writes >> all the bytes at once, permitting wait state. >> >> After this change, performance on a TPM 1.2 with an 8 byte >> burstcount for 1000 extends improved from ~41sec to ~14sec. >> >> Suggested-by: Ken Goldman<kgold@linux.vnet.ibm.com> in >> conjunction with the TPM Device Driver work group. >> Signed-off-by: Nayna Jain<nayna@linux.vnet.ibm.com> >> Acked-by: Mimi Zohar<zohar@linux.vnet.ibm.com> >> --- >> drivers/char/tpm/tpm_tis_core.c | 42 +++++++++++++++---------------------- >> ---- >> 1 file changed, 15 insertions(+), 27 deletions(-) >> >> diff --git a/drivers/char/tpm/tpm_tis_core.c >> b/drivers/char/tpm/tpm_tis_core.c >> index b33126a35694..993328ae988c 100644 >> --- a/drivers/char/tpm/tpm_tis_core.c >> +++ b/drivers/char/tpm/tpm_tis_core.c >> @@ -316,7 +316,6 @@ static int tpm_tis_send_data(struct tpm_chip *chip, >> u8 *buf, size_t len) >> { >> struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev); >> int rc, status, burstcnt; >> - size_t count = 0; >> bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND; >> >> status = tpm_tis_status(chip); >> @@ -330,35 +329,24 @@ static int tpm_tis_send_data(struct tpm_chip *chip, >> u8 *buf, size_t len) >> } >> } >> >> - while (count < len - 1) { >> - burstcnt = get_burstcount(chip); >> - if (burstcnt < 0) { >> - dev_err(&chip->dev, "Unable to read burstcount\n"); >> - rc = burstcnt; >> - goto out_err; >> - } >> - burstcnt = min_t(int, burstcnt, len - count - 1); >> - rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv- >>> locality), >> - burstcnt, buf + count); >> - if (rc < 0) >> - goto out_err; >> - >> - count += burstcnt; >> - >> - if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip- >>> timeout_c, >> - &priv->int_queue, false) < 0) { >> - rc = -ETIME; >> - goto out_err; >> - } >> - status = tpm_tis_status(chip); >> - if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) { >> - rc = -EIO; >> - goto out_err; >> - } >> + /* >> + * Get the initial burstcount to ensure TPM is ready to >> + * accept data. >> + */ >> + burstcnt = get_burstcount(chip); >> + if (burstcnt < 0) { >> + dev_err(&chip->dev, "Unable to read burstcount\n"); >> + rc = burstcnt; >> + goto out_err; >> } >> >> + rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv->locality), >> + len - 1, buf); >> + if (rc < 0) >> + goto out_err; >> + >> /* write last byte */ >> - rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), >> buf[count]); >> + rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), buf[len- >> 1]); >> if (rc < 0) >> goto out_err; >> >> -- >> 2.13.3 > This seems to fail reliably with my SPI TPM 2.0. I get EIO when trying to send large amounts of data, e.g. with TPM2_Hash, and subsequent tests seem to take an unusual amount of time. More analysis probably has to wait until November, since I am going to be in Prague next week. Thanks Alex for testing these.. Did you get the chance to do any further analysis ? Thanks & Regards, - Nayna > Alexander >
> On 10/20/2017 08:12 PM, Alexander.Steffen@infineon.com wrote: > >> The TPM burstcount status indicates the number of bytes that can > >> be sent to the TPM without causing bus wait states. Effectively, > >> it is the number of empty bytes in the command FIFO. > >> > >> This patch optimizes the tpm_tis_send_data() function by checking > >> the burstcount only once. And if the burstcount is valid, it writes > >> all the bytes at once, permitting wait state. > >> > >> After this change, performance on a TPM 1.2 with an 8 byte > >> burstcount for 1000 extends improved from ~41sec to ~14sec. > >> > >> Suggested-by: Ken Goldman<kgold@linux.vnet.ibm.com> in > >> conjunction with the TPM Device Driver work group. > >> Signed-off-by: Nayna Jain<nayna@linux.vnet.ibm.com> > >> Acked-by: Mimi Zohar<zohar@linux.vnet.ibm.com> > >> --- > >> drivers/char/tpm/tpm_tis_core.c | 42 +++++++++++++++------------------ > ---- > >> ---- > >> 1 file changed, 15 insertions(+), 27 deletions(-) > >> > >> diff --git a/drivers/char/tpm/tpm_tis_core.c > >> b/drivers/char/tpm/tpm_tis_core.c > >> index b33126a35694..993328ae988c 100644 > >> --- a/drivers/char/tpm/tpm_tis_core.c > >> +++ b/drivers/char/tpm/tpm_tis_core.c > >> @@ -316,7 +316,6 @@ static int tpm_tis_send_data(struct tpm_chip > *chip, > >> u8 *buf, size_t len) > >> { > >> struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev); > >> int rc, status, burstcnt; > >> - size_t count = 0; > >> bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND; > >> > >> status = tpm_tis_status(chip); > >> @@ -330,35 +329,24 @@ static int tpm_tis_send_data(struct tpm_chip > *chip, > >> u8 *buf, size_t len) > >> } > >> } > >> > >> - while (count < len - 1) { > >> - burstcnt = get_burstcount(chip); > >> - if (burstcnt < 0) { > >> - dev_err(&chip->dev, "Unable to read burstcount\n"); > >> - rc = burstcnt; > >> - goto out_err; > >> - } > >> - burstcnt = min_t(int, burstcnt, len - count - 1); > >> - rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv- > >>> locality), > >> - burstcnt, buf + count); > >> - if (rc < 0) > >> - goto out_err; > >> - > >> - count += burstcnt; > >> - > >> - if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip- > >>> timeout_c, > >> - &priv->int_queue, false) < 0) { > >> - rc = -ETIME; > >> - goto out_err; > >> - } > >> - status = tpm_tis_status(chip); > >> - if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) { > >> - rc = -EIO; > >> - goto out_err; > >> - } > >> + /* > >> + * Get the initial burstcount to ensure TPM is ready to > >> + * accept data. > >> + */ > >> + burstcnt = get_burstcount(chip); > >> + if (burstcnt < 0) { > >> + dev_err(&chip->dev, "Unable to read burstcount\n"); > >> + rc = burstcnt; > >> + goto out_err; > >> } > >> > >> + rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv->locality), > >> + len - 1, buf); > >> + if (rc < 0) > >> + goto out_err; > >> + > >> /* write last byte */ > >> - rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), > >> buf[count]); > >> + rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), buf[len- > >> 1]); > >> if (rc < 0) > >> goto out_err; > >> > >> -- > >> 2.13.3 > > This seems to fail reliably with my SPI TPM 2.0. I get EIO when trying to > send large amounts of data, e.g. with TPM2_Hash, and subsequent tests > seem to take an unusual amount of time. More analysis probably has to wait > until November, since I am going to be in Prague next week. > > Thanks Alex for testing these.. Did you get the chance to do any further > analysis ? I am working on that now. Ken's suggestion seems reasonable, so I am going to test whether correctly waiting for the flags to change fixes the problem. If it does, I'll send the patches. Alexander
> > On 10/20/2017 08:12 PM, Alexander.Steffen@infineon.com wrote: > > >> The TPM burstcount status indicates the number of bytes that can > > >> be sent to the TPM without causing bus wait states. Effectively, > > >> it is the number of empty bytes in the command FIFO. > > >> > > >> This patch optimizes the tpm_tis_send_data() function by checking > > >> the burstcount only once. And if the burstcount is valid, it writes > > >> all the bytes at once, permitting wait state. > > >> > > >> After this change, performance on a TPM 1.2 with an 8 byte > > >> burstcount for 1000 extends improved from ~41sec to ~14sec. > > >> > > >> Suggested-by: Ken Goldman<kgold@linux.vnet.ibm.com> in > > >> conjunction with the TPM Device Driver work group. > > >> Signed-off-by: Nayna Jain<nayna@linux.vnet.ibm.com> > > >> Acked-by: Mimi Zohar<zohar@linux.vnet.ibm.com> > > >> --- > > >> drivers/char/tpm/tpm_tis_core.c | 42 +++++++++++++++---------------- > -- > > ---- > > >> ---- > > >> 1 file changed, 15 insertions(+), 27 deletions(-) > > >> > > >> diff --git a/drivers/char/tpm/tpm_tis_core.c > > >> b/drivers/char/tpm/tpm_tis_core.c > > >> index b33126a35694..993328ae988c 100644 > > >> --- a/drivers/char/tpm/tpm_tis_core.c > > >> +++ b/drivers/char/tpm/tpm_tis_core.c > > >> @@ -316,7 +316,6 @@ static int tpm_tis_send_data(struct tpm_chip > > *chip, > > >> u8 *buf, size_t len) > > >> { > > >> struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev); > > >> int rc, status, burstcnt; > > >> - size_t count = 0; > > >> bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND; > > >> > > >> status = tpm_tis_status(chip); > > >> @@ -330,35 +329,24 @@ static int tpm_tis_send_data(struct tpm_chip > > *chip, > > >> u8 *buf, size_t len) > > >> } > > >> } > > >> > > >> - while (count < len - 1) { > > >> - burstcnt = get_burstcount(chip); > > >> - if (burstcnt < 0) { > > >> - dev_err(&chip->dev, "Unable to read burstcount\n"); > > >> - rc = burstcnt; > > >> - goto out_err; > > >> - } > > >> - burstcnt = min_t(int, burstcnt, len - count - 1); > > >> - rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv- > > >>> locality), > > >> - burstcnt, buf + count); > > >> - if (rc < 0) > > >> - goto out_err; > > >> - > > >> - count += burstcnt; > > >> - > > >> - if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip- > > >>> timeout_c, > > >> - &priv->int_queue, false) < 0) { > > >> - rc = -ETIME; > > >> - goto out_err; > > >> - } > > >> - status = tpm_tis_status(chip); > > >> - if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) { > > >> - rc = -EIO; > > >> - goto out_err; > > >> - } > > >> + /* > > >> + * Get the initial burstcount to ensure TPM is ready to > > >> + * accept data. > > >> + */ > > >> + burstcnt = get_burstcount(chip); > > >> + if (burstcnt < 0) { > > >> + dev_err(&chip->dev, "Unable to read burstcount\n"); > > >> + rc = burstcnt; > > >> + goto out_err; > > >> } > > >> > > >> + rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv->locality), > > >> + len - 1, buf); > > >> + if (rc < 0) > > >> + goto out_err; > > >> + > > >> /* write last byte */ > > >> - rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), > > >> buf[count]); > > >> + rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), buf[len- > > >> 1]); > > >> if (rc < 0) > > >> goto out_err; > > >> > > >> -- > > >> 2.13.3 > > > This seems to fail reliably with my SPI TPM 2.0. I get EIO when trying to > > send large amounts of data, e.g. with TPM2_Hash, and subsequent tests > > seem to take an unusual amount of time. More analysis probably has to > wait > > until November, since I am going to be in Prague next week. > > > > Thanks Alex for testing these.. Did you get the chance to do any further > > analysis ? > > I am working on that now. Ken's suggestion seems reasonable, so I am going > to test whether correctly waiting for the flags to change fixes the problem. If > it does, I'll send the patches. Sorry for the delay, I had to take care of some device tree changes in v4.14 that broke my ARM test machines. I've implemented some patches that fix the issue that Ken pointed out and rebased your patch 2/4 ("ignore burstcount") on top. While doing this I noticed that your original patch does not, as the commit message says, write all the bytes at once, but still unnecessarily splits all commands into at least two transfers (as did the original code). I've fixed this as well in my patches, so that all bytes are indeed sent in a single call, without special handling for the last byte. This should speed up things further, especially for small commands and drivers like tpm_tis_spi, where writing a single byte translates into additional SPI transfers. Unfortunately, even with those changes the problem persists. But I've got more detailed logs now and will try to understand and hopefully fix the issue. I'll follow up with more details and/or patches once I know more. Alexander
> > > On 10/20/2017 08:12 PM, Alexander.Steffen@infineon.com wrote: > > > >> The TPM burstcount status indicates the number of bytes that can > > > >> be sent to the TPM without causing bus wait states. Effectively, > > > >> it is the number of empty bytes in the command FIFO. > > > >> > > > >> This patch optimizes the tpm_tis_send_data() function by checking > > > >> the burstcount only once. And if the burstcount is valid, it writes > > > >> all the bytes at once, permitting wait state. > > > >> > > > >> After this change, performance on a TPM 1.2 with an 8 byte > > > >> burstcount for 1000 extends improved from ~41sec to ~14sec. > > > >> > > > >> Suggested-by: Ken Goldman<kgold@linux.vnet.ibm.com> in > > > >> conjunction with the TPM Device Driver work group. > > > >> Signed-off-by: Nayna Jain<nayna@linux.vnet.ibm.com> > > > >> Acked-by: Mimi Zohar<zohar@linux.vnet.ibm.com> > > > >> --- > > > >> drivers/char/tpm/tpm_tis_core.c | 42 +++++++++++++++-------------- > -- > > -- > > > ---- > > > >> ---- > > > >> 1 file changed, 15 insertions(+), 27 deletions(-) > > > >> > > > >> diff --git a/drivers/char/tpm/tpm_tis_core.c > > > >> b/drivers/char/tpm/tpm_tis_core.c > > > >> index b33126a35694..993328ae988c 100644 > > > >> --- a/drivers/char/tpm/tpm_tis_core.c > > > >> +++ b/drivers/char/tpm/tpm_tis_core.c > > > >> @@ -316,7 +316,6 @@ static int tpm_tis_send_data(struct tpm_chip > > > *chip, > > > >> u8 *buf, size_t len) > > > >> { > > > >> struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev); > > > >> int rc, status, burstcnt; > > > >> - size_t count = 0; > > > >> bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND; > > > >> > > > >> status = tpm_tis_status(chip); > > > >> @@ -330,35 +329,24 @@ static int tpm_tis_send_data(struct > tpm_chip > > > *chip, > > > >> u8 *buf, size_t len) > > > >> } > > > >> } > > > >> > > > >> - while (count < len - 1) { > > > >> - burstcnt = get_burstcount(chip); > > > >> - if (burstcnt < 0) { > > > >> - dev_err(&chip->dev, "Unable to read > burstcount\n"); > > > >> - rc = burstcnt; > > > >> - goto out_err; > > > >> - } > > > >> - burstcnt = min_t(int, burstcnt, len - count - 1); > > > >> - rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv- > > > >>> locality), > > > >> - burstcnt, buf + count); > > > >> - if (rc < 0) > > > >> - goto out_err; > > > >> - > > > >> - count += burstcnt; > > > >> - > > > >> - if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip- > > > >>> timeout_c, > > > >> - &priv->int_queue, false) < 0) { > > > >> - rc = -ETIME; > > > >> - goto out_err; > > > >> - } > > > >> - status = tpm_tis_status(chip); > > > >> - if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) > { > > > >> - rc = -EIO; > > > >> - goto out_err; > > > >> - } > > > >> + /* > > > >> + * Get the initial burstcount to ensure TPM is ready to > > > >> + * accept data. > > > >> + */ > > > >> + burstcnt = get_burstcount(chip); > > > >> + if (burstcnt < 0) { > > > >> + dev_err(&chip->dev, "Unable to read burstcount\n"); > > > >> + rc = burstcnt; > > > >> + goto out_err; > > > >> } > > > >> > > > >> + rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv- > >locality), > > > >> + len - 1, buf); > > > >> + if (rc < 0) > > > >> + goto out_err; > > > >> + > > > >> /* write last byte */ > > > >> - rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), > > > >> buf[count]); > > > >> + rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), > buf[len- > > > >> 1]); > > > >> if (rc < 0) > > > >> goto out_err; > > > >> > > > >> -- > > > >> 2.13.3 > > > > This seems to fail reliably with my SPI TPM 2.0. I get EIO when trying to > > > send large amounts of data, e.g. with TPM2_Hash, and subsequent tests > > > seem to take an unusual amount of time. More analysis probably has to > > wait > > > until November, since I am going to be in Prague next week. > > > > > > Thanks Alex for testing these.. Did you get the chance to do any further > > > analysis ? > > > > I am working on that now. Ken's suggestion seems reasonable, so I am > going > > to test whether correctly waiting for the flags to change fixes the problem. > If > > it does, I'll send the patches. > > Sorry for the delay, I had to take care of some device tree changes in v4.14 > that broke my ARM test machines. > > I've implemented some patches that fix the issue that Ken pointed out and > rebased your patch 2/4 ("ignore burstcount") on top. While doing this I > noticed that your original patch does not, as the commit message says, write > all the bytes at once, but still unnecessarily splits all commands into at least > two transfers (as did the original code). I've fixed this as well in my patches, > so that all bytes are indeed sent in a single call, without special handling for > the last byte. This should speed up things further, especially for small > commands and drivers like tpm_tis_spi, where writing a single byte > translates into additional SPI transfers. > > Unfortunately, even with those changes the problem persists. But I've got > more detailed logs now and will try to understand and hopefully fix the issue. > I'll follow up with more details and/or patches once I know more. Okay, so the problem seems to be that at some point the TPM starts inserting wait states for the FIFO access. The driver tries to handle this, but fails since even the 50 retries that are currently used do not seem to be enough. Adding small (millisecond) delays between the attempts did not help so far. Is there any limit in the specification for how many wait states the TPM may generate or for how long it may do so? I could not find anything, but we need to use something there to prevent a faulty TPM from blocking the kernel forever. Alexander
On Wed, Nov 22, 2017 at 06:52:03AM +0000, Alexander.Steffen@infineon.com wrote: > > > > > This seems to fail reliably with my SPI TPM 2.0. I get EIO when trying to > > > > send large amounts of data, e.g. with TPM2_Hash, and subsequent tests > > > > seem to take an unusual amount of time. More analysis probably has to > > > wait > > > > until November, since I am going to be in Prague next week. > > > > > > > > Thanks Alex for testing these.. Did you get the chance to do any further > > > > analysis ? > > > > > > I am working on that now. Ken's suggestion seems reasonable, so I am > > going > > > to test whether correctly waiting for the flags to change fixes the problem. > > If > > > it does, I'll send the patches. > > > > Sorry for the delay, I had to take care of some device tree changes in v4.14 > > that broke my ARM test machines. > > > > I've implemented some patches that fix the issue that Ken pointed out and > > rebased your patch 2/4 ("ignore burstcount") on top. While doing this I > > noticed that your original patch does not, as the commit message says, write > > all the bytes at once, but still unnecessarily splits all commands into at least > > two transfers (as did the original code). I've fixed this as well in my patches, > > so that all bytes are indeed sent in a single call, without special handling for > > the last byte. This should speed up things further, especially for small > > commands and drivers like tpm_tis_spi, where writing a single byte > > translates into additional SPI transfers. Thanks Alex, for digging into. Yeah, you are right, the first version of this patch sent all the bytes together, but after hearing ddwg inputs, i.e. "The last byte was introduced for error checking purposes (history).", I reverted back to original to be safe. It seems that the last byte was sent from the beginning (27084ef [PATCH] tpm: driver for next generation TPM chips,), does anyone remember the reason ? > > > > Unfortunately, even with those changes the problem persists. But I've got > > more detailed logs now and will try to understand and hopefully fix the issue. > > I'll follow up with more details and/or patches once I know more. > > Okay, so the problem seems to be that at some point the TPM starts inserting wait states for the FIFO access. The driver tries to handle this, but fails since even the 50 retries that are currently used do not seem to be enough. Adding small (millisecond) delays between the attempts did not help so far. > > Is there any limit in the specification for how many wait states the TPM may generate or for how long it may do so? I could not find anything, but we need to use something there to prevent a faulty TPM from blocking the kernel forever. > I have been thinking on this, so was wondering: 1. As you said the problem started while sending large amounts of data for TPM2_Hash, how large is "large" ? I mean did it work for some specific large values before failing. 2. Are these wait states limited to SPI, or does it happen on LPC as well? Thanks & Regards, - Nayna > Alexander >
> On Wed, Nov 22, 2017 at 06:52:03AM +0000, > Alexander.Steffen@infineon.com wrote: > > > > > > This seems to fail reliably with my SPI TPM 2.0. I get EIO when trying > to > > > > > send large amounts of data, e.g. with TPM2_Hash, and subsequent > tests > > > > > seem to take an unusual amount of time. More analysis probably has > to > > > > wait > > > > > until November, since I am going to be in Prague next week. > > > > > > > > > > Thanks Alex for testing these.. Did you get the chance to do any > further > > > > > analysis ? > > > > > > > > I am working on that now. Ken's suggestion seems reasonable, so I am > > > going > > > > to test whether correctly waiting for the flags to change fixes the > problem. > > > If > > > > it does, I'll send the patches. > > > > > > Sorry for the delay, I had to take care of some device tree changes in > v4.14 > > > that broke my ARM test machines. > > > > > > I've implemented some patches that fix the issue that Ken pointed out > and > > > rebased your patch 2/4 ("ignore burstcount") on top. While doing this I > > > noticed that your original patch does not, as the commit message says, > write > > > all the bytes at once, but still unnecessarily splits all commands into at > least > > > two transfers (as did the original code). I've fixed this as well in my > patches, > > > so that all bytes are indeed sent in a single call, without special handling > for > > > the last byte. This should speed up things further, especially for small > > > commands and drivers like tpm_tis_spi, where writing a single byte > > > translates into additional SPI transfers. > > Thanks Alex, for digging into. > > Yeah, you are right, the first version of this patch sent all the bytes together, > but after hearing ddwg inputs, > i.e. "The last byte was introduced for error checking purposes (history).", I > reverted back to original to be safe. > > It seems that the last byte was sent from the beginning (27084ef [PATCH] > tpm: driver for next generation TPM chips,), > does anyone remember the reason ? The intention seems to be to make extra sure that the TPM has correctly understood the command by observing the Expect flag flipping from 1 to 0 when writing the last byte. But following Ken's arguments, this does not work as intended, because the Expect flag will change not when writing the last byte to the FIFO, but when the TPM reads the last byte from the FIFO. Since there is no "FIFO empty" indication, just observing the Expect flag to be 1 before writing the last byte, cannot reliably tell us anything (there might be enough data left in the FIFO for the Expect flag to flip to 0 without writing the last byte). Also, I'd argue that this check is not necessary, because if the Expect flag is 0 after all bytes have been written to the FIFO, then the TPM has correctly received the command and is ready to execute it. According to TIS/PTP the TPM is required to throw away all extra bytes that were not announced in the header, and in addition the kernel driver already ensures not to send more data. That are enough safeguards, I'd say. > > > > > > > Unfortunately, even with those changes the problem persists. But I've > got > > > more detailed logs now and will try to understand and hopefully fix the > issue. > > > I'll follow up with more details and/or patches once I know more. > > > > Okay, so the problem seems to be that at some point the TPM starts > inserting wait states for the FIFO access. The driver tries to handle this, but > fails since even the 50 retries that are currently used do not seem to be > enough. Adding small (millisecond) delays between the attempts did not > help so far. > > > > Is there any limit in the specification for how many wait states the TPM may > generate or for how long it may do so? I could not find anything, but we need > to use something there to prevent a faulty TPM from blocking the kernel > forever. > > > > I have been thinking on this, so was wondering: > > 1. As you said the problem started while sending large amounts of data for > TPM2_Hash, how large is "large" ? I mean did it work for some specific large > values before failing. Around 1k of data (the exact values are chosen randomly, and it failed many times), but I did not try to find a specific boundary. The interesting thing was that for this long command all SPI frames with the maximum payload of 64 bytes were accepted without wait states, but the last frame (with less than 64 bytes) caused the wait states. > 2. Are these wait states limited to SPI, or does it happen on LPC as well? I do not know for LPC because there the wait states are handled in hardware and I cannot trace the LPC signals. > Thanks & Regards, > - Nayna > > > > Alexander > > >
On Thu, Nov 23, 2017 at 08:17:42PM +0530, Nayna Jain wrote: > Yeah, you are right, the first version of this patch sent all the > bytes together, but after hearing ddwg inputs, i.e. "The last byte was > introduced for error checking purposes (history).", I reverted back to > original to be safe. What does that mean ie error checking purposes? > It seems that the last byte was sent from the beginning (27084ef > [PATCH] tpm: driver for next generation TPM chips,), does anyone > remember the reason ? Sent from the beginning? /Jarkko
[Cc'ing Dave and Leendeert] Hi Jarkko, > > It seems that the last byte was sent from the beginning (27084ef > > [PATCH] tpm: driver for next generation TPM chips,), does anyone > > remember the reason ? > > Sent from the beginning? I went through the commit logs to see if any of the patch descriptions have an explanation for sending the last byte separately. Based on commit 27084efee0c3 "[PATCH] tpm: driver for next generation TPM chips", it seems it's been there since the beginning. Dave, Leendert, Do either of you remember the reason for tpm_tis_send_data() sending the last byte separately? thanks, Mimi
Hmm, this is almost 20 years old code ( I think the original code did a burst write and didn't check for error conditions until the very last byte write. I seem to remember that there was some text in the original standard to that effect (this may have gone back as far as IBM's ESS spec). The current code does check for error conditions after each write byte(s) so I don't think there is any reason for this anymore. Changing the while condition to count < len and setting burstcnt = min_t(int, burstcnt, len - count) and remove the tpm_tis_write8/wait_for_tpm_stat/tpm_tis_status clauses after the while loop should be sufficient. Leendert On 11/26/17, 8:37 AM, "Mimi Zohar" <zohar@linux.vnet.ibm.com> wrote: [Cc'ing Dave and Leendeert] Hi Jarkko, > > It seems that the last byte was sent from the beginning (27084ef > > [PATCH] tpm: driver for next generation TPM chips,), does anyone > > remember the reason ? > > Sent from the beginning? I went through the commit logs to see if any of the patch descriptions have an explanation for sending the last byte separately. Based on commit 27084efee0c3 "[PATCH] tpm: driver for next generation TPM chips", it seems it's been there since the beginning. Dave, Leendert, Do either of you remember the reason for tpm_tis_send_data() sending the last byte separately? thanks, Mimi
Hi, Leendert! On Mon, 2017-11-27 at 07:08 +0000, Leendert van Doorn wrote: > Hmm, this is almost 20 years old code ( > > I think the original code did a burst write and didn't check for > error conditions until the very last byte write. I seem to remember > that there was some text in the original standard to that effect > (this may have gone back as far as IBM's ESS spec). I really appreciate your responding with your recollections. > The current code does check for error conditions after each write > byte(s) so I don't think there is any reason for this anymore. > Changing the while condition to count < len and setting burstcnt = > min_t(int, burstcnt, len - count) and remove the > tpm_tis_write8/wait_for_tpm_stat/tpm_tis_status clauses after the > while loop should be sufficient. The context for my question was about removing the while loop and sending all of the data at once, relying on the bus wait states. Before making the change, I wanted to make sure there wasn't anything special about the last byte. thanks, Mimi
On Mon, Nov 27, 2017 at 07:08:51AM +0000, Leendert van Doorn wrote: > Hmm, this is almost 20 years old code ( > > I think the original code did a burst write and didn't check for error > conditions until the very last byte write. I seem to remember that > there was some text in the original standard to that effect (this may > have gone back as far as IBM's ESS spec). > > The current code does check for error conditions after each write > byte(s) so I don't think there is any reason for this anymore. > Changing the while condition to count < len and setting burstcnt = > min_t(int, burstcnt, len - count) and remove the > tpm_tis_write8/wait_for_tpm_stat/tpm_tis_status clauses after the > while loop should be sufficient. Thank you for sharing this! > Leendert /Jarkko
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c index b33126a35694..993328ae988c 100644 --- a/drivers/char/tpm/tpm_tis_core.c +++ b/drivers/char/tpm/tpm_tis_core.c @@ -316,7 +316,6 @@ static int tpm_tis_send_data(struct tpm_chip *chip, u8 *buf, size_t len) { struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev); int rc, status, burstcnt; - size_t count = 0; bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND; status = tpm_tis_status(chip); @@ -330,35 +329,24 @@ static int tpm_tis_send_data(struct tpm_chip *chip, u8 *buf, size_t len) } } - while (count < len - 1) { - burstcnt = get_burstcount(chip); - if (burstcnt < 0) { - dev_err(&chip->dev, "Unable to read burstcount\n"); - rc = burstcnt; - goto out_err; - } - burstcnt = min_t(int, burstcnt, len - count - 1); - rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv->locality), - burstcnt, buf + count); - if (rc < 0) - goto out_err; - - count += burstcnt; - - if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip->timeout_c, - &priv->int_queue, false) < 0) { - rc = -ETIME; - goto out_err; - } - status = tpm_tis_status(chip); - if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) { - rc = -EIO; - goto out_err; - } + /* + * Get the initial burstcount to ensure TPM is ready to + * accept data. + */ + burstcnt = get_burstcount(chip); + if (burstcnt < 0) { + dev_err(&chip->dev, "Unable to read burstcount\n"); + rc = burstcnt; + goto out_err; } + rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv->locality), + len - 1, buf); + if (rc < 0) + goto out_err; + /* write last byte */ - rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), buf[count]); + rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), buf[len-1]); if (rc < 0) goto out_err;