Message ID | 20190613180931.65445-2-swboyd@chromium.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | tpm: Add driver for cr50 | expand |
On Thu, Jun 13, 2019 at 11:09:24AM -0700, Stephen Boyd wrote: > From: Andrey Pronin <apronin@chromium.org> > > Other drivers or userspace may initiate sending a message to the tpm > while the device itself and the controller of the bus it is on are > suspended. That may break the bus driver logic. > Block sending messages while the device is suspended. > > Signed-off-by: Andrey Pronin <apronin@chromium.org> > Signed-off-by: Stephen Boyd <swboyd@chromium.org> > > I don't think this was ever posted before. Use a real lock. Jason
On Thu, Jun 13, 2019 at 11:09:24AM -0700, Stephen Boyd wrote: > diff --git a/include/linux/tpm.h b/include/linux/tpm.h > index 1b5436b213a2..48df005228d0 100644 > --- a/include/linux/tpm.h > +++ b/include/linux/tpm.h > @@ -132,6 +132,8 @@ struct tpm_chip { > int dev_num; /* /dev/tpm# */ > unsigned long is_open; /* only one allowed */ > > + unsigned long is_suspended; > + > char hwrng_name[64]; > struct hwrng hwrng; I think it would better idea to have a bitmask of some sort that would have bits for 'open' and 'suspended'. /Jarkko
Quoting Jason Gunthorpe (2019-06-13 16:26:13) > On Thu, Jun 13, 2019 at 11:09:24AM -0700, Stephen Boyd wrote: > > From: Andrey Pronin <apronin@chromium.org> > > > > Other drivers or userspace may initiate sending a message to the tpm > > while the device itself and the controller of the bus it is on are > > suspended. That may break the bus driver logic. > > Block sending messages while the device is suspended. > > > > Signed-off-by: Andrey Pronin <apronin@chromium.org> > > Signed-off-by: Stephen Boyd <swboyd@chromium.org> > > > > I don't think this was ever posted before. > > Use a real lock. > To make sure the bit is tested under a lock so that suspend/resume can't update the bit in parallel?
Quoting Jarkko Sakkinen (2019-06-14 08:27:00) > On Thu, Jun 13, 2019 at 11:09:24AM -0700, Stephen Boyd wrote: > > diff --git a/include/linux/tpm.h b/include/linux/tpm.h > > index 1b5436b213a2..48df005228d0 100644 > > --- a/include/linux/tpm.h > > +++ b/include/linux/tpm.h > > @@ -132,6 +132,8 @@ struct tpm_chip { > > int dev_num; /* /dev/tpm# */ > > unsigned long is_open; /* only one allowed */ > > > > + unsigned long is_suspended; > > + > > char hwrng_name[64]; > > struct hwrng hwrng; > > I think it would better idea to have a bitmask of some sort that > would have bits for 'open' and 'suspended'. > Sure. I can combine is_open and is_suspended into some sort of 'unsigned long flags' member and then have #define TPM_IS_OPEN 0 and #define TPM_IS_SUSPENDED 1 defines?
On Fri, Jun 14, 2019 at 11:13:13AM -0700, Stephen Boyd wrote: > Quoting Jarkko Sakkinen (2019-06-14 08:27:00) > > On Thu, Jun 13, 2019 at 11:09:24AM -0700, Stephen Boyd wrote: > > > diff --git a/include/linux/tpm.h b/include/linux/tpm.h > > > index 1b5436b213a2..48df005228d0 100644 > > > --- a/include/linux/tpm.h > > > +++ b/include/linux/tpm.h > > > @@ -132,6 +132,8 @@ struct tpm_chip { > > > int dev_num; /* /dev/tpm# */ > > > unsigned long is_open; /* only one allowed */ > > > > > > + unsigned long is_suspended; > > > + > > > char hwrng_name[64]; > > > struct hwrng hwrng; > > > > I think it would better idea to have a bitmask of some sort that > > would have bits for 'open' and 'suspended'. > > > > Sure. I can combine is_open and is_suspended into some sort of 'unsigned > long flags' member and then have #define TPM_IS_OPEN 0 and #define > TPM_IS_SUSPENDED 1 defines? Sounds sustainable. /Jarkko
On Fri, Jun 14, 2019 at 11:12:36AM -0700, Stephen Boyd wrote: > Quoting Jason Gunthorpe (2019-06-13 16:26:13) > > On Thu, Jun 13, 2019 at 11:09:24AM -0700, Stephen Boyd wrote: > > > From: Andrey Pronin <apronin@chromium.org> > > > > > > Other drivers or userspace may initiate sending a message to the tpm > > > while the device itself and the controller of the bus it is on are > > > suspended. That may break the bus driver logic. > > > Block sending messages while the device is suspended. > > > > > > Signed-off-by: Andrey Pronin <apronin@chromium.org> > > > Signed-off-by: Stephen Boyd <swboyd@chromium.org> > > > > > > I don't think this was ever posted before. > > > > Use a real lock. > > > > To make sure the bit is tested under a lock so that suspend/resume can't > update the bit in parallel? No, just use a real lock, don't make locks out of test bit/set bit Jason
Quoting Jason Gunthorpe (2019-06-17 15:51:34) > On Fri, Jun 14, 2019 at 11:12:36AM -0700, Stephen Boyd wrote: > > Quoting Jason Gunthorpe (2019-06-13 16:26:13) > > > On Thu, Jun 13, 2019 at 11:09:24AM -0700, Stephen Boyd wrote: > > > > From: Andrey Pronin <apronin@chromium.org> > > > > > > > > Other drivers or userspace may initiate sending a message to the tpm > > > > while the device itself and the controller of the bus it is on are > > > > suspended. That may break the bus driver logic. > > > > Block sending messages while the device is suspended. > > > > > > > > Signed-off-by: Andrey Pronin <apronin@chromium.org> > > > > Signed-off-by: Stephen Boyd <swboyd@chromium.org> > > > > > > > > I don't think this was ever posted before. > > > > > > Use a real lock. > > > > > > > To make sure the bit is tested under a lock so that suspend/resume can't > > update the bit in parallel? > > No, just use a real lock, don't make locks out of test bit/set bit > Ok. I looked back on the history of this change in our kernel (seems it wasn't attempted upstream for some time) and it looks like the problem may have been that the khwrng kthread (i.e. hwrng_fill()) isn't frozen across suspend/resume. This kthread runs concurrently with devices being resumed, the cr50 hardware is still suspended, and then a tpm command is sent and it hangs the I2C bus because the device hasn't been properly resumed yet. I suspect a better approach than trying to hold of all TPM commands across suspend/resume would be to fix the caller here to not even try to read the hwrng during this time. It's a general problem for other hwrngs that have some suspend/resume hooks too. This kthread is going to be running while suspend/resume is going on if the random entropy gets too low, and that probably shouldn't be the case. What do you think of the attached patch? I haven't tested it, but it would make sure that the kthread is frozen so that the hardware can be resumed before the kthread is thawed and tries to go touch the hardware. ----8<----- diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c index 95be7228f327..3b88af3149a7 100644 --- a/drivers/char/hw_random/core.c +++ b/drivers/char/hw_random/core.c @@ -13,6 +13,7 @@ #include <linux/delay.h> #include <linux/device.h> #include <linux/err.h> +#include <linux/freezer.h> #include <linux/fs.h> #include <linux/hw_random.h> #include <linux/kernel.h> @@ -421,7 +422,9 @@ static int hwrng_fillfn(void *unused) { long rc; - while (!kthread_should_stop()) { + set_freezable(); + + while (!kthread_freezable_should_stop(NULL)) { struct hwrng *rng; rng = get_current_rng();
On Thu, Jun 20, 2019 at 06:03:17PM -0700, Stephen Boyd wrote: > > What do you think of the attached patch? I haven't tested it, but it > would make sure that the kthread is frozen so that the hardware can be > resumed before the kthread is thawed and tries to go touch the hardware. Looks good to me. Cheers,
Quoting Herbert Xu (2019-06-24 07:26:54) > On Thu, Jun 20, 2019 at 06:03:17PM -0700, Stephen Boyd wrote: > > > > What do you think of the attached patch? I haven't tested it, but it > > would make sure that the kthread is frozen so that the hardware can be > > resumed before the kthread is thawed and tries to go touch the hardware. > > Looks good to me. > Thanks for checking. I haven't been able to reproduce the problem yet to confirm this is actually fixing anything, even after tweaking the sysctl values for khwrng to try and force the thread to run continuously. From reading the bug report that caused this 'is_suspended' code to be added to the driver I'm fairly convinced this is the right solution. To give some more background, it looks like we use s2idle suspend (i.e. all CPUs are idle across suspend) and then we have the kthread running to ask the hardware for some more random numbers. The i2c transaction fails when asking the hwrng for data, and we see these messages printed on the resume path: i2c_designware i2c_designware.2: i2c_dw_handle_tx_abort: lost arbitration tpm tpm0: i2c transfer failed (attempt 3/3): -11 tpm0: tpm_transmit: tpm_send: error -11 hwrng: no data available Userspace tasks are thawed after this failure so it looks like something in the kernel kicks khwrng to grab more data before the i2c bus can be resumed.
diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c index ae1030c9b086..7232527652e8 100644 --- a/drivers/char/tpm/tpm-interface.c +++ b/drivers/char/tpm/tpm-interface.c @@ -86,6 +86,11 @@ static ssize_t tpm_try_transmit(struct tpm_chip *chip, void *buf, size_t bufsiz) return -E2BIG; } + if (test_bit(0, &chip->is_suspended)) { + dev_warn(&chip->dev, "blocking transmit while suspended\n"); + return -EAGAIN; + } + rc = chip->ops->send(chip, buf, count); if (rc < 0) { if (rc != -EPIPE) @@ -403,14 +408,19 @@ int tpm_pm_suspend(struct device *dev) return 0; if (!tpm_chip_start(chip)) { - if (chip->flags & TPM_CHIP_FLAG_TPM2) + if (chip->flags & TPM_CHIP_FLAG_TPM2) { tpm2_shutdown(chip, TPM2_SU_STATE); - else + set_bit(0, &chip->is_suspended); + } else { rc = tpm1_pm_suspend(chip, tpm_suspend_pcr); + } tpm_chip_stop(chip); } + if (!rc) + set_bit(0, &chip->is_suspended); + return rc; } EXPORT_SYMBOL_GPL(tpm_pm_suspend); @@ -426,6 +436,8 @@ int tpm_pm_resume(struct device *dev) if (chip == NULL) return -ENODEV; + clear_bit(0, &chip->is_suspended); + return 0; } EXPORT_SYMBOL_GPL(tpm_pm_resume); diff --git a/include/linux/tpm.h b/include/linux/tpm.h index 1b5436b213a2..48df005228d0 100644 --- a/include/linux/tpm.h +++ b/include/linux/tpm.h @@ -132,6 +132,8 @@ struct tpm_chip { int dev_num; /* /dev/tpm# */ unsigned long is_open; /* only one allowed */ + unsigned long is_suspended; + char hwrng_name[64]; struct hwrng hwrng;