diff mbox series

TPM error path on probe

Message ID CAJ+vNU2irDtEn=VD7N2ySofr5wTeko6OHy-QD_SLLZudOwWj0A@mail.gmail.com (mailing list archive)
State New, archived
Headers show
Series TPM error path on probe | expand

Commit Message

Tim Harvey Feb. 29, 2024, 12:46 a.m. UTC
Greetings,

Commit 481c2d14627d ("tpm,tpm_tis: Disable interrupts after 1000
unhandled IRQs") introduced a kernel warning for boards that do not
have a TPM loaded but have one defined in the device-tree (ie a
subloaded board assembly):

[    2.434431] WARNING: CPU: 3 PID: 55 at kernel/workqueue.c:3397
__flush_work.isra.0+0x2ac/0x2d8
[    2.443069] Modules linked in:
[    2.446133] CPU: 3 PID: 55 Comm: kworker/u8:3 Not tainted
6.6.8-00021-g232153790aa6-dirty #434
[    2.454758] Hardware name: Gateworks Venice GW73xx-2x i.MX8MP
Development Kit (DT)
[    2.462337] Workqueue: events_unbound async_run_entry_fn
[    2.467669] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    2.474647] pc : __flush_work.isra.0+0x2ac/0x2d8
[    2.479278] lr : flush_work+0x10/0x1c
[    2.482950] sp : ffff80008270ba40
[    2.486271] x29: ffff80008270ba40 x28: 00000000ffffffff x27: 00000000ffffffff
[    2.493425] x26: ffff000000293000 x25: ffff80008218aff8 x24: ffff800080a1c6a8
[    2.500578] x23: 00000000ffffffff x22: ffff80008213ca18 x21: 0000000000000001
[    2.507731] x20: ffff000000e142c0 x19: ffff000000e14280 x18: 0000000000000000
[    2.514884] x17: 0000000000000000 x16: 000000000000001c x15: 0000000000000000
[    2.522038] x14: ffff00003fd94100 x13: 0000000000000002 x12: 0000000000000000
[    2.529191] x11: 0000000000000400 x10: 0000000000000910 x9 : 00000000ffffffff
[    2.536346] x8 : 0000000000000cc1 x7 : ffff8000804c76d8 x6 : 00000000090ee8ad
[    2.543500] x5 : ffff00000026c380 x4 : 0000000000000000 x3 : 0000000000000000
[    2.550655] x2 : 0000000000000000 x1 : ffff00000026c380 x0 : 0000000000000000
[    2.557806] Call trace:
[    2.560261]  __flush_work.isra.0+0x2ac/0x2d8
[    2.564544]  flush_work+0x10/0x1c
[    2.567869]  tpm_tis_remove+0x8c/0xc4
[    2.571545]  tpm_tis_core_init+0x194/0x7b8
[    2.575656]  tpm_tis_spi_probe+0xa4/0xd4
[    2.579593]  tpm_tis_spi_driver_probe+0x34/0x64
[    2.584136]  spi_probe+0x84/0xe4
[    2.587377]  really_probe+0x148/0x2c0
[    2.591052]  __driver_probe_device+0x78/0x12c
[    2.595417]  driver_probe_device+0xd8/0x15c
[    2.599609]  __device_attach_driver+0xb8/0x134
[    2.604064]  bus_for_each_drv+0x84/0xe0
[    2.607913]  __device_attach_async_helper+0xac/0xd0
[    2.612808]  async_run_entry_fn+0x34/0xe0
[    2.616832]  process_one_work+0x138/0x260
[    2.620851]  worker_thread+0x32c/0x438
[    2.624609]  kthread+0x118/0x11c
[    2.627844]  ret_from_fork+0x10/0x20
[    2.631428] ---[ end trace 0000000000000000 ]---

The issue is caused by tpm_tis_remove() calling flush_work() on the
work queue that has not yet been initialized. I can move the INIT_WORK
call added in this commit into the probe to fix this:

I'm not sure if a hard coded 0xffffffff check is best here.

Best regards,

Tim

Comments

Lino Sanfilippo Feb. 29, 2024, 7:08 a.m. UTC | #1
Hi Tim,

On 29.02.24 01:46, Tim Harvey wrote:
> Greetings,
>
> Commit 481c2d14627d ("tpm,tpm_tis: Disable interrupts after 1000
> unhandled IRQs") introduced a kernel warning for boards that do not
> have a TPM loaded but have one defined in the device-tree (ie a
> subloaded board assembly):
>
> [    2.434431] WARNING: CPU: 3 PID: 55 at kernel/workqueue.c:3397
> __flush_work.isra.0+0x2ac/0x2d8
> [    2.443069] Modules linked in:
> [    2.446133] CPU: 3 PID: 55 Comm: kworker/u8:3 Not tainted
> 6.6.8-00021-g232153790aa6-dirty #434
> [    2.454758] Hardware name: Gateworks Venice GW73xx-2x i.MX8MP
> Development Kit (DT)
> [    2.462337] Workqueue: events_unbound async_run_entry_fn
> [    2.467669] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [    2.474647] pc : __flush_work.isra.0+0x2ac/0x2d8
> [    2.479278] lr : flush_work+0x10/0x1c
> [    2.482950] sp : ffff80008270ba40
> [    2.486271] x29: ffff80008270ba40 x28: 00000000ffffffff x27: 00000000ffffffff
> [    2.493425] x26: ffff000000293000 x25: ffff80008218aff8 x24: ffff800080a1c6a8
> [    2.500578] x23: 00000000ffffffff x22: ffff80008213ca18 x21: 0000000000000001
> [    2.507731] x20: ffff000000e142c0 x19: ffff000000e14280 x18: 0000000000000000
> [    2.514884] x17: 0000000000000000 x16: 000000000000001c x15: 0000000000000000
> [    2.522038] x14: ffff00003fd94100 x13: 0000000000000002 x12: 0000000000000000
> [    2.529191] x11: 0000000000000400 x10: 0000000000000910 x9 : 00000000ffffffff
> [    2.536346] x8 : 0000000000000cc1 x7 : ffff8000804c76d8 x6 : 00000000090ee8ad
> [    2.543500] x5 : ffff00000026c380 x4 : 0000000000000000 x3 : 0000000000000000
> [    2.550655] x2 : 0000000000000000 x1 : ffff00000026c380 x0 : 0000000000000000
> [    2.557806] Call trace:
> [    2.560261]  __flush_work.isra.0+0x2ac/0x2d8
> [    2.564544]  flush_work+0x10/0x1c
> [    2.567869]  tpm_tis_remove+0x8c/0xc4
> [    2.571545]  tpm_tis_core_init+0x194/0x7b8
> [    2.575656]  tpm_tis_spi_probe+0xa4/0xd4
> [    2.579593]  tpm_tis_spi_driver_probe+0x34/0x64
> [    2.584136]  spi_probe+0x84/0xe4
> [    2.587377]  really_probe+0x148/0x2c0
> [    2.591052]  __driver_probe_device+0x78/0x12c
> [    2.595417]  driver_probe_device+0xd8/0x15c
> [    2.599609]  __device_attach_driver+0xb8/0x134
> [    2.604064]  bus_for_each_drv+0x84/0xe0
> [    2.607913]  __device_attach_async_helper+0xac/0xd0
> [    2.612808]  async_run_entry_fn+0x34/0xe0
> [    2.616832]  process_one_work+0x138/0x260
> [    2.620851]  worker_thread+0x32c/0x438
> [    2.624609]  kthread+0x118/0x11c
> [    2.627844]  ret_from_fork+0x10/0x20
> [    2.631428] ---[ end trace 0000000000000000 ]---
>
> The issue is caused by tpm_tis_remove() calling flush_work() on the
> work queue that has not yet been initialized. I can move the INIT_WORK
> call added in this commit into the probe to fix this:

I posted a fix for this issue a few weeks ago:
https://lore.kernel.org/all/20240201113646.31734-1-l.sanfilippo@kunbus.com/

It is already included in Jarkos tree.

BR,
Lino

> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> index d5ea22196b5b..e62294b3a437 100644
> --- a/drivers/char/tpm/tpm_tis_core.c
> +++ b/drivers/char/tpm/tpm_tis_core.c
> @@ -1133,6 +1133,8 @@ int tpm_tis_core_init(struct device *dev, struct
> tpm_tis_data *priv, int irq,
>
>         dev_set_drvdata(&chip->dev, priv);
>
> +       INIT_WORK(&priv->free_irq_work, tpm_tis_free_irq_func);
> +
>         rc = tpm_tis_read32(priv, TPM_DID_VID(0), &vendor);
>         if (rc < 0)
>                 return rc;
> @@ -1159,8 +1161,6 @@ int tpm_tis_core_init(struct device *dev, struct
> tpm_tis_data *priv, int irq,
>                 }
>         }
>
> -       INIT_WORK(&priv->free_irq_work, tpm_tis_free_irq_func);
> -
>         if (chip->ops->clk_enable != NULL)
>                 chip->ops->clk_enable(chip, true);
>
> However I wonder if we should simply have error checking for an
> invalid vendor of 0xffffffff. Is there any reason to not return an
> -ENODEV if vendor == 0xffffffff?
>
> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> index d5ea22196b5b..1bb1b3e1b2d0 100644
> --- a/drivers/char/tpm/tpm_tis_core.c
> +++ b/drivers/char/tpm/tpm_tis_core.c
> @@ -1136,6 +1136,8 @@ int tpm_tis_core_init(struct device *dev, struct
> tpm_tis_data *priv, int irq,
>         rc = tpm_tis_read32(priv, TPM_DID_VID(0), &vendor);
>         if (rc < 0)
>                 return rc;
> +       if (rc == 0xffffffff)
> +               return -ENODEV;
>
>         priv->manufacturer_id = vendor;
>
> I'm not sure if a hard coded 0xffffffff check is best here.
>
> Best regards,
>
> Tim
>
Tim Harvey Feb. 29, 2024, 4:02 p.m. UTC | #2
On Wed, Feb 28, 2024 at 11:09 PM Lino Sanfilippo <LinoSanfilippo@gmx.de> wrote:
>
> Hi Tim,
>
> On 29.02.24 01:46, Tim Harvey wrote:
> > Greetings,
> >
> > Commit 481c2d14627d ("tpm,tpm_tis: Disable interrupts after 1000
> > unhandled IRQs") introduced a kernel warning for boards that do not
> > have a TPM loaded but have one defined in the device-tree (ie a
> > subloaded board assembly):
> >
> > [    2.434431] WARNING: CPU: 3 PID: 55 at kernel/workqueue.c:3397
> > __flush_work.isra.0+0x2ac/0x2d8
> > [    2.443069] Modules linked in:
> > [    2.446133] CPU: 3 PID: 55 Comm: kworker/u8:3 Not tainted
> > 6.6.8-00021-g232153790aa6-dirty #434
> > [    2.454758] Hardware name: Gateworks Venice GW73xx-2x i.MX8MP
> > Development Kit (DT)
> > [    2.462337] Workqueue: events_unbound async_run_entry_fn
> > [    2.467669] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [    2.474647] pc : __flush_work.isra.0+0x2ac/0x2d8
> > [    2.479278] lr : flush_work+0x10/0x1c
> > [    2.482950] sp : ffff80008270ba40
> > [    2.486271] x29: ffff80008270ba40 x28: 00000000ffffffff x27: 00000000ffffffff
> > [    2.493425] x26: ffff000000293000 x25: ffff80008218aff8 x24: ffff800080a1c6a8
> > [    2.500578] x23: 00000000ffffffff x22: ffff80008213ca18 x21: 0000000000000001
> > [    2.507731] x20: ffff000000e142c0 x19: ffff000000e14280 x18: 0000000000000000
> > [    2.514884] x17: 0000000000000000 x16: 000000000000001c x15: 0000000000000000
> > [    2.522038] x14: ffff00003fd94100 x13: 0000000000000002 x12: 0000000000000000
> > [    2.529191] x11: 0000000000000400 x10: 0000000000000910 x9 : 00000000ffffffff
> > [    2.536346] x8 : 0000000000000cc1 x7 : ffff8000804c76d8 x6 : 00000000090ee8ad
> > [    2.543500] x5 : ffff00000026c380 x4 : 0000000000000000 x3 : 0000000000000000
> > [    2.550655] x2 : 0000000000000000 x1 : ffff00000026c380 x0 : 0000000000000000
> > [    2.557806] Call trace:
> > [    2.560261]  __flush_work.isra.0+0x2ac/0x2d8
> > [    2.564544]  flush_work+0x10/0x1c
> > [    2.567869]  tpm_tis_remove+0x8c/0xc4
> > [    2.571545]  tpm_tis_core_init+0x194/0x7b8
> > [    2.575656]  tpm_tis_spi_probe+0xa4/0xd4
> > [    2.579593]  tpm_tis_spi_driver_probe+0x34/0x64
> > [    2.584136]  spi_probe+0x84/0xe4
> > [    2.587377]  really_probe+0x148/0x2c0
> > [    2.591052]  __driver_probe_device+0x78/0x12c
> > [    2.595417]  driver_probe_device+0xd8/0x15c
> > [    2.599609]  __device_attach_driver+0xb8/0x134
> > [    2.604064]  bus_for_each_drv+0x84/0xe0
> > [    2.607913]  __device_attach_async_helper+0xac/0xd0
> > [    2.612808]  async_run_entry_fn+0x34/0xe0
> > [    2.616832]  process_one_work+0x138/0x260
> > [    2.620851]  worker_thread+0x32c/0x438
> > [    2.624609]  kthread+0x118/0x11c
> > [    2.627844]  ret_from_fork+0x10/0x20
> > [    2.631428] ---[ end trace 0000000000000000 ]---
> >
> > The issue is caused by tpm_tis_remove() calling flush_work() on the
> > work queue that has not yet been initialized. I can move the INIT_WORK
> > call added in this commit into the probe to fix this:
>
> I posted a fix for this issue a few weeks ago:
> https://lore.kernel.org/all/20240201113646.31734-1-l.sanfilippo@kunbus.com/
>
> It is already included in Jarkos tree.
>

Hi Lino,

Thank you!

Tim
Jarkko Sakkinen March 1, 2024, 8:49 p.m. UTC | #3
On Thu Feb 29, 2024 at 9:08 AM EET, Lino Sanfilippo wrote:
> I posted a fix for this issue a few weeks ago:
> https://lore.kernel.org/all/20240201113646.31734-1-l.sanfilippo@kunbus.com/
>
> It is already included in Jarkos tree.

sanity checked, it is there! making pull request next week

>
> BR,
> Lino

BR, Jarkko
diff mbox series

Patch

diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index d5ea22196b5b..e62294b3a437 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -1133,6 +1133,8 @@  int tpm_tis_core_init(struct device *dev, struct
tpm_tis_data *priv, int irq,

        dev_set_drvdata(&chip->dev, priv);

+       INIT_WORK(&priv->free_irq_work, tpm_tis_free_irq_func);
+
        rc = tpm_tis_read32(priv, TPM_DID_VID(0), &vendor);
        if (rc < 0)
                return rc;
@@ -1159,8 +1161,6 @@  int tpm_tis_core_init(struct device *dev, struct
tpm_tis_data *priv, int irq,
                }
        }

-       INIT_WORK(&priv->free_irq_work, tpm_tis_free_irq_func);
-
        if (chip->ops->clk_enable != NULL)
                chip->ops->clk_enable(chip, true);

However I wonder if we should simply have error checking for an
invalid vendor of 0xffffffff. Is there any reason to not return an
-ENODEV if vendor == 0xffffffff?

diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index d5ea22196b5b..1bb1b3e1b2d0 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -1136,6 +1136,8 @@  int tpm_tis_core_init(struct device *dev, struct
tpm_tis_data *priv, int irq,
        rc = tpm_tis_read32(priv, TPM_DID_VID(0), &vendor);
        if (rc < 0)
                return rc;
+       if (rc == 0xffffffff)
+               return -ENODEV;

        priv->manufacturer_id = vendor;