diff mbox

tpm: don't destroy chip device prematurely

Message ID 1475393971-12715-1-git-send-email-tomas.winkler@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Winkler, Tomas Oct. 2, 2016, 7:39 a.m. UTC
In tpm_del_char_device device_del is called
prior to tpm2_shutdown where it is still used.

Fortunately, so far chip->dev was used only for printouts
int tpm2_shutdown flow, hence system didn't crash. But with
the introduction of runtime power management it will result in
shutting down the parent device while it still in use.

Fixes: 20e0152393b41 ("tpm: fix crash in tpm_tis deinitialization")
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
---
 drivers/char/tpm/tpm-chip.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Jarkko Sakkinen Oct. 2, 2016, 10:17 a.m. UTC | #1
On Sun, Oct 02, 2016 at 10:39:31AM +0300, Tomas Winkler wrote:
> In tpm_del_char_device device_del is called
> prior to tpm2_shutdown where it is still used.
> 
> Fortunately, so far chip->dev was used only for printouts
> int tpm2_shutdown flow, hence system didn't crash. But with
> the introduction of runtime power management it will result in
> shutting down the parent device while it still in use.
> 
> Fixes: 20e0152393b41 ("tpm: fix crash in tpm_tis deinitialization")
> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>

Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>

/Jarkko

> ---
>  drivers/char/tpm/tpm-chip.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
> index e5950131bd90..b1cb0aae8e66 100644
> --- a/drivers/char/tpm/tpm-chip.c
> +++ b/drivers/char/tpm/tpm-chip.c
> @@ -261,7 +261,6 @@ static int tpm_add_char_device(struct tpm_chip *chip)
>  static void tpm_del_char_device(struct tpm_chip *chip)
>  {
>  	cdev_del(&chip->cdev);
> -	device_del(&chip->dev);
>  
>  	/* Make the chip unavailable. */
>  	mutex_lock(&idr_lock);
> @@ -274,6 +273,8 @@ static void tpm_del_char_device(struct tpm_chip *chip)
>  		tpm2_shutdown(chip, TPM2_SU_CLEAR);
>  	chip->ops = NULL;
>  	up_write(&chip->ops_sem);
> +
> +	device_del(&chip->dev);
>  }
>  
>  static int tpm1_chip_register(struct tpm_chip *chip)
> -- 
> 2.7.4
> 

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jarkko Sakkinen Oct. 2, 2016, 10:24 a.m. UTC | #2
On Sun, Oct 02, 2016 at 01:17:55PM +0300, Jarkko Sakkinen wrote:
> On Sun, Oct 02, 2016 at 10:39:31AM +0300, Tomas Winkler wrote:
> > In tpm_del_char_device device_del is called
> > prior to tpm2_shutdown where it is still used.
> > 
> > Fortunately, so far chip->dev was used only for printouts
> > int tpm2_shutdown flow, hence system didn't crash. But with
> > the introduction of runtime power management it will result in
> > shutting down the parent device while it still in use.
> > 
> > Fixes: 20e0152393b41 ("tpm: fix crash in tpm_tis deinitialization")
> > Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
> 
> Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>

Applied.

/Jarkko

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jason Gunthorpe Oct. 2, 2016, 9:21 p.m. UTC | #3
On Sun, Oct 02, 2016 at 01:24:55PM +0300, Jarkko Sakkinen wrote:
> On Sun, Oct 02, 2016 at 01:17:55PM +0300, Jarkko Sakkinen wrote:
> > On Sun, Oct 02, 2016 at 10:39:31AM +0300, Tomas Winkler wrote:
> > > In tpm_del_char_device device_del is called
> > > prior to tpm2_shutdown where it is still used.
> > > 
> > > Fortunately, so far chip->dev was used only for printouts
> > > int tpm2_shutdown flow, hence system didn't crash. But with
> > > the introduction of runtime power management it will result in
> > > shutting down the parent device while it still in use.
> > > 
> > > Fixes: 20e0152393b41 ("tpm: fix crash in tpm_tis deinitialization")
> > > Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
> > 
> > Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
> > Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
> 
> Applied.

This patch is wrong, I though the comments were clear. All entry
points to find the device must be deleted before we commit to shutting
down the device.

You need to figure out some other way to solve your problem.

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Winkler, Tomas Oct. 3, 2016, 7:05 a.m. UTC | #4
> On Sun, Oct 02, 2016 at 01:24:55PM +0300, Jarkko Sakkinen wrote:
> > On Sun, Oct 02, 2016 at 01:17:55PM +0300, Jarkko Sakkinen wrote:
> > > On Sun, Oct 02, 2016 at 10:39:31AM +0300, Tomas Winkler wrote:
> > > > In tpm_del_char_device device_del is called prior to tpm2_shutdown
> > > > where it is still used.
> > > >
> > > > Fortunately, so far chip->dev was used only for printouts int
> > > > tpm2_shutdown flow, hence system didn't crash. But with the
> > > > introduction of runtime power management it will result in
> > > > shutting down the parent device while it still in use.
> > > >
> > > > Fixes: 20e0152393b41 ("tpm: fix crash in tpm_tis
> > > > deinitialization")
> > > > Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
> > >
> > > Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
> > > Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
> >
> > Applied.
> 
> This patch is wrong, I though the comments were clear. All entry points to find
> the device must be deleted before we commit to shutting down the device.
> 
> You need to figure out some other way to solve your problem.

Please  be more specific regarding flows you think will be wrong  with this patch, you must agree that the current code is broken even w/o runtime pm.
Thanks
Tomas


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Winkler, Tomas Oct. 3, 2016, 7:38 a.m. UTC | #5
> 
> > On Sun, Oct 02, 2016 at 01:24:55PM +0300, Jarkko Sakkinen wrote:
> > > On Sun, Oct 02, 2016 at 01:17:55PM +0300, Jarkko Sakkinen wrote:
> > > > On Sun, Oct 02, 2016 at 10:39:31AM +0300, Tomas Winkler wrote:
> > > > > In tpm_del_char_device device_del is called prior to
> > > > > tpm2_shutdown where it is still used.
> > > > >
> > > > > Fortunately, so far chip->dev was used only for printouts int
> > > > > tpm2_shutdown flow, hence system didn't crash. But with the
> > > > > introduction of runtime power management it will result in
> > > > > shutting down the parent device while it still in use.
> > > > >
> > > > > Fixes: 20e0152393b41 ("tpm: fix crash in tpm_tis
> > > > > deinitialization")
> > > > > Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
> > > >
> > > > Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
> > > > Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
> > >
> > > Applied.
> >
> > This patch is wrong, I though the comments were clear. All entry
> > points to find the device must be deleted before we commit to shutting
> down the device.
> >
> > You need to figure out some other way to solve your problem.
> 
> Please  be more specific regarding flows you think will be wrong  with this
> patch, you must agree that the current code is broken even w/o runtime pm.

I've looked to the registration code and it indeed has few more issues
Maybe TPM_CHIP_FLAG_REGISTERED can be used for sealing the access to the device during deregistration, current usage is void. 

> Thanks
> Tomas
> 
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most engaging
> tech sites, SlashDot.org! http://sdm.link/slashdot
> _______________________________________________
> tpmdd-devel mailing list
> tpmdd-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/tpmdd-devel

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jarkko Sakkinen Oct. 3, 2016, 12:42 p.m. UTC | #6
On Mon, Oct 03, 2016 at 07:38:44AM +0000, Winkler, Tomas wrote:
> > 
> > > On Sun, Oct 02, 2016 at 01:24:55PM +0300, Jarkko Sakkinen wrote:
> > > > On Sun, Oct 02, 2016 at 01:17:55PM +0300, Jarkko Sakkinen wrote:
> > > > > On Sun, Oct 02, 2016 at 10:39:31AM +0300, Tomas Winkler wrote:
> > > > > > In tpm_del_char_device device_del is called prior to
> > > > > > tpm2_shutdown where it is still used.
> > > > > >
> > > > > > Fortunately, so far chip->dev was used only for printouts int
> > > > > > tpm2_shutdown flow, hence system didn't crash. But with the
> > > > > > introduction of runtime power management it will result in
> > > > > > shutting down the parent device while it still in use.
> > > > > >
> > > > > > Fixes: 20e0152393b41 ("tpm: fix crash in tpm_tis
> > > > > > deinitialization")
> > > > > > Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
> > > > >
> > > > > Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
> > > > > Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
> > > >
> > > > Applied.
> > >
> > > This patch is wrong, I though the comments were clear. All entry
> > > points to find the device must be deleted before we commit to shutting
> > down the device.
> > >
> > > You need to figure out some other way to solve your problem.
> > 
> > Please  be more specific regarding flows you think will be wrong  with this
> > patch, you must agree that the current code is broken even w/o runtime pm.
> 
> I've looked to the registration code and it indeed has few more issues
> Maybe TPM_CHIP_FLAG_REGISTERED can be used for sealing the access to the device during deregistration, current usage is void. 

Good catch BTW. This flag has gone quite obsolote.

/Jarkko

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jarkko Sakkinen Oct. 3, 2016, 12:48 p.m. UTC | #7
On Mon, Oct 03, 2016 at 07:05:48AM +0000, Winkler, Tomas wrote:
> 
> > On Sun, Oct 02, 2016 at 01:24:55PM +0300, Jarkko Sakkinen wrote:
> > > On Sun, Oct 02, 2016 at 01:17:55PM +0300, Jarkko Sakkinen wrote:
> > > > On Sun, Oct 02, 2016 at 10:39:31AM +0300, Tomas Winkler wrote:
> > > > > In tpm_del_char_device device_del is called prior to tpm2_shutdown
> > > > > where it is still used.
> > > > >
> > > > > Fortunately, so far chip->dev was used only for printouts int
> > > > > tpm2_shutdown flow, hence system didn't crash. But with the
> > > > > introduction of runtime power management it will result in
> > > > > shutting down the parent device while it still in use.
> > > > >
> > > > > Fixes: 20e0152393b41 ("tpm: fix crash in tpm_tis
> > > > > deinitialization")
> > > > > Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
> > > >
> > > > Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
> > > > Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
> > >
> > > Applied.
> > 
> > This patch is wrong, I though the comments were clear. All entry points to find
> > the device must be deleted before we commit to shutting down the device.
> > 
> > You need to figure out some other way to solve your problem.
> 
> Please  be more specific regarding flows you think will be wrong  with
> this patch, you must agree that the current code is broken even w/o
> runtime pm.

Make the driver uncallable first. The worst race that can happen is that
open("/dev/tpm0", ...) returns -EPIPE. I do not consider this fatal at
all.

> Thanks
> Tomas

/Jarkko

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jason Gunthorpe Oct. 3, 2016, 4 p.m. UTC | #8
On Mon, Oct 03, 2016 at 07:05:48AM +0000, Winkler, Tomas wrote:

> > This patch is wrong, I though the comments were clear. All entry points to find
> > the device must be deleted before we commit to shutting down the device.
> > 
> > You need to figure out some other way to solve your problem.
> 
> Please be more specific regarding flows you think will be wrong with
> this patch, you must agree that the current code is broken even w/o
> runtime pm.

No, I don't agree. Accessing dev->name is OK after the device_del.

What devicde_del does is fence off all sorts of ways to access the
device, eg sysfs files, bus registrations, etc, etc. That absolutely
must be done before calling tpm_suspend.

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jason Gunthorpe Oct. 3, 2016, 4:03 p.m. UTC | #9
On Mon, Oct 03, 2016 at 03:42:25PM +0300, Jarkko Sakkinen wrote:

> > I've looked to the registration code and it indeed has few more issues

?

> > Maybe TPM_CHIP_FLAG_REGISTERED can be used for sealing the access
> > to the device during deregistration, current usage is void.

This is done via chip->ops = NULL and the rwlock scheme.
 
> Good catch BTW. This flag has gone quite obsolote.

I think all the drivers have been updated at this point so we can
probably get rid of it entirely.

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Winkler, Tomas Oct. 3, 2016, 5:16 p.m. UTC | #10
> On Mon, Oct 03, 2016 at 07:05:48AM +0000, Winkler, Tomas wrote:
> 
> > > This patch is wrong, I though the comments were clear. All entry
> > > points to find the device must be deleted before we commit to shutting
> down the device.
> > >
> > > You need to figure out some other way to solve your problem.
> >
> > Please be more specific regarding flows you think will be wrong with
> > this patch, you must agree that the current code is broken even w/o
> > runtime pm.
> 
> No, I don't agree. Accessing dev->name is OK after the device_del.

But you cannot assume that just dev->name is accessed and and runtime_pm breaks this assumption.. 
>
> 
> What devicde_del does is fence off all sorts of ways to access the device, eg
> sysfs files, bus registrations, etc, etc. That absolutely must be done before
> calling tpm_suspend.


But tpm2_shutdown is acceded via tpm_chip , we cannot call device_del before, this is just wrong from the Linux device mode perspective, we have to use other means to close the access to the device.

Thanks
Tomas

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jason Gunthorpe Oct. 3, 2016, 5:30 p.m. UTC | #11
On Mon, Oct 03, 2016 at 05:16:18PM +0000, Winkler, Tomas wrote:

> > > Please be more specific regarding flows you think will be wrong with
> > > this patch, you must agree that the current code is broken even w/o
> > > runtime pm.
> > 
> > No, I don't agree. Accessing dev->name is OK after the device_del.
> 
> But you cannot assume that just dev->name is accessed and and
> runtime_pm breaks this assumption..

It was built around that assumption.

Our chip methods have been built to not require the device to be
registered. We call many of them before we even do device
registration, for instance.

The pm patches can't break that. What is the actual problem anyhow?

> > What devicde_del does is fence off all sorts of ways to access the device, eg
> > sysfs files, bus registrations, etc, etc. That absolutely must be done before
> > calling tpm_suspend.
> 
> But tpm2_shutdown is acceded via tpm_chip , we cannot call
> device_del before, this is just wrong from the Linux device mode
> perspective, we have to use other means to close the access to the
> device.

device_del is the means to close access - that it what it does -
unregister the device from the system. The tpm_chip must be
operational independent of a *registered* device. chip methods can
only assume that the device is *initialized*

This is a basic pattern followed by other subsystems.

This is why it is OK to look at dev->name, but not okay to muck with
other stuff under a chip method.

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Winkler, Tomas Oct. 3, 2016, 5:30 p.m. UTC | #12
> 
> On Mon, Oct 03, 2016 at 03:42:25PM +0300, Jarkko Sakkinen wrote:
> 
> > > I've looked to the registration code and it indeed has few more
> > > issues
> 
> ?
> 
> > > Maybe TPM_CHIP_FLAG_REGISTERED can be used for sealing the access to
> > > the device during deregistration, current usage is void.
> 
> This is done via chip->ops = NULL and the rwlock scheme.
I'm not this is the best choice, kind of unusual in the subsystems.  
> 
> > Good catch BTW. This flag has gone quite obsolote.
> 
> I think all the drivers have been updated at this point so we can probably get
> rid of it entirely.

I would actually keep it for the tpm2_shutdown exception, not sure this can be handled by ops  = NULL and rwlock.


Thanks
Tomas

> 
> Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jarkko Sakkinen Oct. 4, 2016, 5:19 a.m. UTC | #13
On Mon, Oct 03, 2016 at 03:48:36PM +0300, Jarkko Sakkinen wrote:
> On Mon, Oct 03, 2016 at 07:05:48AM +0000, Winkler, Tomas wrote:
> > 
> > > On Sun, Oct 02, 2016 at 01:24:55PM +0300, Jarkko Sakkinen wrote:
> > > > On Sun, Oct 02, 2016 at 01:17:55PM +0300, Jarkko Sakkinen wrote:
> > > > > On Sun, Oct 02, 2016 at 10:39:31AM +0300, Tomas Winkler wrote:
> > > > > > In tpm_del_char_device device_del is called prior to tpm2_shutdown
> > > > > > where it is still used.
> > > > > >
> > > > > > Fortunately, so far chip->dev was used only for printouts int
> > > > > > tpm2_shutdown flow, hence system didn't crash. But with the
> > > > > > introduction of runtime power management it will result in
> > > > > > shutting down the parent device while it still in use.
> > > > > >
> > > > > > Fixes: 20e0152393b41 ("tpm: fix crash in tpm_tis
> > > > > > deinitialization")
> > > > > > Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
> > > > >
> > > > > Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
> > > > > Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
> > > >
> > > > Applied.
> > > 
> > > This patch is wrong, I though the comments were clear. All entry points to find
> > > the device must be deleted before we commit to shutting down the device.
> > > 
> > > You need to figure out some other way to solve your problem.
> > 
> > Please  be more specific regarding flows you think will be wrong  with
> > this patch, you must agree that the current code is broken even w/o
> > runtime pm.
> 
> Make the driver uncallable first. The worst race that can happen is that
> open("/dev/tpm0", ...) returns -EPIPE. I do not consider this fatal at
> all.

No responses for this reasonable proposal so I'll show what I mean:

/* Make the driver uncallable. */
down_write(&chip->ops_sem);
if (chip->flags & TPM_CHIP_FLAG_TPM2)
	tpm2_shutdown(chip, TPM2_SU_CLEAR);
chip->ops = NULL;
up_write(&chip->ops_sem);

cdev_del(&chip->cdev);
device_del(&chip->dev);

/* Make the chip unavailable. */
mutex_lock(&idr_lock);
idr_replace(&dev_nums_idr, NULL, chip->dev_num);
mutex_unlock(&idr_lock);

The worst thing that can happen is -EPIPE.

/Jarkko

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jason Gunthorpe Oct. 4, 2016, 4:47 p.m. UTC | #14
On Tue, Oct 04, 2016 at 08:19:46AM +0300, Jarkko Sakkinen wrote:

> > Make the driver uncallable first. The worst race that can happen is that
> > open("/dev/tpm0", ...) returns -EPIPE. I do not consider this fatal at
> > all.
> 
> No responses for this reasonable proposal so I'll show what I mean:

How is this any better than what Thomas proposed? It seems much worse
to me since now we have even more stuff in the wrong order.

There are three purposes to the ordering as it stands today
 1) To guarantee that tpm2_shutdown is the last command delivered to
    the TPM. When it is issued all other ways to access the device
    are hard fenced off.
 2) To hard fence the tpm subsystem for the 'platform' driver. Once
    tpm_del_char_device completes no callback into the driver
    is possible *at all*. The driver can destroy everything
    (iounmap, dereg irq, etc) and the driver module can be unloaded.
 3) To prevent oopsing with the sysfs code. Recall this comment

        /* The sysfs routines rely on an implicit tpm_try_get_ops, device_del
         * is called before ops is null'd and the sysfs core synchronizes this
         * removal so that no callbacks are running or can run again
         */

    device_del is what eliminates the sysfs access path, so
    ordering device_del after ops = null is just unconditionally
    wrong.

I still haven't heard an explanation why Thomas's other patches need
this, or why trying to change this ordering makes any sense at
all considering how the subsystem is constructed.

Further, if tpm_crb now needs a registered device, how on earth do all
the chip ops we call work *before* registration? Or is that another
bug?

Why can't tpm_crb return to the pre-registration operating state
in the driver remove function before calling unregister?

None of this makes any sense to me.

This whole thing was very carefully constructed to work *correctly*
during unregister. Many other subsystems have races and bugs during
remove (eg see the securityfs discussion). TPM has a hard requirement
to support safe unregister due to the vtpm stuff, so we don't get to
screw it up just to support one driver.

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Winkler, Tomas Oct. 4, 2016, 9:55 p.m. UTC | #15
> On Tue, Oct 04, 2016 at 08:19:46AM +0300, Jarkko Sakkinen wrote:
> 
> > > Make the driver uncallable first. The worst race that can happen is
> > > that open("/dev/tpm0", ...) returns -EPIPE. I do not consider this
> > > fatal at all.
> >
> > No responses for this reasonable proposal so I'll show what I mean:
> 
> How is this any better than what Thomas proposed? It seems much worse to
> me since now we have even more stuff in the wrong order.
> 
> There are three purposes to the ordering as it stands today
>  1) To guarantee that tpm2_shutdown is the last command delivered to
>     the TPM. When it is issued all other ways to access the device
>     are hard fenced off.

I'm not sure where are you taking this requirements from simple bit is just enough to make the HW inaccessible if the interface is designed right.

>  2) To hard fence the tpm subsystem for the 'platform' driver. Once
>     tpm_del_char_device completes no callback into the driver
>     is possible *at all*. The driver can destroy everything
>     (iounmap, dereg irq, etc) and the driver module can be unloaded.

There is some wrong terminology character device is related to user space only,  a device driver can function w/o it. 

>  3) To prevent oopsing with the sysfs code. Recall this comment

> 
>         /* The sysfs routines rely on an implicit tpm_try_get_ops, device_del
>          * is called before ops is null'd and the sysfs core synchronizes this
>          * removal so that no callbacks are running or can run again
>          */
> 
>     device_del is what eliminates the sysfs access path, so
>     ordering device_del after ops = null is just unconditionally
>     wrong.

The ordering can be resolved, like this 

down_write(&chip->ops_sem);
        if (chip->flags & TPM_CHIP_FLAG_TPM2)
                tpm2_shutdown(chip, TPM2_SU_CLEAR);
up_write(&chip->ops_sem);

device_del(&chip->dev);

down_write(&chip->ops_sem);
chip->ops = NULL;
up_write(&chip->ops_sem);

> 
> I still haven't heard an explanation why Thomas's other patches need this, or
> why trying to change this ordering makes any sense at all considering how the
> subsystem is constructed.

I thought it's quite clear form the commit message, the device_del naturally toggles runtime_pm of the parent device, it tries to resume the parent device so it can perform denationalization  and then suspend the parent device back which caused tpm2_shutdown to fail. 
> 
> Further, if tpm_crb now needs a registered device, how on earth do all the
> chip ops we call work *before* registration? Or is that another bug?
> 
> Why can't tpm_crb return to the pre-registration operating state in the driver
> remove function before calling unregister?
> 
> None of this makes any sense to me. 

I general we can not to implement power management via runtime_pm and resolve the issue within tpm_crb driver but it's not abouth tpm_crb.
tpm2_shutdown is a  tpm stack call it's not tpm_crb function, it uses tpm_transmit_cmd and friends it should have valid  tpm_chip initialized and valid.
I'm not sure what could be more clearer than that. 

> This whole thing was very carefully constructed to work *correctly* during
> unregister. Many other subsystems have races and bugs during remove (eg see
> the securityfs discussion). TPM has a hard requirement to support safe
> unregister due to the vtpm stuff, so we don't get to screw it up just to support
> one driver.

I have to admit that I'm not sure what the vtpm does yet,  but I have a feeling that a simple flag can fix this. 


Thanks
Tomas


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jason Gunthorpe Oct. 4, 2016, 11:10 p.m. UTC | #16
On Tue, Oct 04, 2016 at 09:55:36PM +0000, Winkler, Tomas wrote:
> 
> > On Tue, Oct 04, 2016 at 08:19:46AM +0300, Jarkko Sakkinen wrote:
> > 
> > > > Make the driver uncallable first. The worst race that can happen is
> > > > that open("/dev/tpm0", ...) returns -EPIPE. I do not consider this
> > > > fatal at all.
> > >
> > > No responses for this reasonable proposal so I'll show what I mean:
> > 
> > How is this any better than what Thomas proposed? It seems much worse to
> > me since now we have even more stuff in the wrong order.
> > 
> > There are three purposes to the ordering as it stands today
> >  1) To guarantee that tpm2_shutdown is the last command delivered to
> >     the TPM. When it is issued all other ways to access the device
> >     are hard fenced off.
> 
> I'm not sure where are you taking this requirements from simple bit
> is just enough to make the HW inaccessible if the interface is
> designed right.

I'm having a hard time understanding the english in your
email. (Jarkko do you know what Tomas is talking about??)

> The ordering can be resolved, like this 
> 
> down_write(&chip->ops_sem);
>         if (chip->flags & TPM_CHIP_FLAG_TPM2)
>                 tpm2_shutdown(chip, TPM2_SU_CLEAR);
> up_write(&chip->ops_sem);
> 
> device_del(&chip->dev);
> 
> down_write(&chip->ops_sem);
> chip->ops = NULL;
> up_write(&chip->ops_sem);

No, that is wrong as well, another thread can issue a TPM command
between the device_del and the ops = NULL. Presumably that will fail
the same as tpm2_shutdown does.

> > I still haven't heard an explanation why Thomas's other patches need this, or
> > why trying to change this ordering makes any sense at all considering how the
> > subsystem is constructed.
> 
> I thought it's quite clear form the commit message, the device_del

Not clear at all the commit message describes the 'solution' not the
problem.  This doesn't help..

> naturally toggles runtime_pm of the parent device, it tries to
> resume the parent device so it can perform denationalization and
> then suspend the parent device back which caused tpm2_shutdown to
> fail.

What code actually fails? I don't see anything in the runtime pm patch
that relies on chip->dev at all.

What code fails and why?

> I general we can not to implement power management via runtime_pm
> and resolve the issue within tpm_crb driver but it's not abouth
> tpm_crb.  tpm2_shutdown is a tpm stack call it's not tpm_crb
> function, it uses tpm_transmit_cmd and friends it should have valid
> tpm_chip initialized and valid.  I'm not sure what could be more
> clearer than that.

I'll say it again, the tpm_transmit_cmd path must not require a
registered chip->dev.

device_del only unregisters the dev, it does not deinitialize it, nor
does it free any memory. I still don't understand how this has any
impact on the pm stuff when all the pm stuff is attached only to the
pdev.

> I have to admit that I'm not sure what the vtpm does yet, but I have
> a feeling that a simple flag can fix this.

What flag? Fix what?

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Winkler, Tomas Oct. 5, 2016, 7:48 a.m. UTC | #17
> >
> > > On Tue, Oct 04, 2016 at 08:19:46AM +0300, Jarkko Sakkinen wrote:
> > >
> > > > > Make the driver uncallable first. The worst race that can happen
> > > > > is that open("/dev/tpm0", ...) returns -EPIPE. I do not consider
> > > > > this fatal at all.
> > > >
> > > > No responses for this reasonable proposal so I'll show what I mean:
> > >
> > > How is this any better than what Thomas proposed? It seems much
> > > worse to me since now we have even more stuff in the wrong order.
> > >
> > > There are three purposes to the ordering as it stands today
> > >  1) To guarantee that tpm2_shutdown is the last command delivered to
> > >     the TPM. When it is issued all other ways to access the device
> > >     are hard fenced off.
> >
> > I'm not sure where are you taking this requirements from simple bit is
> > just enough to make the HW inaccessible if the interface is designed
> > right.
> 
> I'm having a hard time understanding the english in your email. (Jarkko do you
> know what Tomas is talking about??)

Will try to do better. 

> 
> > The ordering can be resolved, like this
> >
> > down_write(&chip->ops_sem);
> >         if (chip->flags & TPM_CHIP_FLAG_TPM2)
> >                 tpm2_shutdown(chip, TPM2_SU_CLEAR);
> > up_write(&chip->ops_sem);
> >
> > device_del(&chip->dev);
> >
> > down_write(&chip->ops_sem);
> > chip->ops = NULL;
> > up_write(&chip->ops_sem);
> 
> No, that is wrong as well, another thread can issue a TPM command between
> the device_del and the ops = NULL. Presumably that will fail the same as
> tpm2_shutdown does.
>

Right, but that's why we need the TPM_CHIP_FLAG_REGISTERED bit to stay.
Second, tmp2_shutdown only assure that the tpm state is saved,  we are taking too hardline here.  If another command is issued, this is a problem of the upper layers and it has to be fixed in the upper layer.
On the other hand it is much worse if tpm2_shutdown is not sent at all.

> > > I still haven't heard an explanation why Thomas's other patches need
> > > this, or why trying to change this ordering makes any sense at all
> > > considering how the subsystem is constructed.
> >
> > I thought it's quite clear form the commit message, the device_del
> 
> Not clear at all the commit message describes the 'solution' not the problem.
> This doesn't help..

This is the problem statement form the commit message: 

' But with the introduction of runtime power management it will result in
shutting down the parent device while it still in use.' 

We are talking about 
https://lkml.org/lkml/2016/9/12/352 
https://sourceforge.net/p/tpmdd/mailman/message/35395799/

But again, the real bug is in design, where a device is used after device_del()  is called.

> 
> > naturally toggles runtime_pm of the parent device, it tries to resume
> > the parent device so it can perform denationalization and then suspend
> > the parent device back which caused tpm2_shutdown to fail.
> 
> What code actually fails? I don't see anything in the runtime pm patch that
> relies on chip->dev at all.

 "chip-dev.parent''
 dev_get_drvdata(&chip->dev);

> 
> What code fails and why?

   device_del(dev)
          bus_remove_device(dev)
                device_release_driver(dev)
                       __device_release_driver(dev)
                                      pm_runtime_reinit(dev) {
                                            if (dev->parent)
                                                   pm_runtime_put(dev->parent);
                                       }

 
> > I general we can not to implement power management via runtime_pm and
> > resolve the issue within tpm_crb driver but it's not abouth tpm_crb.
> > tpm2_shutdown is a tpm stack call it's not tpm_crb function, it uses
> > tpm_transmit_cmd and friends it should have valid tpm_chip initialized
> > and valid.  I'm not sure what could be more clearer than that.
> 
> I'll say it again, the tpm_transmit_cmd path must not require a registered chip-
> >dev.

I rephrase it again as well, this requirement is  just abusing of the device interface.

> device_del only unregisters the dev, it does not deinitialize it, nor does it free
> any memory. 

Someone will do a legitimate fix in device_del and all you house will crash on you, the device should not be used after device_del is called. 
There is nothing in the interface that promises that nothing is destructed. 

I still don't understand how this has any impact on the pm stuff
> when all the pm stuff is attached only to the pdev.

There is device hierarchy which is important for power management, please read the code.

> > I have to admit that I'm not sure what the vtpm does yet, but I have a
> > feeling that a simple flag can fix this.
> 
> What flag? Fix what?

TPM_CHIP_FLAG_REGISTERED

Tomas


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jarkko Sakkinen Oct. 5, 2016, 10:02 a.m. UTC | #18
On Tue, Oct 04, 2016 at 10:47:38AM -0600, Jason Gunthorpe wrote:
> On Tue, Oct 04, 2016 at 08:19:46AM +0300, Jarkko Sakkinen wrote:
> 
> > > Make the driver uncallable first. The worst race that can happen is that
> > > open("/dev/tpm0", ...) returns -EPIPE. I do not consider this fatal at
> > > all.
> > 
> > No responses for this reasonable proposal so I'll show what I mean:
> 
> How is this any better than what Thomas proposed? It seems much worse
> to me since now we have even more stuff in the wrong order.

It moves a logical block to the front instead of moving one thing
from one logical block to another place.

I'll repeat my question: what worse can happen than returning -EPIPE?  I
though the whole rw lock scheme was introduced just for this purpose.

Why there's even that branch in tpm-dev.c if it's so bad to let it
happen?

/Jarkko

> There are three purposes to the ordering as it stands today
>  1) To guarantee that tpm2_shutdown is the last command delivered to
>     the TPM. When it is issued all other ways to access the device
>     are hard fenced off.
>  2) To hard fence the tpm subsystem for the 'platform' driver. Once
>     tpm_del_char_device completes no callback into the driver
>     is possible *at all*. The driver can destroy everything
>     (iounmap, dereg irq, etc) and the driver module can be unloaded.
>  3) To prevent oopsing with the sysfs code. Recall this comment
> 
>         /* The sysfs routines rely on an implicit tpm_try_get_ops, device_del
>          * is called before ops is null'd and the sysfs core synchronizes this
>          * removal so that no callbacks are running or can run again
>          */
> 
>     device_del is what eliminates the sysfs access path, so
>     ordering device_del after ops = null is just unconditionally
>     wrong.
> 
> I still haven't heard an explanation why Thomas's other patches need
> this, or why trying to change this ordering makes any sense at
> all considering how the subsystem is constructed.
> 
> Further, if tpm_crb now needs a registered device, how on earth do all
> the chip ops we call work *before* registration? Or is that another
> bug?
> 
> Why can't tpm_crb return to the pre-registration operating state
> in the driver remove function before calling unregister?
> 
> None of this makes any sense to me.
> 
> This whole thing was very carefully constructed to work *correctly*
> during unregister. Many other subsystems have races and bugs during
> remove (eg see the securityfs discussion). TPM has a hard requirement
> to support safe unregister due to the vtpm stuff, so we don't get to
> screw it up just to support one driver.
> 
> Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jarkko Sakkinen Oct. 5, 2016, 10:09 a.m. UTC | #19
On Tue, Oct 04, 2016 at 10:47:38AM -0600, Jason Gunthorpe wrote:
> On Tue, Oct 04, 2016 at 08:19:46AM +0300, Jarkko Sakkinen wrote:
> 
> > > Make the driver uncallable first. The worst race that can happen is that
> > > open("/dev/tpm0", ...) returns -EPIPE. I do not consider this fatal at
> > > all.
> > 
> > No responses for this reasonable proposal so I'll show what I mean:
> 
> How is this any better than what Thomas proposed? It seems much worse
> to me since now we have even more stuff in the wrong order.
> 
> There are three purposes to the ordering as it stands today
>  1) To guarantee that tpm2_shutdown is the last command delivered to
>     the TPM. When it is issued all other ways to access the device
>     are hard fenced off.
>  2) To hard fence the tpm subsystem for the 'platform' driver. Once
>     tpm_del_char_device completes no callback into the driver
>     is possible *at all*. The driver can destroy everything
>     (iounmap, dereg irq, etc) and the driver module can be unloaded.
>  3) To prevent oopsing with the sysfs code. Recall this comment
> 
>         /* The sysfs routines rely on an implicit tpm_try_get_ops, device_del
>          * is called before ops is null'd and the sysfs core synchronizes this
>          * removal so that no callbacks are running or can run again
>          */
> 
>     device_del is what eliminates the sysfs access path, so
>     ordering device_del after ops = null is just unconditionally
>     wrong.
> 
> I still haven't heard an explanation why Thomas's other patches need
> this, or why trying to change this ordering makes any sense at
> all considering how the subsystem is constructed.
> 
> Further, if tpm_crb now needs a registered device, how on earth do all
> the chip ops we call work *before* registration? Or is that another
> bug?
> 
> Why can't tpm_crb return to the pre-registration operating state
> in the driver remove function before calling unregister?
> 
> None of this makes any sense to me.
> 
> This whole thing was very carefully constructed to work *correctly*
> during unregister. Many other subsystems have races and bugs during
> remove (eg see the securityfs discussion). TPM has a hard requirement
> to support safe unregister due to the vtpm stuff, so we don't get to
> screw it up just to support one driver.

Obviously a device is needed because it's required by the PM runtime
FW. I'm not following what you're saying about tpm2_shutdown(). With
the change I proposed it's the *very last* command delivered to the
device (because it's fenced by write lock).

> Jason

/Jarkko

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jarkko Sakkinen Oct. 5, 2016, 3:15 p.m. UTC | #20
On Wed, Oct 05, 2016 at 07:48:59AM +0000, Winkler, Tomas wrote:
> > >
> > > > On Tue, Oct 04, 2016 at 08:19:46AM +0300, Jarkko Sakkinen wrote:
> > > >
> > > > > > Make the driver uncallable first. The worst race that can happen
> > > > > > is that open("/dev/tpm0", ...) returns -EPIPE. I do not consider
> > > > > > this fatal at all.
> > > > >
> > > > > No responses for this reasonable proposal so I'll show what I mean:
> > > >
> > > > How is this any better than what Thomas proposed? It seems much
> > > > worse to me since now we have even more stuff in the wrong order.
> > > >
> > > > There are three purposes to the ordering as it stands today
> > > >  1) To guarantee that tpm2_shutdown is the last command delivered to
> > > >     the TPM. When it is issued all other ways to access the device
> > > >     are hard fenced off.
> > >
> > > I'm not sure where are you taking this requirements from simple bit is
> > > just enough to make the HW inaccessible if the interface is designed
> > > right.
> > 
> > I'm having a hard time understanding the english in your email. (Jarkko do you
> > know what Tomas is talking about??)
> 
> Will try to do better. 

Jason, sorry your question slipped while going through the dicussion
:-)

I think I'll take the standpoint that I'll wait for the next version.

The important thing is to notice that runtime PM requires the device
to be "alive" and in the device hierarchy. It's a constraint...

/Jarkko

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jason Gunthorpe Oct. 5, 2016, 4:27 p.m. UTC | #21
On Wed, Oct 05, 2016 at 01:02:34PM +0300, Jarkko Sakkinen wrote:

> I'll repeat my question: what worse can happen than returning -EPIPE?  I
> though the whole rw lock scheme was introduced just for this purpose.

I thought I explained this, if device_del is moved after ops = null
then if sysfs looses the race it will oops the kernel. device_del hard
fences sysfs.

> Why there's even that branch in tpm-dev.c if it's so bad to let it
> happen?

Because cdev_del and device_del do not guarentee that the cdev is
fenced. They just prevent new calls into open(). So the branch in
tpm-dev.c is necessary to avoid a kernel oops if user space holds the
fd open across unregister.

It is the same sitatuion you identified in the securityfs discussion -
user space holding the fd open across a driver unregister.

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jason Gunthorpe Oct. 5, 2016, 4:37 p.m. UTC | #22
On Wed, Oct 05, 2016 at 06:15:26PM +0300, Jarkko Sakkinen wrote:

> The important thing is to notice that runtime PM requires the device
> to be "alive" and in the device hierarchy. It's a constraint...

There are two devices.

The chip->dev and the chip->dev.parent (aka the acpi_device)

Runtime PM is *only* attached to the chip->dev.parent - it does not
interact in any significant way with the chip->dev.

device_del is on the chip->dev. The acpi_device remains intact, and
fully functional.

This is why the whole patch is so confusing to me.

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jason Gunthorpe Oct. 5, 2016, 5:11 p.m. UTC | #23
On Wed, Oct 05, 2016 at 07:48:59AM +0000, Winkler, Tomas wrote:
> > > down_write(&chip->ops_sem);
> > > chip->ops = NULL;
> > > up_write(&chip->ops_sem);
> > 
> > No, that is wrong as well, another thread can issue a TPM command between
> > the device_del and the ops = NULL. Presumably that will fail the same as
> > tpm2_shutdown does.
> 
> Right, but that's why we need the TPM_CHIP_FLAG_REGISTERED bit to
> stay.

How does that help?

We already null ops to signal the driver is removed, but we don't
check ops in all places today. Notably sysfs doesn't do the test, and
relies on the hard fence device_del provides.

So at a minimum to go forward with this approach you have to fix sysfs
to be safe - which I don't think is worthwhile..

> Second, tmp2_shutdown only assure that the tpm state is saved, we
> are taking too hardline here.  If another command is issued, this is
> a problem of the upper layers and it has to be fixed in the upper
> layer.

We can't fix it at the upper layers, this is classic removal
race. Whatever happens it must not cause the kernel to malfunction.

> This is the problem statement form the commit message: 
> 
> ' But with the introduction of runtime power management it will result in
> shutting down the parent device while it still in use.' 

> https://sourceforge.net/p/tpmdd/mailman/message/35395799/

Great, your commit message should include the klog message this patch
is fixing.

> > But again, the real bug is in design, where a device is used after
> > device_del()  is called.

No, I don't think so..

> > What code actually fails? I don't see anything in the runtime pm patch that
> > relies on chip->dev at all.
> 
>  "chip-dev.parent''
>  dev_get_drvdata(&chip->dev);

Also chip->dev.name because we can call dev_log/etc(&chip->dev..)

These are all fine, obviously. Todays kernel retains those values
across device_del and we set those values in tpmm_chip_alloc/etc. So
correct values are present as long as the chip memory exists. tpm
continues to hold a kref on the chip so the memory will be around.

'dev' values are only being used for clarity and convenience, if ever
the kernel changes behavior and nulls those values during device_del
then we will copy the values into chip and stop using dev. No
algorithm needs to change, and we don't need a registered 'dev'
to function.

However, I find such a possible change to be deeply unlikely. If you
look around the kernel I think you will find that many subsystems
subtly depend on these same invariants for corectness during various
unregister races.

> > What code fails and why?
> 
>    device_del(dev)
>           bus_remove_device(dev)
>                 device_release_driver(dev)
>                        __device_release_driver(dev)
>                                       pm_runtime_reinit(dev) {
>                                             if (dev->parent)
>                                                    pm_runtime_put(dev->parent);

Eh?? The full code is:

void pm_runtime_reinit(struct device *dev)
{
 if (!pm_runtime_enabled(dev)) {
     if (dev->power.irq_safe) {
         if (dev->parent)
              pm_runtime_put(dev->parent);

And irq_safe is only set by pm_runtime_irq_safe(). I can't find
any place that looks like that is called on chip->dev

Is there some other PM path where dev->parent becomes invovled?

Are you just guessing this solves a problem, or were you able to
reproduce Jarkko's report?

Considering that Jarkko cannot reliably reproduce the original bug,
I'm deeply skeptical this patch actually does anything more than
fiddle with timing around some kind of undiscovered race condition
scenario.

Even if that pm_runtime_put is happening, why doesn't the

+    pm_runtime_get_sync(chip->dev.parent);

The runtime_pm patch adds to tpm_transmit take care of bringing the
device back?

I'm still not hearing from you an explanation for what is actually
happening to cause the 325 error.. What does that error code decode
to anyhow?

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Winkler, Tomas Oct. 5, 2016, 8:09 p.m. UTC | #24
> 
> On Wed, Oct 05, 2016 at 07:48:59AM +0000, Winkler, Tomas wrote:
> > > > down_write(&chip->ops_sem);
> > > > chip->ops = NULL;
> > > > up_write(&chip->ops_sem);
> > >
> > > No, that is wrong as well, another thread can issue a TPM command
> > > between the device_del and the ops = NULL. Presumably that will fail
> > > the same as tpm2_shutdown does.
> >
> > Right, but that's why we need the TPM_CHIP_FLAG_REGISTERED bit to
> > stay.
> 
> How does that help?
> 
> We already null ops to signal the driver is removed, but we don't check ops in
> all places today. Notably sysfs doesn't do the test, and relies on the hard fence
> device_del provides.
> 
> So at a minimum to go forward with this approach you have to fix sysfs to be
> safe - which I don't think is worthwhile..

Yes, if this approach is taken this has to go across the whole stack, there is no question about it. 


> > Second, tmp2_shutdown only assure that the tpm state is saved, we are
> > taking too hardline here.  If another command is issued, this is a
> > problem of the upper layers and it has to be fixed in the upper layer.
> 
> We can't fix it at the upper layers, this is classic removal race. Whatever
> happens it must not cause the kernel to malfunction.
> 
> > This is the problem statement form the commit message:
> >
> > ' But with the introduction of runtime power management it will result
> > in shutting down the parent device while it still in use.'
> 
> > https://sourceforge.net/p/tpmdd/mailman/message/35395799/
> 
> Great, your commit message should include the klog message this patch is
> fixing.
>

It could, but that  patch was not merged yet, and I believe even if the issue is exposed only with runtime_pm currently, we have a bug in design even w/o runtime pm.

> > > But again, the real bug is in design, where a device is used after
> > > device_del()  is called.
> 
> No, I don't think so..

I do :)
> 
> > > What code actually fails? I don't see anything in the runtime pm
> > > patch that relies on chip->dev at all.
> >
> >  "chip-dev.parent''
> >  dev_get_drvdata(&chip->dev);
> 
> Also chip->dev.name because we can call dev_log/etc(&chip->dev..)

Yes, of course but this was not the point of disagreement
 
> These are all fine, obviously. Todays kernel retains those values across
> device_del and we set those values in tpmm_chip_alloc/etc. So correct values
> are present as long as the chip memory exists. tpm continues to hold a kref on
> the chip so the memory will be around.

I'm not saying they are not, but calling deep into the tpm stack and even to the parent device with unutilized device is not sane. 

> 'dev' values are only being used for clarity and convenience, if ever the kernel
> changes behavior and nulls those values during device_del then we will copy
> the values into chip and stop using dev. No algorithm needs to change, and we
> don't need a registered 'dev'
> to function.

Frankly I would suggest to stay with the device, it let you add tpm spec attributes (sysfs) if needed to the stack, you cannot hang those on the air.
You should distinguish between chip->dev and cdev though, those are not the same things. 

> However, I find such a possible change to be deeply unlikely. If you look
> around the kernel I think you will find that many subsystems subtly depend on
> these same invariants for corectness during various unregister races.

Yes, but the code around is aware that it's unregistering, you have device_del, put_device  pair enclosing device removal in the same function., unlike tmp2_shutdown which requires most of the stack functional. 
 
> > > What code fails and why?
> >
> >    device_del(dev)
> >           bus_remove_device(dev)
> >                 device_release_driver(dev)
> >                        __device_release_driver(dev)
> >                                       pm_runtime_reinit(dev) {
> >                                             if (dev->parent)
> >
> > pm_runtime_put(dev->parent);
> 
> Eh?? The full code is:
> 
> void pm_runtime_reinit(struct device *dev) {  if (!pm_runtime_enabled(dev)) {
>      if (dev->power.irq_safe) {
>          if (dev->parent)
>               pm_runtime_put(dev->parent);
> 
> And irq_safe is only set by pm_runtime_irq_safe(). I can't find any place that
> looks like that is called on chip->dev
> 
> Is there some other PM path where dev->parent becomes invovled?

Of course, the power management utilize the device hierarch, it assumes there is power dependencies between parents and child devices, such as bus controllers and the devices on that bus. 

> Are you just guessing this solves a problem, or were you able to reproduce
> Jarkko's report?

No, guesses are not my style :), this solves the issue, as you see this was also validated by Jarrko on his setup.

> Considering that Jarkko cannot reliably reproduce the original bug, I'm deeply
> skeptical this patch actually does anything more than fiddle with timing
> around some kind of undiscovered race condition scenario.
> 
> Even if that pm_runtime_put is happening, why doesn't the
> 
> +    pm_runtime_get_sync(chip->dev.parent);
> 
> The runtime_pm patch adds to tpm_transmit take care of bringing the device
> back?

Unfortunately not because, because it gets out of sync  and what is actually happening 
is that idle callback is called and device is put to idle between send and receive.

Now we can find a trick to fix this, but this would be rather w/o while we know what the real issue is.

> I'm still not hearing from you an explanation for what is actually happening to
> cause the 325 error.. What does that error code decode to anyhow?

Error code 0x145 RC_AUTH_CONTEXT, but the error code may vary, basically we lost context because the device went to the idle state. 
Thanks
Tomas



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jason Gunthorpe Oct. 5, 2016, 9:16 p.m. UTC | #25
On Wed, Oct 05, 2016 at 08:09:17PM +0000, Winkler, Tomas wrote:

> It could, but that patch was not merged yet, and I believe even if
> the issue is exposed only with runtime_pm currently, we have a bug
> in design even w/o runtime pm.

Please don't make changes without any justification :(

> > These are all fine, obviously. Todays kernel retains those values across
> > device_del and we set those values in tpmm_chip_alloc/etc. So correct values
> > are present as long as the chip memory exists. tpm continues to hold a kref on
> > the chip so the memory will be around.
> 
> I'm not saying they are not, but calling deep into the tpm stack and
> even to the parent device with unutilized device is not sane.

You keep asserting that, but it just isn't true at all.

For a long time the tpm subsystem didn't even have a
'struct device'. That is something Jarkko and I added.

The *ONLY* thing it does is act as the anchor for user space - eg it
holds the sysfs, contains the 'dev' file for the cdev, etc, etc. This
was an important clean up.

Internally to the tpm core, and the drivers, the chip->dev does
*NOTHING* except hold the few variables you pointed out. That is it.

We could go back to the old code, without the 'dev' and things
could still work correctly.

This is why your assertion the struct device needs to be registered
makes no sense.

If the runtime_pm patches change this, then we have a very serious
problem. Removing this assumption is much harder than a one line patch
moving device_del.

I actually have no idea how you'd do it, since we call all sorts of
tpm ops between device_init and device_add - again device_del is the
least of the problems if runtime pm insists the chip->dev be
registered when running transmit_cmd.

So, I again, strongly advise you to give up on this idea, it is too
hard for TPM, and does not seem technically needed at this time. Even
it it does seem to make some kind of intuitive sense.

> > Is there some other PM path where dev->parent becomes invovled?
> 
> Of course, the power management utilize the device hierarch, it
> assumes there is power dependencies between parents and child
> devices, such as bus controllers and the devices on that bus.

Sure, but that relationship only maters if something does a PM call on
the chip->dev, and AFAIK, nothing does that.

Do you know differently?

You pointed at something that isn't even run and said it is the source
of the problem.. You really need to set up here and explain exactly
what is happening.

> > Are you just guessing this solves a problem, or were you able to reproduce
> > Jarkko's report?
> 
> No, guesses are not my style :), this solves the issue, as you see
> this was also validated by Jarrko on his setup.

In the thread you pointed to Jarkko said he could not reproduce the
original issue. Jarkko can you clarify??

> > Even if that pm_runtime_put is happening, why doesn't the
> > 
> > +    pm_runtime_get_sync(chip->dev.parent);
> > 
> > The runtime_pm patch adds to tpm_transmit take care of bringing the device
> > back?
> 
> Unfortunately not because, because it gets out of sync and what is
> actually happening is that idle callback is called and device is put
> to idle between send and receive.

What? As far as I understand this PM stuff, I would call that a very
serious bug.

If a PM transition during transmit_cmd causes the TPM to abort/fail
command execution then it *MUST* be prevented. Period.

pm_runtime_get_sync appears to be the correct thing to get the
guarentee, so I'm very confused by your statement.

> Now we can find a trick to fix this, but this would be rather w/o
> while we know what the real issue is.

don't understand this.. Are you saying that going into idle during
tpm_transmit is not a bug?

Sounds like there is some sort of race condition with the pm stuff
that needs fixing.

I still don't see what your actual bug is, other than what I already
knew - somehow PM causes transmit_cmd to fail.

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Winkler, Tomas Oct. 6, 2016, 12:43 a.m. UTC | #26
> On Wed, Oct 05, 2016 at 08:09:17PM +0000, Winkler, Tomas wrote:
> 
> > It could, but that patch was not merged yet, and I believe even if the
> > issue is exposed only with runtime_pm currently, we have a bug in
> > design even w/o runtime pm.
> 
> Please don't make changes without any justification :(

> 
> > > These are all fine, obviously. Todays kernel retains those values
> > > across device_del and we set those values in tpmm_chip_alloc/etc. So
> > > correct values are present as long as the chip memory exists. tpm
> > > continues to hold a kref on the chip so the memory will be around.
> >
> > I'm not saying they are not, but calling deep into the tpm stack and
> > even to the parent device with unutilized device is not sane.
> 
> You keep asserting that, but it just isn't true at all.

Okay, let's rephrase, that calling device_del before tpm_transmit is not sane when using runtime_pm.

> For a long time the tpm subsystem didn't even have a 'struct device'. That is
> something Jarkko and I added.


> The *ONLY* thing it does is act as the anchor for user space - eg it holds the
> sysfs, contains the 'dev' file for the cdev, etc, etc. This was an important clean
> up.
> 
> Internally to the tpm core, and the drivers, the chip->dev does
> *NOTHING* except hold the few variables you pointed out. That is it.
> 
> We could go back to the old code, without the 'dev' and things could still work
> correctly.

You can write the code in one big loop in assembly and it would work as well, this is not the point.

> This is why your assertion the struct device needs to be registered makes no
> sense.

But you did change the code to get more benefits but this also comes with more dependencies and new rules.
I also can go the simple way and go_idle, cmd_ready callbacks and nothing will breaks, but we wanted all those goodies that runtime_pm has (autosuspend and sysfs), but the feature  has longer roots. 


> If the runtime_pm patches change this, then we have a very serious problem.
> Removing this assumption is much harder than a one line patch moving
> device_del.

I'm just saying that the  runtime patches only exposed issue in the design.
 
> I actually have no idea how you'd do it, since we call all sorts of tpm ops
> between device_init and device_add - again device_del is the least of the
> problems if runtime pm insists the chip->dev be registered when running
> transmit_cmd.

This has to be revisited as well. 

> 
> So, I again, strongly advise you to give up on this idea, it is too hard for TPM,
> and does not seem technically needed at this time. Even it it does seem to
> make some kind of intuitive sense.

I'm the last one who want to do extra jobs, we need to make our hw working asap, but we need to do it right.

> > > Is there some other PM path where dev->parent becomes invovled?
> >
> > Of course, the power management utilize the device hierarch, it
> > assumes there is power dependencies between parents and child devices,
> > such as bus controllers and the devices on that bus.
> 
> Sure, but that relationship only maters if something does a PM call on the
> chip->dev, and AFAIK, nothing does that.
> 
> Do you know differently?

You've pasted that code in in the previous mail, parent is involved on device remove.

> 
> You pointed at something that isn't even run and said it is the source of the
> problem.. You really need to set up here and explain exactly what is
> happening.

Sorry, lost you here.
 
> > > Are you just guessing this solves a problem, or were you able to
> > > reproduce Jarkko's report?
> >
> > No, guesses are not my style :), this solves the issue, as you see
> > this was also validated by Jarrko on his setup.
> 
> In the thread you pointed to Jarkko said he could not reproduce the original
> issue. Jarkko can you clarify??

No, he rolled back the runtime_pm patch and the issue disappeared and that's how we found the root cause. 
 
> > > Even if that pm_runtime_put is happening, why doesn't the
> > >
> > > +    pm_runtime_get_sync(chip->dev.parent);
> > >
> > > The runtime_pm patch adds to tpm_transmit take care of bringing the
> > > device back?
> >
> > Unfortunately not because, because it gets out of sync and what is
> > actually happening is that idle callback is called and device is put
> > to idle between send and receive.
> 
> What? As far as I understand this PM stuff, I would call that a very serious bug.

Maybe, but then you have to find what a bug,  currently it looks like wrong usage of the device.
> 

> If a PM transition during transmit_cmd causes the TPM to abort/fail command
> execution then it *MUST* be prevented. Period.

Or, we can call device_del after tpm2_shutdown. 
 
> pm_runtime_get_sync appears to be the correct thing to get the guarentee, so
> I'm very confused by your statement.
> 
> > Now we can find a trick to fix this, but this would be rather w/o
> > while we know what the real issue is.
> 
> don't understand this.. Are you saying that going into idle during tpm_transmit
> is not a bug?

> Sounds like there is some sort of race condition with the pm stuff that needs
> fixing.

> I still don't see what your actual bug is, other than what I already knew -
> somehow PM causes transmit_cmd to fail.

I will send you the actual trace, anyhow I've respin the original version with go_idle and cmd_ready handlers, this is contra productive, the time is just not right. 

Thanks
Tomas


 
Tomas 

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jason Gunthorpe Oct. 6, 2016, 2:07 a.m. UTC | #27
On Thu, Oct 06, 2016 at 12:43:13AM +0000, Winkler, Tomas wrote:

> > You keep asserting that, but it just isn't true at all.
> 
> Okay, let's rephrase, that calling device_del before tpm_transmit is not sane when using runtime_pm.

Maybe, but I haven't heard an explanation from you why that is the case,
and I haven't found one on my own..

> > Sure, but that relationship only maters if something does a PM call on the
> > chip->dev, and AFAIK, nothing does that.
> > 
> > Do you know differently?
> 
> You've pasted that code in in the previous mail, parent is involved on device remove.

I pasted the code to show it didn't seem possible to hit it because
irq_safe should not be true for chip->dev. You never explained how
this code can run.

> > You pointed at something that isn't even run and said it is the source of the
> > problem.. You really need to set up here and explain exactly what is
> > happening.
> 
> Sorry, lost you here.

You haven't done a good job explaining the problem in detail beyond some general
blame toward PM.

> > > > Even if that pm_runtime_put is happening, why doesn't the
> > > >
> > > > +    pm_runtime_get_sync(chip->dev.parent);
> > > >
> > > > The runtime_pm patch adds to tpm_transmit take care of bringing the
> > > > device back?
> > >
> > > Unfortunately not because, because it gets out of sync and what is
> > > actually happening is that idle callback is called and device is put
> > > to idle between send and receive.
> > 
> > What? As far as I understand this PM stuff, I would call that a very serious bug.
> 
> Maybe, but then you have to find what a bug,  currently it looks
> like wrong usage of the device.

Yes, you actually have to find and explain the bug to fix it.

You still haven't explained at all why device_del on the child causes
pm_runtime_get_sync() on the parent to malfunction. There is certainly
seems to be nothing intrinsic about the PM core that would cause that.

*That* is the really critical bit of explanation that is missing.

Until you can provide it there is no reason to continue discussing.

> > If a PM transition during transmit_cmd causes the TPM to abort/fail command
> > execution then it *MUST* be prevented. Period.
> 
> Or, we can call device_del after tpm2_shutdown. 

What? You haven't even established root cause, how do you know this
bug won't manifest in other cases? It could very well be some kind of
generic race-bug triggered by the proximity of device_del and
pm_runtime_get_sync. Or a HW bug of some kind..

> I will send you the actual trace, anyhow I've respin the original
> version with go_idle and cmd_ready handlers, this is contra
> productive, the time is just not right.

I'm deeply skeptical about all your patches if you can't root-cause
identify why the existing implementation isn't working.

But, you can sort out with Jarkko what to do with crb.

As long at the rest of the drivers and the core subsystem are not
broken by the device_del change. Jarkko - can you confirm you will
drop that patch?

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jarkko Sakkinen Oct. 6, 2016, 11:23 a.m. UTC | #28
On Wed, Oct 05, 2016 at 10:27:41AM -0600, Jason Gunthorpe wrote:
> On Wed, Oct 05, 2016 at 01:02:34PM +0300, Jarkko Sakkinen wrote:
> 
> > I'll repeat my question: what worse can happen than returning -EPIPE?  I
> > though the whole rw lock scheme was introduced just for this purpose.
> 
> I thought I explained this, if device_del is moved after ops = null
> then if sysfs looses the race it will oops the kernel. device_del hard
> fences sysfs.

Sorry, I missed that comment somehow. Looking at the code it is like
that.

I think that they should be fenced then for the sake of consistency.
I do not see why sysfs code is privileged not to do fencing while other
peers have to do it.

/Jarkko

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jason Gunthorpe Oct. 6, 2016, 4:22 p.m. UTC | #29
On Thu, Oct 06, 2016 at 02:23:57PM +0300, Jarkko Sakkinen wrote:

> I think that they should be fenced then for the sake of consistency.
> I do not see why sysfs code is privileged not to do fencing while other
> peers have to do it.

Certainly the locking could be changed, but it would be nice to have a
reason other than aesthetics.

sysfs is not unique, we also do not grab the rwlock lock during any
commands executed as part of probe. There are basically two locking
regimes - stuff that is proven to by synchronous with probe/remove
(sysfs, probe cmds) and everything else (kapi, cdev)

Further, the current sysfs implementation is nice and sane: the file
accesses cannot fail with ENODEV. That is a useful concrete property
and I don't think we should change it without a good reason.

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jarkko Sakkinen Oct. 6, 2016, 4:46 p.m. UTC | #30
On Thu, Oct 06, 2016 at 10:22:45AM -0600, Jason Gunthorpe wrote:
> On Thu, Oct 06, 2016 at 02:23:57PM +0300, Jarkko Sakkinen wrote:
> 
> > I think that they should be fenced then for the sake of consistency.
> > I do not see why sysfs code is privileged not to do fencing while other
> > peers have to do it.
> 
> Certainly the locking could be changed, but it would be nice to have a
> reason other than aesthetics.
> 
> sysfs is not unique, we also do not grab the rwlock lock during any
> commands executed as part of probe. There are basically two locking
> regimes - stuff that is proven to by synchronous with probe/remove
> (sysfs, probe cmds) and everything else (kapi, cdev)
> 
> Further, the current sysfs implementation is nice and sane: the file
> accesses cannot fail with ENODEV. That is a useful concrete property
> and I don't think we should change it without a good reason.

The last point is certainly legit. I think it even might deserve a
comment of its own in tpm_del_char_device.

I think I have a good idea now what to do. Hold on for RFC patch.

> Jason

/Jarkko

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Winkler, Tomas Oct. 7, 2016, 2:24 p.m. UTC | #31
> On Thu, Oct 06, 2016 at 12:43:13AM +0000, Winkler, Tomas wrote:
> 
> > > You keep asserting that, but it just isn't true at all.
> >
> > Okay, let's rephrase, that calling device_del before tpm_transmit is not sane
> when using runtime_pm.
> 
> Maybe, but I haven't heard an explanation from you why that is the case, and I
> haven't found one on my own..
> 
> > > Sure, but that relationship only maters if something does a PM call
> > > on the
> > > chip->dev, and AFAIK, nothing does that.
> > >
> > > Do you know differently?
> >
> > You've pasted that code in in the previous mail, parent is involved on device
> remove.
> 
> I pasted the code to show it didn't seem possible to hit it because irq_safe
> should not be true for chip->dev. You never explained how this code can run.
> 
> > > You pointed at something that isn't even run and said it is the
> > > source of the problem.. You really need to set up here and explain
> > > exactly what is happening.
> >
> > Sorry, lost you here.
> 
> You haven't done a good job explaining the problem in detail beyond some
> general blame toward PM.
> 
> > > > > Even if that pm_runtime_put is happening, why doesn't the
> > > > >
> > > > > +    pm_runtime_get_sync(chip->dev.parent);
> > > > >
> > > > > The runtime_pm patch adds to tpm_transmit take care of bringing
> > > > > the device back?
> > > >
> > > > Unfortunately not because, because it gets out of sync and what is
> > > > actually happening is that idle callback is called and device is
> > > > put to idle between send and receive.
> > >
> > > What? As far as I understand this PM stuff, I would call that a very serious
> bug.
> >
> > Maybe, but then you have to find what a bug,  currently it looks like
> > wrong usage of the device.
> 
> Yes, you actually have to find and explain the bug to fix it.
> 
> You still haven't explained at all why device_del on the child causes
> pm_runtime_get_sync() on the parent to malfunction. There is certainly seems
> to be nothing intrinsic about the PM core that would cause that.
> 
> *That* is the really critical bit of explanation that is missing.
> 
> Until you can provide it there is no reason to continue discussing.
> 
> > > If a PM transition during transmit_cmd causes the TPM to abort/fail
> > > command execution then it *MUST* be prevented. Period.
> >
> > Or, we can call device_del after tpm2_shutdown.
> 
> What? You haven't even established root cause, how do you know this bug
> won't manifest in other cases? It could very well be some kind of generic race-
> bug triggered by the proximity of device_del and pm_runtime_get_sync. Or a
> HW bug of some kind..
> 
> > I will send you the actual trace, anyhow I've respin the original
> > version with go_idle and cmd_ready handlers, this is contra
> > productive, the time is just not right.
> 
> I'm deeply skeptical about all your patches if you can't root-cause identify why
> the existing implementation isn't working.
> 
> But, you can sort out with Jarkko what to do with crb.
> 
> As long at the rest of the drivers and the core subsystem are not broken by the
> device_del change. Jarkko - can you confirm you will drop that patch?

So  here I'm to say I'm sorry for misleading this, after all the doubts I got back to debugging and traces. 
One thing for a reason moving the device_del, had really made the problem go away,  but the real  problem was unbalance runtime_pm PUT/GET from the tpm_crb probe function. 
I will post the fixed patch, of course, this one should be dropped. 
In any case,  and this is not just to keep my ego up, that calling to the tom stack with unutilized dev is not healthy and we should look for that. 
Thanks
Tomas


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jason Gunthorpe Oct. 7, 2016, 7:17 p.m. UTC | #32
On Fri, Oct 07, 2016 at 02:24:59PM +0000, Winkler, Tomas wrote:

> So here I'm to say I'm sorry for misleading this, after all the
> doubts I got back to debugging and traces.  One thing for a reason
> moving the device_del, had really made the problem go away, but the
> real problem was unbalance runtime_pm PUT/GET from the tpm_crb probe
> function.

Oh this is very good news, I'm glad this was resolved in crb!

Presumably the unbalanced put made the ref count go negative and the
balanced get caused it to go to zero, so pm locking was basically totally
broken? That would explain how an idle callback could run
concurrently with transmit_cmd.

Though a bit of a mystery why device_del had any impact? I'm still
very unclear exactly how the child device effects the parent - and
that seems like pretty important information going forward..

Thanks,
Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Winkler, Tomas Oct. 7, 2016, 8:10 p.m. UTC | #33
> Subject: Re: [PATCH] tpm: don't destroy chip device prematurely
> 
> On Fri, Oct 07, 2016 at 02:24:59PM +0000, Winkler, Tomas wrote:
> 
> > So here I'm to say I'm sorry for misleading this, after all the doubts
> > I got back to debugging and traces.  One thing for a reason moving the
> > device_del, had really made the problem go away, but the real problem
> > was unbalance runtime_pm PUT/GET from the tpm_crb probe function.
> 
> Oh this is very good news, I'm glad this was resolved in crb!
> 
> Presumably the unbalanced put made the ref count go negative and the
> balanced get caused it to go to zero, so pm locking was basically totally
> broken? That would explain how an idle callback could run concurrently with
> transmit_cmd.

This is not due to locking and refcount, but similar. The usage_count went negative  and the idle callback kicked in from the pm work queue, and suspended the device.

> 
> Though a bit of a mystery why device_del had any impact? I'm still very
> unclear exactly how the child device effects the parent - and that seems like
> pretty important information going forward..

Yes, there is some dependency as if device_del is not called the idle callback doesn't kick in between send and receive and that was misleading.  I'm not sure but this could be due to scheduling of the pm worker, but I'm not sure.   In any case we hit the issue even w/o device_del if the device is exercise enough.  I will dig into that later.

Thanks
Tomas




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
Jarkko Sakkinen Oct. 8, 2016, 10:47 a.m. UTC | #34
On Fri, Oct 07, 2016 at 02:24:59PM +0000, Winkler, Tomas wrote:
> So  here I'm to say I'm sorry for misleading this, after all the
> doubts I got back to debugging and traces.  One thing for a reason
> moving the device_del, had really made the problem go away,  but the
> real  problem was unbalance runtime_pm PUT/GET from the tpm_crb probe
> function.  I will post the fixed patch, of course, this one should be
> dropped.  In any case,  and this is not just to keep my ego up, that
> calling to the tom stack with unutilized dev is not healthy and we
> should look for that. 

Great. I'll test the fix once it's available.

/Jarkko

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
diff mbox

Patch

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index e5950131bd90..b1cb0aae8e66 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -261,7 +261,6 @@  static int tpm_add_char_device(struct tpm_chip *chip)
 static void tpm_del_char_device(struct tpm_chip *chip)
 {
 	cdev_del(&chip->cdev);
-	device_del(&chip->dev);
 
 	/* Make the chip unavailable. */
 	mutex_lock(&idr_lock);
@@ -274,6 +273,8 @@  static void tpm_del_char_device(struct tpm_chip *chip)
 		tpm2_shutdown(chip, TPM2_SU_CLEAR);
 	chip->ops = NULL;
 	up_write(&chip->ops_sem);
+
+	device_del(&chip->dev);
 }
 
 static int tpm1_chip_register(struct tpm_chip *chip)