spi: ensure timely release of driver-allocated resources

Message ID	YFf2RD931nq3RudJ@google.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-spi-owner@kernel.org> Date: Sun, 21 Mar 2021 18:43:32 -0700 From: Dmitry Torokhov <dmitry.torokhov@gmail.com> To: Mark Brown <broonie@kernel.org> Cc: linux-spi@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] spi: ensure timely release of driver-allocated resources Message-ID: <YFf2RD931nq3RudJ@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Precedence: bulk
Series	spi: ensure timely release of driver-allocated resources \| expand spi: ensure timely release of driver-allocated resources

Dmitry Torokhov March 22, 2021, 1:43 a.m. UTC

More and more drivers rely on devres to manage their resources, however
if bus' probe() and release() methods are not trivial and control some
of resources as well (for example enable or disable clocks, or attach
device to a power domain), we need to make sure that driver-allocated
resources are released immediately after driver's remove() method
returns, and not postponed until driver core gets around to releasing
resources. To fix that we open a new devres group before calling
driver's probe() and explicitly release it when we return from driver's
remove().

Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
---

Note that this is not SPI-specific issue. I already send a similar
patch for I2C and will be sending more.

 drivers/spi/spi.c       | 25 +++++++++++++++++++++++--
 include/linux/spi/spi.h |  4 ++++
 2 files changed, 27 insertions(+), 2 deletions(-)

Mark Brown March 22, 2021, 12:37 p.m. UTC | #1

On Sun, Mar 21, 2021 at 06:43:32PM -0700, Dmitry Torokhov wrote:

> Note that this is not SPI-specific issue. I already send a similar
> patch for I2C and will be sending more.

This feels like it might make sense to push up to the driver core level
then rather than doing in individual buses?

Dmitry Torokhov March 22, 2021, 7:38 p.m. UTC | #2

On Mon, Mar 22, 2021 at 12:37:07PM +0000, Mark Brown wrote:
> On Sun, Mar 21, 2021 at 06:43:32PM -0700, Dmitry Torokhov wrote:
> 
> > Note that this is not SPI-specific issue. I already send a similar
> > patch for I2C and will be sending more.
> 
> This feels like it might make sense to push up to the driver core level
> then rather than doing in individual buses?

That is exactly the issue: we can't. Driver core already releases all
resources when a device is being unbound but that happens after bus
"remove" code is executed and therefore is too late. The device might
already be powered down, but various devm release() callbacks will be
trying to access it.

devm only works when you do not mix manual resources with managed ones,
and when bus code allocates resources themselves (attaching a device to
a power domain can be viewed as resource acquisition) we violate this
principle. We could, of course, to make SPI bus' probe() use
devm_add_action_or_reset() to work in removal of the device from the
power domain into the stream of devm resources, but that still requires
changes at bus code, and I believe will complicate matters if we need to
extend SPI bus code to allocate more resources in probe(). So I opted
for opening a devm group to separate resources allocated before and
after probe() to be able to release them in the right order.

Thanks.

Mark Brown March 23, 2021, 5:36 p.m. UTC | #3

On Mon, Mar 22, 2021 at 12:38:15PM -0700, Dmitry Torokhov wrote:
> On Mon, Mar 22, 2021 at 12:37:07PM +0000, Mark Brown wrote:

> > This feels like it might make sense to push up to the driver core level
> > then rather than doing in individual buses?

> That is exactly the issue: we can't. Driver core already releases all
> resources when a device is being unbound but that happens after bus
> "remove" code is executed and therefore is too late. The device might
> already be powered down, but various devm release() callbacks will be
> trying to access it.

Can you provide a concrete example of something that is causing problems
here?  If something is trying to access the device after remove() has
run that sounds like it's abusing devres somehow.  It sounded from your
commit log like this was something to do with the amount of time it took
the driver core to action the frees rather than an ordering issue.

> devm only works when you do not mix manual resources with managed ones,
> and when bus code allocates resources themselves (attaching a device to
> a power domain can be viewed as resource acquisition) we violate this
> principle. We could, of course, to make SPI bus' probe() use
> devm_add_action_or_reset() to work in removal of the device from the
> power domain into the stream of devm resources, but that still requires
> changes at bus code, and I believe will complicate matters if we need to
> extend SPI bus code to allocate more resources in probe(). So I opted
> for opening a devm group to separate resources allocated before and
> after probe() to be able to release them in the right order.

Sure, these are standard issues that people create with excessive use of
devm but the device's remove() callback is surely already a concern by
itself here?

Dmitry Torokhov March 23, 2021, 7:04 p.m. UTC | #4

On Tue, Mar 23, 2021 at 05:36:06PM +0000, Mark Brown wrote:
> On Mon, Mar 22, 2021 at 12:38:15PM -0700, Dmitry Torokhov wrote:
> > On Mon, Mar 22, 2021 at 12:37:07PM +0000, Mark Brown wrote:
> 
> > > This feels like it might make sense to push up to the driver core level
> > > then rather than doing in individual buses?
> 
> > That is exactly the issue: we can't. Driver core already releases all
> > resources when a device is being unbound but that happens after bus
> > "remove" code is executed and therefore is too late. The device might
> > already be powered down, but various devm release() callbacks will be
> > trying to access it.
> 
> Can you provide a concrete example of something that is causing problems
> here?  If something is trying to access the device after remove() has
> run that sounds like it's abusing devres somehow.  It sounded from your
> commit log like this was something to do with the amount of time it took
> the driver core to action the frees rather than an ordering issue.

No it is ordering issue. I do not have a proven real-life example for
SPI, but we do have one for I2C:

https://lore.kernel.org/linux-devicetree/20210305041236.3489-7-jeff@labundy.com/

However, if we consider fairly typical SPI driver, such as
drivers/input/touchscreen/ad7877.c, you can see that it uses devm in its
probe() and because all resources are managed, it does not define
remove() at all.

So during proble we have:

<driver core allocations>
SPI: dev_pm_domain_attach
AD7877: devm_kzalloc driver structure
AD7877: devm allocation of input device
AD7877: devm custom action to disable the chip on removal
AD7877: devm IRQ request
AD7877: devm sysfs attribute group
AD7877: devm input registration
<additional devm driver core allocations?>

And on remove:

SPI: dev_pm_domain_detach !!!!!!
<deallocate additional devm driver core allocations?>
AD7877: devm input unregistration
AD7877: devm sysfs attribute group removal
AD7877: devm freeing IRQ
AD7877: devm disable the chip
AD7877: devm freeing of input device
AD7877: devm free driver structure
<deallocate driver core allocations>

Note how dev_pm_domain_detach() jumped ahead of everything, and
strictly speaking past this point we can no longer guarantee that we can
access the chip and disable it.

> 
> > devm only works when you do not mix manual resources with managed ones,
> > and when bus code allocates resources themselves (attaching a device to
> > a power domain can be viewed as resource acquisition) we violate this
> > principle. We could, of course, to make SPI bus' probe() use
> > devm_add_action_or_reset() to work in removal of the device from the
> > power domain into the stream of devm resources, but that still requires
> > changes at bus code, and I believe will complicate matters if we need to
> > extend SPI bus code to allocate more resources in probe(). So I opted
> > for opening a devm group to separate resources allocated before and
> > after probe() to be able to release them in the right order.
> 
> Sure, these are standard issues that people create with excessive use of

devm is a fact of life and we need to live with it. I am unconvinced if
it solved more issues that it brought in, but it is something that
driver authors like to use and are pushed towards.

> devm but the device's remove() callback is surely already a concern by
> itself here?

In the example above there is not one, but even if it exists, it is
called first, so in some limited cases you could have non-managed
resources allocated very last and released first in remove(), and then
have devm release the rest. However driver's remove() is not issue here,
it is bus' non-trivial remove.

Thanks.

Mark Brown March 24, 2021, 9:32 p.m. UTC | #5

On Tue, Mar 23, 2021 at 12:04:34PM -0700, Dmitry Torokhov wrote:
> On Tue, Mar 23, 2021 at 05:36:06PM +0000, Mark Brown wrote:

> No it is ordering issue. I do not have a proven real-life example for
> SPI, but we do have one for I2C:

> https://lore.kernel.org/linux-devicetree/20210305041236.3489-7-jeff@labundy.com/

TBH that looks like a fairly standard case where you probably don't want
to be using devm for the interrupts in the first place.  Leaving the
interrupts live after the bus thinks it freed the device doesn't seem
like the best idea, I'm not sure I'd expect that to work reliably when
the device tries to call into the bus code to interact with the device
that the bus thought was already freed anyway.

If we want this to work reliably it really feels like we should have two
remove callbacks in the driver core doing this rather than open coding
in every single bus which is what we'd need to do - this is going to
affect any bus that does anything other than just call the device's
remove() callback.  PCI looks like it might have issues too for example,
and platform does as well and those were simply the first two buses I
looked at.  Possibly we want a driver core callback which is scheduled
via devm (remove_devm(), cleanup() or something).  We'd still need to
move things about in all the buses but it seems preferable to do it that
way rather than open coding opening a group and the comments about
what's going on and the ordering requirements everywhere, it's a little
less error prone going forward.

> Note how dev_pm_domain_detach() jumped ahead of everything, and
> strictly speaking past this point we can no longer guarantee that we can
> access the chip and disable it.

Frankly it looks like the PM domain stuff shouldn't be in the probe()
and remove() paths at all and this has been bogusly copies from other
buses, it should be in the device registration paths.  The device is in
the domain no matter what's going on with binding it.  Given how generic
code is I'm not even sure why it's in the buses.

Dmitry Torokhov March 24, 2021, 10:27 p.m. UTC | #6

On Wed, Mar 24, 2021 at 09:32:26PM +0000, Mark Brown wrote:
> On Tue, Mar 23, 2021 at 12:04:34PM -0700, Dmitry Torokhov wrote:
> > On Tue, Mar 23, 2021 at 05:36:06PM +0000, Mark Brown wrote:
> 
> > No it is ordering issue. I do not have a proven real-life example for
> > SPI, but we do have one for I2C:
> 
> > https://lore.kernel.org/linux-devicetree/20210305041236.3489-7-jeff@labundy.com/
> 
> TBH that looks like a fairly standard case where you probably don't want
> to be using devm for the interrupts in the first place.  Leaving the
> interrupts live after the bus thinks it freed the device doesn't seem
> like the best idea, I'm not sure I'd expect that to work reliably when
> the device tries to call into the bus code to interact with the device
> that the bus thought was already freed anyway.

That is not an argument really. By that token we should not be using
devm for anything but memory, and that is only until we implement some
kind of memleak checker that will ensure that driver-allocated memory is
released after driver's remove() method exits.

If we have devm API we need to make sure it works.

You also misread that the issue is limited to interrupts, when i fact
in this particular driver it is the input device that is being
unregistered too late and fails to timely quiesce the device. Resulting
interrupt storm is merely a side effect of this.

> 
> If we want this to work reliably it really feels like we should have two
> remove callbacks in the driver core doing this rather than open coding
> in every single bus which is what we'd need to do - this is going to
> affect any bus that does anything other than just call the device's
> remove() callback.  PCI looks like it might have issues too for example,
> and platform does as well and those were simply the first two buses I
> looked at.  Possibly we want a driver core callback which is scheduled
> via devm (remove_devm(), cleanup() or something).  We'd still need to
> move things about in all the buses but it seems preferable to do it that
> way rather than open coding opening a group and the comments about
> what's going on and the ordering requirements everywhere, it's a little
> less error prone going forward.

From the driver's perspective they expect devm-allocated resources to
happen immediately after ->remove() method is run. I do not believe any
driver is interested in additional callback, and you'd need to modify
a boatload of drivers to fix this issue.

I agree with you that manual group handling might be a bit confusing
and sprinkling the same comment everywhere is not too nice, so how about
we provide:

	void *devm_mark_driver_resources(struct device *dev)

and

	void devm_release_driver_resources()

and keep the commentary there? The question is where to keep
driver_res_id field - in generic device structure or put it into bus'
device structure as some buses and classes do not need it and we'd be
sawing 4-8 bytes per device structure this way.

Another way is to force buses to use devm for their resource management
(I.e. in case of SPI wrap dev_pm_domain_detach() in
devm_add_action_or_release()). It works for buses that have small number
of resource allocated, but gets more unwieldy and more expensive the
more resources are managed at bus level, that is why I opted for opening
a group.

> 
> > Note how dev_pm_domain_detach() jumped ahead of everything, and
> > strictly speaking past this point we can no longer guarantee that we can
> > access the chip and disable it.
> 
> Frankly it looks like the PM domain stuff shouldn't be in the probe()
> and remove() paths at all and this has been bogusly copies from other
> buses, it should be in the device registration paths.  The device is in
> the domain no matter what's going on with binding it.  Given how generic
> code is I'm not even sure why it's in the buses.

Here I will agree with you, bit I think it comes from power domain
"duality". In OF power domain represents grouping of devices, and is
static as devices do not move around, whereas in ACPI domain means
control, and we are putting a device under control of ACPI PM when we
bind it to a driver. As part of that control we bring the device into
_D0, etc.

Yay for mixing concepts, but this is not really material to the question
of how to orderly release resources.

Thanks.

Mark Brown March 24, 2021, 11:39 p.m. UTC | #7

On Wed, Mar 24, 2021 at 03:27:23PM -0700, Dmitry Torokhov wrote:
> On Wed, Mar 24, 2021 at 09:32:26PM +0000, Mark Brown wrote:

> > TBH that looks like a fairly standard case where you probably don't want
> > to be using devm for the interrupts in the first place.  Leaving the
> > interrupts live after the bus thinks it freed the device doesn't seem
> > like the best idea, I'm not sure I'd expect that to work reliably when
> > the device tries to call into the bus code to interact with the device
> > that the bus thought was already freed anyway.
> 
> That is not an argument really. By that token we should not be using
> devm for anything but memory, and that is only until we implement some
> kind of memleak checker that will ensure that driver-allocated memory is
> released after driver's remove() method exits.

I pretty much agree with that perspective TBH, I rarely see interrupt
users that I conside safe.  It's the things that actively do stuff that
cause issues, interrupts and registration of userspace interfaces both
being particularly likely to do so as work comes in asynchronously.

> You also misread that the issue is limited to interrupts, when i fact
> in this particular driver it is the input device that is being
> unregistered too late and fails to timely quiesce the device. Resulting
> interrupt storm is merely a side effect of this.

My understanding was that the issue is that the driver is generating
work because the interrupt is firing, if the interrupt had been
unregistered we'd not be getting the interupts delivered (probably
they'd have been disabled at the interrupt controller level) and not
have the problem of trying to handle them on a partially unwound device.

> > looked at.  Possibly we want a driver core callback which is scheduled
> > via devm (remove_devm(), cleanup() or something).  We'd still need to
> > move things about in all the buses but it seems preferable to do it that
> > way rather than open coding opening a group and the comments about
> > what's going on and the ordering requirements everywhere, it's a little
> > less error prone going forward.

> From the driver's perspective they expect devm-allocated resources to
> happen immediately after ->remove() method is run. I do not believe any
> driver is interested in additional callback, and you'd need to modify
> a boatload of drivers to fix this issue.

Not a callback for the device drivers, for the buses.  This is
essentially about converting the unwinding the bus does to be sequenced
for devm.

> I agree with you that manual group handling might be a bit confusing
> and sprinkling the same comment everywhere is not too nice, so how about
> we provide:

> 	void *devm_mark_driver_resources(struct device *dev)

> and

> 	void devm_release_driver_resources()

> and keep the commentary there? The question is where to keep
> driver_res_id field - in generic device structure or put it into bus'
> device structure as some buses and classes do not need it and we'd be
> sawing 4-8 bytes per device structure this way.

I guess bus' device :/

> Another way is to force buses to use devm for their resource management
> (I.e. in case of SPI wrap dev_pm_domain_detach() in
> devm_add_action_or_release()). It works for buses that have small number
> of resource allocated, but gets more unwieldy and more expensive the
> more resources are managed at bus level, that is why I opted for opening
> a group.

If the driver core were doing it and scheduling a single callback the
bus provides then that callback could do as much as it likes...

Dmitry Torokhov March 25, 2021, 12:17 a.m. UTC | #8

On Wed, Mar 24, 2021 at 11:39:35PM +0000, Mark Brown wrote:
> On Wed, Mar 24, 2021 at 03:27:23PM -0700, Dmitry Torokhov wrote:
> > On Wed, Mar 24, 2021 at 09:32:26PM +0000, Mark Brown wrote:
> 
> > > TBH that looks like a fairly standard case where you probably don't want
> > > to be using devm for the interrupts in the first place.  Leaving the
> > > interrupts live after the bus thinks it freed the device doesn't seem
> > > like the best idea, I'm not sure I'd expect that to work reliably when
> > > the device tries to call into the bus code to interact with the device
> > > that the bus thought was already freed anyway.
> > 
> > That is not an argument really. By that token we should not be using
> > devm for anything but memory, and that is only until we implement some
> > kind of memleak checker that will ensure that driver-allocated memory is
> > released after driver's remove() method exits.
> 
> I pretty much agree with that perspective TBH, I rarely see interrupt
> users that I conside safe.  It's the things that actively do stuff that
> cause issues, interrupts and registration of userspace interfaces both
> being particularly likely to do so as work comes in asynchronously.
> 
> > You also misread that the issue is limited to interrupts, when i fact
> > in this particular driver it is the input device that is being
> > unregistered too late and fails to timely quiesce the device. Resulting
> > interrupt storm is merely a side effect of this.
> 
> My understanding was that the issue is that the driver is generating
> work because the interrupt is firing, if the interrupt had been
> unregistered we'd not be getting the interupts delivered (probably
> they'd have been disabled at the interrupt controller level) and not
> have the problem of trying to handle them on a partially unwound device.

Yes, but the root of the problem is that we rely that the device will be
stopped when input core "closes" it upon device unregistering, so even
if we did not have this particular issue with interrupt storm we would
still be at risk of accessing half-dead device. What I am trying to say
(and I believe we are actually agreeing with each other) is that not
only interrupts, but other devm-allocated objects are also can cause
problems here.

> 
> > > looked at.  Possibly we want a driver core callback which is scheduled
> > > via devm (remove_devm(), cleanup() or something).  We'd still need to
> > > move things about in all the buses but it seems preferable to do it that
> > > way rather than open coding opening a group and the comments about
> > > what's going on and the ordering requirements everywhere, it's a little
> > > less error prone going forward.
> 
> > From the driver's perspective they expect devm-allocated resources to
> > happen immediately after ->remove() method is run. I do not believe any
> > driver is interested in additional callback, and you'd need to modify
> > a boatload of drivers to fix this issue.
> 
> Not a callback for the device drivers, for the buses.  This is
> essentially about converting the unwinding the bus does to be sequenced
> for devm.

Ah, OK.

> 
> > I agree with you that manual group handling might be a bit confusing
> > and sprinkling the same comment everywhere is not too nice, so how about
> > we provide:
> 
> > 	void *devm_mark_driver_resources(struct device *dev)
> 
> > and
> 
> > 	void devm_release_driver_resources()
> 
> > and keep the commentary there? The question is where to keep
> > driver_res_id field - in generic device structure or put it into bus'
> > device structure as some buses and classes do not need it and we'd be
> > sawing 4-8 bytes per device structure this way.
> 
> I guess bus' device :/
> 
> > Another way is to force buses to use devm for their resource management
> > (I.e. in case of SPI wrap dev_pm_domain_detach() in
> > devm_add_action_or_release()). It works for buses that have small number
> > of resource allocated, but gets more unwieldy and more expensive the
> > more resources are managed at bus level, that is why I opted for opening
> > a group.
> 
> If the driver core were doing it and scheduling a single callback the
> bus provides then that callback could do as much as it likes...

OK, I can look into it. In the meantime wills something like below a bit
easier for you to stomach? If so I'll resubmit formally.

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index c19a09201358..a0db83bcb3d4 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -375,6 +375,11 @@ static int spi_uevent(struct device *dev, struct kobj_uevent_env *env)
 	return add_uevent_var(env, "MODALIAS=%s%s", SPI_MODULE_PREFIX, spi->modalias);
 }
 
+static void spi_dev_pm_domain_detach(void *_dev)
+{
+	dev_pm_domain_detach(_dev, true);
+}
+
 static int spi_probe(struct device *dev)
 {
 	const struct spi_driver		*sdrv = to_spi_driver(dev->driver);
@@ -421,11 +426,12 @@ static int spi_probe(struct device *dev)
 	if (ret)
 		return ret;
 
-	if (sdrv->probe) {
+	ret = devm_add_action_or_reset(dev, spi_dev_pm_domain_detach, dev);
+	if (ret)
+		return ret;
+
+	if (sdrv->probe)
 		ret = sdrv->probe(spi);
-		if (ret)
-			dev_pm_domain_detach(dev, true);
-	}
 
 	return ret;
 }
@@ -444,8 +450,6 @@ static int spi_remove(struct device *dev)
 				 ERR_PTR(ret));
 	}
 
-	dev_pm_domain_detach(dev, true);
-
 	return 0;
 }

Mark Brown March 30, 2021, 5:19 p.m. UTC | #9

On Wed, Mar 24, 2021 at 05:17:52PM -0700, Dmitry Torokhov wrote:

> OK, I can look into it. In the meantime wills something like below a bit
> easier for you to stomach? If so I'll resubmit formally.

Well, it's a bit nicer for a bus level bodge so I guess, but really I'm
more thinking we need something in the driver core for this.

spi: ensure timely release of driver-allocated resources

Commit Message

Comments

Patch