Message ID | 20230723-thermal-fix-of-memory-corruption-v1-1-ed4fa16d199d@kernel.org (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | thermal/of: Fix double free of params during unregistration | expand |
Hi Mark, On 23/07/2023 01:26, Mark Brown wrote: > Unlike the other data structures provided during registration the > thermal core takes a copy of the thermal_zone_params provided to it and > stores that copy in the thermal_zone_device, taking care to free it on > unregistration. This is done because the parameters will be modified at > runtime. > > Unfortunately the thermal_of code assumes that the params structure it > provides will be used throughout the lifetime of the device and since > the params are dynamically allocated based on the bindings it attempts > to free it on unregistration. This results in not only leaking the > original params but also double freeing the copy the core made, leading > to memory corruption. > > Fix this by instead freeing the params parsed from the DT during > registration. > > This issue causing instability on systems where thermal zones are > unregistered, especially visble on those systems where some zones > provided by a device have no trip points such as Allwinner systems. > For example with current mainline an arm64 defconfig is unbootable on > Pine64 Plus and LibreTech Tritium is massively unstable. These issues > have been there for a while and have been made more prominent by recent > memory management changes. > > Fixes: 3fd6d6e2b4e80 ("thermal/of: Rework the thermal device tree initialization") > Signed-off-by: Mark Brown <broonie@kernel.org> > Cc: stable@vger.kernel.org I think this issue has been fixed by: https://lore.kernel.org/all/20230708112720.2897484-2-a.fatoum@pengutronix.de/ Rafael ? Did you pick it up ?
On Sun, Jul 23, 2023 at 11:57:52AM +0200, Daniel Lezcano wrote: > On 23/07/2023 01:26, Mark Brown wrote: > I think this issue has been fixed by: > https://lore.kernel.org/all/20230708112720.2897484-2-a.fatoum@pengutronix.de/ Yes, that should fix the same issue. > Rafael ? Did you pick it up ? There was a message on the thread saying the patches have been applied for v6.5 but I can't see them in either mainline or -next.
On Sun, Jul 23, 2023 at 4:32 PM Mark Brown <broonie@kernel.org> wrote: > > On Sun, Jul 23, 2023 at 11:57:52AM +0200, Daniel Lezcano wrote: > > On 23/07/2023 01:26, Mark Brown wrote: > > > I think this issue has been fixed by: > > > https://lore.kernel.org/all/20230708112720.2897484-2-a.fatoum@pengutronix.de/ > > Yes, that should fix the same issue. > > > Rafael ? Did you pick it up ? > > There was a message on the thread saying the patches have been applied > for v6.5 but I can't see them in either mainline or -next. They should be there in linux-next (as of today). Surely, they are present in my linux-next branch.
On Wed, Jul 26, 2023 at 08:42:39PM +0200, Rafael J. Wysocki wrote: > On Sun, Jul 23, 2023 at 4:32 PM Mark Brown <broonie@kernel.org> wrote: > > There was a message on the thread saying the patches have been applied > > for v6.5 but I can't see them in either mainline or -next. > They should be there in linux-next (as of today). Yes, they're there now. They weren't at time of writing the above (on Sunday). > Surely, they are present in my linux-next branch. Are they queued as fixes? It'd be really good to get these into v6.5, they're rendering the Allwinner platforms I have unusable.
On Wed, Jul 26, 2023 at 8:47 PM Mark Brown <broonie@kernel.org> wrote: > > On Wed, Jul 26, 2023 at 08:42:39PM +0200, Rafael J. Wysocki wrote: > > On Sun, Jul 23, 2023 at 4:32 PM Mark Brown <broonie@kernel.org> wrote: > > > > There was a message on the thread saying the patches have been applied > > > for v6.5 but I can't see them in either mainline or -next. > > > They should be there in linux-next (as of today). > > Yes, they're there now. They weren't at time of writing the above (on > Sunday). > > > Surely, they are present in my linux-next branch. > > Are they queued as fixes? They are. > It'd be really good to get these into v6.5, > they're rendering the Allwinner platforms I have unusable. I'm going to send a pull request with them tomorrow or on Friday.
On Wed, Jul 26, 2023 at 08:51:20PM +0200, Rafael J. Wysocki wrote: > On Wed, Jul 26, 2023 at 8:47 PM Mark Brown <broonie@kernel.org> wrote: > > > Surely, they are present in my linux-next branch. > > Are they queued as fixes? > They are. > > It'd be really good to get these into v6.5, > > they're rendering the Allwinner platforms I have unusable. > I'm going to send a pull request with them tomorrow or on Friday. Ah, excellent - thanks!
diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c index 6fb14e521197..0af11cdfa2c1 100644 --- a/drivers/thermal/thermal_of.c +++ b/drivers/thermal/thermal_of.c @@ -442,13 +442,11 @@ static int thermal_of_unbind(struct thermal_zone_device *tz, static void thermal_of_zone_unregister(struct thermal_zone_device *tz) { struct thermal_trip *trips = tz->trips; - struct thermal_zone_params *tzp = tz->tzp; struct thermal_zone_device_ops *ops = tz->ops; thermal_zone_device_disable(tz); thermal_zone_device_unregister(tz); kfree(trips); - kfree(tzp); kfree(ops); } @@ -530,6 +528,9 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node * goto out_kfree_tzp; } + /* The core will take a copy of tzp, free our copy here. */ + kfree(tzp); + ret = thermal_zone_device_enable(tz); if (ret) { pr_err("Failed to enabled thermal zone '%s', id=%d: %d\n",
Unlike the other data structures provided during registration the thermal core takes a copy of the thermal_zone_params provided to it and stores that copy in the thermal_zone_device, taking care to free it on unregistration. This is done because the parameters will be modified at runtime. Unfortunately the thermal_of code assumes that the params structure it provides will be used throughout the lifetime of the device and since the params are dynamically allocated based on the bindings it attempts to free it on unregistration. This results in not only leaking the original params but also double freeing the copy the core made, leading to memory corruption. Fix this by instead freeing the params parsed from the DT during registration. This issue causing instability on systems where thermal zones are unregistered, especially visble on those systems where some zones provided by a device have no trip points such as Allwinner systems. For example with current mainline an arm64 defconfig is unbootable on Pine64 Plus and LibreTech Tritium is massively unstable. These issues have been there for a while and have been made more prominent by recent memory management changes. Fixes: 3fd6d6e2b4e80 ("thermal/of: Rework the thermal device tree initialization") Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org --- drivers/thermal/thermal_of.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) --- base-commit: fdf0eaf11452d72945af31804e2a1048ee1b574c change-id: 20230722-thermal-fix-of-memory-corruption-73c023f8612b Best regards,