Message ID | 20230113180235.1604526-1-daniel.lezcano@linaro.org (mailing list archive) |
---|---|
Headers | show |
Series | Thermal ACPI APIs for generic trip points | expand |
Hi, On 13/01/2023 19:02, Daniel Lezcano wrote: > Recently sent as a RFC, the thermal ACPI for generic trip points is a set of > functions to fill the generic trip points structure which will become the > standard structure for the thermal framework and its users. > > Different Intel drivers and the ACPI thermal driver are using the ACPI tables to > get the thermal zone information. As those are getting the same information, > providing this set of ACPI function with the generic trip points will > consolidate the code. > > Also, the Intel PCH and the Intel 34xx drivers are converted to use the generic > trip points relying on the ACPI generic trip point parsing functions. > > These changes have been tested on a Thinkpad Lenovo x280 with the PCH and > INT34xx drivers. No regression have been observed, the trip points remain the > same for what is described on this system. Are we ok with this series ? Sorry for insisting but I would like to go forward with the generic thermal trip work. There are more patches pending depending on this series. Thanks -- Daniel > Changelog: > - V5: > - Fixed GTSH unit conversion, deciK -> milli C > > - V4: > - Fixed Kconfig option dependency, select THERMAL_ACPI if ACPI is set > only for the PCH driver > > - V3: > - Took into account Rafael's comments > - Used a silence option THERMAL_ACPI in order to stay consistent > with THERMAL_OF. It is up to the API user to select the option. > > - V2: > - Fix the thermal ACPI patch where the thermal_acpi.c was not included in > the series > - Provide a couple of users of this API which could have been tested on a > real system > > Daniel Lezcano (3): > thermal/acpi: Add ACPI trip point routines > thermal/drivers/intel: Use generic trip points for intel_pch > thermal/drivers/intel: Use generic trip points int340x > > drivers/thermal/Kconfig | 4 + > drivers/thermal/Makefile | 1 + > drivers/thermal/intel/Kconfig | 1 + > drivers/thermal/intel/int340x_thermal/Kconfig | 1 + > .../int340x_thermal/int340x_thermal_zone.c | 177 ++++----------- > .../int340x_thermal/int340x_thermal_zone.h | 10 +- > drivers/thermal/intel/intel_pch_thermal.c | 88 ++------ > drivers/thermal/thermal_acpi.c | 210 ++++++++++++++++++ > include/linux/thermal.h | 8 + > 9 files changed, 286 insertions(+), 214 deletions(-) > create mode 100644 drivers/thermal/thermal_acpi.c >
On Wed, 2023-01-18 at 10:53 +0100, Daniel Lezcano wrote: > Hi, > > On 13/01/2023 19:02, Daniel Lezcano wrote: > > Recently sent as a RFC, the thermal ACPI for generic trip points is > > a set of > > functions to fill the generic trip points structure which will > > become the > > standard structure for the thermal framework and its users. > > > > Different Intel drivers and the ACPI thermal driver are using the > > ACPI tables to > > get the thermal zone information. As those are getting the same > > information, > > providing this set of ACPI function with the generic trip points > > will > > consolidate the code. > > > > Also, the Intel PCH and the Intel 34xx drivers are converted to use > > the generic > > trip points relying on the ACPI generic trip point parsing > > functions. > > > > These changes have been tested on a Thinkpad Lenovo x280 with the > > PCH and > > INT34xx drivers. No regression have been observed, the trip points > > remain the > > same for what is described on this system. > > Are we ok with this series ? > > Sorry for insisting but I would like to go forward with the generic > thermal trip work. There are more patches pending depending on this > series. The whole series looks good to me. Reviwed-by: Zhang Rui <rui.zhang@intel.com> But we'd better wait for the thermald test result from Srinvias. thanks, rui > > Thanks > -- Daniel > > > Changelog: > > - V5: > > - Fixed GTSH unit conversion, deciK -> milli C > > > > - V4: > > - Fixed Kconfig option dependency, select THERMAL_ACPI if ACPI > > is set > > only for the PCH driver > > > > - V3: > > - Took into account Rafael's comments > > - Used a silence option THERMAL_ACPI in order to stay > > consistent > > with THERMAL_OF. It is up to the API user to select the > > option. > > > > - V2: > > - Fix the thermal ACPI patch where the thermal_acpi.c was not > > included in > > the series > > - Provide a couple of users of this API which could have been > > tested on a > > real system > > > > Daniel Lezcano (3): > > thermal/acpi: Add ACPI trip point routines > > thermal/drivers/intel: Use generic trip points for intel_pch > > thermal/drivers/intel: Use generic trip points int340x > > > > drivers/thermal/Kconfig | 4 + > > drivers/thermal/Makefile | 1 + > > drivers/thermal/intel/Kconfig | 1 + > > drivers/thermal/intel/int340x_thermal/Kconfig | 1 + > > .../int340x_thermal/int340x_thermal_zone.c | 177 ++++--------- > > -- > > .../int340x_thermal/int340x_thermal_zone.h | 10 +- > > drivers/thermal/intel/intel_pch_thermal.c | 88 ++------ > > drivers/thermal/thermal_acpi.c | 210 > > ++++++++++++++++++ > > include/linux/thermal.h | 8 + > > 9 files changed, 286 insertions(+), 214 deletions(-) > > create mode 100644 drivers/thermal/thermal_acpi.c > >
On 18/01/2023 14:48, Zhang, Rui wrote: > On Wed, 2023-01-18 at 10:53 +0100, Daniel Lezcano wrote: >> Hi, >> >> On 13/01/2023 19:02, Daniel Lezcano wrote: >>> Recently sent as a RFC, the thermal ACPI for generic trip points is >>> a set of >>> functions to fill the generic trip points structure which will >>> become the >>> standard structure for the thermal framework and its users. >>> >>> Different Intel drivers and the ACPI thermal driver are using the >>> ACPI tables to >>> get the thermal zone information. As those are getting the same >>> information, >>> providing this set of ACPI function with the generic trip points >>> will >>> consolidate the code. >>> >>> Also, the Intel PCH and the Intel 34xx drivers are converted to use >>> the generic >>> trip points relying on the ACPI generic trip point parsing >>> functions. >>> >>> These changes have been tested on a Thinkpad Lenovo x280 with the >>> PCH and >>> INT34xx drivers. No regression have been observed, the trip points >>> remain the >>> same for what is described on this system. >> >> Are we ok with this series ? >> >> Sorry for insisting but I would like to go forward with the generic >> thermal trip work. There are more patches pending depending on this >> series. > > The whole series looks good to me. > > Reviwed-by: Zhang Rui <rui.zhang@intel.com> > > But we'd better wait for the thermald test result from Srinvias. Sure, thanks for the review !
On Wed, 2023-01-18 at 13:48 +0000, Zhang, Rui wrote: > On Wed, 2023-01-18 at 10:53 +0100, Daniel Lezcano wrote: > > Hi, > > > > On 13/01/2023 19:02, Daniel Lezcano wrote: > > > Recently sent as a RFC, the thermal ACPI for generic trip points > > > is > > > a set of > > > functions to fill the generic trip points structure which will > > > become the > > > standard structure for the thermal framework and its users. > > > > > > Different Intel drivers and the ACPI thermal driver are using the > > > ACPI tables to > > > get the thermal zone information. As those are getting the same > > > information, > > > providing this set of ACPI function with the generic trip points > > > will > > > consolidate the code. > > > > > > Also, the Intel PCH and the Intel 34xx drivers are converted to > > > use > > > the generic > > > trip points relying on the ACPI generic trip point parsing > > > functions. > > > > > > These changes have been tested on a Thinkpad Lenovo x280 with the > > > PCH and > > > INT34xx drivers. No regression have been observed, the trip > > > points > > > remain the > > > same for what is described on this system. > > > > Are we ok with this series ? > > > > Sorry for insisting but I would like to go forward with the generic > > thermal trip work. There are more patches pending depending on this > > series. > > The whole series looks good to me. > > Reviwed-by: Zhang Rui <rui.zhang@intel.com> > > But we'd better wait for the thermald test result from Srinvias. A quick test show that things still work with thermald and these changes. Thanks, Srinivas > > thanks, > rui > > > > Thanks > > -- Daniel > > > > > Changelog: > > > - V5: > > > - Fixed GTSH unit conversion, deciK -> milli C > > > > > > - V4: > > > - Fixed Kconfig option dependency, select THERMAL_ACPI if > > > ACPI > > > is set > > > only for the PCH driver > > > > > > - V3: > > > - Took into account Rafael's comments > > > - Used a silence option THERMAL_ACPI in order to stay > > > consistent > > > with THERMAL_OF. It is up to the API user to select the > > > option. > > > > > > - V2: > > > - Fix the thermal ACPI patch where the thermal_acpi.c was not > > > included in > > > the series > > > - Provide a couple of users of this API which could have been > > > tested on a > > > real system > > > > > > Daniel Lezcano (3): > > > thermal/acpi: Add ACPI trip point routines > > > thermal/drivers/intel: Use generic trip points for intel_pch > > > thermal/drivers/intel: Use generic trip points int340x > > > > > > drivers/thermal/Kconfig | 4 + > > > drivers/thermal/Makefile | 1 + > > > drivers/thermal/intel/Kconfig | 1 + > > > drivers/thermal/intel/int340x_thermal/Kconfig | 1 + > > > .../int340x_thermal/int340x_thermal_zone.c | 177 ++++------- > > > -- > > > -- > > > .../int340x_thermal/int340x_thermal_zone.h | 10 +- > > > drivers/thermal/intel/intel_pch_thermal.c | 88 ++------ > > > drivers/thermal/thermal_acpi.c | 210 > > > ++++++++++++++++++ > > > include/linux/thermal.h | 8 + > > > 9 files changed, 286 insertions(+), 214 deletions(-) > > > create mode 100644 drivers/thermal/thermal_acpi.c > > >
Hi Srinivas, On 18/01/2023 20:01, srinivas pandruvada wrote: [ ... ] >> The whole series looks good to me. >> >> Reviwed-by: Zhang Rui <rui.zhang@intel.com> >> >> But we'd better wait for the thermald test result from Srinvias. > > A quick test show that things still work with thermald and these > changes. Thanks for testing. Shall I consider as Tested-by or do you want to test more ?
On Wed, 2023-01-18 at 11:01 -0800, srinivas pandruvada wrote: > On Wed, 2023-01-18 at 13:48 +0000, Zhang, Rui wrote: > > On Wed, 2023-01-18 at 10:53 +0100, Daniel Lezcano wrote: > > > Hi, > > > > > > On 13/01/2023 19:02, Daniel Lezcano wrote: > > > > Recently sent as a RFC, the thermal ACPI for generic trip > > > > points > > > > is > > > > a set of > > > > functions to fill the generic trip points structure which will > > > > become the > > > > standard structure for the thermal framework and its users. > > > > > > > > Different Intel drivers and the ACPI thermal driver are using > > > > the > > > > ACPI tables to > > > > get the thermal zone information. As those are getting the same > > > > information, > > > > providing this set of ACPI function with the generic trip > > > > points > > > > will > > > > consolidate the code. > > > > > > > > Also, the Intel PCH and the Intel 34xx drivers are converted to > > > > use > > > > the generic > > > > trip points relying on the ACPI generic trip point parsing > > > > functions. > > > > > > > > These changes have been tested on a Thinkpad Lenovo x280 with > > > > the > > > > PCH and > > > > INT34xx drivers. No regression have been observed, the trip > > > > points > > > > remain the > > > > same for what is described on this system. > > > > > > Are we ok with this series ? > > > > > > Sorry for insisting but I would like to go forward with the > > > generic > > > thermal trip work. There are more patches pending depending on > > > this > > > series. > > > > The whole series looks good to me. > > > > Reviwed-by: Zhang Rui <rui.zhang@intel.com> > > > > But we'd better wait for the thermald test result from Srinvias. > > A quick test show that things still work with thermald and these > changes. But I have a question. In some devices trip point temperature is not static. When hardware changes, we get notification. For example INT3403_PERF_TRIP_POINT_CHANGED for INT3403 drivers. Currently get_trip can get the latest changed value. But if we preregister, we need some mechanism to update them. Thanks, Srinivas > Thanks, > Srinivas > > > > > thanks, > > rui > > > > > > Thanks > > > -- Daniel > > > > > > > Changelog: > > > > - V5: > > > > - Fixed GTSH unit conversion, deciK -> milli C > > > > > > > > - V4: > > > > - Fixed Kconfig option dependency, select THERMAL_ACPI if > > > > ACPI > > > > is set > > > > only for the PCH driver > > > > > > > > - V3: > > > > - Took into account Rafael's comments > > > > - Used a silence option THERMAL_ACPI in order to stay > > > > consistent > > > > with THERMAL_OF. It is up to the API user to select the > > > > option. > > > > > > > > - V2: > > > > - Fix the thermal ACPI patch where the thermal_acpi.c was > > > > not > > > > included in > > > > the series > > > > - Provide a couple of users of this API which could have > > > > been > > > > tested on a > > > > real system > > > > > > > > Daniel Lezcano (3): > > > > thermal/acpi: Add ACPI trip point routines > > > > thermal/drivers/intel: Use generic trip points for intel_pch > > > > thermal/drivers/intel: Use generic trip points int340x > > > > > > > > drivers/thermal/Kconfig | 4 + > > > > drivers/thermal/Makefile | 1 + > > > > drivers/thermal/intel/Kconfig | 1 + > > > > drivers/thermal/intel/int340x_thermal/Kconfig | 1 + > > > > .../int340x_thermal/int340x_thermal_zone.c | 177 ++++----- > > > > -- > > > > -- > > > > -- > > > > .../int340x_thermal/int340x_thermal_zone.h | 10 +- > > > > drivers/thermal/intel/intel_pch_thermal.c | 88 ++------ > > > > drivers/thermal/thermal_acpi.c | 210 > > > > ++++++++++++++++++ > > > > include/linux/thermal.h | 8 + > > > > 9 files changed, 286 insertions(+), 214 deletions(-) > > > > create mode 100644 drivers/thermal/thermal_acpi.c > > > > >
On 18/01/2023 20:16, srinivas pandruvada wrote: [ ... ] >>> But we'd better wait for the thermald test result from Srinvias. >> >> A quick test show that things still work with thermald and these >> changes. > > But I have a question. In some devices trip point temperature is not > static. When hardware changes, we get notification. For example > INT3403_PERF_TRIP_POINT_CHANGED for INT3403 drivers. > Currently get_trip can get the latest changed value. But if we > preregister, we need some mechanism to update them. When the notification INT3403_PERF_TRIP_POINT_CHANGED happens, we call int340x_thermal_read_trips() which in turn updates the trip points.
On Wed, 2023-01-18 at 21:00 +0100, Daniel Lezcano wrote: > On 18/01/2023 20:16, srinivas pandruvada wrote: > > [ ... ] > > > > > But we'd better wait for the thermald test result from > > > > Srinvias. > > > > > > A quick test show that things still work with thermald and these > > > changes. > > > > But I have a question. In some devices trip point temperature is > > not > > static. When hardware changes, we get notification. For example > > INT3403_PERF_TRIP_POINT_CHANGED for INT3403 drivers. > > Currently get_trip can get the latest changed value. But if we > > preregister, we need some mechanism to update them. > > When the notification INT3403_PERF_TRIP_POINT_CHANGED happens, we > call > int340x_thermal_read_trips() which in turn updates the trip points. > Not sure how we handle concurrency here when driver can freely update trips while thermal core is using trips. Thanks, Srinivas > >
On 18/01/2023 21:53, srinivas pandruvada wrote: > On Wed, 2023-01-18 at 21:00 +0100, Daniel Lezcano wrote: >> On 18/01/2023 20:16, srinivas pandruvada wrote: >> >> [ ... ] >> >>>>> But we'd better wait for the thermald test result from >>>>> Srinvias. >>>> >>>> A quick test show that things still work with thermald and these >>>> changes. >>> >>> But I have a question. In some devices trip point temperature is >>> not >>> static. When hardware changes, we get notification. For example >>> INT3403_PERF_TRIP_POINT_CHANGED for INT3403 drivers. >>> Currently get_trip can get the latest changed value. But if we >>> preregister, we need some mechanism to update them. >> >> When the notification INT3403_PERF_TRIP_POINT_CHANGED happens, we >> call >> int340x_thermal_read_trips() which in turn updates the trip points. >> > > Not sure how we handle concurrency here when driver can freely update > trips while thermal core is using trips. Don't we have the same race without this patch ? The thermal core can call get_trip_temp() while there is an update, no ?
On Wed, 2023-01-18 at 22:01 +0100, Daniel Lezcano wrote: > On 18/01/2023 21:53, srinivas pandruvada wrote: > > On Wed, 2023-01-18 at 21:00 +0100, Daniel Lezcano wrote: > > > On 18/01/2023 20:16, srinivas pandruvada wrote: > > > > > > [ ... ] > > > > > > > > > But we'd better wait for the thermald test result from > > > > > > Srinvias. > > > > > > > > > > A quick test show that things still work with thermald and > > > > > these > > > > > changes. > > > > > > > > But I have a question. In some devices trip point temperature > > > > is > > > > not > > > > static. When hardware changes, we get notification. For example > > > > INT3403_PERF_TRIP_POINT_CHANGED for INT3403 drivers. > > > > Currently get_trip can get the latest changed value. But if we > > > > preregister, we need some mechanism to update them. > > > > > > When the notification INT3403_PERF_TRIP_POINT_CHANGED happens, we > > > call > > > int340x_thermal_read_trips() which in turn updates the trip > > > points. > > > > > > > Not sure how we handle concurrency here when driver can freely > > update > > trips while thermal core is using trips. > > Don't we have the same race without this patch ? The thermal core can > call get_trip_temp() while there is an update, no ? Yes it is. But I can add a mutex locally here to solve. But not any longer. I think you need some thermal_zone_read_lock/unlock() in core, which can use rcu. Even mutex is fine as there will be no contention as updates to trips will be rare. Thanks, Srinivas >
On 18/01/2023 22:16, srinivas pandruvada wrote: > On Wed, 2023-01-18 at 22:01 +0100, Daniel Lezcano wrote: >> On 18/01/2023 21:53, srinivas pandruvada wrote: >>> On Wed, 2023-01-18 at 21:00 +0100, Daniel Lezcano wrote: >>>> On 18/01/2023 20:16, srinivas pandruvada wrote: >>>> >>>> [ ... ] >>>> >>>>>>> But we'd better wait for the thermald test result from >>>>>>> Srinvias. >>>>>> >>>>>> A quick test show that things still work with thermald and >>>>>> these >>>>>> changes. >>>>> >>>>> But I have a question. In some devices trip point temperature >>>>> is >>>>> not >>>>> static. When hardware changes, we get notification. For example >>>>> INT3403_PERF_TRIP_POINT_CHANGED for INT3403 drivers. >>>>> Currently get_trip can get the latest changed value. But if we >>>>> preregister, we need some mechanism to update them. >>>> >>>> When the notification INT3403_PERF_TRIP_POINT_CHANGED happens, we >>>> call >>>> int340x_thermal_read_trips() which in turn updates the trip >>>> points. >>>> >>> >>> Not sure how we handle concurrency here when driver can freely >>> update >>> trips while thermal core is using trips. >> >> Don't we have the same race without this patch ? The thermal core can >> call get_trip_temp() while there is an update, no ? > Yes it is. But I can add a mutex locally here to solve. > But not any longer. > > I think you need some thermal_zone_read_lock/unlock() in core, which > can use rcu. Even mutex is fine as there will be no contention as > updates to trips will be rare. I was planning to provide a thermal_trips_update(tz, trips) and from there handle the locking. As the race was already existing, can we postpone this change after the generic trip points changes? There is still a lot of work to do to consolidate the code. One of them is to provide a generic function to browse the trip points and ensure the code is using it instead of directly inspect the thermal zone internals structure. I'm almost there but I need the remaining Intel drivers changes to be merged (as well as ACPI which is finished but depending on this series).
On Wed, 2023-01-18 at 23:14 +0100, Daniel Lezcano wrote: > On 18/01/2023 22:16, srinivas pandruvada wrote: > > On Wed, 2023-01-18 at 22:01 +0100, Daniel Lezcano wrote: > > > On 18/01/2023 21:53, srinivas pandruvada wrote: > > > > On Wed, 2023-01-18 at 21:00 +0100, Daniel Lezcano wrote: > > > > > On 18/01/2023 20:16, srinivas pandruvada wrote: > > > > > > > > > > [ ... ] > > > > > > > > > > > > > But we'd better wait for the thermald test result from > > > > > > > > Srinvias. > > > > > > > > > > > > > > A quick test show that things still work with thermald > > > > > > > and > > > > > > > these > > > > > > > changes. > > > > > > > > > > > > But I have a question. In some devices trip point > > > > > > temperature > > > > > > is > > > > > > not > > > > > > static. When hardware changes, we get notification. For > > > > > > example > > > > > > INT3403_PERF_TRIP_POINT_CHANGED for INT3403 drivers. > > > > > > Currently get_trip can get the latest changed value. But if > > > > > > we > > > > > > preregister, we need some mechanism to update them. > > > > > > > > > > When the notification INT3403_PERF_TRIP_POINT_CHANGED > > > > > happens, we > > > > > call > > > > > int340x_thermal_read_trips() which in turn updates the trip > > > > > points. > > > > > > > > > > > > > Not sure how we handle concurrency here when driver can freely > > > > update > > > > trips while thermal core is using trips. > > > > > > Don't we have the same race without this patch ? The thermal core > > > can > > > call get_trip_temp() while there is an update, no ? > > Yes it is. But I can add a mutex locally here to solve. > > But not any longer. > > > > I think you need some thermal_zone_read_lock/unlock() in core, > > which > > can use rcu. Even mutex is fine as there will be no contention as > > updates to trips will be rare. > > I was planning to provide a thermal_trips_update(tz, trips) and from > there handle the locking. > > As the race was already existing, can we postpone this change after > the > generic trip points changes? I think so. > > There is still a lot of work to do to consolidate the code. One of > them > is to provide a generic function to browse the trip points and ensure > the code is using it instead of directly inspect the thermal zone > internals structure. > > I'm almost there but I need the remaining Intel drivers changes to be > merged (as well as ACPI which is finished but depending on this > series). > Sounds good. You can add my tested by for this. Thanks, Srinivas
On Thu, Jan 19, 2023 at 12:04 AM srinivas pandruvada <srinivas.pandruvada@linux.intel.com> wrote: > > On Wed, 2023-01-18 at 23:14 +0100, Daniel Lezcano wrote: > > On 18/01/2023 22:16, srinivas pandruvada wrote: > > > On Wed, 2023-01-18 at 22:01 +0100, Daniel Lezcano wrote: > > > > On 18/01/2023 21:53, srinivas pandruvada wrote: > > > > > On Wed, 2023-01-18 at 21:00 +0100, Daniel Lezcano wrote: > > > > > > On 18/01/2023 20:16, srinivas pandruvada wrote: > > > > > > > > > > > > [ ... ] > > > > > > > > > > > > > > > But we'd better wait for the thermald test result from > > > > > > > > > Srinvias. > > > > > > > > > > > > > > > > A quick test show that things still work with thermald > > > > > > > > and > > > > > > > > these > > > > > > > > changes. > > > > > > > > > > > > > > But I have a question. In some devices trip point > > > > > > > temperature > > > > > > > is > > > > > > > not > > > > > > > static. When hardware changes, we get notification. For > > > > > > > example > > > > > > > INT3403_PERF_TRIP_POINT_CHANGED for INT3403 drivers. > > > > > > > Currently get_trip can get the latest changed value. But if > > > > > > > we > > > > > > > preregister, we need some mechanism to update them. > > > > > > > > > > > > When the notification INT3403_PERF_TRIP_POINT_CHANGED > > > > > > happens, we > > > > > > call > > > > > > int340x_thermal_read_trips() which in turn updates the trip > > > > > > points. > > > > > > > > > > > > > > > > Not sure how we handle concurrency here when driver can freely > > > > > update > > > > > trips while thermal core is using trips. > > > > > > > > Don't we have the same race without this patch ? The thermal core > > > > can > > > > call get_trip_temp() while there is an update, no ? > > > Yes it is. But I can add a mutex locally here to solve. > > > But not any longer. > > > > > > I think you need some thermal_zone_read_lock/unlock() in core, > > > which > > > can use rcu. Even mutex is fine as there will be no contention as > > > updates to trips will be rare. > > > > I was planning to provide a thermal_trips_update(tz, trips) and from > > there handle the locking. > > > > As the race was already existing, can we postpone this change after > > the > > generic trip points changes? > I think so. Well, what if this bug is reported by a user and a fix needs to be backported to "stable"? Are we going to backport the whole framework redesign along with it? Or is this extremely unlikely?
On Thu, 2023-01-19 at 13:17 +0100, Rafael J. Wysocki wrote: > On Thu, Jan 19, 2023 at 12:04 AM srinivas pandruvada > <srinivas.pandruvada@linux.intel.com> wrote: > > > > On Wed, 2023-01-18 at 23:14 +0100, Daniel Lezcano wrote: > > > On 18/01/2023 22:16, srinivas pandruvada wrote: > > > > On Wed, 2023-01-18 at 22:01 +0100, Daniel Lezcano wrote: > > > > > On 18/01/2023 21:53, srinivas pandruvada wrote: > > > > > > On Wed, 2023-01-18 at 21:00 +0100, Daniel Lezcano wrote: > > > > > > > On 18/01/2023 20:16, srinivas pandruvada wrote: > > > > > > > > > > > > > > [ ... ] > > > > > > > > > > > > > > > > > But we'd better wait for the thermald test result > > > > > > > > > > from > > > > > > > > > > Srinvias. > > > > > > > > > > > > > > > > > > A quick test show that things still work with > > > > > > > > > thermald > > > > > > > > > and > > > > > > > > > these > > > > > > > > > changes. > > > > > > > > > > > > > > > > But I have a question. In some devices trip point > > > > > > > > temperature > > > > > > > > is > > > > > > > > not > > > > > > > > static. When hardware changes, we get notification. For > > > > > > > > example > > > > > > > > INT3403_PERF_TRIP_POINT_CHANGED for INT3403 drivers. > > > > > > > > Currently get_trip can get the latest changed value. > > > > > > > > But if > > > > > > > > we > > > > > > > > preregister, we need some mechanism to update them. > > > > > > > > > > > > > > When the notification INT3403_PERF_TRIP_POINT_CHANGED > > > > > > > happens, we > > > > > > > call > > > > > > > int340x_thermal_read_trips() which in turn updates the > > > > > > > trip > > > > > > > points. > > > > > > > > > > > > > > > > > > > Not sure how we handle concurrency here when driver can > > > > > > freely > > > > > > update > > > > > > trips while thermal core is using trips. > > > > > > > > > > Don't we have the same race without this patch ? The thermal > > > > > core > > > > > can > > > > > call get_trip_temp() while there is an update, no ? > > > > Yes it is. But I can add a mutex locally here to solve. > > > > But not any longer. > > > > > > > > I think you need some thermal_zone_read_lock/unlock() in core, > > > > which > > > > can use rcu. Even mutex is fine as there will be no contention > > > > as > > > > updates to trips will be rare. > > > > > > I was planning to provide a thermal_trips_update(tz, trips) and > > > from > > > there handle the locking. > > > > > > As the race was already existing, can we postpone this change > > > after > > > the > > > generic trip points changes? > > I think so. > > Well, what if this bug is reported by a user and a fix needs to be > backported to "stable"? > > Are we going to backport the whole framework redesign along with it? > > Or is this extremely unlikely? These trips are read at the start of DTT/Thermald and will be read once after notification is received. So extremely unlikely. But we can add the patch before this series to address this issue, which can be marked stable. I can submit this. Thanks, Srinivas
On Thu, Jan 19, 2023 at 5:58 PM srinivas pandruvada <srinivas.pandruvada@linux.intel.com> wrote: > > On Thu, 2023-01-19 at 13:17 +0100, Rafael J. Wysocki wrote: > > On Thu, Jan 19, 2023 at 12:04 AM srinivas pandruvada > > <srinivas.pandruvada@linux.intel.com> wrote: > > > > > > On Wed, 2023-01-18 at 23:14 +0100, Daniel Lezcano wrote: > > > > On 18/01/2023 22:16, srinivas pandruvada wrote: > > > > > On Wed, 2023-01-18 at 22:01 +0100, Daniel Lezcano wrote: > > > > > > On 18/01/2023 21:53, srinivas pandruvada wrote: > > > > > > > On Wed, 2023-01-18 at 21:00 +0100, Daniel Lezcano wrote: > > > > > > > > On 18/01/2023 20:16, srinivas pandruvada wrote: > > > > > > > > > > > > > > > > [ ... ] > > > > > > > > > > > > > > > > > > > But we'd better wait for the thermald test result > > > > > > > > > > > from > > > > > > > > > > > Srinvias. > > > > > > > > > > > > > > > > > > > > A quick test show that things still work with > > > > > > > > > > thermald > > > > > > > > > > and > > > > > > > > > > these > > > > > > > > > > changes. > > > > > > > > > > > > > > > > > > But I have a question. In some devices trip point > > > > > > > > > temperature > > > > > > > > > is > > > > > > > > > not > > > > > > > > > static. When hardware changes, we get notification. For > > > > > > > > > example > > > > > > > > > INT3403_PERF_TRIP_POINT_CHANGED for INT3403 drivers. > > > > > > > > > Currently get_trip can get the latest changed value. > > > > > > > > > But if > > > > > > > > > we > > > > > > > > > preregister, we need some mechanism to update them. > > > > > > > > > > > > > > > > When the notification INT3403_PERF_TRIP_POINT_CHANGED > > > > > > > > happens, we > > > > > > > > call > > > > > > > > int340x_thermal_read_trips() which in turn updates the > > > > > > > > trip > > > > > > > > points. > > > > > > > > > > > > > > > > > > > > > > Not sure how we handle concurrency here when driver can > > > > > > > freely > > > > > > > update > > > > > > > trips while thermal core is using trips. > > > > > > > > > > > > Don't we have the same race without this patch ? The thermal > > > > > > core > > > > > > can > > > > > > call get_trip_temp() while there is an update, no ? > > > > > Yes it is. But I can add a mutex locally here to solve. > > > > > But not any longer. > > > > > > > > > > I think you need some thermal_zone_read_lock/unlock() in core, > > > > > which > > > > > can use rcu. Even mutex is fine as there will be no contention > > > > > as > > > > > updates to trips will be rare. > > > > > > > > I was planning to provide a thermal_trips_update(tz, trips) and > > > > from > > > > there handle the locking. > > > > > > > > As the race was already existing, can we postpone this change > > > > after > > > > the > > > > generic trip points changes? > > > I think so. > > > > Well, what if this bug is reported by a user and a fix needs to be > > backported to "stable"? > > > > Are we going to backport the whole framework redesign along with it? > > > > Or is this extremely unlikely? > These trips are read at the start of DTT/Thermald and will be read once > after notification is received. So extremely unlikely. > But we can add the patch before this series to address this issue, > which can be marked stable. I can submit this. Looks reasonable to me.