Message ID | 20180724234636.57137-1-mka@chromium.org (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Eduardo Valentin |
Headers | show |
Series | [v5,1/3] thermal: qcom-spmi: Use PMIC thermal stage 2 for critical trip points | expand |
On Tue, Jul 24, 2018 at 04:46:34PM -0700, Matthias Kaehlcke wrote: > There are three thermal stages defined in the PMIC: > > stage 1: warning > stage 2: system should shut down > stage 3: emergency shut down > > By default the PMIC assumes that the OS isn't doing anything and thus > at stage 2 it does a partial PMIC shutdown and at stage 3 it kills > all power. When switching between thermal stages the PMIC generates an > interrupt which is handled by the driver. The partial PMIC shutdown at > stage 2 can be disabled by software, which allows the OS to initiate a > shutdown at stage 2 with a thermal zone configured accordingly. > > If a critical trip point is configured in the thermal zone the driver > adjusts the stage 1-3 temperature thresholds to (closely) match the > critical temperature with a stage 2 threshold (125/130/135/140 °C). > If a suitable match is found the partial shutdown at stage 2 is > disabled. If for some reason the system doesn't shutdown at stage 2 > the emergency shutdown at stage 3 kicks in. > > The partial shutdown at stage 2 remains enabled in these cases: > - no critical trip point defined > - the temperature of the critical trip point is < 125°C > - the temperature of the critical trip point is > 140°C and no > ADC channel is configured (thus the OS is not notified when the critical > temperature is reached) > > Suggested-by: Douglas Anderson <dianders@chromium.org> > Signed-off-by: Matthias Kaehlcke <mka@chromium.org> > --- > Changes in v5: > - patch added to the series > --- > drivers/thermal/qcom-spmi-temp-alarm.c | 161 ++++++++++++++++++++++--- > 1 file changed, 142 insertions(+), 19 deletions(-) > > diff --git a/drivers/thermal/qcom-spmi-temp-alarm.c b/drivers/thermal/qcom-spmi-temp-alarm.c > index ad4f3a8d6560..936e4dde4298 100644 > --- a/drivers/thermal/qcom-spmi-temp-alarm.c > +++ b/drivers/thermal/qcom-spmi-temp-alarm.c > > ... > > +static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip, > + int temp) > +{ > + u8 reg; > + bool disable_s2_shutdown = false; > + int ret; > + > + WARN_ON(!mutex_is_locked(&chip->lock)); > + > + /* > + * Default: S2 and S3 shutdown enabled, thresholds at > + * 105C/125C/145C, monitoring at 25Hz > + */ > + reg = SHUTDOWN_CTRL1_RATE_25HZ; > + > + if ((temp == THERMAL_TEMP_INVALID) || > + (temp < STAGE2_THRESHOLD_MIN)) { > + chip->thresh = THRESH_MIN; > + goto skip; > + } > + > + if (temp <= STAGE2_THRESHOLD_MAX) { > + chip->thresh = THRESH_MAX - > + ((STAGE2_THRESHOLD_MAX - temp) / > + TEMP_THRESH_STEP); > + disable_s2_shutdown = true; > + } else { > + chip->thresh = THRESH_MAX; > + > + if (!IS_ERR(chip->adc)) Note to self: with commit 7a4ca51b7040 ("thermal/drivers/qcom-spmi: Use devm_iio_channel_get") this should be 'if (chip->adc)'.
Hi, On Tue, Jul 24, 2018 at 4:46 PM, Matthias Kaehlcke <mka@chromium.org> wrote: > +static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip, > + int temp) > +{ > + u8 reg; > + bool disable_s2_shutdown = false; > + int ret; > + > + WARN_ON(!mutex_is_locked(&chip->lock)); > + > + /* > + * Default: S2 and S3 shutdown enabled, thresholds at > + * 105C/125C/145C, monitoring at 25Hz > + */ > + reg = SHUTDOWN_CTRL1_RATE_25HZ; > + > + if ((temp == THERMAL_TEMP_INVALID) || > + (temp < STAGE2_THRESHOLD_MIN)) { > + chip->thresh = THRESH_MIN; > + goto skip; > + } > + > + if (temp <= STAGE2_THRESHOLD_MAX) { > + chip->thresh = THRESH_MAX - > + ((STAGE2_THRESHOLD_MAX - temp) / > + TEMP_THRESH_STEP); > + disable_s2_shutdown = true; > + } else { > + chip->thresh = THRESH_MAX; > + > + if (!IS_ERR(chip->adc)) > + disable_s2_shutdown = true; > + else > + dev_warn(chip->dev, > + "No ADC is configured and critical temperature is above the maximum stage 2 threshold of 140°C! Configuring stage 2 shutdown at 140°C.\n"); Putting a non-ASCII character (the degree symbol) in your commit message is one thing, but are you sure it's wise to put it in the kernel logs? > + } > + > +skip: > + reg |= chip->thresh; > + if (disable_s2_shutdown) > + reg |= SHUTDOWN_CTRL1_OVERRIDE_S2; > + > + ret = qpnp_tm_write(chip, QPNP_TM_REG_SHUTDOWN_CTRL1, reg); > + if (ret < 0) > + return ret; > + > + return ret; Simplify the above lines to: return qpnp_tm_write(chip, QPNP_TM_REG_SHUTDOWN_CTRL1, reg); > @@ -313,12 +441,7 @@ static int qpnp_tm_probe(struct platform_device *pdev) > if (ret < 0) > return ret; > > - chip->tz_dev = devm_thermal_zone_of_sensor_register(&pdev->dev, 0, chip, > - &qpnp_tm_sensor_ops); > - if (IS_ERR(chip->tz_dev)) { > - dev_err(&pdev->dev, "failed to register sensor\n"); > - return PTR_ERR(chip->tz_dev); > - } > + chip->initialized = true; Should we add "thermal_zone_device_update(chip->tz_dev, THERMAL_EVENT_UNSPECIFIED);" here ...also: do we care about any type of locking for chip->initialized? Technically we can be running on weakly ordered memory so if qpnp_tm_update_temp_no_adc() is running on a different processor then possibly it could still keep returning the default temperature for a little while. We could try to analyze whether there's some sort of implicit barrier or we could add manual memory barriers, but generally I try to avoid that and just do the simple locking... What about just setting chip-Initialized = true at the end of qpnp_tm_init() while the mutex is still held? I'd also love to hear from someone with more thermal framework experience to make sure it's legit to return a default value if someone calls us while we're initting. It seems sane to me but nice to confirm it's OK. Overall I like the idea of this patch so hopefully others do too. Thanks for sending it out! -Doug
Hi Doug, On Wed, Jul 25, 2018 at 04:19:56PM -0700, Doug Anderson wrote: > On Tue, Jul 24, 2018 at 4:46 PM, Matthias Kaehlcke <mka@chromium.org> wrote: > > +static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip, > > + int temp) > > +{ > > + u8 reg; > > + bool disable_s2_shutdown = false; > > + int ret; > > + > > + WARN_ON(!mutex_is_locked(&chip->lock)); > > + > > + /* > > + * Default: S2 and S3 shutdown enabled, thresholds at > > + * 105C/125C/145C, monitoring at 25Hz > > + */ > > + reg = SHUTDOWN_CTRL1_RATE_25HZ; > > + > > + if ((temp == THERMAL_TEMP_INVALID) || > > + (temp < STAGE2_THRESHOLD_MIN)) { > > + chip->thresh = THRESH_MIN; > > + goto skip; > > + } > > + > > + if (temp <= STAGE2_THRESHOLD_MAX) { > > + chip->thresh = THRESH_MAX - > > + ((STAGE2_THRESHOLD_MAX - temp) / > > + TEMP_THRESH_STEP); > > + disable_s2_shutdown = true; > > + } else { > > + chip->thresh = THRESH_MAX; > > + > > + if (!IS_ERR(chip->adc)) > > + disable_s2_shutdown = true; > > + else > > + dev_warn(chip->dev, > > + "No ADC is configured and critical temperature is above the maximum stage 2 threshold of 140°C! Configuring stage 2 shutdown at 140°C.\n"); > > Putting a non-ASCII character (the degree symbol) in your commit > message is one thing, but are you sure it's wise to put it in the > kernel logs? A few other drivers also do this (drivers/gpu/drm/nouveau/nvkm/subdev/clk/base.c, drivers/macintosh/windfarm_pm121.c), however that doesn't mean it's a good idea. Will change to degC or C. > > + } > > + > > +skip: > > + reg |= chip->thresh; > > + if (disable_s2_shutdown) > > + reg |= SHUTDOWN_CTRL1_OVERRIDE_S2; > > + > > + ret = qpnp_tm_write(chip, QPNP_TM_REG_SHUTDOWN_CTRL1, reg); > > + if (ret < 0) > > + return ret; > > + > > + return ret; > > Simplify the above lines to: > > return qpnp_tm_write(chip, QPNP_TM_REG_SHUTDOWN_CTRL1, reg); Ouch, my code is indeed dumb ... > > @@ -313,12 +441,7 @@ static int qpnp_tm_probe(struct platform_device *pdev) > > if (ret < 0) > > return ret; > > > > - chip->tz_dev = devm_thermal_zone_of_sensor_register(&pdev->dev, 0, chip, > > - &qpnp_tm_sensor_ops); > > - if (IS_ERR(chip->tz_dev)) { > > - dev_err(&pdev->dev, "failed to register sensor\n"); > > - return PTR_ERR(chip->tz_dev); > > - } > > + chip->initialized = true; > > Should we add "thermal_zone_device_update(chip->tz_dev, > THERMAL_EVENT_UNSPECIFIED);" here Seems reasonable, will do. > ...also: do we care about any type of locking for chip->initialized? > Technically we can be running on weakly ordered memory so if > qpnp_tm_update_temp_no_adc() is running on a different processor then > possibly it could still keep returning the default temperature for a > little while. We could try to analyze whether there's some sort of > implicit barrier or we could add manual memory barriers, but generally > I try to avoid that and just do the simple locking... What about just > setting chip-Initialized = true at the end of qpnp_tm_init() while the > mutex is still held? Thanks for pointing that out. I agree that we should keep things simple, chip->initialized to true at the end of qpnp_tm_init() sounds good to me. > I'd also love to hear from someone with more thermal framework > experience to make sure it's legit to return a default value if > someone calls us while we're initting. It seems sane to me but nice > to confirm it's OK. An alternative could be to return THERMAL_TEMP_INVALID, however I don't see this handled outside of thermal_core.c, not sure if it could throw some other code off. Comments from thermal folks on either approach (or alternatives) are definitely welcome :) > Overall I like the idea of this patch so hopefully others do too. > Thanks for sending it out! Thanks for the review! Matthias
On Wed, Jul 25, 2018 at 06:12:28PM -0700, Matthias Kaehlcke wrote: > Hi Doug, > > On Wed, Jul 25, 2018 at 04:19:56PM -0700, Doug Anderson wrote: > > > On Tue, Jul 24, 2018 at 4:46 PM, Matthias Kaehlcke <mka@chromium.org> wrote: > > > +static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip, > > > + int temp) > > > +{ > > > + u8 reg; > > > + bool disable_s2_shutdown = false; > > > + int ret; > > > + > > > + WARN_ON(!mutex_is_locked(&chip->lock)); > > > + > > > + /* > > > + * Default: S2 and S3 shutdown enabled, thresholds at > > > + * 105C/125C/145C, monitoring at 25Hz > > > + */ > > > + reg = SHUTDOWN_CTRL1_RATE_25HZ; > > > + > > > + if ((temp == THERMAL_TEMP_INVALID) || > > > + (temp < STAGE2_THRESHOLD_MIN)) { > > > + chip->thresh = THRESH_MIN; > > > + goto skip; > > > + } > > > + > > > + if (temp <= STAGE2_THRESHOLD_MAX) { > > > + chip->thresh = THRESH_MAX - > > > + ((STAGE2_THRESHOLD_MAX - temp) / > > > + TEMP_THRESH_STEP); > > > + disable_s2_shutdown = true; > > > + } else { > > > + chip->thresh = THRESH_MAX; > > > + > > > + if (!IS_ERR(chip->adc)) > > > + disable_s2_shutdown = true; > > > + else > > > + dev_warn(chip->dev, > > > + "No ADC is configured and critical temperature is above the maximum stage 2 threshold of 140°C! Configuring stage 2 shutdown at 140°C.\n"); > > > > Putting a non-ASCII character (the degree symbol) in your commit > > message is one thing, but are you sure it's wise to put it in the > > kernel logs? > > A few other drivers also do this > (drivers/gpu/drm/nouveau/nvkm/subdev/clk/base.c, > drivers/macintosh/windfarm_pm121.c), however that doesn't mean it's a > good idea. Will change to degC or C. > > > > + } > > > + > > > +skip: > > > + reg |= chip->thresh; > > > + if (disable_s2_shutdown) > > > + reg |= SHUTDOWN_CTRL1_OVERRIDE_S2; > > > + > > > + ret = qpnp_tm_write(chip, QPNP_TM_REG_SHUTDOWN_CTRL1, reg); > > > + if (ret < 0) > > > + return ret; > > > + > > > + return ret; > > > > Simplify the above lines to: > > > > return qpnp_tm_write(chip, QPNP_TM_REG_SHUTDOWN_CTRL1, reg); > > Ouch, my code is indeed dumb ... > > > > @@ -313,12 +441,7 @@ static int qpnp_tm_probe(struct platform_device *pdev) > > > if (ret < 0) > > > return ret; > > > > > > - chip->tz_dev = devm_thermal_zone_of_sensor_register(&pdev->dev, 0, chip, > > > - &qpnp_tm_sensor_ops); > > > - if (IS_ERR(chip->tz_dev)) { > > > - dev_err(&pdev->dev, "failed to register sensor\n"); > > > - return PTR_ERR(chip->tz_dev); > > > - } > > > + chip->initialized = true; > > > > Should we add "thermal_zone_device_update(chip->tz_dev, > > THERMAL_EVENT_UNSPECIFIED);" here > > Seems reasonable, will do. > > > ...also: do we care about any type of locking for chip->initialized? > > Technically we can be running on weakly ordered memory so if > > qpnp_tm_update_temp_no_adc() is running on a different processor then > > possibly it could still keep returning the default temperature for a > > little while. We could try to analyze whether there's some sort of > > implicit barrier or we could add manual memory barriers, but generally > > I try to avoid that and just do the simple locking... What about just > > setting chip-Initialized = true at the end of qpnp_tm_init() while the > > mutex is still held? > > Thanks for pointing that out. I agree that we should keep things > simple, chip->initialized to true at the end of qpnp_tm_init() sounds > good to me. > > > I'd also love to hear from someone with more thermal framework > > experience to make sure it's legit to return a default value if > > someone calls us while we're initting. It seems sane to me but nice > > to confirm it's OK. > > An alternative could be to return THERMAL_TEMP_INVALID, however I > don't see this handled outside of thermal_core.c, not sure if it could > throw some other code off. > > Comments from thermal folks on either approach (or alternatives) are > definitely welcome :) > > > Overall I like the idea of this patch so hopefully others do too. > > Thanks for sending it out! > minor ask for next version WARNING: line over 80 characters #159: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:65: +#define STAGE2_THRESHOLD_MIN 125000 /* Stage 2 Threshold Min: 125 C */ WARNING: line over 80 characters #160: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:66: +#define STAGE2_THRESHOLD_MAX 140000 /* Stage 2 Threshold Max: 140 C */ ERROR: trailing statements should be on next line #201: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:186: + if (!chip->adc)) { CHECK: Unnecessary parentheses around 'temp == THERMAL_TEMP_INVALID' #227: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:220: + if ((temp == THERMAL_TEMP_INVALID) || + (temp < STAGE2_THRESHOLD_MIN)) { CHECK: Unnecessary parentheses around 'temp < STAGE2_THRESHOLD_MIN' #227: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:220: + if ((temp == THERMAL_TEMP_INVALID) || + (temp < STAGE2_THRESHOLD_MIN)) { CHECK: Unnecessary parentheses around 'trips[i].type == THERMAL_TRIP_CRITICAL' #305: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:302: + if (of_thermal_is_trip_valid(chip->tz_dev, i) && + (trips[i].type == THERMAL_TRIP_CRITICAL)) CHECK: Alignment should match open parenthesis #386: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:427: + chip->tz_dev = devm_thermal_zone_of_sensor_register(&pdev->dev, 0, chip, + &qpnp_tm_sensor_ops); > Thanks for the review! > > Matthias
On Fri, Jul 27, 2018 at 03:40:52PM -0700, Eduardo Valentin wrote: > On Wed, Jul 25, 2018 at 06:12:28PM -0700, Matthias Kaehlcke wrote: > > Hi Doug, > > > > On Wed, Jul 25, 2018 at 04:19:56PM -0700, Doug Anderson wrote: > > > > > On Tue, Jul 24, 2018 at 4:46 PM, Matthias Kaehlcke <mka@chromium.org> wrote: > > > > +static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip, > > > > + int temp) > > > > +{ > > > > + u8 reg; > > > > + bool disable_s2_shutdown = false; > > > > + int ret; > > > > + > > > > + WARN_ON(!mutex_is_locked(&chip->lock)); > > > > + > > > > + /* > > > > + * Default: S2 and S3 shutdown enabled, thresholds at > > > > + * 105C/125C/145C, monitoring at 25Hz > > > > + */ > > > > + reg = SHUTDOWN_CTRL1_RATE_25HZ; > > > > + > > > > + if ((temp == THERMAL_TEMP_INVALID) || > > > > + (temp < STAGE2_THRESHOLD_MIN)) { > > > > + chip->thresh = THRESH_MIN; > > > > + goto skip; > > > > + } > > > > + > > > > + if (temp <= STAGE2_THRESHOLD_MAX) { > > > > + chip->thresh = THRESH_MAX - > > > > + ((STAGE2_THRESHOLD_MAX - temp) / > > > > + TEMP_THRESH_STEP); > > > > + disable_s2_shutdown = true; > > > > + } else { > > > > + chip->thresh = THRESH_MAX; > > > > + > > > > + if (!IS_ERR(chip->adc)) > > > > + disable_s2_shutdown = true; > > > > + else > > > > + dev_warn(chip->dev, > > > > + "No ADC is configured and critical temperature is above the maximum stage 2 threshold of 140°C! Configuring stage 2 shutdown at 140°C.\n"); > > > > > > Putting a non-ASCII character (the degree symbol) in your commit > > > message is one thing, but are you sure it's wise to put it in the > > > kernel logs? > > > > A few other drivers also do this > > (drivers/gpu/drm/nouveau/nvkm/subdev/clk/base.c, > > drivers/macintosh/windfarm_pm121.c), however that doesn't mean it's a > > good idea. Will change to degC or C. > > > > > > + } > > > > + > > > > +skip: > > > > + reg |= chip->thresh; > > > > + if (disable_s2_shutdown) > > > > + reg |= SHUTDOWN_CTRL1_OVERRIDE_S2; > > > > + > > > > + ret = qpnp_tm_write(chip, QPNP_TM_REG_SHUTDOWN_CTRL1, reg); > > > > + if (ret < 0) > > > > + return ret; > > > > + > > > > + return ret; > > > > > > Simplify the above lines to: > > > > > > return qpnp_tm_write(chip, QPNP_TM_REG_SHUTDOWN_CTRL1, reg); > > > > Ouch, my code is indeed dumb ... > > > > > > @@ -313,12 +441,7 @@ static int qpnp_tm_probe(struct platform_device *pdev) > > > > if (ret < 0) > > > > return ret; > > > > > > > > - chip->tz_dev = devm_thermal_zone_of_sensor_register(&pdev->dev, 0, chip, > > > > - &qpnp_tm_sensor_ops); > > > > - if (IS_ERR(chip->tz_dev)) { > > > > - dev_err(&pdev->dev, "failed to register sensor\n"); > > > > - return PTR_ERR(chip->tz_dev); > > > > - } > > > > + chip->initialized = true; > > > > > > Should we add "thermal_zone_device_update(chip->tz_dev, > > > THERMAL_EVENT_UNSPECIFIED);" here > > > > Seems reasonable, will do. > > > > > ...also: do we care about any type of locking for chip->initialized? > > > Technically we can be running on weakly ordered memory so if > > > qpnp_tm_update_temp_no_adc() is running on a different processor then > > > possibly it could still keep returning the default temperature for a > > > little while. We could try to analyze whether there's some sort of > > > implicit barrier or we could add manual memory barriers, but generally > > > I try to avoid that and just do the simple locking... What about just > > > setting chip-Initialized = true at the end of qpnp_tm_init() while the > > > mutex is still held? > > > > Thanks for pointing that out. I agree that we should keep things > > simple, chip->initialized to true at the end of qpnp_tm_init() sounds > > good to me. > > > > > I'd also love to hear from someone with more thermal framework > > > experience to make sure it's legit to return a default value if > > > someone calls us while we're initting. It seems sane to me but nice > > > to confirm it's OK. > > > > An alternative could be to return THERMAL_TEMP_INVALID, however I > > don't see this handled outside of thermal_core.c, not sure if it could > > throw some other code off. > > > > Comments from thermal folks on either approach (or alternatives) are > > definitely welcome :) > > > > > Overall I like the idea of this patch so hopefully others do too. > > > Thanks for sending it out! > > > > minor ask for next version > > > WARNING: line over 80 characters > #159: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:65: > +#define STAGE2_THRESHOLD_MIN 125000 /* Stage 2 Threshold > Min: 125 C */ > > WARNING: line over 80 characters > #160: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:66: > +#define STAGE2_THRESHOLD_MAX 140000 /* Stage 2 Threshold > Max: 140 C */ > > ERROR: trailing statements should be on next line > #201: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:186: > + if (!chip->adc)) { > > CHECK: Unnecessary parentheses around 'temp == THERMAL_TEMP_INVALID' > #227: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:220: > + if ((temp == THERMAL_TEMP_INVALID) || > + (temp < STAGE2_THRESHOLD_MIN)) { > > CHECK: Unnecessary parentheses around 'temp < STAGE2_THRESHOLD_MIN' > #227: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:220: > + if ((temp == THERMAL_TEMP_INVALID) || > + (temp < STAGE2_THRESHOLD_MIN)) { > > CHECK: Unnecessary parentheses around 'trips[i].type == > THERMAL_TRIP_CRITICAL' > #305: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:302: > + if (of_thermal_is_trip_valid(chip->tz_dev, i) && > + (trips[i].type == THERMAL_TRIP_CRITICAL)) > > CHECK: Alignment should match open parenthesis > #386: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:427: > + chip->tz_dev = devm_thermal_zone_of_sensor_register(&pdev->dev, > 0, chip, > + > &qpnp_tm_sensor_ops); And it would be great if you could combine these two in your a single series, say when you fix this patch and send a new version of this series, please include these too: https://patchwork.kernel.org/patch/10543335/ https://patchwork.kernel.org/patch/10543333/ > > > Thanks for the review! > > > > Matthias
On Fri, Jul 27, 2018 at 03:40:52PM -0700, Eduardo Valentin wrote: > On Wed, Jul 25, 2018 at 06:12:28PM -0700, Matthias Kaehlcke wrote: > > Hi Doug, > > > > On Wed, Jul 25, 2018 at 04:19:56PM -0700, Doug Anderson wrote: > > > > > On Tue, Jul 24, 2018 at 4:46 PM, Matthias Kaehlcke <mka@chromium.org> wrote: > > > > +static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip, > > > > + int temp) > > > > +{ > > > > + u8 reg; > > > > + bool disable_s2_shutdown = false; > > > > + int ret; > > > > + > > > > + WARN_ON(!mutex_is_locked(&chip->lock)); > > > > + > > > > + /* > > > > + * Default: S2 and S3 shutdown enabled, thresholds at > > > > + * 105C/125C/145C, monitoring at 25Hz > > > > + */ > > > > + reg = SHUTDOWN_CTRL1_RATE_25HZ; > > > > + > > > > + if ((temp == THERMAL_TEMP_INVALID) || > > > > + (temp < STAGE2_THRESHOLD_MIN)) { > > > > + chip->thresh = THRESH_MIN; > > > > + goto skip; > > > > + } > > > > + > > > > + if (temp <= STAGE2_THRESHOLD_MAX) { > > > > + chip->thresh = THRESH_MAX - > > > > + ((STAGE2_THRESHOLD_MAX - temp) / > > > > + TEMP_THRESH_STEP); > > > > + disable_s2_shutdown = true; > > > > + } else { > > > > + chip->thresh = THRESH_MAX; > > > > + > > > > + if (!IS_ERR(chip->adc)) > > > > + disable_s2_shutdown = true; > > > > + else > > > > + dev_warn(chip->dev, > > > > + "No ADC is configured and critical temperature is above the maximum stage 2 threshold of 140°C! Configuring stage 2 shutdown at 140°C.\n"); > > > > > > Putting a non-ASCII character (the degree symbol) in your commit > > > message is one thing, but are you sure it's wise to put it in the > > > kernel logs? > > > > A few other drivers also do this > > (drivers/gpu/drm/nouveau/nvkm/subdev/clk/base.c, > > drivers/macintosh/windfarm_pm121.c), however that doesn't mean it's a > > good idea. Will change to degC or C. > > > > > > + } > > > > + > > > > +skip: > > > > + reg |= chip->thresh; > > > > + if (disable_s2_shutdown) > > > > + reg |= SHUTDOWN_CTRL1_OVERRIDE_S2; > > > > + > > > > + ret = qpnp_tm_write(chip, QPNP_TM_REG_SHUTDOWN_CTRL1, reg); > > > > + if (ret < 0) > > > > + return ret; > > > > + > > > > + return ret; > > > > > > Simplify the above lines to: > > > > > > return qpnp_tm_write(chip, QPNP_TM_REG_SHUTDOWN_CTRL1, reg); > > > > Ouch, my code is indeed dumb ... > > > > > > @@ -313,12 +441,7 @@ static int qpnp_tm_probe(struct platform_device *pdev) > > > > if (ret < 0) > > > > return ret; > > > > > > > > - chip->tz_dev = devm_thermal_zone_of_sensor_register(&pdev->dev, 0, chip, > > > > - &qpnp_tm_sensor_ops); > > > > - if (IS_ERR(chip->tz_dev)) { > > > > - dev_err(&pdev->dev, "failed to register sensor\n"); > > > > - return PTR_ERR(chip->tz_dev); > > > > - } > > > > + chip->initialized = true; > > > > > > Should we add "thermal_zone_device_update(chip->tz_dev, > > > THERMAL_EVENT_UNSPECIFIED);" here > > > > Seems reasonable, will do. > > > > > ...also: do we care about any type of locking for chip->initialized? > > > Technically we can be running on weakly ordered memory so if > > > qpnp_tm_update_temp_no_adc() is running on a different processor then > > > possibly it could still keep returning the default temperature for a > > > little while. We could try to analyze whether there's some sort of > > > implicit barrier or we could add manual memory barriers, but generally > > > I try to avoid that and just do the simple locking... What about just > > > setting chip-Initialized = true at the end of qpnp_tm_init() while the > > > mutex is still held? > > > > Thanks for pointing that out. I agree that we should keep things > > simple, chip->initialized to true at the end of qpnp_tm_init() sounds > > good to me. > > > > > I'd also love to hear from someone with more thermal framework > > > experience to make sure it's legit to return a default value if > > > someone calls us while we're initting. It seems sane to me but nice > > > to confirm it's OK. > > > > An alternative could be to return THERMAL_TEMP_INVALID, however I > > don't see this handled outside of thermal_core.c, not sure if it could > > throw some other code off. > > > > Comments from thermal folks on either approach (or alternatives) are > > definitely welcome :) > > > > > Overall I like the idea of this patch so hopefully others do too. > > > Thanks for sending it out! > > > > minor ask for next version > > > WARNING: line over 80 characters > #159: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:65: > +#define STAGE2_THRESHOLD_MIN 125000 /* Stage 2 Threshold > Min: 125 C */ > > WARNING: line over 80 characters > #160: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:66: > +#define STAGE2_THRESHOLD_MAX 140000 /* Stage 2 Threshold > Max: 140 C */ > > ERROR: trailing statements should be on next line > #201: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:186: > + if (!chip->adc)) { > > CHECK: Unnecessary parentheses around 'temp == THERMAL_TEMP_INVALID' > #227: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:220: > + if ((temp == THERMAL_TEMP_INVALID) || > + (temp < STAGE2_THRESHOLD_MIN)) { > > CHECK: Unnecessary parentheses around 'temp < STAGE2_THRESHOLD_MIN' > #227: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:220: > + if ((temp == THERMAL_TEMP_INVALID) || > + (temp < STAGE2_THRESHOLD_MIN)) { > > CHECK: Unnecessary parentheses around 'trips[i].type == > THERMAL_TRIP_CRITICAL' > #305: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:302: > + if (of_thermal_is_trip_valid(chip->tz_dev, i) && > + (trips[i].type == THERMAL_TRIP_CRITICAL)) > > CHECK: Alignment should match open parenthesis > #386: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:427: > + chip->tz_dev = devm_thermal_zone_of_sensor_register(&pdev->dev, > 0, chip, > + > &qpnp_tm_sensor_ops); Thanks for the review, I'll fix these in the next version. Right after sending the patches I realized that I forgot to run checkpatch.pl :( Will try to do better in the future.
On Fri, Jul 27, 2018 at 03:45:06PM -0700, Eduardo Valentin wrote: > On Fri, Jul 27, 2018 at 03:40:52PM -0700, Eduardo Valentin wrote: > > On Wed, Jul 25, 2018 at 06:12:28PM -0700, Matthias Kaehlcke wrote: > > > Hi Doug, > > > > > > On Wed, Jul 25, 2018 at 04:19:56PM -0700, Doug Anderson wrote: > > > > > > > On Tue, Jul 24, 2018 at 4:46 PM, Matthias Kaehlcke <mka@chromium.org> wrote: > > > > > +static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip, > > > > > + int temp) > > > > > +{ > > > > > + u8 reg; > > > > > + bool disable_s2_shutdown = false; > > > > > + int ret; > > > > > + > > > > > + WARN_ON(!mutex_is_locked(&chip->lock)); > > > > > + > > > > > + /* > > > > > + * Default: S2 and S3 shutdown enabled, thresholds at > > > > > + * 105C/125C/145C, monitoring at 25Hz > > > > > + */ > > > > > + reg = SHUTDOWN_CTRL1_RATE_25HZ; > > > > > + > > > > > + if ((temp == THERMAL_TEMP_INVALID) || > > > > > + (temp < STAGE2_THRESHOLD_MIN)) { > > > > > + chip->thresh = THRESH_MIN; > > > > > + goto skip; > > > > > + } > > > > > + > > > > > + if (temp <= STAGE2_THRESHOLD_MAX) { > > > > > + chip->thresh = THRESH_MAX - > > > > > + ((STAGE2_THRESHOLD_MAX - temp) / > > > > > + TEMP_THRESH_STEP); > > > > > + disable_s2_shutdown = true; > > > > > + } else { > > > > > + chip->thresh = THRESH_MAX; > > > > > + > > > > > + if (!IS_ERR(chip->adc)) > > > > > + disable_s2_shutdown = true; > > > > > + else > > > > > + dev_warn(chip->dev, > > > > > + "No ADC is configured and critical temperature is above the maximum stage 2 threshold of 140°C! Configuring stage 2 shutdown at 140°C.\n"); > > > > > > > > Putting a non-ASCII character (the degree symbol) in your commit > > > > message is one thing, but are you sure it's wise to put it in the > > > > kernel logs? > > > > > > A few other drivers also do this > > > (drivers/gpu/drm/nouveau/nvkm/subdev/clk/base.c, > > > drivers/macintosh/windfarm_pm121.c), however that doesn't mean it's a > > > good idea. Will change to degC or C. > > > > > > > > + } > > > > > + > > > > > +skip: > > > > > + reg |= chip->thresh; > > > > > + if (disable_s2_shutdown) > > > > > + reg |= SHUTDOWN_CTRL1_OVERRIDE_S2; > > > > > + > > > > > + ret = qpnp_tm_write(chip, QPNP_TM_REG_SHUTDOWN_CTRL1, reg); > > > > > + if (ret < 0) > > > > > + return ret; > > > > > + > > > > > + return ret; > > > > > > > > Simplify the above lines to: > > > > > > > > return qpnp_tm_write(chip, QPNP_TM_REG_SHUTDOWN_CTRL1, reg); > > > > > > Ouch, my code is indeed dumb ... > > > > > > > > @@ -313,12 +441,7 @@ static int qpnp_tm_probe(struct platform_device *pdev) > > > > > if (ret < 0) > > > > > return ret; > > > > > > > > > > - chip->tz_dev = devm_thermal_zone_of_sensor_register(&pdev->dev, 0, chip, > > > > > - &qpnp_tm_sensor_ops); > > > > > - if (IS_ERR(chip->tz_dev)) { > > > > > - dev_err(&pdev->dev, "failed to register sensor\n"); > > > > > - return PTR_ERR(chip->tz_dev); > > > > > - } > > > > > + chip->initialized = true; > > > > > > > > Should we add "thermal_zone_device_update(chip->tz_dev, > > > > THERMAL_EVENT_UNSPECIFIED);" here > > > > > > Seems reasonable, will do. > > > > > > > ...also: do we care about any type of locking for chip->initialized? > > > > Technically we can be running on weakly ordered memory so if > > > > qpnp_tm_update_temp_no_adc() is running on a different processor then > > > > possibly it could still keep returning the default temperature for a > > > > little while. We could try to analyze whether there's some sort of > > > > implicit barrier or we could add manual memory barriers, but generally > > > > I try to avoid that and just do the simple locking... What about just > > > > setting chip-Initialized = true at the end of qpnp_tm_init() while the > > > > mutex is still held? > > > > > > Thanks for pointing that out. I agree that we should keep things > > > simple, chip->initialized to true at the end of qpnp_tm_init() sounds > > > good to me. > > > > > > > I'd also love to hear from someone with more thermal framework > > > > experience to make sure it's legit to return a default value if > > > > someone calls us while we're initting. It seems sane to me but nice > > > > to confirm it's OK. > > > > > > An alternative could be to return THERMAL_TEMP_INVALID, however I > > > don't see this handled outside of thermal_core.c, not sure if it could > > > throw some other code off. > > > > > > Comments from thermal folks on either approach (or alternatives) are > > > definitely welcome :) > > > > > > > Overall I like the idea of this patch so hopefully others do too. > > > > Thanks for sending it out! > > > > > > > minor ask for next version > > > > > > WARNING: line over 80 characters > > #159: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:65: > > +#define STAGE2_THRESHOLD_MIN 125000 /* Stage 2 Threshold > > Min: 125 C */ > > > > WARNING: line over 80 characters > > #160: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:66: > > +#define STAGE2_THRESHOLD_MAX 140000 /* Stage 2 Threshold > > Max: 140 C */ > > > > ERROR: trailing statements should be on next line > > #201: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:186: > > + if (!chip->adc)) { > > > > CHECK: Unnecessary parentheses around 'temp == THERMAL_TEMP_INVALID' > > #227: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:220: > > + if ((temp == THERMAL_TEMP_INVALID) || > > + (temp < STAGE2_THRESHOLD_MIN)) { > > > > CHECK: Unnecessary parentheses around 'temp < STAGE2_THRESHOLD_MIN' > > #227: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:220: > > + if ((temp == THERMAL_TEMP_INVALID) || > > + (temp < STAGE2_THRESHOLD_MIN)) { > > > > CHECK: Unnecessary parentheses around 'trips[i].type == > > THERMAL_TRIP_CRITICAL' > > #305: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:302: > > + if (of_thermal_is_trip_valid(chip->tz_dev, i) && > > + (trips[i].type == THERMAL_TRIP_CRITICAL)) > > > > CHECK: Alignment should match open parenthesis > > #386: FILE: drivers/thermal/qcom-spmi-temp-alarm.c:427: > > + chip->tz_dev = devm_thermal_zone_of_sensor_register(&pdev->dev, > > 0, chip, > > + > > &qpnp_tm_sensor_ops); > > > And it would be great if you could combine these two in your a single > series, say when you fix this patch and send a new version of this > series, please include these too: > https://patchwork.kernel.org/patch/10543335/ > https://patchwork.kernel.org/patch/10543333/ Ok, will do
diff --git a/drivers/thermal/qcom-spmi-temp-alarm.c b/drivers/thermal/qcom-spmi-temp-alarm.c index ad4f3a8d6560..936e4dde4298 100644 --- a/drivers/thermal/qcom-spmi-temp-alarm.c +++ b/drivers/thermal/qcom-spmi-temp-alarm.c @@ -23,6 +23,8 @@ #include <linux/regmap.h> #include <linux/thermal.h> +#include "thermal_core.h" + #define QPNP_TM_REG_TYPE 0x04 #define QPNP_TM_REG_SUBTYPE 0x05 #define QPNP_TM_REG_STATUS 0x08 @@ -37,9 +39,11 @@ #define STATUS_GEN2_STATE_MASK GENMASK(6, 4) #define STATUS_GEN2_STATE_SHIFT 4 -#define SHUTDOWN_CTRL1_OVERRIDE_MASK GENMASK(7, 6) +#define SHUTDOWN_CTRL1_OVERRIDE_S2 BIT(6) #define SHUTDOWN_CTRL1_THRESHOLD_MASK GENMASK(1, 0) +#define SHUTDOWN_CTRL1_RATE_25HZ BIT(3) + #define ALARM_CTRL_FORCE_ENABLE BIT(7) /* @@ -56,12 +60,17 @@ #define TEMP_THRESH_STEP 5000 /* Threshold step: 5 C */ #define THRESH_MIN 0 +#define THRESH_MAX 3 + +#define STAGE2_THRESHOLD_MIN 125000 /* Stage 2 Threshold Min: 125 C */ +#define STAGE2_THRESHOLD_MAX 140000 /* Stage 2 Threshold Max: 140 C */ /* Temperature in Milli Celsius reported during stage 0 if no ADC is present */ #define DEFAULT_TEMP 37000 struct qpnp_tm_chip { struct regmap *map; + struct device *dev; struct thermal_zone_device *tz_dev; unsigned int subtype; long temp; @@ -69,6 +78,10 @@ struct qpnp_tm_chip { unsigned int stage; unsigned int prev_stage; unsigned int base; + /* protects .thresh, .stage and chip registers */ + struct mutex lock; + bool initialized; + struct iio_channel *adc; }; @@ -125,6 +138,8 @@ static int qpnp_tm_update_temp_no_adc(struct qpnp_tm_chip *chip) unsigned int stage, stage_new, stage_old; int ret; + WARN_ON(!mutex_is_locked(&chip->lock)); + ret = qpnp_tm_get_temp_stage(chip); if (ret < 0) return ret; @@ -163,8 +178,15 @@ static int qpnp_tm_get_temp(void *data, int *temp) if (!temp) return -EINVAL; - if (!chip->adc) { + if (!chip->initialized) { + *temp = DEFAULT_TEMP; + return 0; + } + + if (!chip->adc)) { + mutex_lock(&chip->lock); ret = qpnp_tm_update_temp_no_adc(chip); + mutex_unlock(&chip->lock); if (ret < 0) return ret; } else { @@ -180,8 +202,77 @@ static int qpnp_tm_get_temp(void *data, int *temp) return 0; } +static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip, + int temp) +{ + u8 reg; + bool disable_s2_shutdown = false; + int ret; + + WARN_ON(!mutex_is_locked(&chip->lock)); + + /* + * Default: S2 and S3 shutdown enabled, thresholds at + * 105C/125C/145C, monitoring at 25Hz + */ + reg = SHUTDOWN_CTRL1_RATE_25HZ; + + if ((temp == THERMAL_TEMP_INVALID) || + (temp < STAGE2_THRESHOLD_MIN)) { + chip->thresh = THRESH_MIN; + goto skip; + } + + if (temp <= STAGE2_THRESHOLD_MAX) { + chip->thresh = THRESH_MAX - + ((STAGE2_THRESHOLD_MAX - temp) / + TEMP_THRESH_STEP); + disable_s2_shutdown = true; + } else { + chip->thresh = THRESH_MAX; + + if (!IS_ERR(chip->adc)) + disable_s2_shutdown = true; + else + dev_warn(chip->dev, + "No ADC is configured and critical temperature is above the maximum stage 2 threshold of 140°C! Configuring stage 2 shutdown at 140°C.\n"); + } + +skip: + reg |= chip->thresh; + if (disable_s2_shutdown) + reg |= SHUTDOWN_CTRL1_OVERRIDE_S2; + + ret = qpnp_tm_write(chip, QPNP_TM_REG_SHUTDOWN_CTRL1, reg); + if (ret < 0) + return ret; + + return ret; +} + +static int qpnp_tm_set_trip_temp(void *data, int trip, int temp) +{ + struct qpnp_tm_chip *chip = data; + const struct thermal_trip *trip_points; + int ret; + + trip_points = of_thermal_get_trip_points(chip->tz_dev); + if (!trip_points) + return -EINVAL; + + if (trip_points[trip].type != THERMAL_TRIP_CRITICAL) + return 0; + + mutex_lock(&chip->lock); + ret = qpnp_tm_update_critical_trip_temp(chip, temp); + mutex_unlock(&chip->lock); + + return ret; +} + static const struct thermal_zone_of_device_ops qpnp_tm_sensor_ops = { .get_temp = qpnp_tm_get_temp, + .set_trip_temp = qpnp_tm_set_trip_temp, }; static irqreturn_t qpnp_tm_isr(int irq, void *data) @@ -193,6 +284,29 @@ static irqreturn_t qpnp_tm_isr(int irq, void *data) return IRQ_HANDLED; } +static int qpnp_tm_get_critical_trip_temp(struct qpnp_tm_chip *chip) +{ + int ntrips; + const struct thermal_trip *trips; + int i; + + ntrips = of_thermal_get_ntrips(chip->tz_dev); + if (ntrips <= 0) + return THERMAL_TEMP_INVALID; + + trips = of_thermal_get_trip_points(chip->tz_dev); + if (!trips) + return THERMAL_TEMP_INVALID; + + for (i = 0; i < ntrips; i++) { + if (of_thermal_is_trip_valid(chip->tz_dev, i) && + (trips[i].type == THERMAL_TRIP_CRITICAL)) + return trips[i].temperature; + } + + return THERMAL_TEMP_INVALID; +} + /* * This function initializes the internal temp value based on only the * current thermal stage and threshold. Setup threshold control and @@ -203,17 +317,20 @@ static int qpnp_tm_init(struct qpnp_tm_chip *chip) unsigned int stage; int ret; u8 reg = 0; + int crit_temp; + + mutex_lock(&chip->lock); ret = qpnp_tm_read(chip, QPNP_TM_REG_SHUTDOWN_CTRL1, ®); if (ret < 0) - return ret; + goto out; chip->thresh = reg & SHUTDOWN_CTRL1_THRESHOLD_MASK; chip->temp = DEFAULT_TEMP; ret = qpnp_tm_get_temp_stage(chip); if (ret < 0) - return ret; + goto out; chip->stage = ret; stage = chip->subtype == QPNP_TM_SUBTYPE_GEN1 @@ -224,21 +341,17 @@ static int qpnp_tm_init(struct qpnp_tm_chip *chip) (stage - 1) * TEMP_STAGE_STEP + TEMP_THRESH_MIN; - /* - * Set threshold and disable software override of stage 2 and 3 - * shutdowns. - */ - chip->thresh = THRESH_MIN; - reg &= ~(SHUTDOWN_CTRL1_OVERRIDE_MASK | SHUTDOWN_CTRL1_THRESHOLD_MASK); - reg |= chip->thresh & SHUTDOWN_CTRL1_THRESHOLD_MASK; - ret = qpnp_tm_write(chip, QPNP_TM_REG_SHUTDOWN_CTRL1, reg); + crit_temp = qpnp_tm_get_critical_trip_temp(chip); + ret = qpnp_tm_update_critical_trip_temp(chip, crit_temp); if (ret < 0) - return ret; + goto out; /* Enable the thermal alarm PMIC module in always-on mode. */ reg = ALARM_CTRL_FORCE_ENABLE; ret = qpnp_tm_write(chip, QPNP_TM_REG_ALARM_CTRL, reg); +out: + mutex_unlock(&chip->lock); return ret; } @@ -257,6 +370,9 @@ static int qpnp_tm_probe(struct platform_device *pdev) return -ENOMEM; dev_set_drvdata(&pdev->dev, chip); + chip->dev = &pdev->dev; + + mutex_init(&chip->lock); chip->map = dev_get_regmap(pdev->dev.parent, NULL); if (!chip->map) @@ -302,6 +418,18 @@ static int qpnp_tm_probe(struct platform_device *pdev) chip->subtype = subtype; + /* + * Register the sensor before initializing the hardware to be able to + * read the trip points. get_temp() returns the default temperature + * before the hardware initialization is completed. + */ + chip->tz_dev = devm_thermal_zone_of_sensor_register(&pdev->dev, 0, chip, + &qpnp_tm_sensor_ops); + if (IS_ERR(chip->tz_dev)) { + dev_err(&pdev->dev, "failed to register sensor\n"); + return PTR_ERR(chip->tz_dev); + } + ret = qpnp_tm_init(chip); if (ret < 0) { dev_err(&pdev->dev, "init failed\n"); @@ -313,12 +441,7 @@ static int qpnp_tm_probe(struct platform_device *pdev) if (ret < 0) return ret; - chip->tz_dev = devm_thermal_zone_of_sensor_register(&pdev->dev, 0, chip, - &qpnp_tm_sensor_ops); - if (IS_ERR(chip->tz_dev)) { - dev_err(&pdev->dev, "failed to register sensor\n"); - return PTR_ERR(chip->tz_dev); - } + chip->initialized = true; return 0; }
There are three thermal stages defined in the PMIC: stage 1: warning stage 2: system should shut down stage 3: emergency shut down By default the PMIC assumes that the OS isn't doing anything and thus at stage 2 it does a partial PMIC shutdown and at stage 3 it kills all power. When switching between thermal stages the PMIC generates an interrupt which is handled by the driver. The partial PMIC shutdown at stage 2 can be disabled by software, which allows the OS to initiate a shutdown at stage 2 with a thermal zone configured accordingly. If a critical trip point is configured in the thermal zone the driver adjusts the stage 1-3 temperature thresholds to (closely) match the critical temperature with a stage 2 threshold (125/130/135/140 °C). If a suitable match is found the partial shutdown at stage 2 is disabled. If for some reason the system doesn't shutdown at stage 2 the emergency shutdown at stage 3 kicks in. The partial shutdown at stage 2 remains enabled in these cases: - no critical trip point defined - the temperature of the critical trip point is < 125°C - the temperature of the critical trip point is > 140°C and no ADC channel is configured (thus the OS is not notified when the critical temperature is reached) Suggested-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Matthias Kaehlcke <mka@chromium.org> --- Changes in v5: - patch added to the series --- drivers/thermal/qcom-spmi-temp-alarm.c | 161 ++++++++++++++++++++++--- 1 file changed, 142 insertions(+), 19 deletions(-)