diff mbox series

[1/2] thermal: tegra-bpmp: Handle offline zones

Message ID 20230207135610.3100865-1-cyndis@kapsi.fi (mailing list archive)
State New, archived
Delegated to: Daniel Lezcano
Headers show
Series [1/2] thermal: tegra-bpmp: Handle offline zones | expand

Commit Message

Mikko Perttunen Feb. 7, 2023, 1:56 p.m. UTC
From: Mikko Perttunen <mperttunen@nvidia.com>

Thermal zones located in power domains may not be accessible when
the domain is powergated. In this situation, reading the temperature
will return -BPMP_EFAULT and the temperature is considered to be
-256C for calculating trips.

For smooth operation, for offline zones, return -EAGAIN when reading
the temperature and allow registration of zones even if they are
offline during probe.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
 drivers/thermal/tegra/tegra-bpmp-thermal.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Thierry Reding Feb. 8, 2023, 10:31 a.m. UTC | #1
On Tue, Feb 07, 2023 at 03:56:08PM +0200, Mikko Perttunen wrote:
> From: Mikko Perttunen <mperttunen@nvidia.com>
> 
> Thermal zones located in power domains may not be accessible when
> the domain is powergated. In this situation, reading the temperature
> will return -BPMP_EFAULT and the temperature is considered to be
> -256C for calculating trips.

Where's that -256C being set? I only see THERMAL_TEMP_INVALID being set
as the default for a zone, but that's not -274C, not -256C. If that's
the temperature that you're referring to, it might be better to state
that we rely on the default temperature rather than any specific number.

Thierry

> 
> For smooth operation, for offline zones, return -EAGAIN when reading
> the temperature and allow registration of zones even if they are
> offline during probe.
> 
> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
> ---
>  drivers/thermal/tegra/tegra-bpmp-thermal.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/thermal/tegra/tegra-bpmp-thermal.c b/drivers/thermal/tegra/tegra-bpmp-thermal.c
> index c76e1ea62c8a..628b18818ae9 100644
> --- a/drivers/thermal/tegra/tegra-bpmp-thermal.c
> +++ b/drivers/thermal/tegra/tegra-bpmp-thermal.c
> @@ -52,6 +52,8 @@ static int __tegra_bpmp_thermal_get_temp(struct tegra_bpmp_thermal_zone *zone,
>  	err = tegra_bpmp_transfer(zone->tegra->bpmp, &msg);
>  	if (err)
>  		return err;
> +	if (msg.rx.ret == -BPMP_EFAULT)
> +		return -EAGAIN;
>  	if (msg.rx.ret)
>  		return -EINVAL;
>  
> @@ -257,7 +259,12 @@ static int tegra_bpmp_thermal_probe(struct platform_device *pdev)
>  		zone->tegra = tegra;
>  
>  		err = __tegra_bpmp_thermal_get_temp(zone, &temp);
> -		if (err < 0) {
> +
> +		/*
> +		 * Sensors in powergated domains may temporarily fail to be read
> +		 * (-EAGAIN), but will become accessible when the domain is powered on.
> +		 */
> +		if (err < 0 && err != -EAGAIN) {
>  			devm_kfree(&pdev->dev, zone);
>  			continue;
>  		}
> -- 
> 2.39.0
>
Mikko Perttunen Feb. 8, 2023, 3:35 p.m. UTC | #2
On 2/8/23 12:31, Thierry Reding wrote:
> On Tue, Feb 07, 2023 at 03:56:08PM +0200, Mikko Perttunen wrote:
>> From: Mikko Perttunen <mperttunen@nvidia.com>
>>
>> Thermal zones located in power domains may not be accessible when
>> the domain is powergated. In this situation, reading the temperature
>> will return -BPMP_EFAULT and the temperature is considered to be
>> -256C for calculating trips.
> 
> Where's that -256C being set? I only see THERMAL_TEMP_INVALID being set
> as the default for a zone, but that's not -274C, not -256C. If that's
> the temperature that you're referring to, it might be better to state
> that we rely on the default temperature rather than any specific number.
> 
> Thierry

It is based on BPMP's internal behavior.

Mikko

> 
>>
>> For smooth operation, for offline zones, return -EAGAIN when reading
>> the temperature and allow registration of zones even if they are
>> offline during probe.
>>
>> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
>> ---
>>   drivers/thermal/tegra/tegra-bpmp-thermal.c | 9 ++++++++-
>>   1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/thermal/tegra/tegra-bpmp-thermal.c b/drivers/thermal/tegra/tegra-bpmp-thermal.c
>> index c76e1ea62c8a..628b18818ae9 100644
>> --- a/drivers/thermal/tegra/tegra-bpmp-thermal.c
>> +++ b/drivers/thermal/tegra/tegra-bpmp-thermal.c
>> @@ -52,6 +52,8 @@ static int __tegra_bpmp_thermal_get_temp(struct tegra_bpmp_thermal_zone *zone,
>>   	err = tegra_bpmp_transfer(zone->tegra->bpmp, &msg);
>>   	if (err)
>>   		return err;
>> +	if (msg.rx.ret == -BPMP_EFAULT)
>> +		return -EAGAIN;
>>   	if (msg.rx.ret)
>>   		return -EINVAL;
>>   
>> @@ -257,7 +259,12 @@ static int tegra_bpmp_thermal_probe(struct platform_device *pdev)
>>   		zone->tegra = tegra;
>>   
>>   		err = __tegra_bpmp_thermal_get_temp(zone, &temp);
>> -		if (err < 0) {
>> +
>> +		/*
>> +		 * Sensors in powergated domains may temporarily fail to be read
>> +		 * (-EAGAIN), but will become accessible when the domain is powered on.
>> +		 */
>> +		if (err < 0 && err != -EAGAIN) {
>>   			devm_kfree(&pdev->dev, zone);
>>   			continue;
>>   		}
>> -- 
>> 2.39.0
>>
Thierry Reding Feb. 8, 2023, 4:05 p.m. UTC | #3
On Wed, Feb 08, 2023 at 05:35:48PM +0200, Mikko Perttunen wrote:
> On 2/8/23 12:31, Thierry Reding wrote:
> > On Tue, Feb 07, 2023 at 03:56:08PM +0200, Mikko Perttunen wrote:
> > > From: Mikko Perttunen <mperttunen@nvidia.com>
> > > 
> > > Thermal zones located in power domains may not be accessible when
> > > the domain is powergated. In this situation, reading the temperature
> > > will return -BPMP_EFAULT and the temperature is considered to be
> > > -256C for calculating trips.
> > 
> > Where's that -256C being set? I only see THERMAL_TEMP_INVALID being set
> > as the default for a zone, but that's not -274C, not -256C. If that's
> > the temperature that you're referring to, it might be better to state
> > that we rely on the default temperature rather than any specific number.
> > 
> > Thierry
> 
> It is based on BPMP's internal behavior.

Okay, maybe clarify that part of the sentence then. Could be something
like:

	... will return -BPMP_EFAULT. When evaluating trips, BPMP will
	internally use -256C as the temperature for offline zones.

Thierry
diff mbox series

Patch

diff --git a/drivers/thermal/tegra/tegra-bpmp-thermal.c b/drivers/thermal/tegra/tegra-bpmp-thermal.c
index c76e1ea62c8a..628b18818ae9 100644
--- a/drivers/thermal/tegra/tegra-bpmp-thermal.c
+++ b/drivers/thermal/tegra/tegra-bpmp-thermal.c
@@ -52,6 +52,8 @@  static int __tegra_bpmp_thermal_get_temp(struct tegra_bpmp_thermal_zone *zone,
 	err = tegra_bpmp_transfer(zone->tegra->bpmp, &msg);
 	if (err)
 		return err;
+	if (msg.rx.ret == -BPMP_EFAULT)
+		return -EAGAIN;
 	if (msg.rx.ret)
 		return -EINVAL;
 
@@ -257,7 +259,12 @@  static int tegra_bpmp_thermal_probe(struct platform_device *pdev)
 		zone->tegra = tegra;
 
 		err = __tegra_bpmp_thermal_get_temp(zone, &temp);
-		if (err < 0) {
+
+		/*
+		 * Sensors in powergated domains may temporarily fail to be read
+		 * (-EAGAIN), but will become accessible when the domain is powered on.
+		 */
+		if (err < 0 && err != -EAGAIN) {
 			devm_kfree(&pdev->dev, zone);
 			continue;
 		}