diff mbox series

thermal: gov_power_allocator: Allow binding without cooling devices

Message ID 20240321-gpa-no-cooling-devs-v1-1-5c9e0ef2062e@trvn.ru (mailing list archive)
State Changes Requested, archived
Headers show
Series thermal: gov_power_allocator: Allow binding without cooling devices | expand

Commit Message

Nikita Travkin March 21, 2024, 2:44 p.m. UTC
Commit e83747c2f8e3 ("thermal: gov_power_allocator: Set up trip points earlier")
added a check that would fail binding the governer if there is no
cooling devices bound to the thermal zone. Unfortunately this causes
issues in cases when the TZ is bound to the governer before the cooling
devices are attached to it. (I.e. when the tz is registered using
thermal_zone_device_register_with_trips().)

Additionally, the documentation across gov_power_allocator suggests it's
intended to allow it to be bound to thermal zones without cooling
devices (and thus without passive/active trip points), however the same
change added a check for the trip point to be present, causing those TZ
to fail probing.

Those changes cause all thermal zones to fail on some devices (such as
sc7180-acer-aspire1) and prevent the kernel from controlling the cpu/gpu
frequency based on the temperature, as well as losing all the other
"informational" thermal zones if power_allocator is set as default.

This commit partially reverts the referenced one by dropping the trip
point check and by allowing the TZ to probe even if no actor buffer was
allocated to allow those TZ to probe again.

Fixes: e83747c2f8e3 ("thermal: gov_power_allocator: Set up trip points earlier")
Signed-off-by: Nikita Travkin <nikita@trvn.ru>
---
I've noticed that all thermal zones fail probing with -EINVAL on my
sc7180 based Acer Aspire 1 since 6.8. This commit allows me to bring
them back.
---
 drivers/thermal/gov_power_allocator.c | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)


---
base-commit: e7528c088874326d3060a46f572252be43755a86
change-id: 20240321-gpa-no-cooling-devs-c79ee3288325

Best regards,

Comments

Rafael J. Wysocki March 27, 2024, 3:25 p.m. UTC | #1
On Thu, Mar 21, 2024 at 3:44 PM Nikita Travkin <nikita@trvn.ru> wrote:
>
> Commit e83747c2f8e3 ("thermal: gov_power_allocator: Set up trip points earlier")
> added a check that would fail binding the governer if there is no
> cooling devices bound to the thermal zone. Unfortunately this causes
> issues in cases when the TZ is bound to the governer before the cooling
> devices are attached to it. (I.e. when the tz is registered using
> thermal_zone_device_register_with_trips().)
>
> Additionally, the documentation across gov_power_allocator suggests it's
> intended to allow it to be bound to thermal zones without cooling
> devices (and thus without passive/active trip points), however the same
> change added a check for the trip point to be present, causing those TZ
> to fail probing.
>
> Those changes cause all thermal zones to fail on some devices (such as
> sc7180-acer-aspire1) and prevent the kernel from controlling the cpu/gpu
> frequency based on the temperature, as well as losing all the other
> "informational" thermal zones if power_allocator is set as default.
>
> This commit partially reverts the referenced one by dropping the trip
> point check and by allowing the TZ to probe even if no actor buffer was
> allocated to allow those TZ to probe again.
>
> Fixes: e83747c2f8e3 ("thermal: gov_power_allocator: Set up trip points earlier")
> Signed-off-by: Nikita Travkin <nikita@trvn.ru>
> ---
> I've noticed that all thermal zones fail probing with -EINVAL on my
> sc7180 based Acer Aspire 1 since 6.8. This commit allows me to bring
> them back.

Łukasz, any comments?

> ---
>  drivers/thermal/gov_power_allocator.c | 14 +++++---------
>  1 file changed, 5 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c
> index 1b17dc4c219c..4f2d7f3b7508 100644
> --- a/drivers/thermal/gov_power_allocator.c
> +++ b/drivers/thermal/gov_power_allocator.c
> @@ -679,11 +679,6 @@ static int power_allocator_bind(struct thermal_zone_device *tz)
>                 return -ENOMEM;
>
>         get_governor_trips(tz, params);
> -       if (!params->trip_max) {
> -               dev_warn(&tz->device, "power_allocator: missing trip_max\n");
> -               kfree(params);
> -               return -EINVAL;
> -       }
>
>         ret = check_power_actors(tz, params);
>         if (ret < 0) {
> @@ -693,7 +688,7 @@ static int power_allocator_bind(struct thermal_zone_device *tz)
>         }
>
>         ret = allocate_actors_buffer(params, ret);
> -       if (ret) {
> +       if (ret && ret != -EINVAL) {
>                 dev_warn(&tz->device, "power_allocator: allocation failed\n");
>                 kfree(params);
>                 return ret;
> @@ -714,9 +709,10 @@ static int power_allocator_bind(struct thermal_zone_device *tz)
>         else
>                 params->sustainable_power = tz->tzp->sustainable_power;
>
> -       estimate_pid_constants(tz, tz->tzp->sustainable_power,
> -                              params->trip_switch_on,
> -                              params->trip_max->temperature);
> +       if (params->trip_max)
> +               estimate_pid_constants(tz, tz->tzp->sustainable_power,
> +                                      params->trip_switch_on,
> +                                      params->trip_max->temperature);
>
>         reset_pid_controller(params);
>
>
> ---
> base-commit: e7528c088874326d3060a46f572252be43755a86
> change-id: 20240321-gpa-no-cooling-devs-c79ee3288325
>
> Best regards,
> --
> Nikita Travkin <nikita@trvn.ru>
>
Lukasz Luba March 28, 2024, 9:12 a.m. UTC | #2
Hi Rafael,

On 3/27/24 15:25, Rafael J. Wysocki wrote:
> On Thu, Mar 21, 2024 at 3:44 PM Nikita Travkin <nikita@trvn.ru> wrote:
>>
>> Commit e83747c2f8e3 ("thermal: gov_power_allocator: Set up trip points earlier")
>> added a check that would fail binding the governer if there is no
>> cooling devices bound to the thermal zone. Unfortunately this causes
>> issues in cases when the TZ is bound to the governer before the cooling
>> devices are attached to it. (I.e. when the tz is registered using
>> thermal_zone_device_register_with_trips().)
>>
>> Additionally, the documentation across gov_power_allocator suggests it's
>> intended to allow it to be bound to thermal zones without cooling
>> devices (and thus without passive/active trip points), however the same
>> change added a check for the trip point to be present, causing those TZ
>> to fail probing.
>>
>> Those changes cause all thermal zones to fail on some devices (such as
>> sc7180-acer-aspire1) and prevent the kernel from controlling the cpu/gpu
>> frequency based on the temperature, as well as losing all the other
>> "informational" thermal zones if power_allocator is set as default.
>>
>> This commit partially reverts the referenced one by dropping the trip
>> point check and by allowing the TZ to probe even if no actor buffer was
>> allocated to allow those TZ to probe again.
>>
>> Fixes: e83747c2f8e3 ("thermal: gov_power_allocator: Set up trip points earlier")
>> Signed-off-by: Nikita Travkin <nikita@trvn.ru>
>> ---
>> I've noticed that all thermal zones fail probing with -EINVAL on my
>> sc7180 based Acer Aspire 1 since 6.8. This commit allows me to bring
>> them back.
> 
> Łukasz, any comments?

Let me check this today.

> 
>> ---
>>   drivers/thermal/gov_power_allocator.c | 14 +++++---------
>>   1 file changed, 5 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c
>> index 1b17dc4c219c..4f2d7f3b7508 100644
>> --- a/drivers/thermal/gov_power_allocator.c
>> +++ b/drivers/thermal/gov_power_allocator.c
>> @@ -679,11 +679,6 @@ static int power_allocator_bind(struct thermal_zone_device *tz)
>>                  return -ENOMEM;
>>
>>          get_governor_trips(tz, params);
>> -       if (!params->trip_max) {
>> -               dev_warn(&tz->device, "power_allocator: missing trip_max\n");
>> -               kfree(params);
>> -               return -EINVAL;
>> -       }
>>
>>          ret = check_power_actors(tz, params);
>>          if (ret < 0) {
>> @@ -693,7 +688,7 @@ static int power_allocator_bind(struct thermal_zone_device *tz)
>>          }
>>
>>          ret = allocate_actors_buffer(params, ret);
>> -       if (ret) {
>> +       if (ret && ret != -EINVAL) {
>>                  dev_warn(&tz->device, "power_allocator: allocation failed\n");
>>                  kfree(params);
>>                  return ret;
>> @@ -714,9 +709,10 @@ static int power_allocator_bind(struct thermal_zone_device *tz)
>>          else
>>                  params->sustainable_power = tz->tzp->sustainable_power;
>>
>> -       estimate_pid_constants(tz, tz->tzp->sustainable_power,
>> -                              params->trip_switch_on,
>> -                              params->trip_max->temperature);
>> +       if (params->trip_max)
>> +               estimate_pid_constants(tz, tz->tzp->sustainable_power,
>> +                                      params->trip_switch_on,
>> +                                      params->trip_max->temperature);
>>
>>          reset_pid_controller(params);
>>
>>
>> ---
>> base-commit: e7528c088874326d3060a46f572252be43755a86
>> change-id: 20240321-gpa-no-cooling-devs-c79ee3288325
>>
>> Best regards,
>> --
>> Nikita Travkin <nikita@trvn.ru>
>>
Lukasz Luba March 28, 2024, 9:50 a.m. UTC | #3
On 3/28/24 09:12, Lukasz Luba wrote:
> Hi Rafael,
> 
> On 3/27/24 15:25, Rafael J. Wysocki wrote:
>> On Thu, Mar 21, 2024 at 3:44 PM Nikita Travkin <nikita@trvn.ru> wrote:
>>>
>>> Commit e83747c2f8e3 ("thermal: gov_power_allocator: Set up trip 
>>> points earlier")
>>> added a check that would fail binding the governer if there is no
>>> cooling devices bound to the thermal zone. Unfortunately this causes
>>> issues in cases when the TZ is bound to the governer before the cooling
>>> devices are attached to it. (I.e. when the tz is registered using
>>> thermal_zone_device_register_with_trips().)
>>>
>>> Additionally, the documentation across gov_power_allocator suggests it's
>>> intended to allow it to be bound to thermal zones without cooling
>>> devices (and thus without passive/active trip points), however the same
>>> change added a check for the trip point to be present, causing those TZ
>>> to fail probing.

This patch description is mixing trips and cooling devices and refers to
a commit which is only for guarding the trip points number to be
not less than 2. In IPA we require 2 trip points.

>>>
>>> Those changes cause all thermal zones to fail on some devices (such as
>>> sc7180-acer-aspire1) and prevent the kernel from controlling the cpu/gpu
>>> frequency based on the temperature, as well as losing all the other
>>> "informational" thermal zones if power_allocator is set as default.
>>>
>>> This commit partially reverts the referenced one by dropping the trip
>>> point check and by allowing the TZ to probe even if no actor buffer was
>>> allocated to allow those TZ to probe again.
>>>
>>> Fixes: e83747c2f8e3 ("thermal: gov_power_allocator: Set up trip 
>>> points earlier")

Not that commit.

>>> Signed-off-by: Nikita Travkin <nikita@trvn.ru>
>>> ---
>>> I've noticed that all thermal zones fail probing with -EINVAL on my
>>> sc7180 based Acer Aspire 1 since 6.8. This commit allows me to bring
>>> them back.
>>
>> Łukasz, any comments?
> 
> Let me check this today.
> 
>>
>>> ---
>>>   drivers/thermal/gov_power_allocator.c | 14 +++++---------
>>>   1 file changed, 5 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/thermal/gov_power_allocator.c 
>>> b/drivers/thermal/gov_power_allocator.c
>>> index 1b17dc4c219c..4f2d7f3b7508 100644
>>> --- a/drivers/thermal/gov_power_allocator.c
>>> +++ b/drivers/thermal/gov_power_allocator.c
>>> @@ -679,11 +679,6 @@ static int power_allocator_bind(struct 
>>> thermal_zone_device *tz)
>>>                  return -ENOMEM;
>>>
>>>          get_governor_trips(tz, params);
>>> -       if (!params->trip_max) {
>>> -               dev_warn(&tz->device, "power_allocator: missing 
>>> trip_max\n");

This if() guards the binding of TZ with less than 2 trip points,
not the cooling devices.

>>> -               kfree(params);
>>> -               return -EINVAL;
>>> -       }
>>>
>>>          ret = check_power_actors(tz, params);
>>>          if (ret < 0) {
>>> @@ -693,7 +688,7 @@ static int power_allocator_bind(struct 
>>> thermal_zone_device *tz)
>>>          }
>>>
>>>          ret = allocate_actors_buffer(params, ret);
>>> -       if (ret) {

This if() is from different commit.

>>> +       if (ret && ret != -EINVAL) {

This is about 0 cooling devices in the thermal zone, but IPA won't work
so why to even fake and forward binding?

Rafael should we support binding with 0 cooling devices?

>>>                  dev_warn(&tz->device, "power_allocator: allocation 
>>> failed\n");
>>>                  kfree(params);
>>>                  return ret;
>>> @@ -714,9 +709,10 @@ static int power_allocator_bind(struct 
>>> thermal_zone_device *tz)
>>>          else
>>>                  params->sustainable_power = tz->tzp->sustainable_power;
>>>
>>> -       estimate_pid_constants(tz, tz->tzp->sustainable_power,
>>> -                              params->trip_switch_on,
>>> -                              params->trip_max->temperature);
>>> +       if (params->trip_max)

This is not supported, we need those 2 trip points.

>>> +               estimate_pid_constants(tz, tz->tzp->sustainable_power,
>>> +                                      params->trip_switch_on,
>>> +                                      params->trip_max->temperature);
>>>
>>>          reset_pid_controller(params);
>>>
>>>
>>> ---
>>> base-commit: e7528c088874326d3060a46f572252be43755a86
>>> change-id: 20240321-gpa-no-cooling-devs-c79ee3288325
>>>
>>> Best regards,
>>> -- 
>>> Nikita Travkin <nikita@trvn.ru>
>>>
>
Nikita Travkin March 29, 2024, 1:54 p.m. UTC | #4
Hi Lukasz, Rafael!

First of all, it seems like outlook mail server that serves @arm.com refuses
to communicate with mine so I can't send anything to it and it refuses to
send anything to me either...

I'm sorry for any inconvenience this may cause and I will deal with it 
somehow...

28.03.2024 14:50, Lukasz Luba пишет:
>
>
> On 3/28/24 09:12, Lukasz Luba wrote:
>> Hi Rafael,
>>
>> On 3/27/24 15:25, Rafael J. Wysocki wrote:
>>> On Thu, Mar 21, 2024 at 3:44 PM Nikita Travkin <nikita@trvn.ru> wrote:
>>>>
>>>> Commit e83747c2f8e3 ("thermal: gov_power_allocator: Set up trip 
>>>> points earlier")
>>>> added a check that would fail binding the governer if there is no
>>>> cooling devices bound to the thermal zone. Unfortunately this causes
>>>> issues in cases when the TZ is bound to the governer before the 
>>>> cooling
>>>> devices are attached to it. (I.e. when the tz is registered using
>>>> thermal_zone_device_register_with_trips().)
>>>>
>>>> Additionally, the documentation across gov_power_allocator suggests 
>>>> it's
>>>> intended to allow it to be bound to thermal zones without cooling
>>>> devices (and thus without passive/active trip points), however the 
>>>> same
>>>> change added a check for the trip point to be present, causing 
>>>> those TZ
>>>> to fail probing. 
>
> This patch description is mixing trips and cooling devices and refers to
> a commit which is only for guarding the trip points number to be
> not less than 2. In IPA we require 2 trip points.

I'm sorry, I probably worded this poorly. I meant that there may be
a TZ that has no trip points because it's not meant to be used with
any cooling device.

>>>>
>>>> Those changes cause all thermal zones to fail on some devices (such as
>>>> sc7180-acer-aspire1) and prevent the kernel from controlling the 
>>>> cpu/gpu
>>>> frequency based on the temperature, as well as losing all the other
>>>> "informational" thermal zones if power_allocator is set as default.
>>>>
>>>> This commit partially reverts the referenced one by dropping the trip
>>>> point check and by allowing the TZ to probe even if no actor buffer 
>>>> was
>>>> allocated to allow those TZ to probe again.
>>>>
>>>> Fixes: e83747c2f8e3 ("thermal: gov_power_allocator: Set up trip 
>>>> points earlier") 
>
> Not that commit.
>>>> Signed-off-by: Nikita Travkin <nikita@trvn.ru>
>>>> ---
>>>> I've noticed that all thermal zones fail probing with -EINVAL on my
>>>> sc7180 based Acer Aspire 1 since 6.8. This commit allows me to bring
>>>> them back. 
>>>
>>> Łukasz, any comments? 
>>
>> Let me check this today.
>>>> ---
>>>>   drivers/thermal/gov_power_allocator.c | 14 +++++---------
>>>>   1 file changed, 5 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/drivers/thermal/gov_power_allocator.c 
>>>> b/drivers/thermal/gov_power_allocator.c
>>>> index 1b17dc4c219c..4f2d7f3b7508 100644
>>>> --- a/drivers/thermal/gov_power_allocator.c
>>>> +++ b/drivers/thermal/gov_power_allocator.c
>>>> @@ -679,11 +679,6 @@ static int power_allocator_bind(struct 
>>>> thermal_zone_device *tz)
>>>>                  return -ENOMEM;
>>>>
>>>>          get_governor_trips(tz, params);
>>>> -       if (!params->trip_max) {
>>>> -               dev_warn(&tz->device, "power_allocator: missing 
>>>> trip_max\n"); 
>
> This if() guards the binding of TZ with less than 2 trip points,
> not the cooling devices.
>>>> - kfree(params);
>>>> -               return -EINVAL;
>>>> -       }
>>>>
>>>>          ret = check_power_actors(tz, params);
>>>>          if (ret < 0) {
>>>> @@ -693,7 +688,7 @@ static int power_allocator_bind(struct 
>>>> thermal_zone_device *tz)
>>>>          }
>>>>
>>>>          ret = allocate_actors_buffer(params, ret);
>>>> -       if (ret) { 
>
> This if() is from different commit.

Hm you're right, I misread the commit I think and on a second thought
this patch combines two different issues. I will split it up into two.

>>>> +       if (ret && ret != -EINVAL) { 
>
> This is about 0 cooling devices in the thermal zone, but IPA won't work
> so why to even fake and forward binding?
>
> Rafael should we support binding with 0 cooling devices? 

As I understand it, cooling devices can attach to existing thermal zones
so it's possible that TZ is registered before any cooling devices has
attached to it. This is exactly what happens in my case: first the
IPA is probed for the TZ and after that the cooling device is attached.

check_power_actors() kerneldoc in IPA says:

> * If all of the cooling  devices currently attached to @tz implement the power
> * actor API, return the  number of them (which may be 0, because some cooling
> * devices may be  attached later). Otherwise, return -EINVAL.

and afaiu this is exactly how it's used by 
thermal_zone_device_register_with_trips():

     result = thermal_set_governor(tz, governor);
     if (result) {
mutex_unlock(&thermal_governor_lock);
         goto unregister;
     }
     (...)
     /* Bind cooling devices for this zone */
     bind_tz(tz);

>>>> dev_warn(&tz->device, "power_allocator: allocation failed\n");
>>>>                  kfree(params);
>>>>                  return ret;
>>>> @@ -714,9 +709,10 @@ static int power_allocator_bind(struct 
>>>> thermal_zone_device *tz)
>>>>          else
>>>>                  params->sustainable_power = 
>>>> tz->tzp->sustainable_power;
>>>>
>>>> -       estimate_pid_constants(tz, tz->tzp->sustainable_power,
>>>> -                              params->trip_switch_on,
>>>> - params->trip_max->temperature);
>>>> +       if (params->trip_max) 
>
> This is not supported, we need those 2 trip points.

Yes, I understand that IPA is useless without trip points and/or cooling
devices. However from the snippet above the TZ won't be registered if
the governor fails to probe. This means that if one has IPA set as the
default, thermal core will try to probe it for TZ without trip points and,
when it fails, will fail to probe the whole TZ.

get_governor_trips() kerneldoc in IPA says:

 > * on.  If there are no passive or active trip points, then the
 > * governor won't do anything.  In fact, its throttle function
 > * won't be called at all.

Which suggests to me that being bound to such TZ is expected.

Unless you have any thoughts on how to solve this better, I will split
this into two commits and resubmit in a few days:

  - Allow binding without cooling devices (allow -EINVAL after allocate)
  - Allow binding without trip points (Rest of this patch)

Nikita


>>>> + estimate_pid_constants(tz, tz->tzp->sustainable_power,
>>>> + params->trip_switch_on,
>>>> + params->trip_max->temperature);
>>>>
>>>>          reset_pid_controller(params);
>>>>
>>>>
>>>> ---
>>>> base-commit: e7528c088874326d3060a46f572252be43755a86
>>>> change-id: 20240321-gpa-no-cooling-devs-c79ee3288325
>>>>
>>>> Best regards,
>>>> -- 
>>>> Nikita Travkin <nikita@trvn.ru>
Lukasz Luba April 2, 2024, 9:51 a.m. UTC | #5
On 3/29/24 13:54, Nikita Travkin wrote:
> Hi Lukasz, Rafael!
> 
> First of all, it seems like outlook mail server that serves @arm.com 
> refuses
> to communicate with mine so I can't send anything to it and it refuses to
> send anything to me either...
> 
> I'm sorry for any inconvenience this may cause and I will deal with it 
> somehow...
> 
> 28.03.2024 14:50, Lukasz Luba пишет:
>>
>>
>> On 3/28/24 09:12, Lukasz Luba wrote:
>>> Hi Rafael,
>>>
>>> On 3/27/24 15:25, Rafael J. Wysocki wrote:
>>>> On Thu, Mar 21, 2024 at 3:44 PM Nikita Travkin <nikita@trvn.ru> wrote:
>>>>>
>>>>> Commit e83747c2f8e3 ("thermal: gov_power_allocator: Set up trip 
>>>>> points earlier")
>>>>> added a check that would fail binding the governer if there is no
>>>>> cooling devices bound to the thermal zone. Unfortunately this causes
>>>>> issues in cases when the TZ is bound to the governer before the 
>>>>> cooling
>>>>> devices are attached to it. (I.e. when the tz is registered using
>>>>> thermal_zone_device_register_with_trips().)
>>>>>
>>>>> Additionally, the documentation across gov_power_allocator suggests 
>>>>> it's
>>>>> intended to allow it to be bound to thermal zones without cooling
>>>>> devices (and thus without passive/active trip points), however the 
>>>>> same
>>>>> change added a check for the trip point to be present, causing 
>>>>> those TZ
>>>>> to fail probing. 
>>
>> This patch description is mixing trips and cooling devices and refers to
>> a commit which is only for guarding the trip points number to be
>> not less than 2. In IPA we require 2 trip points.
> 
> I'm sorry, I probably worded this poorly. I meant that there may be
> a TZ that has no trip points because it's not meant to be used with
> any cooling device.

The 'empty' thermal zone w/o trip points and w/o cooling devices, but
w/ governor looks odd but still live.

> 
>>>>>
>>>>> Those changes cause all thermal zones to fail on some devices (such as
>>>>> sc7180-acer-aspire1) and prevent the kernel from controlling the 
>>>>> cpu/gpu
>>>>> frequency based on the temperature, as well as losing all the other
>>>>> "informational" thermal zones if power_allocator is set as default.
>>>>>
>>>>> This commit partially reverts the referenced one by dropping the trip
>>>>> point check and by allowing the TZ to probe even if no actor buffer 
>>>>> was
>>>>> allocated to allow those TZ to probe again.
>>>>>
>>>>> Fixes: e83747c2f8e3 ("thermal: gov_power_allocator: Set up trip 
>>>>> points earlier") 
>>
>> Not that commit.
>>>>> Signed-off-by: Nikita Travkin <nikita@trvn.ru>
>>>>> ---
>>>>> I've noticed that all thermal zones fail probing with -EINVAL on my
>>>>> sc7180 based Acer Aspire 1 since 6.8. This commit allows me to bring
>>>>> them back. 
>>>>
>>>> Łukasz, any comments? 
>>>
>>> Let me check this today.
>>>>> ---
>>>>>   drivers/thermal/gov_power_allocator.c | 14 +++++---------
>>>>>   1 file changed, 5 insertions(+), 9 deletions(-)
>>>>>
>>>>> diff --git a/drivers/thermal/gov_power_allocator.c 
>>>>> b/drivers/thermal/gov_power_allocator.c
>>>>> index 1b17dc4c219c..4f2d7f3b7508 100644
>>>>> --- a/drivers/thermal/gov_power_allocator.c
>>>>> +++ b/drivers/thermal/gov_power_allocator.c
>>>>> @@ -679,11 +679,6 @@ static int power_allocator_bind(struct 
>>>>> thermal_zone_device *tz)
>>>>>                  return -ENOMEM;
>>>>>
>>>>>          get_governor_trips(tz, params);
>>>>> -       if (!params->trip_max) {
>>>>> -               dev_warn(&tz->device, "power_allocator: missing 
>>>>> trip_max\n"); 
>>
>> This if() guards the binding of TZ with less than 2 trip points,
>> not the cooling devices.
>>>>> - kfree(params);
>>>>> -               return -EINVAL;
>>>>> -       }
>>>>>
>>>>>          ret = check_power_actors(tz, params);
>>>>>          if (ret < 0) {
>>>>> @@ -693,7 +688,7 @@ static int power_allocator_bind(struct 
>>>>> thermal_zone_device *tz)
>>>>>          }
>>>>>
>>>>>          ret = allocate_actors_buffer(params, ret);
>>>>> -       if (ret) { 
>>
>> This if() is from different commit.
> 
> Hm you're right, I misread the commit I think and on a second thought
> this patch combines two different issues. I will split it up into two.
> 
>>>>> +       if (ret && ret != -EINVAL) { 
>>
>> This is about 0 cooling devices in the thermal zone, but IPA won't work
>> so why to even fake and forward binding?
>>
>> Rafael should we support binding with 0 cooling devices? 
> 
> As I understand it, cooling devices can attach to existing thermal zones
> so it's possible that TZ is registered before any cooling devices has
> attached to it. This is exactly what happens in my case: first the
> IPA is probed for the TZ and after that the cooling device is attached.

That's fair enough. IPA should support it.

> 
> check_power_actors() kerneldoc in IPA says:
> 
>> * If all of the cooling  devices currently attached to @tz implement 
>> the power
>> * actor API, return the  number of them (which may be 0, because some 
>> cooling
>> * devices may be  attached later). Otherwise, return -EINVAL.
> 
> and afaiu this is exactly how it's used by 
> thermal_zone_device_register_with_trips():
> 
>      result = thermal_set_governor(tz, governor);
>      if (result) {
> mutex_unlock(&thermal_governor_lock);
>          goto unregister;
>      }
>      (...)
>      /* Bind cooling devices for this zone */
>      bind_tz(tz);
> 
>>>>> dev_warn(&tz->device, "power_allocator: allocation failed\n");
>>>>>                  kfree(params);
>>>>>                  return ret;
>>>>> @@ -714,9 +709,10 @@ static int power_allocator_bind(struct 
>>>>> thermal_zone_device *tz)
>>>>>          else
>>>>>                  params->sustainable_power = 
>>>>> tz->tzp->sustainable_power;
>>>>>
>>>>> -       estimate_pid_constants(tz, tz->tzp->sustainable_power,
>>>>> -                              params->trip_switch_on,
>>>>> - params->trip_max->temperature);
>>>>> +       if (params->trip_max) 
>>
>> This is not supported, we need those 2 trip points.
> 
> Yes, I understand that IPA is useless without trip points and/or cooling
> devices. However from the snippet above the TZ won't be registered if
> the governor fails to probe. This means that if one has IPA set as the
> default, thermal core will try to probe it for TZ without trip points and,
> when it fails, will fail to probe the whole TZ.
> 
> get_governor_trips() kerneldoc in IPA says:
> 
>  > * on.  If there are no passive or active trip points, then the
>  > * governor won't do anything.  In fact, its throttle function
>  > * won't be called at all.
> 
> Which suggests to me that being bound to such TZ is expected.
> 
> Unless you have any thoughts on how to solve this better, I will split
> this into two commits and resubmit in a few days:

Yes, please split into 2 patches for different commits.

> 
>   - Allow binding without cooling devices (allow -EINVAL after allocate)

It won't be the fix, it's a hack. If the 0 cooling devices is going to 
be the normal state, then it has to be handled not as error exception
path.

Please change the allocate_actors_buffer():
          if (!num_actors) {
                  ret = -EINVAL;   <--- ret = 0 here

since it's normal state now.

>   - Allow binding without trip points (Rest of this patch)

Yes, the rest of that patch looks good.
diff mbox series

Patch

diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c
index 1b17dc4c219c..4f2d7f3b7508 100644
--- a/drivers/thermal/gov_power_allocator.c
+++ b/drivers/thermal/gov_power_allocator.c
@@ -679,11 +679,6 @@  static int power_allocator_bind(struct thermal_zone_device *tz)
 		return -ENOMEM;
 
 	get_governor_trips(tz, params);
-	if (!params->trip_max) {
-		dev_warn(&tz->device, "power_allocator: missing trip_max\n");
-		kfree(params);
-		return -EINVAL;
-	}
 
 	ret = check_power_actors(tz, params);
 	if (ret < 0) {
@@ -693,7 +688,7 @@  static int power_allocator_bind(struct thermal_zone_device *tz)
 	}
 
 	ret = allocate_actors_buffer(params, ret);
-	if (ret) {
+	if (ret && ret != -EINVAL) {
 		dev_warn(&tz->device, "power_allocator: allocation failed\n");
 		kfree(params);
 		return ret;
@@ -714,9 +709,10 @@  static int power_allocator_bind(struct thermal_zone_device *tz)
 	else
 		params->sustainable_power = tz->tzp->sustainable_power;
 
-	estimate_pid_constants(tz, tz->tzp->sustainable_power,
-			       params->trip_switch_on,
-			       params->trip_max->temperature);
+	if (params->trip_max)
+		estimate_pid_constants(tz, tz->tzp->sustainable_power,
+				       params->trip_switch_on,
+				       params->trip_max->temperature);
 
 	reset_pid_controller(params);