Message ID | 20230712210505.1536416-1-Frank.Li@nxp.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/1] thermal/drivers/imx_sc_thermal: return -EAGAIN when SCFW turn off resource | expand |
On 12/07/2023 23:05, Frank Li wrote: > Avoid endless print following message when SCFW turns off resource. > [ 1818.342337] thermal thermal_zone0: failed to read out thermal zone (-1) > > Signed-off-by: Frank Li <Frank.Li@nxp.com> > --- > drivers/thermal/imx_sc_thermal.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/thermal/imx_sc_thermal.c b/drivers/thermal/imx_sc_thermal.c > index 8d6b4ef23746..0533d58f199f 100644 > --- a/drivers/thermal/imx_sc_thermal.c > +++ b/drivers/thermal/imx_sc_thermal.c > @@ -58,7 +58,9 @@ static int imx_sc_thermal_get_temp(struct thermal_zone_device *tz, int *temp) > hdr->size = 2; > > ret = imx_scu_call_rpc(thermal_ipc_handle, &msg, true); > - if (ret) > + if (ret == -EPERM) /* NO POWER */ > + return -EAGAIN; Isn't there a chain call somewhere when the resource is turned off, so the thermal zone can be disabled? > + else if (ret) > return ret; > > *temp = msg.data.resp.celsius * 1000 + msg.data.resp.tenths * 100;
On Thu, Jul 13, 2023 at 02:49:54PM +0200, Daniel Lezcano wrote: > On 12/07/2023 23:05, Frank Li wrote: > > Avoid endless print following message when SCFW turns off resource. > > [ 1818.342337] thermal thermal_zone0: failed to read out thermal zone (-1) > > > > Signed-off-by: Frank Li <Frank.Li@nxp.com> > > --- > > drivers/thermal/imx_sc_thermal.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/thermal/imx_sc_thermal.c b/drivers/thermal/imx_sc_thermal.c > > index 8d6b4ef23746..0533d58f199f 100644 > > --- a/drivers/thermal/imx_sc_thermal.c > > +++ b/drivers/thermal/imx_sc_thermal.c > > @@ -58,7 +58,9 @@ static int imx_sc_thermal_get_temp(struct thermal_zone_device *tz, int *temp) > > hdr->size = 2; > > ret = imx_scu_call_rpc(thermal_ipc_handle, &msg, true); > > - if (ret) > > + if (ret == -EPERM) /* NO POWER */ > > + return -EAGAIN; > > Isn't there a chain call somewhere when the resource is turned off, so the > thermal zone can be disabled? A possible place in drivers/firmware/imx/scu-pd.c. but I am not sure how to get thermal devices. I just found a API thermal_zone_get_zone_by_name(). I am not sure if it is good to depend on "name", which add coupling between two drivers and if there are external thermal devices(such as) has the same name, it will wrong turn off. If add power domain notification in thermal driver, I am not how to get other devices's pd in thermal driver. Any example I can refer? Or this is simple enough solution. Frank > > > + else if (ret) > > return ret; > > *temp = msg.data.resp.celsius * 1000 + msg.data.resp.tenths * 100; > > -- > <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs > > Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | > <http://twitter.com/#!/linaroorg> Twitter | > <http://www.linaro.org/linaro-blog/> Blog >
Hi Frank, sorry for the delay On 14/07/2023 19:19, Frank Li wrote: > On Thu, Jul 13, 2023 at 02:49:54PM +0200, Daniel Lezcano wrote: >> On 12/07/2023 23:05, Frank Li wrote: >>> Avoid endless print following message when SCFW turns off resource. >>> [ 1818.342337] thermal thermal_zone0: failed to read out thermal zone (-1) >>> >>> Signed-off-by: Frank Li <Frank.Li@nxp.com> >>> --- >>> drivers/thermal/imx_sc_thermal.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/thermal/imx_sc_thermal.c b/drivers/thermal/imx_sc_thermal.c >>> index 8d6b4ef23746..0533d58f199f 100644 >>> --- a/drivers/thermal/imx_sc_thermal.c >>> +++ b/drivers/thermal/imx_sc_thermal.c >>> @@ -58,7 +58,9 @@ static int imx_sc_thermal_get_temp(struct thermal_zone_device *tz, int *temp) >>> hdr->size = 2; >>> ret = imx_scu_call_rpc(thermal_ipc_handle, &msg, true); >>> - if (ret) >>> + if (ret == -EPERM) /* NO POWER */ >>> + return -EAGAIN; >> >> Isn't there a chain call somewhere when the resource is turned off, so the >> thermal zone can be disabled? > > A possible place in drivers/firmware/imx/scu-pd.c. but I am not sure how to > get thermal devices. I just found a API thermal_zone_get_zone_by_name(). I > am not sure if it is good to depend on "name", which add coupling between > two drivers and if there are external thermal devices(such as) has the > same name, it will wrong turn off. Correct > If add power domain notification in thermal driver, I am not how to get > other devices's pd in thermal driver. > > Any example I can refer? > > Or this is simple enough solution. The solution works for removing the error message but it does not solve the root cause of the issue. The thermal zone keeps monitoring while the sensor is down. So the question is why the sensor is shut down if it is in use? >> >>> + else if (ret) >>> return ret; >>> *temp = msg.data.resp.celsius * 1000 + msg.data.resp.tenths * 100; >> >> -- >> <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs >> >> Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | >> <http://twitter.com/#!/linaroorg> Twitter | >> <http://www.linaro.org/linaro-blog/> Blog >>
On Wed, Aug 16, 2023 at 10:44:32AM +0200, Daniel Lezcano wrote: > > Hi Frank, > > sorry for the delay > > On 14/07/2023 19:19, Frank Li wrote: > > On Thu, Jul 13, 2023 at 02:49:54PM +0200, Daniel Lezcano wrote: > > > On 12/07/2023 23:05, Frank Li wrote: > > > > Avoid endless print following message when SCFW turns off resource. > > > > [ 1818.342337] thermal thermal_zone0: failed to read out thermal zone (-1) > > > > > > > > Signed-off-by: Frank Li <Frank.Li@nxp.com> > > > > --- > > > > drivers/thermal/imx_sc_thermal.c | 4 +++- > > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/thermal/imx_sc_thermal.c b/drivers/thermal/imx_sc_thermal.c > > > > index 8d6b4ef23746..0533d58f199f 100644 > > > > --- a/drivers/thermal/imx_sc_thermal.c > > > > +++ b/drivers/thermal/imx_sc_thermal.c > > > > @@ -58,7 +58,9 @@ static int imx_sc_thermal_get_temp(struct thermal_zone_device *tz, int *temp) > > > > hdr->size = 2; > > > > ret = imx_scu_call_rpc(thermal_ipc_handle, &msg, true); > > > > - if (ret) > > > > + if (ret == -EPERM) /* NO POWER */ > > > > + return -EAGAIN; > > > > > > Isn't there a chain call somewhere when the resource is turned off, so the > > > thermal zone can be disabled? > > > > A possible place in drivers/firmware/imx/scu-pd.c. but I am not sure how to > > get thermal devices. I just found a API thermal_zone_get_zone_by_name(). I > > am not sure if it is good to depend on "name", which add coupling between > > two drivers and if there are external thermal devices(such as) has the > > same name, it will wrong turn off. > > Correct > > > If add power domain notification in thermal driver, I am not how to get > > other devices's pd in thermal driver. > > > > Any example I can refer? > > > > Or this is simple enough solution. > > The solution works for removing the error message but it does not solve the > root cause of the issue. The thermal zone keeps monitoring while the sensor > is down. > > So the question is why the sensor is shut down if it is in use? Do you know if there are any code I reference? I supposed it is quite common. Frank > > > > > > > > > > + else if (ret) > > > > return ret; > > > > *temp = msg.data.resp.celsius * 1000 + msg.data.resp.tenths * 100; > > > > > > -- > > > <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs > > > > > > Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | > > > <http://twitter.com/#!/linaroorg> Twitter | > > > <http://www.linaro.org/linaro-blog/> Blog > > > > > -- > <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs > > Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | > <http://twitter.com/#!/linaroorg> Twitter | > <http://www.linaro.org/linaro-blog/> Blog >
On 16/08/2023 18:28, Frank Li wrote: > On Wed, Aug 16, 2023 at 10:44:32AM +0200, Daniel Lezcano wrote: >> >> Hi Frank, >> >> sorry for the delay >> >> On 14/07/2023 19:19, Frank Li wrote: >>> On Thu, Jul 13, 2023 at 02:49:54PM +0200, Daniel Lezcano wrote: >>>> On 12/07/2023 23:05, Frank Li wrote: >>>>> Avoid endless print following message when SCFW turns off resource. >>>>> [ 1818.342337] thermal thermal_zone0: failed to read out thermal zone (-1) >>>>> >>>>> Signed-off-by: Frank Li <Frank.Li@nxp.com> >>>>> --- >>>>> drivers/thermal/imx_sc_thermal.c | 4 +++- >>>>> 1 file changed, 3 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/thermal/imx_sc_thermal.c b/drivers/thermal/imx_sc_thermal.c >>>>> index 8d6b4ef23746..0533d58f199f 100644 >>>>> --- a/drivers/thermal/imx_sc_thermal.c >>>>> +++ b/drivers/thermal/imx_sc_thermal.c >>>>> @@ -58,7 +58,9 @@ static int imx_sc_thermal_get_temp(struct thermal_zone_device *tz, int *temp) >>>>> hdr->size = 2; >>>>> ret = imx_scu_call_rpc(thermal_ipc_handle, &msg, true); >>>>> - if (ret) >>>>> + if (ret == -EPERM) /* NO POWER */ >>>>> + return -EAGAIN; >>>> >>>> Isn't there a chain call somewhere when the resource is turned off, so the >>>> thermal zone can be disabled? >>> >>> A possible place in drivers/firmware/imx/scu-pd.c. but I am not sure how to >>> get thermal devices. I just found a API thermal_zone_get_zone_by_name(). I >>> am not sure if it is good to depend on "name", which add coupling between >>> two drivers and if there are external thermal devices(such as) has the >>> same name, it will wrong turn off. >> >> Correct >> >>> If add power domain notification in thermal driver, I am not how to get >>> other devices's pd in thermal driver. >>> >>> Any example I can refer? >>> >>> Or this is simple enough solution. >> >> The solution works for removing the error message but it does not solve the >> root cause of the issue. The thermal zone keeps monitoring while the sensor >> is down. >> >> So the question is why the sensor is shut down if it is in use? > > Do you know if there are any code I reference? I supposed it is quite common. Sorry, I don't get your comment What I meant is why is the sensor turned off if it is in use ?
On Wed, Aug 16, 2023 at 06:47:17PM +0200, Daniel Lezcano wrote: > On 16/08/2023 18:28, Frank Li wrote: > > On Wed, Aug 16, 2023 at 10:44:32AM +0200, Daniel Lezcano wrote: > > > > > > Hi Frank, > > > > > > sorry for the delay > > > > > > On 14/07/2023 19:19, Frank Li wrote: > > > > On Thu, Jul 13, 2023 at 02:49:54PM +0200, Daniel Lezcano wrote: > > > > > On 12/07/2023 23:05, Frank Li wrote: > > > > > > Avoid endless print following message when SCFW turns off resource. > > > > > > [ 1818.342337] thermal thermal_zone0: failed to read out thermal zone (-1) > > > > > > > > > > > > Signed-off-by: Frank Li <Frank.Li@nxp.com> > > > > > > --- > > > > > > drivers/thermal/imx_sc_thermal.c | 4 +++- > > > > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/drivers/thermal/imx_sc_thermal.c b/drivers/thermal/imx_sc_thermal.c > > > > > > index 8d6b4ef23746..0533d58f199f 100644 > > > > > > --- a/drivers/thermal/imx_sc_thermal.c > > > > > > +++ b/drivers/thermal/imx_sc_thermal.c > > > > > > @@ -58,7 +58,9 @@ static int imx_sc_thermal_get_temp(struct thermal_zone_device *tz, int *temp) > > > > > > hdr->size = 2; > > > > > > ret = imx_scu_call_rpc(thermal_ipc_handle, &msg, true); > > > > > > - if (ret) > > > > > > + if (ret == -EPERM) /* NO POWER */ > > > > > > + return -EAGAIN; > > > > > > > > > > Isn't there a chain call somewhere when the resource is turned off, so the > > > > > thermal zone can be disabled? > > > > > > > > A possible place in drivers/firmware/imx/scu-pd.c. but I am not sure how to > > > > get thermal devices. I just found a API thermal_zone_get_zone_by_name(). I > > > > am not sure if it is good to depend on "name", which add coupling between > > > > two drivers and if there are external thermal devices(such as) has the > > > > same name, it will wrong turn off. > > > > > > Correct > > > > > > > If add power domain notification in thermal driver, I am not how to get > > > > other devices's pd in thermal driver. > > > > > > > > Any example I can refer? > > > > > > > > Or this is simple enough solution. > > > > > > The solution works for removing the error message but it does not solve the > > > root cause of the issue. The thermal zone keeps monitoring while the sensor > > > is down. > > > > > > So the question is why the sensor is shut down if it is in use? > > > > Do you know if there are any code I reference? I supposed it is quite common. > > Sorry, I don't get your comment > > What I meant is why is the sensor turned off if it is in use ? One typical example is cpu hotplug. The sensor is located CPU power domain. If CPU hotplug off, CPU power domain will be turn off. It doesn't make sensor keep monitor such sensor when CPU already power off. It doesn't make sensor to keep CPU power on just because want to get sensor data. Anthor example is GPU, if there are GPU0 and GPU1. Most case just GPU0 work. GPU1 may turn off when less loading. Ideally, thermal can get notification from power domain driver. when such power domain turn off, disable thermal zone. So far, I have not idea how to do that. > > -- > <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs > > Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | > <http://twitter.com/#!/linaroorg> Twitter | > <http://www.linaro.org/linaro-blog/> Blog >
On 16/08/2023 19:07, Frank Li wrote: > On Wed, Aug 16, 2023 at 06:47:17PM +0200, Daniel Lezcano wrote: >> On 16/08/2023 18:28, Frank Li wrote: >>> On Wed, Aug 16, 2023 at 10:44:32AM +0200, Daniel Lezcano wrote: >>>> >>>> Hi Frank, >>>> >>>> sorry for the delay >>>> >>>> On 14/07/2023 19:19, Frank Li wrote: >>>>> On Thu, Jul 13, 2023 at 02:49:54PM +0200, Daniel Lezcano wrote: >>>>>> On 12/07/2023 23:05, Frank Li wrote: >>>>>>> Avoid endless print following message when SCFW turns off resource. >>>>>>> [ 1818.342337] thermal thermal_zone0: failed to read out thermal zone (-1) >>>>>>> >>>>>>> Signed-off-by: Frank Li <Frank.Li@nxp.com> >>>>>>> --- >>>>>>> drivers/thermal/imx_sc_thermal.c | 4 +++- >>>>>>> 1 file changed, 3 insertions(+), 1 deletion(-) >>>>>>> >>>>>>> diff --git a/drivers/thermal/imx_sc_thermal.c b/drivers/thermal/imx_sc_thermal.c >>>>>>> index 8d6b4ef23746..0533d58f199f 100644 >>>>>>> --- a/drivers/thermal/imx_sc_thermal.c >>>>>>> +++ b/drivers/thermal/imx_sc_thermal.c >>>>>>> @@ -58,7 +58,9 @@ static int imx_sc_thermal_get_temp(struct thermal_zone_device *tz, int *temp) >>>>>>> hdr->size = 2; >>>>>>> ret = imx_scu_call_rpc(thermal_ipc_handle, &msg, true); >>>>>>> - if (ret) >>>>>>> + if (ret == -EPERM) /* NO POWER */ >>>>>>> + return -EAGAIN; >>>>>> >>>>>> Isn't there a chain call somewhere when the resource is turned off, so the >>>>>> thermal zone can be disabled? >>>>> >>>>> A possible place in drivers/firmware/imx/scu-pd.c. but I am not sure how to >>>>> get thermal devices. I just found a API thermal_zone_get_zone_by_name(). I >>>>> am not sure if it is good to depend on "name", which add coupling between >>>>> two drivers and if there are external thermal devices(such as) has the >>>>> same name, it will wrong turn off. >>>> >>>> Correct >>>> >>>>> If add power domain notification in thermal driver, I am not how to get >>>>> other devices's pd in thermal driver. >>>>> >>>>> Any example I can refer? >>>>> >>>>> Or this is simple enough solution. >>>> >>>> The solution works for removing the error message but it does not solve the >>>> root cause of the issue. The thermal zone keeps monitoring while the sensor >>>> is down. >>>> >>>> So the question is why the sensor is shut down if it is in use? >>> >>> Do you know if there are any code I reference? I supposed it is quite common. >> >> Sorry, I don't get your comment >> >> What I meant is why is the sensor turned off if it is in use ? > > One typical example is cpu hotplug. The sensor is located CPU power domain. > If CPU hotplug off, CPU power domain will be turn off. > > It doesn't make sensor keep monitor such sensor when CPU already power off. > It doesn't make sensor to keep CPU power on just because want to get sensor > data. > > Anthor example is GPU, if there are GPU0 and GPU1. Most case just GPU0 > work. GPU1 may turn off when less loading. > > Ideally, thermal can get notification from power domain driver. > when such power domain turn off, disable thermal zone. > > So far, I have not idea how to do that. Ulf, do you have a guidance to link the thermal zone and the power domain in order to get a poweron/off notification leading to enable/disable the thermal zone ?
On Wed, 16 Aug 2023 at 22:46, Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > > On 16/08/2023 19:07, Frank Li wrote: > > On Wed, Aug 16, 2023 at 06:47:17PM +0200, Daniel Lezcano wrote: > >> On 16/08/2023 18:28, Frank Li wrote: > >>> On Wed, Aug 16, 2023 at 10:44:32AM +0200, Daniel Lezcano wrote: > >>>> > >>>> Hi Frank, > >>>> > >>>> sorry for the delay > >>>> > >>>> On 14/07/2023 19:19, Frank Li wrote: > >>>>> On Thu, Jul 13, 2023 at 02:49:54PM +0200, Daniel Lezcano wrote: > >>>>>> On 12/07/2023 23:05, Frank Li wrote: > >>>>>>> Avoid endless print following message when SCFW turns off resource. > >>>>>>> [ 1818.342337] thermal thermal_zone0: failed to read out thermal zone (-1) > >>>>>>> > >>>>>>> Signed-off-by: Frank Li <Frank.Li@nxp.com> > >>>>>>> --- > >>>>>>> drivers/thermal/imx_sc_thermal.c | 4 +++- > >>>>>>> 1 file changed, 3 insertions(+), 1 deletion(-) > >>>>>>> > >>>>>>> diff --git a/drivers/thermal/imx_sc_thermal.c b/drivers/thermal/imx_sc_thermal.c > >>>>>>> index 8d6b4ef23746..0533d58f199f 100644 > >>>>>>> --- a/drivers/thermal/imx_sc_thermal.c > >>>>>>> +++ b/drivers/thermal/imx_sc_thermal.c > >>>>>>> @@ -58,7 +58,9 @@ static int imx_sc_thermal_get_temp(struct thermal_zone_device *tz, int *temp) > >>>>>>> hdr->size = 2; > >>>>>>> ret = imx_scu_call_rpc(thermal_ipc_handle, &msg, true); > >>>>>>> - if (ret) > >>>>>>> + if (ret == -EPERM) /* NO POWER */ > >>>>>>> + return -EAGAIN; > >>>>>> > >>>>>> Isn't there a chain call somewhere when the resource is turned off, so the > >>>>>> thermal zone can be disabled? > >>>>> > >>>>> A possible place in drivers/firmware/imx/scu-pd.c. but I am not sure how to > >>>>> get thermal devices. I just found a API thermal_zone_get_zone_by_name(). I > >>>>> am not sure if it is good to depend on "name", which add coupling between > >>>>> two drivers and if there are external thermal devices(such as) has the > >>>>> same name, it will wrong turn off. > >>>> > >>>> Correct > >>>> > >>>>> If add power domain notification in thermal driver, I am not how to get > >>>>> other devices's pd in thermal driver. > >>>>> > >>>>> Any example I can refer? > >>>>> > >>>>> Or this is simple enough solution. > >>>> > >>>> The solution works for removing the error message but it does not solve the > >>>> root cause of the issue. The thermal zone keeps monitoring while the sensor > >>>> is down. > >>>> > >>>> So the question is why the sensor is shut down if it is in use? > >>> > >>> Do you know if there are any code I reference? I supposed it is quite common. > >> > >> Sorry, I don't get your comment > >> > >> What I meant is why is the sensor turned off if it is in use ? > > > > One typical example is cpu hotplug. The sensor is located CPU power domain. > > If CPU hotplug off, CPU power domain will be turn off. > > > > It doesn't make sensor keep monitor such sensor when CPU already power off. > > It doesn't make sensor to keep CPU power on just because want to get sensor > > data. > > > > Anthor example is GPU, if there are GPU0 and GPU1. Most case just GPU0 > > work. GPU1 may turn off when less loading. > > > > Ideally, thermal can get notification from power domain driver. > > when such power domain turn off, disable thermal zone. > > > > So far, I have not idea how to do that. > > Ulf, > > do you have a guidance to link the thermal zone and the power domain in > order to get a poweron/off notification leading to enable/disable the > thermal zone ? I don't know the details here, so apologize for my ignorance to start with. What platform is this? A vague idea could be to hook up the thermal sensor to the corresponding CPU power domain. Assuming the CPU power domain is modelled as a genpd provider, then this allows the driver for the thermal sensor to register for power-on/off notifications of the genpd (see dev_pm_genpd_add_notifier()). Can this work? Kind regards Uffe
Hi Ulf, thanks for your answer On 16/08/2023 23:23, Ulf Hansson wrote: > On Wed, 16 Aug 2023 at 22:46, Daniel Lezcano <daniel.lezcano@linaro.org> wrote: [ ... ] >>>>>>> If add power domain notification in thermal driver, I am not how to get >>>>>>> other devices's pd in thermal driver. >>>>>>> >>>>>>> Any example I can refer? >>>>>>> >>>>>>> Or this is simple enough solution. >>>>>> >>>>>> The solution works for removing the error message but it does not solve the >>>>>> root cause of the issue. The thermal zone keeps monitoring while the sensor >>>>>> is down. >>>>>> >>>>>> So the question is why the sensor is shut down if it is in use? >>>>> >>>>> Do you know if there are any code I reference? I supposed it is quite common. >>>> >>>> Sorry, I don't get your comment >>>> >>>> What I meant is why is the sensor turned off if it is in use ? >>> >>> One typical example is cpu hotplug. The sensor is located CPU power domain. >>> If CPU hotplug off, CPU power domain will be turn off. >>> >>> It doesn't make sensor keep monitor such sensor when CPU already power off. >>> It doesn't make sensor to keep CPU power on just because want to get sensor >>> data. >>> >>> Anthor example is GPU, if there are GPU0 and GPU1. Most case just GPU0 >>> work. GPU1 may turn off when less loading. >>> >>> Ideally, thermal can get notification from power domain driver. >>> when such power domain turn off, disable thermal zone. >>> >>> So far, I have not idea how to do that. >> >> Ulf, >> >> do you have a guidance to link the thermal zone and the power domain in >> order to get a poweron/off notification leading to enable/disable the >> thermal zone ? > > I don't know the details here, so apologize for my ignorance to start > with. What platform is this? I will let Frank answer this > A vague idea could be to hook up the thermal sensor to the > corresponding CPU power domain. Assuming the CPU power domain is > modelled as a genpd provider, then this allows the driver for the > thermal sensor to register for power-on/off notifications of the genpd > (see dev_pm_genpd_add_notifier()). > > Can this work? Yes indeed it sounds like what should be achieved. Assuming it is not modeled with genpd how would you describe those in order to have the sensor belonging to one specific power domain?
On Wed, Aug 16, 2023 at 11:23:17PM +0200, Ulf Hansson wrote: > On Wed, 16 Aug 2023 at 22:46, Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > > > > On 16/08/2023 19:07, Frank Li wrote: > > > On Wed, Aug 16, 2023 at 06:47:17PM +0200, Daniel Lezcano wrote: > > >> On 16/08/2023 18:28, Frank Li wrote: > > >>> On Wed, Aug 16, 2023 at 10:44:32AM +0200, Daniel Lezcano wrote: > > >>>> > > >>>> Hi Frank, > > >>>> > > >>>> sorry for the delay > > >>>> > > >>>> On 14/07/2023 19:19, Frank Li wrote: > > >>>>> On Thu, Jul 13, 2023 at 02:49:54PM +0200, Daniel Lezcano wrote: > > >>>>>> On 12/07/2023 23:05, Frank Li wrote: > > >>>>>>> Avoid endless print following message when SCFW turns off resource. > > >>>>>>> [ 1818.342337] thermal thermal_zone0: failed to read out thermal zone (-1) > > >>>>>>> > > >>>>>>> Signed-off-by: Frank Li <Frank.Li@nxp.com> > > >>>>>>> --- > > >>>>>>> drivers/thermal/imx_sc_thermal.c | 4 +++- > > >>>>>>> 1 file changed, 3 insertions(+), 1 deletion(-) > > >>>>>>> > > >>>>>>> diff --git a/drivers/thermal/imx_sc_thermal.c b/drivers/thermal/imx_sc_thermal.c > > >>>>>>> index 8d6b4ef23746..0533d58f199f 100644 > > >>>>>>> --- a/drivers/thermal/imx_sc_thermal.c > > >>>>>>> +++ b/drivers/thermal/imx_sc_thermal.c > > >>>>>>> @@ -58,7 +58,9 @@ static int imx_sc_thermal_get_temp(struct thermal_zone_device *tz, int *temp) > > >>>>>>> hdr->size = 2; > > >>>>>>> ret = imx_scu_call_rpc(thermal_ipc_handle, &msg, true); > > >>>>>>> - if (ret) > > >>>>>>> + if (ret == -EPERM) /* NO POWER */ > > >>>>>>> + return -EAGAIN; > > >>>>>> > > >>>>>> Isn't there a chain call somewhere when the resource is turned off, so the > > >>>>>> thermal zone can be disabled? > > >>>>> > > >>>>> A possible place in drivers/firmware/imx/scu-pd.c. but I am not sure how to > > >>>>> get thermal devices. I just found a API thermal_zone_get_zone_by_name(). I > > >>>>> am not sure if it is good to depend on "name", which add coupling between > > >>>>> two drivers and if there are external thermal devices(such as) has the > > >>>>> same name, it will wrong turn off. > > >>>> > > >>>> Correct > > >>>> > > >>>>> If add power domain notification in thermal driver, I am not how to get > > >>>>> other devices's pd in thermal driver. > > >>>>> > > >>>>> Any example I can refer? > > >>>>> > > >>>>> Or this is simple enough solution. > > >>>> > > >>>> The solution works for removing the error message but it does not solve the > > >>>> root cause of the issue. The thermal zone keeps monitoring while the sensor > > >>>> is down. > > >>>> > > >>>> So the question is why the sensor is shut down if it is in use? > > >>> > > >>> Do you know if there are any code I reference? I supposed it is quite common. > > >> > > >> Sorry, I don't get your comment > > >> > > >> What I meant is why is the sensor turned off if it is in use ? > > > > > > One typical example is cpu hotplug. The sensor is located CPU power domain. > > > If CPU hotplug off, CPU power domain will be turn off. > > > > > > It doesn't make sensor keep monitor such sensor when CPU already power off. > > > It doesn't make sensor to keep CPU power on just because want to get sensor > > > data. > > > > > > Anthor example is GPU, if there are GPU0 and GPU1. Most case just GPU0 > > > work. GPU1 may turn off when less loading. > > > > > > Ideally, thermal can get notification from power domain driver. > > > when such power domain turn off, disable thermal zone. > > > > > > So far, I have not idea how to do that. > > > > Ulf, > > > > do you have a guidance to link the thermal zone and the power domain in > > order to get a poweron/off notification leading to enable/disable the > > thermal zone ? > > I don't know the details here, so apologize for my ignorance to start > with. What platform is this? i.MX8QM. > > A vague idea could be to hook up the thermal sensor to the > corresponding CPU power domain. Assuming the CPU power domain is > modelled as a genpd provider, then this allows the driver for the > thermal sensor to register for power-on/off notifications of the genpd > (see dev_pm_genpd_add_notifier()). > > Can this work? I don't think. dev_pm_genpd_ad_notifier() need a dev, which binded to pd. tsens: thermal-sensor { compatible = "fsl,imx-sc-thermal"; tsens-num = <6>; #thermal-sensor-cells = <1>; }; we have 6 thermal-sensor, which assocated with 6 pd, IMX_SC_R_SYSTEM, IMX_SC_R_PMIC_0, IMX_SC_R_AP_0, IMX_SC_R_AP_1, IMX_SC_R_GPU_0_PID0, IMX_SC_R_GPU_1_PID0, IMX_SC_R_DRC_0 We don't want to hold PD on just because want to get temperature. GPU pd consume much power. I want to register one callback at thermal-sensor driver, when GPU pd on, enable thermal-zone. when GPU pd off, disable thermal zone. we can do more common way. gpu-thermal1 { polling-delay-passive = <250>; polling-delay = <2000>; >>> pd=<&GPU1_PD> thermal-sensors = <&tsens IMX_SC_R_GPU_1_PID0>; }; if GPU1_PD on, then gpu-thermal1 enable, if GPU1_PD off, then gpu-thermal1 disable. > > Kind regards > Uffe
On Thu, 17 Aug 2023 at 17:31, Frank Li <Frank.li@nxp.com> wrote: > > On Wed, Aug 16, 2023 at 11:23:17PM +0200, Ulf Hansson wrote: > > On Wed, 16 Aug 2023 at 22:46, Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > > > > > > On 16/08/2023 19:07, Frank Li wrote: > > > > On Wed, Aug 16, 2023 at 06:47:17PM +0200, Daniel Lezcano wrote: > > > >> On 16/08/2023 18:28, Frank Li wrote: > > > >>> On Wed, Aug 16, 2023 at 10:44:32AM +0200, Daniel Lezcano wrote: > > > >>>> > > > >>>> Hi Frank, > > > >>>> > > > >>>> sorry for the delay > > > >>>> > > > >>>> On 14/07/2023 19:19, Frank Li wrote: > > > >>>>> On Thu, Jul 13, 2023 at 02:49:54PM +0200, Daniel Lezcano wrote: > > > >>>>>> On 12/07/2023 23:05, Frank Li wrote: > > > >>>>>>> Avoid endless print following message when SCFW turns off resource. > > > >>>>>>> [ 1818.342337] thermal thermal_zone0: failed to read out thermal zone (-1) > > > >>>>>>> > > > >>>>>>> Signed-off-by: Frank Li <Frank.Li@nxp.com> > > > >>>>>>> --- > > > >>>>>>> drivers/thermal/imx_sc_thermal.c | 4 +++- > > > >>>>>>> 1 file changed, 3 insertions(+), 1 deletion(-) > > > >>>>>>> > > > >>>>>>> diff --git a/drivers/thermal/imx_sc_thermal.c b/drivers/thermal/imx_sc_thermal.c > > > >>>>>>> index 8d6b4ef23746..0533d58f199f 100644 > > > >>>>>>> --- a/drivers/thermal/imx_sc_thermal.c > > > >>>>>>> +++ b/drivers/thermal/imx_sc_thermal.c > > > >>>>>>> @@ -58,7 +58,9 @@ static int imx_sc_thermal_get_temp(struct thermal_zone_device *tz, int *temp) > > > >>>>>>> hdr->size = 2; > > > >>>>>>> ret = imx_scu_call_rpc(thermal_ipc_handle, &msg, true); > > > >>>>>>> - if (ret) > > > >>>>>>> + if (ret == -EPERM) /* NO POWER */ > > > >>>>>>> + return -EAGAIN; > > > >>>>>> > > > >>>>>> Isn't there a chain call somewhere when the resource is turned off, so the > > > >>>>>> thermal zone can be disabled? > > > >>>>> > > > >>>>> A possible place in drivers/firmware/imx/scu-pd.c. but I am not sure how to > > > >>>>> get thermal devices. I just found a API thermal_zone_get_zone_by_name(). I > > > >>>>> am not sure if it is good to depend on "name", which add coupling between > > > >>>>> two drivers and if there are external thermal devices(such as) has the > > > >>>>> same name, it will wrong turn off. > > > >>>> > > > >>>> Correct > > > >>>> > > > >>>>> If add power domain notification in thermal driver, I am not how to get > > > >>>>> other devices's pd in thermal driver. > > > >>>>> > > > >>>>> Any example I can refer? > > > >>>>> > > > >>>>> Or this is simple enough solution. > > > >>>> > > > >>>> The solution works for removing the error message but it does not solve the > > > >>>> root cause of the issue. The thermal zone keeps monitoring while the sensor > > > >>>> is down. > > > >>>> > > > >>>> So the question is why the sensor is shut down if it is in use? > > > >>> > > > >>> Do you know if there are any code I reference? I supposed it is quite common. > > > >> > > > >> Sorry, I don't get your comment > > > >> > > > >> What I meant is why is the sensor turned off if it is in use ? > > > > > > > > One typical example is cpu hotplug. The sensor is located CPU power domain. > > > > If CPU hotplug off, CPU power domain will be turn off. > > > > > > > > It doesn't make sensor keep monitor such sensor when CPU already power off. > > > > It doesn't make sensor to keep CPU power on just because want to get sensor > > > > data. > > > > > > > > Anthor example is GPU, if there are GPU0 and GPU1. Most case just GPU0 > > > > work. GPU1 may turn off when less loading. > > > > > > > > Ideally, thermal can get notification from power domain driver. > > > > when such power domain turn off, disable thermal zone. > > > > > > > > So far, I have not idea how to do that. > > > > > > Ulf, > > > > > > do you have a guidance to link the thermal zone and the power domain in > > > order to get a poweron/off notification leading to enable/disable the > > > thermal zone ? > > > > I don't know the details here, so apologize for my ignorance to start > > with. What platform is this? > > i.MX8QM. Thanks! > > > > > A vague idea could be to hook up the thermal sensor to the > > corresponding CPU power domain. Assuming the CPU power domain is > > modelled as a genpd provider, then this allows the driver for the > > thermal sensor to register for power-on/off notifications of the genpd > > (see dev_pm_genpd_add_notifier()). > > > > Can this work? > > I don't think. dev_pm_genpd_ad_notifier() need a dev, which binded to pd. Yes, correct. > > tsens: thermal-sensor { > compatible = "fsl,imx-sc-thermal"; > tsens-num = <6>; > #thermal-sensor-cells = <1>; > }; Are you saying that the above doesn't have a corresponding struct device created for it? That sounds like a problem that can be fixed, right? Not sure if it makes sense though. > > we have 6 thermal-sensor, which assocated with 6 pd, > IMX_SC_R_SYSTEM, IMX_SC_R_PMIC_0, > IMX_SC_R_AP_0, IMX_SC_R_AP_1, > IMX_SC_R_GPU_0_PID0, IMX_SC_R_GPU_1_PID0, > IMX_SC_R_DRC_0 > > We don't want to hold PD on just because want to get temperature. GPU pd > consume much power. Of course, that would be a bad idea it seems like. The corresponding struct device that's hooked up to a genpd, can remain runtime suspended as long as you think it makes sense. Thus it would not keep the PM domain powered on when it isn't needed. > > I want to register one callback at thermal-sensor driver, when GPU pd on, > enable thermal-zone. when GPU pd off, disable thermal zone. Right, that should work fine too, I think. It seems like this is just a matter of modelling this correctly in DT, I have no strong opinion in this regard. > > we can do more common way. > > gpu-thermal1 { > polling-delay-passive = <250>; > polling-delay = <2000>; > >>> pd=<&GPU1_PD> > thermal-sensors = <&tsens IMX_SC_R_GPU_1_PID0>; > > }; > > if GPU1_PD on, then gpu-thermal1 enable, > if GPU1_PD off, then gpu-thermal1 disable. > Sounds like it's worth a try! Please keep me posted. Kind regards Uffe
diff --git a/drivers/thermal/imx_sc_thermal.c b/drivers/thermal/imx_sc_thermal.c index 8d6b4ef23746..0533d58f199f 100644 --- a/drivers/thermal/imx_sc_thermal.c +++ b/drivers/thermal/imx_sc_thermal.c @@ -58,7 +58,9 @@ static int imx_sc_thermal_get_temp(struct thermal_zone_device *tz, int *temp) hdr->size = 2; ret = imx_scu_call_rpc(thermal_ipc_handle, &msg, true); - if (ret) + if (ret == -EPERM) /* NO POWER */ + return -EAGAIN; + else if (ret) return ret; *temp = msg.data.resp.celsius * 1000 + msg.data.resp.tenths * 100;
Avoid endless print following message when SCFW turns off resource. [ 1818.342337] thermal thermal_zone0: failed to read out thermal zone (-1) Signed-off-by: Frank Li <Frank.Li@nxp.com> --- drivers/thermal/imx_sc_thermal.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)