Message ID | 20241022-tisci-pd-boot-state-v1-1-849a6384131b@ideasonboard.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] pmdomain: ti-sci: Set PD on/off state according to the HW state | expand |
On Tue, 22 Oct 2024 at 08:41, Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> wrote: > > At the moment the driver sets the power state of all the PDs it creates > to off, regardless of the actual HW state. This has two drawbacks: > > 1) The kernel cannot disable unused PDs automatically for power saving, > as it thinks they are off already > > 2) A more specific case (but perhaps applicable to other scenarios > also): bootloader enabled splash-screen cannot be kept on the screen. > > The issue in 2) is that the driver framework automatically enables the > device's PD before calling probe() and disables it after the probe(). > This means that when the display subsystem (DSS) driver probes, but e.g. > fails due to deferred probing, the DSS PD gets turned off and the driver > cannot do anything to affect that. > > Solving the 2) requires more changes to actually keep the PD on during > the boot, but a prerequisite for it is to have the correct power state > for the PD. > > The downside with this patch is that it takes time to call the 'is_on' > op, and we need to call it for each PD. In my tests with AM62 SK, using > defconfig, I see an increase from ~3.5ms to ~7ms. However, the added > feature is valuable, so in my opinion it's worth it. > > The performance could probably be improved with a new firmware API which > returns the power states of all the PDs. > > There's also a related HW issue at play here: if the DSS IP is enabled > and active, and its PD is turned off without first disabling the DSS > display outputs, the DSS IP will hang and causes the kernel to halt if > and when the DSS driver accesses the DSS registers the next time. > > With the current upstream kernel, with this patch applied, this means > that if the bootloader enables the display, and the DSS driver is > compiled as a module, the kernel will at some point disable unused PDs, > including the DSS PD. When the DSS module is later loaded, it will hang > the kernel. > > The same issue is already there, even without this patch, as the DSS > driver may hit deferred probing, which causes the PD to be turned off, > and leading to kernel halt when the DSS driver is probed again. This > issue has been made quite rare with some arrangements in the DSS > driver's probe, but it's still there. > > So, because of the DSS hang issues, I think this patch is still an RFC. > Hopefully we can sort out all the issues, but this patch (or similar) > will be part of the solution so I'd like to get some acks/nacks/comments > for this. Also, this change might have side effects to other devices > too, if the drivers expect the PD to be on, so testing is needed. > > Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> This patch seems reasonable to me, however I am deferring to apply it until yours and other confirmations about tests. In regards to the "disable unused genpd" problem, I am working on it in-between all the other stuff. I do understand that it probably starts being annoying waiting for me, so I will try to get this prioritized. We really need to get this solved. Kind regards Uffe > --- > drivers/pmdomain/ti/ti_sci_pm_domains.c | 24 +++++++++++++++++++++++- > 1 file changed, 23 insertions(+), 1 deletion(-) > > diff --git a/drivers/pmdomain/ti/ti_sci_pm_domains.c b/drivers/pmdomain/ti/ti_sci_pm_domains.c > index 1510d5ddae3d..14c51a395d7e 100644 > --- a/drivers/pmdomain/ti/ti_sci_pm_domains.c > +++ b/drivers/pmdomain/ti/ti_sci_pm_domains.c > @@ -126,6 +126,23 @@ static bool ti_sci_pm_idx_exists(struct ti_sci_genpd_provider *pd_provider, u32 > return false; > } > > +static bool ti_sci_pm_pd_is_on(struct ti_sci_genpd_provider *pd_provider, > + int pd_idx) > +{ > + bool is_on; > + int ret; > + > + if (!pd_provider->ti_sci->ops.dev_ops.is_on) > + return false; > + > + ret = pd_provider->ti_sci->ops.dev_ops.is_on(pd_provider->ti_sci, > + pd_idx, NULL, &is_on); > + if (ret) > + return false; > + > + return is_on; > +} > + > static int ti_sci_pm_domain_probe(struct platform_device *pdev) > { > struct device *dev = &pdev->dev; > @@ -161,6 +178,8 @@ static int ti_sci_pm_domain_probe(struct platform_device *pdev) > break; > > if (args.args_count >= 1 && args.np == dev->of_node) { > + bool is_on; > + > if (args.args[0] > max_id) { > max_id = args.args[0]; > } else { > @@ -189,7 +208,10 @@ static int ti_sci_pm_domain_probe(struct platform_device *pdev) > pd->idx = args.args[0]; > pd->parent = pd_provider; > > - pm_genpd_init(&pd->pd, NULL, true); > + is_on = ti_sci_pm_pd_is_on(pd_provider, > + pd->idx); > + > + pm_genpd_init(&pd->pd, NULL, !is_on); > > list_add(&pd->node, &pd_provider->pd_list); > } > > --- > base-commit: 98f7e32f20d28ec452afb208f9cffc08448a2652 > change-id: 20241022-tisci-pd-boot-state-33cf02efd378 > > Best regards, > -- > Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> >
Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> writes: > At the moment the driver sets the power state of all the PDs it creates > to off, regardless of the actual HW state. This has two drawbacks: > > 1) The kernel cannot disable unused PDs automatically for power saving, > as it thinks they are off already > > 2) A more specific case (but perhaps applicable to other scenarios > also): bootloader enabled splash-screen cannot be kept on the screen. > > The issue in 2) is that the driver framework automatically enables the > device's PD before calling probe() and disables it after the probe(). > This means that when the display subsystem (DSS) driver probes, but e.g. > fails due to deferred probing, the DSS PD gets turned off and the driver > cannot do anything to affect that. > > Solving the 2) requires more changes to actually keep the PD on during > the boot, but a prerequisite for it is to have the correct power state > for the PD. > > The downside with this patch is that it takes time to call the 'is_on' > op, and we need to call it for each PD. In my tests with AM62 SK, using > defconfig, I see an increase from ~3.5ms to ~7ms. However, the added > feature is valuable, so in my opinion it's worth it. > > The performance could probably be improved with a new firmware API which > returns the power states of all the PDs. Agreed. I think we have to pay this performance price for correctness, and we can optimizie it later with improvements to the SCI firmware and a new API. > There's also a related HW issue at play here: if the DSS IP is enabled > and active, and its PD is turned off without first disabling the DSS > display outputs, the DSS IP will hang and causes the kernel to halt if > and when the DSS driver accesses the DSS registers the next time. Ouch. > With the current upstream kernel, with this patch applied, this means > that if the bootloader enables the display, and the DSS driver is > compiled as a module, the kernel will at some point disable unused PDs, > including the DSS PD. When the DSS module is later loaded, it will hang > the kernel. > > The same issue is already there, even without this patch, as the DSS > driver may hit deferred probing, which causes the PD to be turned off, > and leading to kernel halt when the DSS driver is probed again. This > issue has been made quite rare with some arrangements in the DSS > driver's probe, but it's still there. > > So, because of the DSS hang issues, I think this patch is still an RFC. Like you said, I think that DSS hang is an issue independently of this patch, so it shouldn't hold this up IMO. > Hopefully we can sort out all the issues, but this patch (or similar) > will be part of the solution so I'd like to get some acks/nacks/comments > for this. Also, this change might have side effects to other devices > too, if the drivers expect the PD to be on, so testing is needed. > > Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> We already discussed this a bit off-list, but for the record, I agree with the approach. I also tested it on k3-am62a7-sk where I've been doing the other TI SCI pmdomain work and everything still working fine. Reviewed-by: Kevin Hilman <khilman@baylibre.com> Tested-by: Kevin Hilman <khilman@baylibre.com> Kevin
Hi, On 30/10/2024 22:04, Kevin Hilman wrote: > Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> writes: > >> At the moment the driver sets the power state of all the PDs it creates >> to off, regardless of the actual HW state. This has two drawbacks: >> >> 1) The kernel cannot disable unused PDs automatically for power saving, >> as it thinks they are off already >> >> 2) A more specific case (but perhaps applicable to other scenarios >> also): bootloader enabled splash-screen cannot be kept on the screen. >> >> The issue in 2) is that the driver framework automatically enables the >> device's PD before calling probe() and disables it after the probe(). >> This means that when the display subsystem (DSS) driver probes, but e.g. >> fails due to deferred probing, the DSS PD gets turned off and the driver >> cannot do anything to affect that. >> >> Solving the 2) requires more changes to actually keep the PD on during >> the boot, but a prerequisite for it is to have the correct power state >> for the PD. >> >> The downside with this patch is that it takes time to call the 'is_on' >> op, and we need to call it for each PD. In my tests with AM62 SK, using >> defconfig, I see an increase from ~3.5ms to ~7ms. However, the added >> feature is valuable, so in my opinion it's worth it. >> >> The performance could probably be improved with a new firmware API which >> returns the power states of all the PDs. > > Agreed. I think we have to pay this performance price for correctness, > and we can optimizie it later with improvements to the SCI firmware and > a new API. > >> There's also a related HW issue at play here: if the DSS IP is enabled >> and active, and its PD is turned off without first disabling the DSS >> display outputs, the DSS IP will hang and causes the kernel to halt if >> and when the DSS driver accesses the DSS registers the next time. > > Ouch. > >> With the current upstream kernel, with this patch applied, this means >> that if the bootloader enables the display, and the DSS driver is >> compiled as a module, the kernel will at some point disable unused PDs, >> including the DSS PD. When the DSS module is later loaded, it will hang >> the kernel. >> >> The same issue is already there, even without this patch, as the DSS >> driver may hit deferred probing, which causes the PD to be turned off, >> and leading to kernel halt when the DSS driver is probed again. This >> issue has been made quite rare with some arrangements in the DSS >> driver's probe, but it's still there. >> >> So, because of the DSS hang issues, I think this patch is still an RFC. > > Like you said, I think that DSS hang is an issue independently of this > patch, so it shouldn't hold this up IMO. In current upstream, if the bootloader has enabled the display, we most likely won't hit the DSS hang issue as the PD will stay on until the DSS driver has had a chance to probe, and the driver takes actions to avoid the hang issue. With this patch applied, the PD may be turned off before the DSS driver has had a chance to probe, causing the board to hang when the DSS driver probes the first time. That's why I'm a bit hesitant to apply this. It could mean that for some people their board stops booting. I'm not even sure what would be the perfect fix for this hang problem... We could have some built-in early boot code which checks if the DSS is enabled, and disables it, so that the hang issue won't happen. But that's not good if we try to keep the boot splash on the screen until the userspace takes over. Alternatively we could, somehow, mark the DSS powerdomain to be handled in a special way: if the PD is enabled at boot time, it will be kept enabled until the DSS driver (somehow) changes the PD back to normal operation (and if DSS driver is never loaded, PD will stay on). Tomi >> Hopefully we can sort out all the issues, but this patch (or similar) >> will be part of the solution so I'd like to get some acks/nacks/comments >> for this. Also, this change might have side effects to other devices >> too, if the drivers expect the PD to be on, so testing is needed. >> >> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> > > We already discussed this a bit off-list, but for the record, I agree > with the approach. > > I also tested it on k3-am62a7-sk where I've been doing the other TI SCI > pmdomain work and everything still working fine. > > Reviewed-by: Kevin Hilman <khilman@baylibre.com> > Tested-by: Kevin Hilman <khilman@baylibre.com> > > Kevin
On Thu, 31 Oct 2024 at 08:39, Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> wrote: > > Hi, > > On 30/10/2024 22:04, Kevin Hilman wrote: > > Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> writes: > > > >> At the moment the driver sets the power state of all the PDs it creates > >> to off, regardless of the actual HW state. This has two drawbacks: > >> > >> 1) The kernel cannot disable unused PDs automatically for power saving, > >> as it thinks they are off already > >> > >> 2) A more specific case (but perhaps applicable to other scenarios > >> also): bootloader enabled splash-screen cannot be kept on the screen. > >> > >> The issue in 2) is that the driver framework automatically enables the > >> device's PD before calling probe() and disables it after the probe(). > >> This means that when the display subsystem (DSS) driver probes, but e.g. > >> fails due to deferred probing, the DSS PD gets turned off and the driver > >> cannot do anything to affect that. > >> > >> Solving the 2) requires more changes to actually keep the PD on during > >> the boot, but a prerequisite for it is to have the correct power state > >> for the PD. > >> > >> The downside with this patch is that it takes time to call the 'is_on' > >> op, and we need to call it for each PD. In my tests with AM62 SK, using > >> defconfig, I see an increase from ~3.5ms to ~7ms. However, the added > >> feature is valuable, so in my opinion it's worth it. > >> > >> The performance could probably be improved with a new firmware API which > >> returns the power states of all the PDs. > > > > Agreed. I think we have to pay this performance price for correctness, > > and we can optimizie it later with improvements to the SCI firmware and > > a new API. > > > >> There's also a related HW issue at play here: if the DSS IP is enabled > >> and active, and its PD is turned off without first disabling the DSS > >> display outputs, the DSS IP will hang and causes the kernel to halt if > >> and when the DSS driver accesses the DSS registers the next time. > > > > Ouch. > > > >> With the current upstream kernel, with this patch applied, this means > >> that if the bootloader enables the display, and the DSS driver is > >> compiled as a module, the kernel will at some point disable unused PDs, > >> including the DSS PD. When the DSS module is later loaded, it will hang > >> the kernel. > >> > >> The same issue is already there, even without this patch, as the DSS > >> driver may hit deferred probing, which causes the PD to be turned off, > >> and leading to kernel halt when the DSS driver is probed again. This > >> issue has been made quite rare with some arrangements in the DSS > >> driver's probe, but it's still there. > >> > >> So, because of the DSS hang issues, I think this patch is still an RFC. > > > > Like you said, I think that DSS hang is an issue independently of this > > patch, so it shouldn't hold this up IMO. > > In current upstream, if the bootloader has enabled the display, we most > likely won't hit the DSS hang issue as the PD will stay on until the DSS > driver has had a chance to probe, and the driver takes actions to avoid > the hang issue. > > With this patch applied, the PD may be turned off before the DSS driver > has had a chance to probe, causing the board to hang when the DSS driver > probes the first time. > > That's why I'm a bit hesitant to apply this. It could mean that for some > people their board stops booting. > > I'm not even sure what would be the perfect fix for this hang problem... > > We could have some built-in early boot code which checks if the DSS is > enabled, and disables it, so that the hang issue won't happen. But > that's not good if we try to keep the boot splash on the screen until > the userspace takes over. > > Alternatively we could, somehow, mark the DSS powerdomain to be handled > in a special way: if the PD is enabled at boot time, it will be kept > enabled until the DSS driver (somehow) changes the PD back to normal > operation (and if DSS driver is never loaded, PD will stay on). This option is kind of what I am working on. Although, the goal is to keep the code generic, so ideally we should not need any changes in the DSS driver to make this work. Let's see. That said, it sounds like we should defer $subject patch until we have a solution for the above, right? [...] Kind regards Uffe
Hi, On 31/10/2024 12:25, Ulf Hansson wrote: > On Thu, 31 Oct 2024 at 08:39, Tomi Valkeinen > <tomi.valkeinen@ideasonboard.com> wrote: >> >> Hi, >> >> On 30/10/2024 22:04, Kevin Hilman wrote: >>> Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> writes: >>> >>>> At the moment the driver sets the power state of all the PDs it creates >>>> to off, regardless of the actual HW state. This has two drawbacks: >>>> >>>> 1) The kernel cannot disable unused PDs automatically for power saving, >>>> as it thinks they are off already >>>> >>>> 2) A more specific case (but perhaps applicable to other scenarios >>>> also): bootloader enabled splash-screen cannot be kept on the screen. >>>> >>>> The issue in 2) is that the driver framework automatically enables the >>>> device's PD before calling probe() and disables it after the probe(). >>>> This means that when the display subsystem (DSS) driver probes, but e.g. >>>> fails due to deferred probing, the DSS PD gets turned off and the driver >>>> cannot do anything to affect that. >>>> >>>> Solving the 2) requires more changes to actually keep the PD on during >>>> the boot, but a prerequisite for it is to have the correct power state >>>> for the PD. >>>> >>>> The downside with this patch is that it takes time to call the 'is_on' >>>> op, and we need to call it for each PD. In my tests with AM62 SK, using >>>> defconfig, I see an increase from ~3.5ms to ~7ms. However, the added >>>> feature is valuable, so in my opinion it's worth it. >>>> >>>> The performance could probably be improved with a new firmware API which >>>> returns the power states of all the PDs. >>> >>> Agreed. I think we have to pay this performance price for correctness, >>> and we can optimizie it later with improvements to the SCI firmware and >>> a new API. >>> >>>> There's also a related HW issue at play here: if the DSS IP is enabled >>>> and active, and its PD is turned off without first disabling the DSS >>>> display outputs, the DSS IP will hang and causes the kernel to halt if >>>> and when the DSS driver accesses the DSS registers the next time. >>> >>> Ouch. >>> >>>> With the current upstream kernel, with this patch applied, this means >>>> that if the bootloader enables the display, and the DSS driver is >>>> compiled as a module, the kernel will at some point disable unused PDs, >>>> including the DSS PD. When the DSS module is later loaded, it will hang >>>> the kernel. >>>> >>>> The same issue is already there, even without this patch, as the DSS >>>> driver may hit deferred probing, which causes the PD to be turned off, >>>> and leading to kernel halt when the DSS driver is probed again. This >>>> issue has been made quite rare with some arrangements in the DSS >>>> driver's probe, but it's still there. >>>> >>>> So, because of the DSS hang issues, I think this patch is still an RFC. >>> >>> Like you said, I think that DSS hang is an issue independently of this >>> patch, so it shouldn't hold this up IMO. >> >> In current upstream, if the bootloader has enabled the display, we most >> likely won't hit the DSS hang issue as the PD will stay on until the DSS >> driver has had a chance to probe, and the driver takes actions to avoid >> the hang issue. >> >> With this patch applied, the PD may be turned off before the DSS driver >> has had a chance to probe, causing the board to hang when the DSS driver >> probes the first time. >> >> That's why I'm a bit hesitant to apply this. It could mean that for some >> people their board stops booting. >> >> I'm not even sure what would be the perfect fix for this hang problem... >> >> We could have some built-in early boot code which checks if the DSS is >> enabled, and disables it, so that the hang issue won't happen. But >> that's not good if we try to keep the boot splash on the screen until >> the userspace takes over. >> >> Alternatively we could, somehow, mark the DSS powerdomain to be handled >> in a special way: if the PD is enabled at boot time, it will be kept >> enabled until the DSS driver (somehow) changes the PD back to normal >> operation (and if DSS driver is never loaded, PD will stay on). > > This option is kind of what I am working on. Although, the goal is to > keep the code generic, so ideally we should not need any changes in > the DSS driver to make this work. Let's see. If you need someone to do some testing, you know who to ask =). > That said, it sounds like we should defer $subject patch until we have > a solution for the above, right? Yes, I would say so. I'll collect the tags, and will resend this patch (as a non-RFC) when I think it's safe. Thanks for the feedback, Kevin and Ulf. Tomi
diff --git a/drivers/pmdomain/ti/ti_sci_pm_domains.c b/drivers/pmdomain/ti/ti_sci_pm_domains.c index 1510d5ddae3d..14c51a395d7e 100644 --- a/drivers/pmdomain/ti/ti_sci_pm_domains.c +++ b/drivers/pmdomain/ti/ti_sci_pm_domains.c @@ -126,6 +126,23 @@ static bool ti_sci_pm_idx_exists(struct ti_sci_genpd_provider *pd_provider, u32 return false; } +static bool ti_sci_pm_pd_is_on(struct ti_sci_genpd_provider *pd_provider, + int pd_idx) +{ + bool is_on; + int ret; + + if (!pd_provider->ti_sci->ops.dev_ops.is_on) + return false; + + ret = pd_provider->ti_sci->ops.dev_ops.is_on(pd_provider->ti_sci, + pd_idx, NULL, &is_on); + if (ret) + return false; + + return is_on; +} + static int ti_sci_pm_domain_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; @@ -161,6 +178,8 @@ static int ti_sci_pm_domain_probe(struct platform_device *pdev) break; if (args.args_count >= 1 && args.np == dev->of_node) { + bool is_on; + if (args.args[0] > max_id) { max_id = args.args[0]; } else { @@ -189,7 +208,10 @@ static int ti_sci_pm_domain_probe(struct platform_device *pdev) pd->idx = args.args[0]; pd->parent = pd_provider; - pm_genpd_init(&pd->pd, NULL, true); + is_on = ti_sci_pm_pd_is_on(pd_provider, + pd->idx); + + pm_genpd_init(&pd->pd, NULL, !is_on); list_add(&pd->node, &pd_provider->pd_list); }
At the moment the driver sets the power state of all the PDs it creates to off, regardless of the actual HW state. This has two drawbacks: 1) The kernel cannot disable unused PDs automatically for power saving, as it thinks they are off already 2) A more specific case (but perhaps applicable to other scenarios also): bootloader enabled splash-screen cannot be kept on the screen. The issue in 2) is that the driver framework automatically enables the device's PD before calling probe() and disables it after the probe(). This means that when the display subsystem (DSS) driver probes, but e.g. fails due to deferred probing, the DSS PD gets turned off and the driver cannot do anything to affect that. Solving the 2) requires more changes to actually keep the PD on during the boot, but a prerequisite for it is to have the correct power state for the PD. The downside with this patch is that it takes time to call the 'is_on' op, and we need to call it for each PD. In my tests with AM62 SK, using defconfig, I see an increase from ~3.5ms to ~7ms. However, the added feature is valuable, so in my opinion it's worth it. The performance could probably be improved with a new firmware API which returns the power states of all the PDs. There's also a related HW issue at play here: if the DSS IP is enabled and active, and its PD is turned off without first disabling the DSS display outputs, the DSS IP will hang and causes the kernel to halt if and when the DSS driver accesses the DSS registers the next time. With the current upstream kernel, with this patch applied, this means that if the bootloader enables the display, and the DSS driver is compiled as a module, the kernel will at some point disable unused PDs, including the DSS PD. When the DSS module is later loaded, it will hang the kernel. The same issue is already there, even without this patch, as the DSS driver may hit deferred probing, which causes the PD to be turned off, and leading to kernel halt when the DSS driver is probed again. This issue has been made quite rare with some arrangements in the DSS driver's probe, but it's still there. So, because of the DSS hang issues, I think this patch is still an RFC. Hopefully we can sort out all the issues, but this patch (or similar) will be part of the solution so I'd like to get some acks/nacks/comments for this. Also, this change might have side effects to other devices too, if the drivers expect the PD to be on, so testing is needed. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> --- drivers/pmdomain/ti/ti_sci_pm_domains.c | 24 +++++++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-) --- base-commit: 98f7e32f20d28ec452afb208f9cffc08448a2652 change-id: 20241022-tisci-pd-boot-state-33cf02efd378 Best regards,