Message ID | 20200605063724.9030-1-m.szyprowski@samsung.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
Series | regulator: do not balance 'boot-on' coupled regulators without constraints | expand |
+Saravana, On 05-06-20, 08:37, Marek Szyprowski wrote: > Balancing of the 'boot-on' coupled regulators must wait until the clients > set their constraints, otherwise the balancing code might change the > voltage of the not-yet-constrained regulator to the value below the > bootloader-configured operation point, what might cause a system crash. > This is achieved by assuming that, the minimal voltage allowed for the > given 'boot-on' regulator is equal to it's current voltage until > consumers apply their constraints. > > Suggested-by: Dmitry Osipenko <digetx@gmail.com> > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> > --- > This is yet another approach to fix the regulator coupling on > Exynos5800/5422 SoCs in the regulator core. I agree with Dmitry that this > issue is generic and if possible it should be handled in the core. > > This patchset is another attempt to fix the regulator coupling on > Exynos5800/5422 SoCs. Here are links to the previous attempts and > discussions: > > https://lore.kernel.org/linux-samsung-soc/20191008101709.qVNy8eijBi0LynOteWFMnTg4GUwKG599n6OyYoX1Abs@z/ > https://lore.kernel.org/lkml/20191017102758.8104-1-m.szyprowski@samsung.com/ > https://lore.kernel.org/linux-pm/cover.1589528491.git.viresh.kumar@linaro.org/ > https://lore.kernel.org/linux-pm/20200528131130.17984-1-m.szyprowski@samsung.com/ > https://lore.kernel.org/linux-samsung-soc/57cf3a15-5d9b-7636-4c69-60742e8cfae6@samsung.com/ > > The problem is with "vdd_int" regulator coupled with "vdd_arm" on Odroid > XU3/XU4 boards family. "vdd_arm" is handled by CPUfreq. "vdd_int" is > handled by devfreq. CPUfreq initialized quite early during boot and it > starts changing OPPs and "vdd_arm" value. Sometimes CPU activity during > boot goes down and some low-frequency OPPs are selected, what in turn > causes lowering "vdd_arm". This happens before devfreq applies its > requirements on "vdd_int". Regulator balancing code reduces "vdd_arm" > voltage value, what in turn causes lowering "vdd_int" value to the lowest > possible value. This is much below the operation point of the wcore bus, > which still runs at the highest frequency. > > The issue was hard to notice because in the most cases the board managed > to boot properly, even when the regulator was set to lowest value allowed > by the regulator constraints. However, it caused some random issues, > which can be observed as "Unhandled prefetch abort" or low USB stability. > > Best regards > Marek Szyprowski > --- > drivers/regulator/core.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c > index 03154f5b939f..7e9af7ea4bdf 100644 > --- a/drivers/regulator/core.c > +++ b/drivers/regulator/core.c > @@ -3553,6 +3553,17 @@ static int regulator_get_optimal_voltage(struct regulator_dev *rdev, > if (ret < 0) > return ret; > > + /* > + * If no constraints set yet and regulator has boot-on flag, > + * keep its voltage unchanged > + */ > + if (tmp_min == 0 && c_rdevs[i]->constraints->boot_on) { > + ret = regulator_get_voltage_rdev(c_rdevs[i]); > + if (ret < 0) > + return ret; > + tmp_min = ret; > + } > + > ret = regulator_check_voltage(c_rdevs[i], &tmp_min, &tmp_max); > if (ret < 0) > return ret; This is exactly what Saravana tried to solve earlier AFAIR, lets see what he has to say here.
On Fri, Jun 05, 2020 at 08:37:24AM +0200, Marek Szyprowski wrote: > Balancing of the 'boot-on' coupled regulators must wait until the clients > set their constraints, otherwise the balancing code might change the No, this is not what boot-on means at all. It is there for cases where we can't read the enable status from the hardware. Trying to infer *anything* about the runtime behaviour from it being present or absent is very badly broken. Saravana (CCed) was working on some patches which tried to deal with some stuff around this for enables using the sync_state() callback. Unfortunately there's quite a few problems with the current approach (the biggest one from my point of view being that it's implemented so that it requires every single consumer of every device on the PMIC to come up but there's others at more of an implementation level).
Hi Mark, On 05.06.2020 12:20, Mark Brown wrote: > On Fri, Jun 05, 2020 at 08:37:24AM +0200, Marek Szyprowski wrote: > >> Balancing of the 'boot-on' coupled regulators must wait until the clients >> set their constraints, otherwise the balancing code might change the > No, this is not what boot-on means at all. It is there for cases where > we can't read the enable status from the hardware. Trying to infer > *anything* about the runtime behaviour from it being present or absent > is very badly broken. Okay, what about the 'always-on' property? I don't think that we need another property for annotating this behavior, as in my opinion this is just an implementation issue on the Linux kernel and regulator framework. Alternatively I can drop the property check, but then it won't be possible to have a regulator without a consumer, which follows the other one (although we still don't have a real use case for it). If you don't like this idea at all, I will try to move this logic to the custom coupler again, although it would mean some code copying. > Saravana (CCed) was working on some patches which tried to deal with > some stuff around this for enables using the sync_state() callback. > Unfortunately there's quite a few problems with the current approach > (the biggest one from my point of view being that it's implemented so > that it requires every single consumer of every device on the PMIC to > come up but there's others at more of an implementation level). I'm not sure if we really need such complex solution for this... Best regards
On Fri, Jun 05, 2020 at 03:37:32PM +0200, Marek Szyprowski wrote: > On 05.06.2020 12:20, Mark Brown wrote: > > No, this is not what boot-on means at all. It is there for cases where > > we can't read the enable status from the hardware. Trying to infer > > *anything* about the runtime behaviour from it being present or absent > > is very badly broken. > Okay, what about the 'always-on' property? I don't think that we need > another property for annotating this behavior, as in my opinion this is No, that's just as disconnected from the need - we may as well do it based on the regulator name being an odd number of characters. > just an implementation issue on the Linux kernel and regulator > framework. Alternatively I can drop the property check, but then it > won't be possible to have a regulator without a consumer, which follows > the other one (although we still don't have a real use case for it). > If you don't like this idea at all, I will try to move this logic to the > custom coupler again, although it would mean some code copying. I think that's better TBH. > > Saravana (CCed) was working on some patches which tried to deal with > > some stuff around this for enables using the sync_state() callback. > > Unfortunately there's quite a few problems with the current approach > > (the biggest one from my point of view being that it's implemented so > > that it requires every single consumer of every device on the PMIC to > > come up but there's others at more of an implementation level). > I'm not sure if we really need such complex solution for this... So I think that the specific approach there is overly heavyweight and restrictive but I do see the general use case here for something per regulator providing we can avoid breaking anything that does actually need to change the regulator state (eg, raising the voltage for cpufreq). Previously to the past week I'd only really heard about it causing problems in the context of displays left on by the bootloader glitching during boot but this is a concrete use case and we already have the infrastructure to track dependencies at the device model level if we use it well. OTOH if you have a coupler already that needs to be doing stuff all the time at runtime it may be easier to just put this in the coupler, especially I think in this case where the lack of the devfreq driver wouldn't mean that the hardware being controlled wasn't being used at all. The coupler would end up backstopping a missing cpufreq or devfreq driver.
On Fri, Jun 5, 2020 at 8:59 AM Mark Brown <broonie@kernel.org> wrote: > > On Fri, Jun 05, 2020 at 03:37:32PM +0200, Marek Szyprowski wrote: > > On 05.06.2020 12:20, Mark Brown wrote: > > > > No, this is not what boot-on means at all. It is there for cases where > > > we can't read the enable status from the hardware. Trying to infer > > > *anything* about the runtime behaviour from it being present or absent > > > is very badly broken. > > > Okay, what about the 'always-on' property? I don't think that we need > > another property for annotating this behavior, as in my opinion this is > > No, that's just as disconnected from the need - we may as well do it > based on the regulator name being an odd number of characters. > > > just an implementation issue on the Linux kernel and regulator > > framework. Alternatively I can drop the property check, but then it > > won't be possible to have a regulator without a consumer, which follows > > the other one (although we still don't have a real use case for it). > > > If you don't like this idea at all, I will try to move this logic to the > > custom coupler again, although it would mean some code copying. > > I think that's better TBH. > > > > Saravana (CCed) was working on some patches which tried to deal with > > > some stuff around this for enables using the sync_state() callback. > > > Unfortunately there's quite a few problems with the current approach > > > (the biggest one from my point of view being that it's implemented so > > > that it requires every single consumer of every device on the PMIC to > > > come up but there's others at more of an implementation level). > > > I'm not sure if we really need such complex solution for this... > > So I think that the specific approach there is overly heavyweight and > restrictive but I do see the general use case here for something per > regulator providing we can avoid breaking anything that does actually > need to change the regulator state (eg, raising the voltage for > cpufreq). The changes I propose won't prevent anything from asking for more power/energy (will always allow turning on stuff, increasing voltage, increasing current, etc). It'll only prevent reducing power lower than what was provided when the bootloader left stuff on. This shouldn't break most boards -- because any other consumer could be setting similar limits and things don't break then. But even if that's a concern, we can still default to a timeout behavior and then give folks the choice of disabling the timeout if they know all their devices will probe. Btw, the patch series I sent fixes a lot of subtle use cases even with the timeout enabled. For example, in one hardware platform, a LDO is shared between camera, display, UFS and USB. The camera driver would probe first, enable the regulator, poll its HW and then disable the regulator. This causes the regulator to be disabled before display, UFS, and USB could probe and this caused hardware faults for those. > Previously to the past week I'd only really heard about it > causing problems in the context of displays left on by the bootloader > glitching during boot but this is a concrete Ah, finally! I have examples of pretty much the same issue in some downstream kernels -- the CPU and memory shares rails with other hardware blocks and things fail if this isn't taken care of. Glad that someone else found an example for me in the upstream kernel. > use case and we already > have the infrastructure to track dependencies at the device model level > if we use it well. I'll send out a v3 series in a couple of days to address Mark's earlier comments and also add the voltage support to address Marek's case. We can take it from there. -Saravana
diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c index 03154f5b939f..7e9af7ea4bdf 100644 --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -3553,6 +3553,17 @@ static int regulator_get_optimal_voltage(struct regulator_dev *rdev, if (ret < 0) return ret; + /* + * If no constraints set yet and regulator has boot-on flag, + * keep its voltage unchanged + */ + if (tmp_min == 0 && c_rdevs[i]->constraints->boot_on) { + ret = regulator_get_voltage_rdev(c_rdevs[i]); + if (ret < 0) + return ret; + tmp_min = ret; + } + ret = regulator_check_voltage(c_rdevs[i], &tmp_min, &tmp_max); if (ret < 0) return ret;
Balancing of the 'boot-on' coupled regulators must wait until the clients set their constraints, otherwise the balancing code might change the voltage of the not-yet-constrained regulator to the value below the bootloader-configured operation point, what might cause a system crash. This is achieved by assuming that, the minimal voltage allowed for the given 'boot-on' regulator is equal to it's current voltage until consumers apply their constraints. Suggested-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> --- This is yet another approach to fix the regulator coupling on Exynos5800/5422 SoCs in the regulator core. I agree with Dmitry that this issue is generic and if possible it should be handled in the core. This patchset is another attempt to fix the regulator coupling on Exynos5800/5422 SoCs. Here are links to the previous attempts and discussions: https://lore.kernel.org/linux-samsung-soc/20191008101709.qVNy8eijBi0LynOteWFMnTg4GUwKG599n6OyYoX1Abs@z/ https://lore.kernel.org/lkml/20191017102758.8104-1-m.szyprowski@samsung.com/ https://lore.kernel.org/linux-pm/cover.1589528491.git.viresh.kumar@linaro.org/ https://lore.kernel.org/linux-pm/20200528131130.17984-1-m.szyprowski@samsung.com/ https://lore.kernel.org/linux-samsung-soc/57cf3a15-5d9b-7636-4c69-60742e8cfae6@samsung.com/ The problem is with "vdd_int" regulator coupled with "vdd_arm" on Odroid XU3/XU4 boards family. "vdd_arm" is handled by CPUfreq. "vdd_int" is handled by devfreq. CPUfreq initialized quite early during boot and it starts changing OPPs and "vdd_arm" value. Sometimes CPU activity during boot goes down and some low-frequency OPPs are selected, what in turn causes lowering "vdd_arm". This happens before devfreq applies its requirements on "vdd_int". Regulator balancing code reduces "vdd_arm" voltage value, what in turn causes lowering "vdd_int" value to the lowest possible value. This is much below the operation point of the wcore bus, which still runs at the highest frequency. The issue was hard to notice because in the most cases the board managed to boot properly, even when the regulator was set to lowest value allowed by the regulator constraints. However, it caused some random issues, which can be observed as "Unhandled prefetch abort" or low USB stability. Best regards Marek Szyprowski --- drivers/regulator/core.c | 11 +++++++++++ 1 file changed, 11 insertions(+)