Message ID | 540C202E.2060009@collabora.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sun, Sep 07, 2014 at 11:06:54AM +0200, Javier Martinez Canillas wrote: > But maybe we could add a boot argument similar to "clk_ignore_unused" but for > regulators? Something like "regulator_ignore_unused" that would prevent the > regulator core to disable unused regulators? If Mark agrees with that idea > I'll be glad to propose a patch. I'm not all that sympathetic to the idea; we already have quite enough quality problems with the way people hook up regulators without providing yet another way for them to hack around things, I'm concerned it'll just make things more fragile as people require magic command line arguments to get things working.
Hello Mark, On 09/07/2014 05:01 PM, Mark Brown wrote: > On Sun, Sep 07, 2014 at 11:06:54AM +0200, Javier Martinez Canillas wrote: > >> But maybe we could add a boot argument similar to "clk_ignore_unused" but for >> regulators? Something like "regulator_ignore_unused" that would prevent the >> regulator core to disable unused regulators? If Mark agrees with that idea >> I'll be glad to propose a patch. > > I'm not all that sympathetic to the idea; we already have quite enough > quality problems with the way people hook up regulators without > providing yet another way for them to hack around things, I'm concerned > it'll just make things more fragile as people require magic command line > arguments to get things working. > I understand your position and I fully agree, I just was thinking aloud. It seems the simplefb approach is somewhat fragile since the driver relies on the bootloader to correctly setup the display hardware and its needed resources (clock, regulators, etc) but also relies on the kernel to not disable those resources even when they are unused from its point of view. So, the best option for Will is to just use Ajay's proposed in-flight Exynos DRM patches or if he really wants to have simplefb working then he can carry the patch I shared to force tps65090 fet1 and fet6 regulators to be always on. Best regards, Javier
>>> Have you seen the very long "[PATCH 4/4] simplefb: add clock handling >>> code" thread [0]?. I wonder if the problem is that the display clocks were >>> not known to the kernel before 3.15 but now are getting disabled and thus >>> the simplefb driver not working? >>> >>> So probably is worth to try passing clk_ignore_unused as a parameter to >>> the kernel command line. >>> >>> [0]: https://www.mail-archive.com/linux-sunxi@googlegroups.com/msg06623.html >> > > So my assumptions was correct and the issue is that the kernel disables the > resources (clocks and regulators) needed to have display working and because > the simplefb expects the display hardware to have been already initialized by > the bootloader/firmware, it simply fails. > > You didn't face this issue before 3.15 because the default bootargs set by > nv_uboot-snow already includes the "clk_ignore_unused" parameter and the > kernel didn't know about the regulators but the later changed with commit: > > b16be76 ("ARM: dts: add tps65090 power regulator for exynos5250-snow") > > This was included in 3.16, so the mentioned commit is what "broke" your > workflow since now the kernel is aware of the tps65090 fet1 and fet6 > regulators (used as supply for the the backlight and panel respectively) and > disables them because nothing uses them from a kernel POV. So I believe we've got a process issue here. If you don't have normal support for display hardware, but you want to keep the display operational thanks to bootloader already initializing it, you should not add anything to the kernel which breaks it, until full support comes in. This means that respective regulators should be either always-on or not listed at all (I'd favor the former) and respective clocks either somehow enabled at boot-up or completely ignored, including all their parents capable of being gated. Now with regulators this is pretty straightforward, but with clocks I believe it's an open issue. AFAIR we've discussed this on MLs some time ago (at least I remember Doug commenting on that topic) and kind of concluded that SoC clock drivers could include lists of clocks to be enabled at boot-up (as a HACK to enable things like simplefb until proper support for respective features are added). I believe this would be the proper solution for $subject. Best regards, Tomasz
Hello Tomasz, On 09/07/2014 05:52 PM, Tomasz Figa wrote: > > So I believe we've got a process issue here. If you don't have normal > support for display hardware, but you want to keep the display > operational thanks to bootloader already initializing it, you should not > add anything to the kernel which breaks it, until full support comes in. > > This means that respective regulators should be either always-on or not > listed at all (I'd favor the former) So that means that do you think that the workaround patch I shared on the previous email could be considered as a correct solution? In that case I can post it as a proper patch. > somehow enabled at boot-up or completely ignored, including all their > parents capable of being gated. > AFAIU from the thread I mentioned before, Nvidia folks proposed the same to fix the simplefb issue on sunxi, to avoid the clocks in question being turned off at boot by modifying the sunxi clock driver. > Now with regulators this is pretty straightforward, but with clocks I > believe it's an open issue. AFAIR we've discussed this on MLs some time > ago (at least I remember Doug commenting on that topic) and kind of > concluded that SoC clock drivers could include lists of clocks to be > enabled at boot-up (as a HACK to enable things like simplefb until > proper support for respective features are added). > > I believe this would be the proper solution for $subject. > Clocks is not an issue at least on this machine since the bootloader already passes the clk_ignore_unused parameter to the kernel command line so in that sense there isn't a regression comparing with older kernels. If possible I would prefer to leave this way instead of adding quirks to the clock driver, specially since there are proposed patches to have the display working using the Exynos DRM driver on this machine. > Best regards, > Tomasz > Best regards, Javier
On 07.09.2014 18:12, Javier Martinez Canillas wrote: > Hello Tomasz, > > On 09/07/2014 05:52 PM, Tomasz Figa wrote: >> >> So I believe we've got a process issue here. If you don't have normal >> support for display hardware, but you want to keep the display >> operational thanks to bootloader already initializing it, you should not >> add anything to the kernel which breaks it, until full support comes in. >> >> This means that respective regulators should be either always-on or not >> listed at all (I'd favor the former) > > So that means that do you think that the workaround patch I shared on the > previous email could be considered as a correct solution? In that case I can > post it as a proper patch. Right. > >> somehow enabled at boot-up or completely ignored, including all their >> parents capable of being gated. >> > > AFAIU from the thread I mentioned before, Nvidia folks proposed the same to > fix the simplefb issue on sunxi, to avoid the clocks in question being turned > off at boot by modifying the sunxi clock driver. OK. > >> Now with regulators this is pretty straightforward, but with clocks I >> believe it's an open issue. AFAIR we've discussed this on MLs some time >> ago (at least I remember Doug commenting on that topic) and kind of >> concluded that SoC clock drivers could include lists of clocks to be >> enabled at boot-up (as a HACK to enable things like simplefb until >> proper support for respective features are added). >> >> I believe this would be the proper solution for $subject. >> > > Clocks is not an issue at least on this machine since the bootloader already > passes the clk_ignore_unused parameter to the kernel command line so in that > sense there isn't a regression comparing with older kernels. If possible I > would prefer to leave this way instead of adding quirks to the clock driver, > specially since there are proposed patches to have the display working using > the Exynos DRM driver on this machine. Well, clk_ignore_unused seems a bit too coarse grained to me. Also forcing the user to add it in his bootloader (or any other way) is not really the best practice IMHO. At least for next 3.17-rc I'd suggest fixing this up in respective clock driver and dropping the hack only after Exynos DRM patches are merged and confirmed working. Best regards, Tomasz
Hello Tomasz, On 09/07/2014 06:19 PM, Tomasz Figa wrote: > On 07.09.2014 18:12, Javier Martinez Canillas wrote: >> Clocks is not an issue at least on this machine since the bootloader already >> passes the clk_ignore_unused parameter to the kernel command line so in that >> sense there isn't a regression comparing with older kernels. If possible I >> would prefer to leave this way instead of adding quirks to the clock driver, >> specially since there are proposed patches to have the display working using >> the Exynos DRM driver on this machine. > > Well, clk_ignore_unused seems a bit too coarse grained to me. Also > forcing the user to add it in his bootloader (or any other way) is not > really the best practice IMHO. > Fair enough. > At least for next 3.17-rc I'd suggest fixing this up in respective clock > driver and dropping the hack only after Exynos DRM patches are merged > and confirmed working. > Ok, I'll prepare a patch to add the CLK_IGNORE_UNUSED flag to the needed clocks in drivers/clk/samsung/clk-exynos5250.c. That will be a more fine-grained solution since the clk_ignore_unused kernel parameter won't be needed. > Best regards, > Tomasz > Best regards, Javier
Hi, On Sun, Sep 7, 2014 at 8:52 AM, Tomasz Figa <tomasz.figa@gmail.com> wrote: > So I believe we've got a process issue here. If you don't have normal > support for display hardware, but you want to keep the display > operational thanks to bootloader already initializing it, you should not > add anything to the kernel which breaks it, until full support comes in. > > This means that respective regulators should be either always-on or not > listed at all (I'd favor the former) and respective clocks either > somehow enabled at boot-up or completely ignored, including all their > parents capable of being gated. It seems slightly broken to hack the device tree in this way. I'll be the first to admit that I often list regulators as "always-on" during bringup when not everything is done, and I guess it's not that different. ...but given everything going on upstream (and people working on Suspend/Resume, DRM, etc) it seems like it might be a bit of a pain. ...but if that's what everyone agrees on, I won't disagree too strongly. One (ugly?) solution would be to add a feature to your bootloader to modify the device tree to mark regulators as "always-on". Since the booloader gets to touch the device tree and the bootloader is involved in communicating into about SimpleFB, it kinda makes sense.
Hi, On Sun, Sep 7, 2014 at 8:52 AM, Tomasz Figa <tomasz.figa@gmail.com> wrote: > Now with regulators this is pretty straightforward, but with clocks I > believe it's an open issue. AFAIR we've discussed this on MLs some time > ago (at least I remember Doug commenting on that topic) and kind of > concluded that SoC clock drivers could include lists of clocks to be > enabled at boot-up (as a HACK to enable things like simplefb until > proper support for respective features are added). I think my old problem was with earlyprintk and a core clock getting disabled. See (44ff025 clk: exynos5420: Remove aclk66_peric from the clock tree description). I think I've seen others solve the same problem with the concept of "critical clocks". I agree that regulator and clock frameworks allow very different "hacks". ;) -Doug
Hello Doug, On 09/08/2014 06:36 AM, Doug Anderson wrote: > > One (ugly?) solution would be to add a feature to your bootloader to > modify the device tree to mark regulators as "always-on". Since the > booloader gets to touch the device tree and the bootloader is involved > in communicating into about SimpleFB, it kinda makes sense. > I can't say I like to mark the regulators as always-on on the DT and that's why I copied the patch in the response instead of posting it as a proper patch but I think relying in the bootloaders to modify the DT is not better. IMHO U-boot should only modify the strictly necessary like the /chosen branch even though lately I've seen some attempts in the OMAP community to (ab)use U-Boot's fdt command to mangle the DT before passing to the kernel in order to support different Beagle Bone Black capes. Best regards, Javier
On Sun, Sep 07, 2014 at 09:36:56PM -0700, Doug Anderson wrote: > On Sun, Sep 7, 2014 at 8:52 AM, Tomasz Figa <tomasz.figa@gmail.com> wrote: > > So I believe we've got a process issue here. If you don't have normal > > support for display hardware, but you want to keep the display > > operational thanks to bootloader already initializing it, you should not > > add anything to the kernel which breaks it, until full support comes in. > > This means that respective regulators should be either always-on or not > > listed at all (I'd favor the former) and respective clocks either > > somehow enabled at boot-up or completely ignored, including all their > > parents capable of being gated. > It seems slightly broken to hack the device tree in this way. I'll be > the first to admit that I often list regulators as "always-on" during > bringup when not everything is done, and I guess it's not that > different. ...but given everything going on upstream (and people > working on Suspend/Resume, DRM, etc) it seems like it might be a bit > of a pain. ...but if that's what everyone agrees on, I won't disagree > too strongly. > One (ugly?) solution would be to add a feature to your bootloader to > modify the device tree to mark regulators as "always-on". Since the > booloader gets to touch the device tree and the bootloader is involved > in communicating into about SimpleFB, it kinda makes sense. That would seem to make sense, yes - we're apparently communicating this as a virtual device so we should make sure that virtual device has the resources it needs either directly or by reference to other devices so the driver can keep them on. Ideally we'd be doing this with fallback compatibles or something but this will probably work OK. I'd expect we're also going to run into the same problems with what people are currently doing with the SoC power domains, we also have code to power them off when they're idle, and this whole performance with adding hacks isn't going to be robust or scale - it's essentially just praying that nothing turns off the resources we need as far as I can tell.
On Sun, Sep 07, 2014 at 05:19:03PM +0100, Tomasz Figa wrote: > At least for next 3.17-rc I'd suggest fixing this up in respective clock > driver and dropping the hack only after Exynos DRM patches are merged > and confirmed working. Whilst I'm sympathetic to people working to enable DRM, I think this is the right solution to the problem. The transition from simplefb to DRM shouldn't break display for a bunch of kernel revisions whilst the code is in flux. Will
Hello Will, On 09/08/2014 01:21 PM, Will Deacon wrote: > On Sun, Sep 07, 2014 at 05:19:03PM +0100, Tomasz Figa wrote: >> At least for next 3.17-rc I'd suggest fixing this up in respective clock >> driver and dropping the hack only after Exynos DRM patches are merged >> and confirmed working. > > Whilst I'm sympathetic to people working to enable DRM, I think this is > the right solution to the problem. The transition from simplefb to DRM > shouldn't break display for a bunch of kernel revisions whilst the code is > in flux. > Agreed, I'm preparing a patch-set to prevent the kernel to disable both the clocks and regulators needed for the display, I'll try to post it today or tomorrow at worst. > Will > Best regards, Javier
On Mon, Sep 8, 2014 at 12:21 PM, Will Deacon <will.deacon@arm.com> wrote: > On Sun, Sep 07, 2014 at 05:19:03PM +0100, Tomasz Figa wrote: >> At least for next 3.17-rc I'd suggest fixing this up in respective clock >> driver and dropping the hack only after Exynos DRM patches are merged >> and confirmed working. > > Whilst I'm sympathetic to people working to enable DRM, I think this is > the right solution to the problem. The transition from simplefb to DRM > shouldn't break display for a bunch of kernel revisions whilst the code is > in flux. I would go further. The kernel behaviour has changed, and we have to deal with platforms that assume the old behaviour. That means either defaulting to leaving enabled regulators/clocks alone unless there is a flag in the DT saying they can be power managed, or black listing platforms that are known to depend on the regulator being on. Updating the device tree must not be required to get the kernel to boot, but it is valid to require a DT upgrade to get better performance (battery life) out of the platform. g.
On Mon, Sep 08, 2014 at 12:55:29PM +0100, Javier Martinez Canillas wrote: > On 09/08/2014 01:21 PM, Will Deacon wrote: > > On Sun, Sep 07, 2014 at 05:19:03PM +0100, Tomasz Figa wrote: > >> At least for next 3.17-rc I'd suggest fixing this up in respective clock > >> driver and dropping the hack only after Exynos DRM patches are merged > >> and confirmed working. > > > > Whilst I'm sympathetic to people working to enable DRM, I think this is > > the right solution to the problem. The transition from simplefb to DRM > > shouldn't break display for a bunch of kernel revisions whilst the code is > > in flux. > > > > Agreed, I'm preparing a patch-set to prevent the kernel to disable both the > clocks and regulators needed for the display, I'll try to post it today or > tomorrow at worst. Thanks, Javier. Please CC me on the patch and I'll take it for a spin. Will
On Mon, Sep 08, 2014 at 01:20:11PM +0100, Grant Likely wrote: > On Mon, Sep 8, 2014 at 12:21 PM, Will Deacon <will.deacon@arm.com> wrote: > > Whilst I'm sympathetic to people working to enable DRM, I think this is > > the right solution to the problem. The transition from simplefb to DRM > > shouldn't break display for a bunch of kernel revisions whilst the code is > > in flux. > I would go further. The kernel behaviour has changed, and we have to > deal with platforms that assume the old behaviour. That means either > defaulting to leaving enabled regulators/clocks alone unless there is > a flag in the DT saying they can be power managed, or black listing > platforms that are known to depend on the regulator being on. For regulators there is essentially a flag in DT already - the regulators should not be described in DT if the OS isn't supposed to be managing them. > Updating the device tree must not be required to get the kernel to > boot, but it is valid to require a DT upgrade to get better > performance (battery life) out of the platform. This has got to be a blacklist then, and it seems like we've got to fix simplefb to actually support managing the resources it's using. The current plan does not seem at all sensible - we're talking about adding hacks in every subsystem that provides resources and bodging DTs in order to work around simplefb.
Hello Will, On 09/08/2014 03:49 PM, Mark Brown wrote: > On Mon, Sep 08, 2014 at 01:20:11PM +0100, Grant Likely wrote: >> On Mon, Sep 8, 2014 at 12:21 PM, Will Deacon <will.deacon@arm.com> wrote: > >> > Whilst I'm sympathetic to people working to enable DRM, I think this is >> > the right solution to the problem. The transition from simplefb to DRM >> > shouldn't break display for a bunch of kernel revisions whilst the code is >> > in flux. > >> I would go further. The kernel behaviour has changed, and we have to >> deal with platforms that assume the old behaviour. That means either >> defaulting to leaving enabled regulators/clocks alone unless there is >> a flag in the DT saying they can be power managed, or black listing >> platforms that are known to depend on the regulator being on. > > For regulators there is essentially a flag in DT already - the > regulators should not be described in DT if the OS isn't supposed to be > managing them. > >> Updating the device tree must not be required to get the kernel to >> boot, but it is valid to require a DT upgrade to get better >> performance (battery life) out of the platform. > > This has got to be a blacklist then, and it seems like we've got to fix > simplefb to actually support managing the resources it's using. The > current plan does not seem at all sensible - we're talking about adding > hacks in every subsystem that provides resources and bodging DTs in > order to work around simplefb. > Since many folks don't agree that hacking different subsystems is the way forward I'll hold the patches and don't post them. The sunxi thread [0] already shows how different people have strong opposite positions on the correct approach to handle this. For now you can just disable the tps65090 PMIC support by not enabling the CONFIG_REGULATOR_TPS65090 kconfig symbol on your kernel config. That will give you exactly the same behavior that before tps65090 support was added to the Snow DT on commit b16be76 ("ARM: dts: add tps65090 power regulator for exynos5250-snow") which AFAIU was good enough for your workflow. Best regards, Javier [0]: https://www.mail-archive.com/linux-sunxi@googlegroups.com/msg06623.html
Javier, On Sun, Sep 7, 2014 at 11:09 PM, Javier Martinez Canillas <javier.martinez@collabora.co.uk> wrote: > Hello Doug, > > On 09/08/2014 06:36 AM, Doug Anderson wrote: >> >> One (ugly?) solution would be to add a feature to your bootloader to >> modify the device tree to mark regulators as "always-on". Since the >> booloader gets to touch the device tree and the bootloader is involved >> in communicating into about SimpleFB, it kinda makes sense. >> > > I can't say I like to mark the regulators as always-on on the DT and that's > why I copied the patch in the response instead of posting it as a proper patch > but I think relying in the bootloaders to modify the DT is not better. > > IMHO U-boot should only modify the strictly necessary like the /chosen branch > even though lately I've seen some attempts in the OMAP community to (ab)use > U-Boot's fdt command to mangle the DT before passing to the kernel in order to > support different Beagle Bone Black capes. So "simple-framebuffer" is added to the device tree here: https://chromium-review.googlesource.com/#/c/49358/2/board/samsung/smdk5250/smdk5250.c That's one of the two patches to build your own U-Boot for enabling simplefb. You'll notice that's not a super official thing. It's a "DO NOT SUBMIT" patch sitting in a gerrit code review server, so I wouldn't exactly call it a stable ABI that we can't break. It's not something shipping in real products and it's not even landed in a git tree (I suppose maybe someone somewhere landed it, but...). To me, that means that if someone is using that patch and it works for them, then that's great! If it stops working (possibly because it was making assumptions about the state of the kernel) then it should be fixed up. In this case, that patch really should be adding references to regulators (and possibly clocks) that are needed. Given that this patch is already reaching into the device tree to add the "simple-framebuffer" node, it doesn't seem unreasonable to say that it should be grabbing the proper references (or mark regulators as always-on). ...as always, though, remember that my opinion doesn't count for much. I also sympathize with the problems people are running into. :(
Grant, On Mon, Sep 8, 2014 at 5:20 AM, Grant Likely <grant.likely@secretlab.ca> wrote: > On Mon, Sep 8, 2014 at 12:21 PM, Will Deacon <will.deacon@arm.com> wrote: >> On Sun, Sep 07, 2014 at 05:19:03PM +0100, Tomasz Figa wrote: >>> At least for next 3.17-rc I'd suggest fixing this up in respective clock >>> driver and dropping the hack only after Exynos DRM patches are merged >>> and confirmed working. >> >> Whilst I'm sympathetic to people working to enable DRM, I think this is >> the right solution to the problem. The transition from simplefb to DRM >> shouldn't break display for a bunch of kernel revisions whilst the code is >> in flux. > > I would go further. The kernel behaviour has changed, and we have to > deal with platforms that assume the old behaviour. That means either > defaulting to leaving enabled regulators/clocks alone unless there is > a flag in the DT saying they can be power managed, or black listing > platforms that are known to depend on the regulator being on. > > Updating the device tree must not be required to get the kernel to > boot, but it is valid to require a DT upgrade to get better > performance (battery life) out of the platform. In this case people using SImple FB are not really using an officially sanctioned device tree. The simple-fb fragment is created on the fly via a "DO NOT SUBMIT" patch sitting on a code review server. It's not something that's shipped with real firmware nor is it something present in the kernel. See <https://chromium-review.googlesource.com/#/c/49358/2/board/samsung/smdk5250/smdk5250.c> as I mentioned above. Is this really a device tree that we need to guarantee backward compatibility with? -Doug
On Mon, Sep 08, 2014 at 04:55:31PM +0100, Doug Anderson wrote: > So "simple-framebuffer" is added to the device tree here: > > https://chromium-review.googlesource.com/#/c/49358/2/board/samsung/smdk5250/smdk5250.c > > That's one of the two patches to build your own U-Boot for enabling > simplefb. You'll notice that's not a super official thing. It's a > "DO NOT SUBMIT" patch sitting in a gerrit code review server, so I > wouldn't exactly call it a stable ABI that we can't break. It's not > something shipping in real products and it's not even landed in a git > tree (I suppose maybe someone somewhere landed it, but...). I just took the uboot image linked to from the chromium.org page here: http://www.chromium.org/chromium-os/u-boot-porting-guide/using-nv-u-boot-on-the-samsung-arm-chromebook#TOC-Getting-nv-U-Boot Will
Will, On Mon, Sep 8, 2014 at 9:07 AM, Will Deacon <will.deacon@arm.com> wrote: > On Mon, Sep 08, 2014 at 04:55:31PM +0100, Doug Anderson wrote: >> So "simple-framebuffer" is added to the device tree here: >> >> https://chromium-review.googlesource.com/#/c/49358/2/board/samsung/smdk5250/smdk5250.c >> >> That's one of the two patches to build your own U-Boot for enabling >> simplefb. You'll notice that's not a super official thing. It's a >> "DO NOT SUBMIT" patch sitting in a gerrit code review server, so I >> wouldn't exactly call it a stable ABI that we can't break. It's not >> something shipping in real products and it's not even landed in a git >> tree (I suppose maybe someone somewhere landed it, but...). > > I just took the uboot image linked to from the chromium.org page here: > > http://www.chromium.org/chromium-os/u-boot-porting-guide/using-nv-u-boot-on-the-samsung-arm-chromebook#TOC-Getting-nv-U-Boot Ah, OK. It's still using the "DO NOT SUBMIT" patchs, but I guess given that there is a binary built up there by a fairly official source... Hrmm. I think Olof is the one that built that. Perhaps he'd be willing to muck with that and see if he can grab the regulators? -Doug
On Mon, Sep 8, 2014 at 4:58 PM, Doug Anderson <dianders@chromium.org> wrote: > Grant, > > On Mon, Sep 8, 2014 at 5:20 AM, Grant Likely <grant.likely@secretlab.ca> wrote: >> On Mon, Sep 8, 2014 at 12:21 PM, Will Deacon <will.deacon@arm.com> wrote: >>> On Sun, Sep 07, 2014 at 05:19:03PM +0100, Tomasz Figa wrote: >>>> At least for next 3.17-rc I'd suggest fixing this up in respective clock >>>> driver and dropping the hack only after Exynos DRM patches are merged >>>> and confirmed working. >>> >>> Whilst I'm sympathetic to people working to enable DRM, I think this is >>> the right solution to the problem. The transition from simplefb to DRM >>> shouldn't break display for a bunch of kernel revisions whilst the code is >>> in flux. >> >> I would go further. The kernel behaviour has changed, and we have to >> deal with platforms that assume the old behaviour. That means either >> defaulting to leaving enabled regulators/clocks alone unless there is >> a flag in the DT saying they can be power managed, or black listing >> platforms that are known to depend on the regulator being on. >> >> Updating the device tree must not be required to get the kernel to >> boot, but it is valid to require a DT upgrade to get better >> performance (battery life) out of the platform. > > In this case people using SImple FB are not really using an officially > sanctioned device tree. The simple-fb fragment is created on the fly > via a "DO NOT SUBMIT" patch sitting on a code review server. It's not > something that's shipped with real firmware nor is it something > present in the kernel. See > <https://chromium-review.googlesource.com/#/c/49358/2/board/samsung/smdk5250/smdk5250.c> > as I mentioned above. > > Is this really a device tree that we need to guarantee backward > compatibility with? Well, lets see... We've got a real user complaining about a platform that used to work on mainline, and no longer does. The only loophole for ignoring breakage is if there nobody cares that it is broken. That currently isn't the case. So even though it's based on a patch that has "DO NOT SUBMIT" in large friendly letters on the front cover, it doesn't change the situation that mainline has a regression. g.
Hi, Been travelling I'm buried in email, so a bit slow at responding. On Mon, Sep 8, 2014 at 12:40 PM, Grant Likely <grant.likely@secretlab.ca> wrote: > On Mon, Sep 8, 2014 at 4:58 PM, Doug Anderson <dianders@chromium.org> wrote: >> Grant, >> >> On Mon, Sep 8, 2014 at 5:20 AM, Grant Likely <grant.likely@secretlab.ca> wrote: >>> On Mon, Sep 8, 2014 at 12:21 PM, Will Deacon <will.deacon@arm.com> wrote: >>>> On Sun, Sep 07, 2014 at 05:19:03PM +0100, Tomasz Figa wrote: >>>>> At least for next 3.17-rc I'd suggest fixing this up in respective clock >>>>> driver and dropping the hack only after Exynos DRM patches are merged >>>>> and confirmed working. >>>> >>>> Whilst I'm sympathetic to people working to enable DRM, I think this is >>>> the right solution to the problem. The transition from simplefb to DRM >>>> shouldn't break display for a bunch of kernel revisions whilst the code is >>>> in flux. >>> >>> I would go further. The kernel behaviour has changed, and we have to >>> deal with platforms that assume the old behaviour. That means either >>> defaulting to leaving enabled regulators/clocks alone unless there is >>> a flag in the DT saying they can be power managed, or black listing >>> platforms that are known to depend on the regulator being on. >>> >>> Updating the device tree must not be required to get the kernel to >>> boot, but it is valid to require a DT upgrade to get better >>> performance (battery life) out of the platform. >> >> In this case people using SImple FB are not really using an officially >> sanctioned device tree. The simple-fb fragment is created on the fly >> via a "DO NOT SUBMIT" patch sitting on a code review server. It's not >> something that's shipped with real firmware nor is it something >> present in the kernel. See >> <https://chromium-review.googlesource.com/#/c/49358/2/board/samsung/smdk5250/smdk5250.c> >> as I mentioned above. >> >> Is this really a device tree that we need to guarantee backward >> compatibility with? > > Well, lets see... We've got a real user complaining about a platform > that used to work on mainline, and no longer does. The only loophole > for ignoring breakage is if there nobody cares that it is broken. That > currently isn't the case. So even though it's based on a patch that > has "DO NOT SUBMIT" in large friendly letters on the front cover, it > doesn't change the situation that mainline has a regression. Yeah, I'm with you on this Grant, it doesn't matter what the patch is labelled as. For extra added complication, the firmware that is referenced above isn't what most people use, they use another binary that someone that I don't even know who it is has built, that boots the kernel in HYP mode. I expect the ARM guys to be using that version since they make use of KVM, etc. One way to deal with this could be to add a quirk at boot time -- looking for the simplefb and if found, modifies the regulators to keep them on. That'd go in the kernel, not in firmware. Much better would have been if the DRM changes worked when they landed, so that the migration form simplefb to drm was invisible to the user. Or at least, to get them working ASAP since they're still broken. :( -Olof
On Wed, Sep 10, 2014 at 06:06:46AM -0700, Olof Johansson wrote: > On Mon, Sep 8, 2014 at 12:40 PM, Grant Likely <grant.likely@secretlab.ca> wrote: > > Well, lets see... We've got a real user complaining about a platform > > that used to work on mainline, and no longer does. The only loophole > > for ignoring breakage is if there nobody cares that it is broken. That > > currently isn't the case. So even though it's based on a patch that > > has "DO NOT SUBMIT" in large friendly letters on the front cover, it > > doesn't change the situation that mainline has a regression. > Yeah, I'm with you on this Grant, it doesn't matter what the patch is > labelled as. > One way to deal with this could be to add a quirk at boot time -- > looking for the simplefb and if found, modifies the regulators to keep > them on. That'd go in the kernel, not in firmware. Well, we should also be fixing simplefb to manage the resources it uses though that doesn't clean up after the broken DTs that are currently deployed. As well as the regulators we'll also need to fix the clocks. If we're going to start adding these fixups perhaps we want to consider having a wrapper stage that deals with rewriting DTs prior to trying to use them? I'm not sure if it makes much difference but there's overlap with other tools like the ATAGs conversion wrapper and building separately would let the fixup code run early without directly going into the early init code (which seems a bit scary). > Much better would have been if the DRM changes worked when they > landed, so that the migration form simplefb to drm was invisible to > the user. Or at least, to get them working ASAP since they're still > broken. :( As far as I can tell the problem here is coming from the decision to have simplefb use resources without knowing about them - can we agree that this is a bad idea?
On Wed, Sep 10, 2014 at 3:31 PM, Mark Brown <broonie@kernel.org> wrote: > On Wed, Sep 10, 2014 at 06:06:46AM -0700, Olof Johansson wrote: >> On Mon, Sep 8, 2014 at 12:40 PM, Grant Likely <grant.likely@secretlab.ca> wrote: > >> > Well, lets see... We've got a real user complaining about a platform >> > that used to work on mainline, and no longer does. The only loophole >> > for ignoring breakage is if there nobody cares that it is broken. That >> > currently isn't the case. So even though it's based on a patch that >> > has "DO NOT SUBMIT" in large friendly letters on the front cover, it >> > doesn't change the situation that mainline has a regression. > >> Yeah, I'm with you on this Grant, it doesn't matter what the patch is >> labelled as. > >> One way to deal with this could be to add a quirk at boot time -- >> looking for the simplefb and if found, modifies the regulators to keep >> them on. That'd go in the kernel, not in firmware. > > Well, we should also be fixing simplefb to manage the resources it uses > though that doesn't clean up after the broken DTs that are currently > deployed. > > As well as the regulators we'll also need to fix the clocks. If we're > going to start adding these fixups perhaps we want to consider having a > wrapper stage that deals with rewriting DTs prior to trying to use them? > I'm not sure if it makes much difference but there's overlap with other > tools like the ATAGs conversion wrapper and building separately would > let the fixup code run early without directly going into the early init > code (which seems a bit scary). > >> Much better would have been if the DRM changes worked when they >> landed, so that the migration form simplefb to drm was invisible to >> the user. Or at least, to get them working ASAP since they're still >> broken. :( > > As far as I can tell the problem here is coming from the decision to > have simplefb use resources without knowing about them - can we agree > that this is a bad idea? No, I don't think we can... there is a certain amount of "firmware got things working for us, and we're going to use it for a while" that is absolutely reasonable. simplefb is a good example, but there are certainly others. I /do/ think it would be better for the simplefb data to get embedded or linked into the node of the graphics controller so that it can be torn down appropriately, and we need a rule for how long boot-state can be considered valid so that a proper driver can either reserve the resources for a given SoC, or do a full handoff from the simplefb. Even without that though, we need to be able to handle the case of an anonymous simplefb node with no regulator information. If that means the default simplefb behaviour is to inhibit runtime pm on all resources until a real driver show up, then that might just be what we need to do. Two things should probably be changed from the current setup. 1) simplefb shouldn't be a platform driver. It is a boot thing that handles initial state from the graphics chip. By implementing it as a platform driver, it prevents the real driver from binding to the real device if the simplefb data embedded into it. 2) make sure that an SoC driver can protect the needed resources before they are automatically disabled. Either by putting them in an earlier initcall, or handling it in the subsystem code. I don't know enough about the regulator and clock runtime PM to know what the best way to do this is. g.
On Wed, Sep 10, 2014 at 03:56:16PM +0100, Grant Likely wrote: > On Wed, Sep 10, 2014 at 3:31 PM, Mark Brown <broonie@kernel.org> wrote: > > As far as I can tell the problem here is coming from the decision to > > have simplefb use resources without knowing about them - can we agree > > that this is a bad idea? > No, I don't think we can... there is a certain amount of "firmware got > things working for us, and we're going to use it for a while" that is > absolutely reasonable. simplefb is a good example, but there are > certainly others. That bit is fine - I definitely think it's reasonable to have things like this where the device is initialized prior to the kernel starting and we use some simplified subset. What I think is a big problem here is that we're not being told what parts of the system state are relevant to this initialization (worse, we're being told things that are actively wrong for some of the resources). This seems inherently fragile. > I /do/ think it would be better for the simplefb data to get embedded > or linked into the node of the graphics controller so that it can be > torn down appropriately, and we need a rule for how long boot-state > can be considered valid so that a proper driver can either reserve the > resources for a given SoC, or do a full handoff from the simplefb. > Even without that though, we need to be able to handle the case of an > anonymous simplefb node with no regulator information. If that means > the default simplefb behaviour is to inhibit runtime pm on all > resources until a real driver show up, then that might just be what we > need to do. I think saying that it's a good idea to have an simplefb node without resource management is exactly the problem here - if we start from the assumption that this is a good idea we do get dragged down this path but it seems like we took a wrong turn going that way in the first place. It's not just regulators - we've got exactly the same problem with clocks on this system for example, they're also getting disabled because they seem unused and users have to pass in a kernel command line bodge to avoid that. We'd also have an issue if something decided to change the rates of some of the clocks, and power domains have the same problem (Ulf's patches to genericise their code has the same behaviour with regard to powering off unused domains, some of the existing implementations do that already). > Two things should probably be changed from the current setup. 1) > simplefb shouldn't be a platform driver. It is a boot thing that > handles initial state from the graphics chip. By implementing it as a > platform driver, it prevents the real driver from binding to the real > device if the simplefb data embedded into it. 2) make sure that an SoC > driver can protect the needed resources before they are automatically > disabled. Either by putting them in an earlier initcall, or handling > it in the subsystem code. I don't know enough about the regulator and > clock runtime PM to know what the best way to do this is. Right, I agree with what you're saying here but what I'm saying is that the way to ensure that the resources are protected is for the simplefb node to tell the kernel what resources are being used, otherwise it seems like we're just guessing and will fall over ourselves sooner or later. We can't use initcall hacks as these only work in cases where we will at some point hand over to a real driver and there seems to be a clear use case for using simplefb prior to that driver being written; even where we will hand over to a real driver we can't put a definite timescale on that happening since in the distro case it might be being loaded from disk at some point after userspace is running. There's not a lot the subsystem can do without breaking other users or making the system substantially worse if we don't tell it what resources are affected, we'll end up being forced to tend too far towards being conservative about what we allow to happen.
On Wed, Sep 10, 2014 at 4:39 PM, Mark Brown <broonie@kernel.org> wrote: > On Wed, Sep 10, 2014 at 03:56:16PM +0100, Grant Likely wrote: >> On Wed, Sep 10, 2014 at 3:31 PM, Mark Brown <broonie@kernel.org> wrote: > >> > As far as I can tell the problem here is coming from the decision to >> > have simplefb use resources without knowing about them - can we agree >> > that this is a bad idea? > >> No, I don't think we can... there is a certain amount of "firmware got >> things working for us, and we're going to use it for a while" that is >> absolutely reasonable. simplefb is a good example, but there are >> certainly others. > > That bit is fine - I definitely think it's reasonable to have things > like this where the device is initialized prior to the kernel starting > and we use some simplified subset. What I think is a big problem here > is that we're not being told what parts of the system state are relevant > to this initialization (worse, we're being told things that are actively > wrong for some of the resources). This seems inherently fragile. > >> I /do/ think it would be better for the simplefb data to get embedded >> or linked into the node of the graphics controller so that it can be >> torn down appropriately, and we need a rule for how long boot-state >> can be considered valid so that a proper driver can either reserve the >> resources for a given SoC, or do a full handoff from the simplefb. >> Even without that though, we need to be able to handle the case of an >> anonymous simplefb node with no regulator information. If that means >> the default simplefb behaviour is to inhibit runtime pm on all >> resources until a real driver show up, then that might just be what we >> need to do. > > I think saying that it's a good idea to have an simplefb node without > resource management is exactly the problem here - if we start from the > assumption that this is a good idea we do get dragged down this path but > it seems like we took a wrong turn going that way in the first place. > > It's not just regulators - we've got exactly the same problem with > clocks on this system for example, they're also getting disabled because > they seem unused and users have to pass in a kernel command line bodge > to avoid that. We'd also have an issue if something decided to change > the rates of some of the clocks, and power domains have the same problem > (Ulf's patches to genericise their code has the same behaviour with > regard to powering off unused domains, some of the existing > implementations do that already). > >> Two things should probably be changed from the current setup. 1) >> simplefb shouldn't be a platform driver. It is a boot thing that >> handles initial state from the graphics chip. By implementing it as a >> platform driver, it prevents the real driver from binding to the real >> device if the simplefb data embedded into it. 2) make sure that an SoC >> driver can protect the needed resources before they are automatically >> disabled. Either by putting them in an earlier initcall, or handling >> it in the subsystem code. I don't know enough about the regulator and >> clock runtime PM to know what the best way to do this is. > > Right, I agree with what you're saying here but what I'm saying is that > the way to ensure that the resources are protected is for the simplefb > node to tell the kernel what resources are being used, otherwise it > seems like we're just guessing and will fall over ourselves sooner or > later. > > We can't use initcall hacks as these only work in cases where we will at > some point hand over to a real driver and there seems to be a clear use > case for using simplefb prior to that driver being written; even where > we will hand over to a real driver we can't put a definite timescale on > that happening since in the distro case it might be being loaded from > disk at some point after userspace is running. What we can do is have an inhibit flag for simplefb/simpleuart/simplewhatever that holds off PM. When a real driver, or a stub that understands parsing the resource dependencies, takes ownership of the device (or userspace tells the kernel to stop caring) it can clear the inhibit. I don't want to build knowledge of resource dependencies into the simple case. We'll simply frequently get it wrong. For example: A future kernel will have better PM and will turn off more devices which isn't accounted for in an older DT. g.
On Wed, Sep 10, 2014 at 7:31 AM, Mark Brown <broonie@kernel.org> wrote: > On Wed, Sep 10, 2014 at 06:06:46AM -0700, Olof Johansson wrote: >> On Mon, Sep 8, 2014 at 12:40 PM, Grant Likely <grant.likely@secretlab.ca> wrote: > >> > Well, lets see... We've got a real user complaining about a platform >> > that used to work on mainline, and no longer does. The only loophole >> > for ignoring breakage is if there nobody cares that it is broken. That >> > currently isn't the case. So even though it's based on a patch that >> > has "DO NOT SUBMIT" in large friendly letters on the front cover, it >> > doesn't change the situation that mainline has a regression. > >> Yeah, I'm with you on this Grant, it doesn't matter what the patch is >> labelled as. > >> One way to deal with this could be to add a quirk at boot time -- >> looking for the simplefb and if found, modifies the regulators to keep >> them on. That'd go in the kernel, not in firmware. > > Well, we should also be fixing simplefb to manage the resources it uses > though that doesn't clean up after the broken DTs that are currently > deployed. > > As well as the regulators we'll also need to fix the clocks. If we're > going to start adding these fixups perhaps we want to consider having a > wrapper stage that deals with rewriting DTs prior to trying to use them? > I'm not sure if it makes much difference but there's overlap with other > tools like the ATAGs conversion wrapper and building separately would > let the fixup code run early without directly going into the early init > code (which seems a bit scary). Yes, having a stage that fixes up broken device trees makes a lot of sense. It can likely be plugged into the machine descriptor today per platform, since I think most things we have going on right now are platform-specific quirks. I'm strongly against doing this outside of the kernel, since they're closely tied together today. We've always had the quirk tables for devices in the kernel, and we used to do this a long time ago on powerpc as well (we did it before we built the flat DT out of the OF equivalent there, most of the time). >> Much better would have been if the DRM changes worked when they >> landed, so that the migration form simplefb to drm was invisible to >> the user. Or at least, to get them working ASAP since they're still >> broken. :( > > As far as I can tell the problem here is coming from the decision to > have simplefb use resources without knowing about them - can we agree > that this is a bad idea? As already argued, there are good reasons to sometimes allow this, as long as it can be expected that it's something that's just used during early boot. For example, having DEBUG_LL output on a pre-mapped framebuffer could be really useful. Once DRM comes up, it'll tear down the existing one. -Olof
Grant, On Wed, Sep 10, 2014 at 9:29 AM, Grant Likely <grant.likely@secretlab.ca> wrote: > On Wed, Sep 10, 2014 at 4:39 PM, Mark Brown <broonie@kernel.org> wrote: >> On Wed, Sep 10, 2014 at 03:56:16PM +0100, Grant Likely wrote: >>> On Wed, Sep 10, 2014 at 3:31 PM, Mark Brown <broonie@kernel.org> wrote: >> >>> > As far as I can tell the problem here is coming from the decision to >>> > have simplefb use resources without knowing about them - can we agree >>> > that this is a bad idea? >> >>> No, I don't think we can... there is a certain amount of "firmware got >>> things working for us, and we're going to use it for a while" that is >>> absolutely reasonable. simplefb is a good example, but there are >>> certainly others. >> >> That bit is fine - I definitely think it's reasonable to have things >> like this where the device is initialized prior to the kernel starting >> and we use some simplified subset. What I think is a big problem here >> is that we're not being told what parts of the system state are relevant >> to this initialization (worse, we're being told things that are actively >> wrong for some of the resources). This seems inherently fragile. >> >>> I /do/ think it would be better for the simplefb data to get embedded >>> or linked into the node of the graphics controller so that it can be >>> torn down appropriately, and we need a rule for how long boot-state >>> can be considered valid so that a proper driver can either reserve the >>> resources for a given SoC, or do a full handoff from the simplefb. >>> Even without that though, we need to be able to handle the case of an >>> anonymous simplefb node with no regulator information. If that means >>> the default simplefb behaviour is to inhibit runtime pm on all >>> resources until a real driver show up, then that might just be what we >>> need to do. >> >> I think saying that it's a good idea to have an simplefb node without >> resource management is exactly the problem here - if we start from the >> assumption that this is a good idea we do get dragged down this path but >> it seems like we took a wrong turn going that way in the first place. >> >> It's not just regulators - we've got exactly the same problem with >> clocks on this system for example, they're also getting disabled because >> they seem unused and users have to pass in a kernel command line bodge >> to avoid that. We'd also have an issue if something decided to change >> the rates of some of the clocks, and power domains have the same problem >> (Ulf's patches to genericise their code has the same behaviour with >> regard to powering off unused domains, some of the existing >> implementations do that already). >> >>> Two things should probably be changed from the current setup. 1) >>> simplefb shouldn't be a platform driver. It is a boot thing that >>> handles initial state from the graphics chip. By implementing it as a >>> platform driver, it prevents the real driver from binding to the real >>> device if the simplefb data embedded into it. 2) make sure that an SoC >>> driver can protect the needed resources before they are automatically >>> disabled. Either by putting them in an earlier initcall, or handling >>> it in the subsystem code. I don't know enough about the regulator and >>> clock runtime PM to know what the best way to do this is. >> >> Right, I agree with what you're saying here but what I'm saying is that >> the way to ensure that the resources are protected is for the simplefb >> node to tell the kernel what resources are being used, otherwise it >> seems like we're just guessing and will fall over ourselves sooner or >> later. >> >> We can't use initcall hacks as these only work in cases where we will at >> some point hand over to a real driver and there seems to be a clear use >> case for using simplefb prior to that driver being written; even where >> we will hand over to a real driver we can't put a definite timescale on >> that happening since in the distro case it might be being loaded from >> disk at some point after userspace is running. > > What we can do is have an inhibit flag for > simplefb/simpleuart/simplewhatever that holds off PM. When a real > driver, or a stub that understands parsing the resource dependencies, > takes ownership of the device (or userspace tells the kernel to stop > caring) it can clear the inhibit. This doesn't seem crazy, though it means that if you're planning on using nothing but simplefb then you're never going to be able to get any power savings anywhere. Right now I know that clock disabling is supposed to be inhibited during the early boot process. I think regulators too? > I don't want to build knowledge of resource dependencies into the > simple case. We'll simply frequently get it wrong. For example: A > future kernel will have better PM and will turn off more devices which > isn't accounted for in an older DT. In IRC I made a suggestion that perhaps the "simplefb" ought to be put in the main in-kernel "dts" file but with no address information and set to "disabled". Then the firmware can do something very simple: set to enabled and fill in the address. Right now the firmware is taking a dts that it doesn't really own (it grabs it from the kernel FIT image) and making changes to it to add simplefb. That's inherently pretty fragile. If the kernel DTS file adds regulators then as we saw things break. I can imagine lots of other breakages, too. If the dts is in kernel then we can add regulator / clock references very easily. As regulators are added to the kernel dts file then the references can be added to simplefb. That doesn't solve the problem with people who have old copies of U-Boot. All of those people have already flashed a custom firmware though, so it doesn't seem unreasonable (to me) to solve their problem by giving them a new custom firmware. I agree that something needs to be done to help those people, but I don't feel like adding extra hacks to the kernel is the right answer to solve people who are clearly living on the bleeding edge. Note that I haven't actually worked with simplefb or the custom U-Boots, so perhaps there's something I'm missing... -Doug
On Wed, Sep 10, 2014 at 05:29:32PM +0100, Grant Likely wrote: > What we can do is have an inhibit flag for > simplefb/simpleuart/simplewhatever that holds off PM. When a real > driver, or a stub that understands parsing the resource dependencies, > takes ownership of the device (or userspace tells the kernel to stop > caring) it can clear the inhibit. It's not quite as simple as just disabling PM - for example in the clocks case we've also got to worry about what happens with rate changes (which is going to get more and more risky as we get smarter about being able to push configuration changes back up the tree), regulators have a similar thing with voltage changes. With simple enables and disables we have to worry about things like handling users who actively want to power things on and and off but may potentially be sharing a resource with an undeclared dependency. If we are going to go with an approach like you suggest I think that rather than require a userspace notification that everything is OK we should have the stub drivers do something which causes the appropriate behaviour to happen so long as they're loaded. This means userspace doesn't need an update and ensures it doesn't have to worry about cases where we're using the stub driver at runtime due to a real driver not being available - we can figure this stuff out within the kernel oureslves. That said a kick from userspace when the first round of module loading has finished would be very helpful, I just don't think we should rely on it for this behaviour. > I don't want to build knowledge of resource dependencies into the > simple case. We'll simply frequently get it wrong. For example: A > future kernel will have better PM and will turn off more devices which > isn't accounted for in an older DT. That is tricky and there will be problems. Being fairly aggressive about doing these things and avoiding having runtime configuration hacks since it makes it harder for people to introduce problems without noticing them, and requiring an explicit request to do resource management at all is the most conservative option. Between them those strategies should help for anything that's getting tested at least, it makes it hard for the kernel to learn about a resource without it being handled safely from the get go.
On Wed, Sep 10, 2014 at 09:36:32AM -0700, Olof Johansson wrote: > On Wed, Sep 10, 2014 at 7:31 AM, Mark Brown <broonie@kernel.org> wrote: > > As well as the regulators we'll also need to fix the clocks. If we're > > going to start adding these fixups perhaps we want to consider having a > > wrapper stage that deals with rewriting DTs prior to trying to use them? > > I'm not sure if it makes much difference but there's overlap with other > > tools like the ATAGs conversion wrapper and building separately would > > let the fixup code run early without directly going into the early init > > code (which seems a bit scary). > I'm strongly against doing this outside of the kernel, since they're > closely tied together today. We've always had the quirk tables for > devices in the kernel, and we used to do this a long time ago on > powerpc as well (we did it before we built the flat DT out of the OF > equivalent there, most of the time). Indeed - sorry, the above wasn't adequately clear. I think that we should build this separately but keep it part of the kernel source. The split I was thinking of was purely technical. > > As far as I can tell the problem here is coming from the decision to > > have simplefb use resources without knowing about them - can we agree > > that this is a bad idea? > As already argued, there are good reasons to sometimes allow this, as > long as it can be expected that it's something that's just used during > early boot. For example, having DEBUG_LL output on a pre-mapped > framebuffer could be really useful. Once DRM comes up, it'll tear down > the existing one. The problem here seems to be that that just during early boot assumption isn't playing out so well...
On Wed, Sep 10, 2014 at 09:45:21AM -0700, Doug Anderson wrote: > Right now I know that clock disabling is supposed to be inhibited > during the early boot process. I think regulators too? No, for regulators we'll quite happily disable anything a consumer asks us to at any point but we'll only do a sweep for regulators that were enabled on startup then not subsequently referenced and turn them off in late_initcall.
Mark, On Wed, Sep 10, 2014 at 12:45 PM, Mark Brown <broonie@kernel.org> wrote: > On Wed, Sep 10, 2014 at 09:45:21AM -0700, Doug Anderson wrote: > >> Right now I know that clock disabling is supposed to be inhibited >> during the early boot process. I think regulators too? > > No, for regulators we'll quite happily disable anything a consumer asks > us to at any point but we'll only do a sweep for regulators that were > enabled on startup then not subsequently referenced and turn them off in > late_initcall. Ah, that sounds exactly like the clock framework then. I think I just didn't explain the clock framework properly. -Doug
On Wed, 10 Sep 2014 15:31:44 +0100, Mark Brown <broonie@kernel.org> wrote: > On Wed, Sep 10, 2014 at 06:06:46AM -0700, Olof Johansson wrote: > > On Mon, Sep 8, 2014 at 12:40 PM, Grant Likely <grant.likely@secretlab.ca> wrote: > > > > Well, lets see... We've got a real user complaining about a platform > > > that used to work on mainline, and no longer does. The only loophole > > > for ignoring breakage is if there nobody cares that it is broken. That > > > currently isn't the case. So even though it's based on a patch that > > > has "DO NOT SUBMIT" in large friendly letters on the front cover, it > > > doesn't change the situation that mainline has a regression. > > > Yeah, I'm with you on this Grant, it doesn't matter what the patch is > > labelled as. > > > One way to deal with this could be to add a quirk at boot time -- > > looking for the simplefb and if found, modifies the regulators to keep > > them on. That'd go in the kernel, not in firmware. > > Well, we should also be fixing simplefb to manage the resources it uses > though that doesn't clean up after the broken DTs that are currently > deployed. > > As well as the regulators we'll also need to fix the clocks. If we're > going to start adding these fixups perhaps we want to consider having a > wrapper stage that deals with rewriting DTs prior to trying to use them? > I'm not sure if it makes much difference but there's overlap with other > tools like the ATAGs conversion wrapper and building separately would > let the fixup code run early without directly going into the early init > code (which seems a bit scary). We've already got a dt fixup hook in the machine struct, created for exactly this reason. Fixing an incorrect DT provided by firmware: arch/arm/include/asm/mach/arch.h: struct machine_desc { ... void (*dt_fixup)(void); ... g.
On Wed, 10 Sep 2014 17:57:23 +0100, Mark Brown <broonie@kernel.org> wrote: > On Wed, Sep 10, 2014 at 05:29:32PM +0100, Grant Likely wrote: > > > What we can do is have an inhibit flag for > > simplefb/simpleuart/simplewhatever that holds off PM. When a real > > driver, or a stub that understands parsing the resource dependencies, > > takes ownership of the device (or userspace tells the kernel to stop > > caring) it can clear the inhibit. > > It's not quite as simple as just disabling PM - for example in the > clocks case we've also got to worry about what happens with rate changes > (which is going to get more and more risky as we get smarter about being > able to push configuration changes back up the tree), regulators have a > similar thing with voltage changes. With simple enables and disables we > have to worry about things like handling users who actively want to > power things on and and off but may potentially be sharing a resource > with an undeclared dependency. I think we can be okay with the above. This is a best-effort situation where we don't want to tear down how firmware has set up the board if it can be reasonably assumed that something depends on it (simplefb). However, if clocks or regulators are shared with other devices and those drivers ask for other settings, then there is simply no recourse. In that situation there must be a driver for the video device that takes care of any constraints. g.
On Thu, Sep 11, 2014 at 10:06:08AM +0100, Grant Likely wrote: > On Wed, 10 Sep 2014 15:31:44 +0100, Mark Brown <broonie@kernel.org> wrote: > > As well as the regulators we'll also need to fix the clocks. If we're > > going to start adding these fixups perhaps we want to consider having a > > wrapper stage that deals with rewriting DTs prior to trying to use them? > > I'm not sure if it makes much difference but there's overlap with other > > tools like the ATAGs conversion wrapper and building separately would > > let the fixup code run early without directly going into the early init > > code (which seems a bit scary). > We've already got a dt fixup hook in the machine struct, created for > exactly this reason. Fixing an incorrect DT provided by firmware: > arch/arm/include/asm/mach/arch.h: > struct machine_desc { > ... > void (*dt_fixup)(void); > ... Hrm, that's in the machine descriptor which doesn't seem the ideal place - something keying off machine ID would be nicer. But that's relatively speaking just detail.
On Thu, Sep 11, 2014 at 10:22:32AM +0100, Grant Likely wrote: > On Wed, 10 Sep 2014 17:57:23 +0100, Mark Brown <broonie@kernel.org> wrote: > > It's not quite as simple as just disabling PM - for example in the > > clocks case we've also got to worry about what happens with rate changes > > (which is going to get more and more risky as we get smarter about being > > able to push configuration changes back up the tree), regulators have a > > similar thing with voltage changes. With simple enables and disables we > > have to worry about things like handling users who actively want to > > power things on and and off but may potentially be sharing a resource > > with an undeclared dependency. > I think we can be okay with the above. This is a best-effort situation > where we don't want to tear down how firmware has set up the board if > it can be reasonably assumed that something depends on it (simplefb). > However, if clocks or regulators are shared with other devices and those > drivers ask for other settings, then there is simply no recourse. In > that situation there must be a driver for the video device that takes > care of any constraints. When things break I'm not sure that users are going to understand that something that used to work for them was only provided on a best effort basis, I think they will expect things to carry on working. It's not going to be great if enabling some driver for a device that happens to be in the same power domain as a component used in a framebuffer causes the display to vanish, or if better power management in an existing driver causes breakage. It's relatively OK to have a brief hiccup during boot but usage seems to have expanded beyond that point and I think we need to take robustness more seriously. Given that we have straightforward ways to communicate resource usage it seems sensible to add robustness to the system by making use of them.
Hi, On Thu, Sep 11, 2014 at 11:03 AM, Mark Brown <broonie@kernel.org> wrote: > On Thu, Sep 11, 2014 at 10:22:32AM +0100, Grant Likely wrote: >> On Wed, 10 Sep 2014 17:57:23 +0100, Mark Brown <broonie@kernel.org> wrote: > >> > It's not quite as simple as just disabling PM - for example in the >> > clocks case we've also got to worry about what happens with rate changes >> > (which is going to get more and more risky as we get smarter about being >> > able to push configuration changes back up the tree), regulators have a >> > similar thing with voltage changes. With simple enables and disables we >> > have to worry about things like handling users who actively want to >> > power things on and and off but may potentially be sharing a resource >> > with an undeclared dependency. > >> I think we can be okay with the above. This is a best-effort situation >> where we don't want to tear down how firmware has set up the board if >> it can be reasonably assumed that something depends on it (simplefb). >> However, if clocks or regulators are shared with other devices and those >> drivers ask for other settings, then there is simply no recourse. In >> that situation there must be a driver for the video device that takes >> care of any constraints. > > When things break I'm not sure that users are going to understand that > something that used to work for them was only provided on a best effort > basis, I think they will expect things to carry on working. Right. This is exactly what happened at the start of this thread. SimpleFB was working only on a best effort basis and then it stopped working. I agree that's pretty non-ideal. -Doug
On Wed, Sep 10, 2014 at 03:56:16PM +0100, Grant Likely wrote: > On Wed, Sep 10, 2014 at 3:31 PM, Mark Brown <broonie@kernel.org> wrote: > > On Wed, Sep 10, 2014 at 06:06:46AM -0700, Olof Johansson wrote: > >> On Mon, Sep 8, 2014 at 12:40 PM, Grant Likely <grant.likely@secretlab.ca> wrote: > > > >> > Well, lets see... We've got a real user complaining about a platform > >> > that used to work on mainline, and no longer does. The only loophole > >> > for ignoring breakage is if there nobody cares that it is broken. That > >> > currently isn't the case. So even though it's based on a patch that > >> > has "DO NOT SUBMIT" in large friendly letters on the front cover, it > >> > doesn't change the situation that mainline has a regression. > > > >> Yeah, I'm with you on this Grant, it doesn't matter what the patch is > >> labelled as. > > > >> One way to deal with this could be to add a quirk at boot time -- > >> looking for the simplefb and if found, modifies the regulators to keep > >> them on. That'd go in the kernel, not in firmware. > > > > Well, we should also be fixing simplefb to manage the resources it uses > > though that doesn't clean up after the broken DTs that are currently > > deployed. > > > > As well as the regulators we'll also need to fix the clocks. If we're > > going to start adding these fixups perhaps we want to consider having a > > wrapper stage that deals with rewriting DTs prior to trying to use them? > > I'm not sure if it makes much difference but there's overlap with other > > tools like the ATAGs conversion wrapper and building separately would > > let the fixup code run early without directly going into the early init > > code (which seems a bit scary). > > > >> Much better would have been if the DRM changes worked when they > >> landed, so that the migration form simplefb to drm was invisible to > >> the user. Or at least, to get them working ASAP since they're still > >> broken. :( > > > > As far as I can tell the problem here is coming from the decision to > > have simplefb use resources without knowing about them - can we agree > > that this is a bad idea? > > No, I don't think we can... there is a certain amount of "firmware got > things working for us, and we're going to use it for a while" that is > absolutely reasonable. simplefb is a good example, but there are > certainly others. > > I /do/ think it would be better for the simplefb data to get embedded > or linked into the node of the graphics controller so that it can be > torn down appropriately, and we need a rule for how long boot-state > can be considered valid so that a proper driver can either reserve the > resources for a given SoC, or do a full handoff from the simplefb. > Even without that though, we need to be able to handle the case of an > anonymous simplefb node with no regulator information. If that means > the default simplefb behaviour is to inhibit runtime pm on all > resources until a real driver show up, then that might just be what we > need to do. > > Two things should probably be changed from the current setup. 1) > simplefb shouldn't be a platform driver. It is a boot thing that > handles initial state from the graphics chip. By implementing it as a > platform driver, it prevents the real driver from binding to the real > device if the simplefb data embedded into it. 2) make sure that an SoC > driver can protect the needed resources before they are automatically > disabled. Either by putting them in an earlier initcall, or handling > it in the subsystem code. I don't know enough about the regulator and > clock runtime PM to know what the best way to do this is. I posted a patch[0] earlier to do this for the clock framework in "that other thread". The idea is that shim drivers for these types of firmware devices can tell the various subsystems that they might need resources that aren't explicitly requested. The current implementation simply uses the existing infrastructure already present for the clk_ignore_unused command-line argument and allows drivers to declare this requirement. It also allows these drivers to retire the request once they've properly handed off to the real driver. Something similar could be done other frameworks. One of the objections to that in the other thread is that it won't prevent clocks from being disabled if some other driver was using those same clocks and doing a clk_enable()/clk_disable() on them. But quite frankly I don't think that's something we need to worry about. Though there are two cases: one is to use simplefb as a means to have early boot messages on a graphical display (and optionally hand off to a real driver). The other is to use simplefb as the only framebuffer driver until a proper driver has been implemented. The latter would have the disadvantage of not allowing unused resources from being garbage collected at all. Then again, I don't think power consumption is going to be a very big issue on hardware where no proper display driver is available. Thierry [0]: http://lists.infradead.org/pipermail/linux-arm-kernel/2014-September/291295.html
On Mon, Sep 29, 2014 at 1:57 PM, Thierry Reding <thierry.reding@gmail.com> wrote: > On Wed, Sep 10, 2014 at 03:56:16PM +0100, Grant Likely wrote: >> On Wed, Sep 10, 2014 at 3:31 PM, Mark Brown <broonie@kernel.org> wrote: >> > On Wed, Sep 10, 2014 at 06:06:46AM -0700, Olof Johansson wrote: >> >> On Mon, Sep 8, 2014 at 12:40 PM, Grant Likely <grant.likely@secretlab.ca> wrote: >> > >> >> > Well, lets see... We've got a real user complaining about a platform >> >> > that used to work on mainline, and no longer does. The only loophole >> >> > for ignoring breakage is if there nobody cares that it is broken. That >> >> > currently isn't the case. So even though it's based on a patch that >> >> > has "DO NOT SUBMIT" in large friendly letters on the front cover, it >> >> > doesn't change the situation that mainline has a regression. >> > >> >> Yeah, I'm with you on this Grant, it doesn't matter what the patch is >> >> labelled as. >> > >> >> One way to deal with this could be to add a quirk at boot time -- >> >> looking for the simplefb and if found, modifies the regulators to keep >> >> them on. That'd go in the kernel, not in firmware. >> > >> > Well, we should also be fixing simplefb to manage the resources it uses >> > though that doesn't clean up after the broken DTs that are currently >> > deployed. >> > >> > As well as the regulators we'll also need to fix the clocks. If we're >> > going to start adding these fixups perhaps we want to consider having a >> > wrapper stage that deals with rewriting DTs prior to trying to use them? >> > I'm not sure if it makes much difference but there's overlap with other >> > tools like the ATAGs conversion wrapper and building separately would >> > let the fixup code run early without directly going into the early init >> > code (which seems a bit scary). >> > >> >> Much better would have been if the DRM changes worked when they >> >> landed, so that the migration form simplefb to drm was invisible to >> >> the user. Or at least, to get them working ASAP since they're still >> >> broken. :( >> > >> > As far as I can tell the problem here is coming from the decision to >> > have simplefb use resources without knowing about them - can we agree >> > that this is a bad idea? >> >> No, I don't think we can... there is a certain amount of "firmware got >> things working for us, and we're going to use it for a while" that is >> absolutely reasonable. simplefb is a good example, but there are >> certainly others. >> >> I /do/ think it would be better for the simplefb data to get embedded >> or linked into the node of the graphics controller so that it can be >> torn down appropriately, and we need a rule for how long boot-state >> can be considered valid so that a proper driver can either reserve the >> resources for a given SoC, or do a full handoff from the simplefb. >> Even without that though, we need to be able to handle the case of an >> anonymous simplefb node with no regulator information. If that means >> the default simplefb behaviour is to inhibit runtime pm on all >> resources until a real driver show up, then that might just be what we >> need to do. >> >> Two things should probably be changed from the current setup. 1) >> simplefb shouldn't be a platform driver. It is a boot thing that >> handles initial state from the graphics chip. By implementing it as a >> platform driver, it prevents the real driver from binding to the real >> device if the simplefb data embedded into it. 2) make sure that an SoC >> driver can protect the needed resources before they are automatically >> disabled. Either by putting them in an earlier initcall, or handling >> it in the subsystem code. I don't know enough about the regulator and >> clock runtime PM to know what the best way to do this is. > > I posted a patch[0] earlier to do this for the clock framework in "that > other thread". The idea is that shim drivers for these types of firmware > devices can tell the various subsystems that they might need resources > that aren't explicitly requested. The current implementation simply uses > the existing infrastructure already present for the clk_ignore_unused > command-line argument and allows drivers to declare this requirement. It > also allows these drivers to retire the request once they've properly > handed off to the real driver. > > Something similar could be done other frameworks. > > One of the objections to that in the other thread is that it won't > prevent clocks from being disabled if some other driver was using those > same clocks and doing a clk_enable()/clk_disable() on them. But quite > frankly I don't think that's something we need to worry about. Agreed > Though there are two cases: one is to use simplefb as a means to have > early boot messages on a graphical display (and optionally hand off to a > real driver). The other is to use simplefb as the only framebuffer > driver until a proper driver has been implemented. The latter would have > the disadvantage of not allowing unused resources from being garbage > collected at all. Then again, I don't think power consumption is going > to be a very big issue on hardware where no proper display driver is > available. When simplefb is the only framebuffer to get a platform working, it is reasonable to have a placeholder driver that grabs the resources and nothing else. When a real driver is implemented, and merged, the placeholder driver should drop compatibility with the device node at the same time. g.
On Mon, Sep 29, 2014 at 02:12:43PM +0100, Grant Likely wrote: > On Mon, Sep 29, 2014 at 1:57 PM, Thierry Reding > > Though there are two cases: one is to use simplefb as a means to have > > early boot messages on a graphical display (and optionally hand off to a > > real driver). The other is to use simplefb as the only framebuffer > > driver until a proper driver has been implemented. The latter would have > > the disadvantage of not allowing unused resources from being garbage > > collected at all. Then again, I don't think power consumption is going > > to be a very big issue on hardware where no proper display driver is > > available. > When simplefb is the only framebuffer to get a platform working, it is > reasonable to have a placeholder driver that grabs the resources and > nothing else. When a real driver is implemented, and merged, the > placeholder driver should drop compatibility with the device node at > the same time. I'd thought there was some objection to doing this? It does seem like a sensible approach.
On Mon, Sep 29, 2014 at 02:57:19PM +0200, Thierry Reding wrote: > On Wed, Sep 10, 2014 at 03:56:16PM +0100, Grant Likely wrote: > > On Wed, Sep 10, 2014 at 3:31 PM, Mark Brown <broonie@kernel.org> wrote: > > > On Wed, Sep 10, 2014 at 06:06:46AM -0700, Olof Johansson wrote: > > >> On Mon, Sep 8, 2014 at 12:40 PM, Grant Likely <grant.likely@secretlab.ca> wrote: > > > > > >> > Well, lets see... We've got a real user complaining about a platform > > >> > that used to work on mainline, and no longer does. The only loophole > > >> > for ignoring breakage is if there nobody cares that it is broken. That > > >> > currently isn't the case. So even though it's based on a patch that > > >> > has "DO NOT SUBMIT" in large friendly letters on the front cover, it > > >> > doesn't change the situation that mainline has a regression. > > > > > >> Yeah, I'm with you on this Grant, it doesn't matter what the patch is > > >> labelled as. > > > > > >> One way to deal with this could be to add a quirk at boot time -- > > >> looking for the simplefb and if found, modifies the regulators to keep > > >> them on. That'd go in the kernel, not in firmware. > > > > > > Well, we should also be fixing simplefb to manage the resources it uses > > > though that doesn't clean up after the broken DTs that are currently > > > deployed. > > > > > > As well as the regulators we'll also need to fix the clocks. If we're > > > going to start adding these fixups perhaps we want to consider having a > > > wrapper stage that deals with rewriting DTs prior to trying to use them? > > > I'm not sure if it makes much difference but there's overlap with other > > > tools like the ATAGs conversion wrapper and building separately would > > > let the fixup code run early without directly going into the early init > > > code (which seems a bit scary). > > > > > >> Much better would have been if the DRM changes worked when they > > >> landed, so that the migration form simplefb to drm was invisible to > > >> the user. Or at least, to get them working ASAP since they're still > > >> broken. :( > > > > > > As far as I can tell the problem here is coming from the decision to > > > have simplefb use resources without knowing about them - can we agree > > > that this is a bad idea? > > > > No, I don't think we can... there is a certain amount of "firmware got > > things working for us, and we're going to use it for a while" that is > > absolutely reasonable. simplefb is a good example, but there are > > certainly others. > > > > I /do/ think it would be better for the simplefb data to get embedded > > or linked into the node of the graphics controller so that it can be > > torn down appropriately, and we need a rule for how long boot-state > > can be considered valid so that a proper driver can either reserve the > > resources for a given SoC, or do a full handoff from the simplefb. > > Even without that though, we need to be able to handle the case of an > > anonymous simplefb node with no regulator information. If that means > > the default simplefb behaviour is to inhibit runtime pm on all > > resources until a real driver show up, then that might just be what we > > need to do. > > > > Two things should probably be changed from the current setup. 1) > > simplefb shouldn't be a platform driver. It is a boot thing that > > handles initial state from the graphics chip. By implementing it as a > > platform driver, it prevents the real driver from binding to the real > > device if the simplefb data embedded into it. 2) make sure that an SoC > > driver can protect the needed resources before they are automatically > > disabled. Either by putting them in an earlier initcall, or handling > > it in the subsystem code. I don't know enough about the regulator and > > clock runtime PM to know what the best way to do this is. > > I posted a patch[0] earlier to do this for the clock framework in "that > other thread". The idea is that shim drivers for these types of firmware > devices can tell the various subsystems that they might need resources > that aren't explicitly requested. The current implementation simply uses > the existing infrastructure already present for the clk_ignore_unused > command-line argument and allows drivers to declare this requirement. It > also allows these drivers to retire the request once they've properly > handed off to the real driver. > > Something similar could be done other frameworks. > > One of the objections to that in the other thread is that it won't > prevent clocks from being disabled if some other driver was using those > same clocks and doing a clk_enable()/clk_disable() on them. But quite > frankly I don't think that's something we need to worry about. That's not what has been said. What might happen is this. parent (gate) | +------> clock A (sound) | +------> clock B (display) So. Let's say that at boot, we have parent enabled, and display enabled, and that the display have been setup. We have a sound driver that is going to probe. If at *any* point in time, the sound driver is to disable its clock, the clock framework, since there's no registered user left of the parent clock, will disable the parent clock as well, effectively disabling the display clock. This can happen for various reason: failed probe, PM, whatever, or even if the sound clock is to be reparented. Quite frankly, I think that's definitely something we need to worry about. > Though there are two cases: one is to use simplefb as a means to have > early boot messages on a graphical display (and optionally hand off to a > real driver). The other is to use simplefb as the only framebuffer > driver until a proper driver has been implemented. The latter would have > the disadvantage of not allowing unused resources from being garbage > collected at all. Then again, I don't think power consumption is going > to be a very big issue on hardware where no proper display driver is > available. Two use cases, one single driver, and a proper way to prevent all the issues your solution doesn't address, like the one we saw above, or preventing the clock rate to change. Maxime
On Mon, Sep 29, 2014 at 02:12:43PM +0100, Grant Likely wrote: > On Mon, Sep 29, 2014 at 1:57 PM, Thierry Reding <thierry.reding@gmail.com> wrote: > > Though there are two cases: one is to use simplefb as a means to have > > early boot messages on a graphical display (and optionally hand off to a > > real driver). The other is to use simplefb as the only framebuffer > > driver until a proper driver has been implemented. The latter would have > > the disadvantage of not allowing unused resources from being garbage > > collected at all. Then again, I don't think power consumption is going > > to be a very big issue on hardware where no proper display driver is > > available. > > When simplefb is the only framebuffer to get a platform working, it is > reasonable to have a placeholder driver that grabs the resources and > nothing else. When a real driver is implemented, and merged, the > placeholder driver should drop compatibility with the device node at > the same time. You mean the device node for the real device should be compatible with "simplefb"? One problem I see with that is that there may be multiple dummy drivers for different pieces of hardware, all of them binding to the simplefb compatible and conflicting. Also this assumes that a device tree node exists for the device. One of the reasons for using simplefb is so that you don't have to write that device tree node and its binding yet. Presumably, though, if the firmware already knows what resources are needed and generate them at runtime it should be possible to come up with a static device tree node, too. Thierry
diff --git a/arch/arm/boot/dts/exynos5250-snow.dts b/arch/arm/boot/dts/exynos5250-snow.dts index 2a62459..6a29b44 100644 --- a/arch/arm/boot/dts/exynos5250-snow.dts +++ b/arch/arm/boot/dts/exynos5250-snow.dts @@ -196,6 +196,7 @@ }; fet1 { regulator-name = "vcd_led"; + regulator-always-on; ti,overcurrent-wait = <3>; }; tps65090_fet2: fet2 { @@ -219,6 +220,7 @@ }; fet6 { regulator-name = "lcd_vdd"; + regulator-always-on; ti,overcurrent-wait = <3>; }; tps65090_fet7: fet7 {