Message ID | 1427730803-28635-3-git-send-email-javier.martinez@collabora.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Javier, Please see my comments inline. 2015-03-31 0:53 GMT+09:00 Javier Martinez Canillas <javier.martinez@collabora.co.uk>: [snip] > diff --git a/drivers/clk/samsung/clk-exynos5420.c b/drivers/clk/samsung/clk-exynos5420.c > index 07d666cc6a29..2d39b629144a 100644 > --- a/drivers/clk/samsung/clk-exynos5420.c > +++ b/drivers/clk/samsung/clk-exynos5420.c > @@ -151,6 +151,7 @@ enum exynos5x_plls { > > static void __iomem *reg_base; > static enum exynos5x_soc exynos5x_soc; > +struct samsung_clk_provider *ctx; static > > #ifdef CONFIG_PM_SLEEP > static struct samsung_clk_reg_dump *exynos5x_save; > @@ -275,8 +276,18 @@ static const struct samsung_clk_reg_dump exynos5420_set_clksrc[] = { > { .offset = GATE_IP_PERIC, .value = 0xffffffff, }, > }; > > +/* > + * list of clocks that have to be kept enabled during suspend/resume cycle. > + */ > +static unsigned int exynos5x_clk_suspend[] = { static const > + CLK_MDMA0, > +}; > + > static int exynos5420_clk_suspend(void) > { > + int i; > + struct clk *clk; > + > samsung_clk_save(reg_base, exynos5x_save, > ARRAY_SIZE(exynos5x_clk_regs)); > > @@ -287,11 +298,24 @@ static int exynos5420_clk_suspend(void) > samsung_clk_restore(reg_base, exynos5420_set_clksrc, > ARRAY_SIZE(exynos5420_set_clksrc)); > > + for (i = 0; i < ARRAY_SIZE(exynos5x_clk_suspend); i++) { > + clk = samsung_clk_lookup(ctx, exynos5x_clk_suspend[i]); If look-up speed is important here, maybe all the relevant clocks could be looked up once at initialization time and just prepared and enabled here? Best regards, Tomasz
Hello Tomasz, Thanks a lot for your feedback. On 03/30/2015 06:07 PM, Tomasz Figa wrote: > Hi Javier, > > Please see my comments inline. > > 2015-03-31 0:53 GMT+09:00 Javier Martinez Canillas > <javier.martinez@collabora.co.uk>: > [snip] >> diff --git a/drivers/clk/samsung/clk-exynos5420.c b/drivers/clk/samsung/clk-exynos5420.c >> index 07d666cc6a29..2d39b629144a 100644 >> --- a/drivers/clk/samsung/clk-exynos5420.c >> +++ b/drivers/clk/samsung/clk-exynos5420.c >> @@ -151,6 +151,7 @@ enum exynos5x_plls { >> >> static void __iomem *reg_base; >> static enum exynos5x_soc exynos5x_soc; >> +struct samsung_clk_provider *ctx; > > static > Ok. >> >> #ifdef CONFIG_PM_SLEEP >> static struct samsung_clk_reg_dump *exynos5x_save; >> @@ -275,8 +276,18 @@ static const struct samsung_clk_reg_dump exynos5420_set_clksrc[] = { >> { .offset = GATE_IP_PERIC, .value = 0xffffffff, }, >> }; >> >> +/* >> + * list of clocks that have to be kept enabled during suspend/resume cycle. >> + */ >> +static unsigned int exynos5x_clk_suspend[] = { > > static const > Ok. >> + CLK_MDMA0, >> +}; >> + >> static int exynos5420_clk_suspend(void) >> { >> + int i; >> + struct clk *clk; >> + >> samsung_clk_save(reg_base, exynos5x_save, >> ARRAY_SIZE(exynos5x_clk_regs)); >> >> @@ -287,11 +298,24 @@ static int exynos5420_clk_suspend(void) >> samsung_clk_restore(reg_base, exynos5420_set_clksrc, >> ARRAY_SIZE(exynos5420_set_clksrc)); >> >> + for (i = 0; i < ARRAY_SIZE(exynos5x_clk_suspend); i++) { >> + clk = samsung_clk_lookup(ctx, exynos5x_clk_suspend[i]); > > If look-up speed is important here, maybe all the relevant clocks > could be looked up once at initialization time and just prepared and > enabled here? > Yes, I'll do that indeed. In fact, I was wondering if we should let this clock be disabled at all. I noticed that the rockchip clk drivers do something similar and prepare / enable a list of clocks at init time [0,1]. Unfortunately I don't fully understand why this clock needs to be enabled. It would be good if someone at Samsung can explain in more detail what the real problem really is. > Best regards, > Tomasz > Best regards, Javier [0]: http://lxr.free-electrons.com/source/drivers/clk/rockchip/clk.c#L320 [1]: http://lxr.free-electrons.com/source/drivers/clk/rockchip/clk-rk3288.c#L874
Hello Abhilash, On 03/31/2015 04:38 PM, Abhilash Kesavan wrote: > javier.martinez@collabora.co.uk> wrote: >> On 03/30/2015 06:07 PM, Tomasz Figa wrote: >> > >> > If look-up speed is important here, maybe all the relevant clocks >> > could be looked up once at initialization time and just prepared and >> > enabled here? >> > >> >> Yes, I'll do that indeed. >> >> In fact, I was wondering if we should let this clock be disabled at >> all. I noticed that the rockchip clk drivers do something similar >> and prepare / enable a list of clocks at init time [0,1]. >> >> Unfortunately I don't fully understand why this clock needs to be >> enabled. It would be good if someone at Samsung can explain in >> more detail what the real problem really is. >> > > I had a look at this some more today. The problem actually occurs when the > mdma0 clock's parent - aclk266_g2d gets disabled. The run-time pm support > in the dma driver disables mdma0 and in turn aclk266_g2d which causes the > issue. > From the User Manual, it appears that aclk266_g2d should be gated only when > certain bits in the clock gating status register are 0. I cannot say for > certain, but our gating the aclk266_g2d clock without the CG_STATUS bits > being 0 could be a cause of the suspend failure. > Thanks a lot for the explanation. I see the NOTE at the bottom of section 7.9.1.159 CLK_GATE_BUS_TOP that mentions that. I'll add this information to the commit message when posting as a proper patch instead of a RFC. I confirmed that changing the patch to prevent "aclk266_g2d" to be gated instead of "mdm0" also makes the system to resume correctly from suspend so I'll change that on the patch as well. I see that many of the Exynos5420 clocks (including "aclk266_g2d") use the CLK_IGNORE_UNUSED flag but AFAIU it only prevents the common clock framework to disable the clocks on init but doesn't prevent the clocks to be disabled if all the clock childs are gated so the parent is gated as well. > As the CG_STATUS bits are not being checked anywhere in the kernel I think > aclk266_g2d (and others in GATE_BUS_TOP) should not be gated. I am OK with For now I'll just add "aclk266_g2d" but later if needed all the GATE_BUS_TOP clocks (and others) that should only be gated when CG_STATUS is 0 can be added. My patch iterates over a list of clocks to be kept during suspend even when there is only one for now so adding more later if needed will be trivial. Or do you think that I should add all the GATE_BUS_TOP clocks now? > the suspend specific approach you have posted. > Great, I'll wait a couple of days to see if others have more comments about the last RFC I shared [0] and I'll post it as a proper patch. > Regards, > Abhilash > Thanks a lot for all your help and best regards, Javier [0]: https://lkml.org/lkml/2015/3/31/152
Javier Martinez Canillas <javier.martinez@collabora.co.uk> writes: [...] > Unfortunately I don't fully understand why this clock needs to be > enabled. It would be good if someone at Samsung can explain in more > detail what the real problem really is. +1 Maybe Abhilash can shed some light here? We really should know *why* this is needed because having the fix in the clock driver just doesn't seem right. It seems like the DMA driver should be managing this clock. Kevin
Hi Kevin, On Wed, Apr 1, 2015 at 2:32 AM, Kevin Hilman <khilman@kernel.org> wrote: > Javier Martinez Canillas <javier.martinez@collabora.co.uk> writes: > > [...] > >> Unfortunately I don't fully understand why this clock needs to be >> enabled. It would be good if someone at Samsung can explain in more >> detail what the real problem really is. > > +1 > > Maybe Abhilash can shed some light here? > > We really should know *why* this is needed because having the fix in the > clock driver just doesn't seem right. It seems like the DMA driver > should be managing this clock. I think my last mail might not have reached you (was accidentally sent as html). We are gating the aclk266_g2d clock without checking the CG_STATUS0 register bits as specified in the UM. It looks like we need to keep several clocks alive or gate them only after checking the CG_STATUSx register bits. Regards, Abhilash
Abhilash Kesavan <kesavan.abhilash@gmail.com> writes: > On Wed, Apr 1, 2015 at 2:32 AM, Kevin Hilman <khilman@kernel.org> wrote: >> Javier Martinez Canillas <javier.martinez@collabora.co.uk> writes: >> >> [...] >> >>> Unfortunately I don't fully understand why this clock needs to be >>> enabled. It would be good if someone at Samsung can explain in more >>> detail what the real problem really is. >> >> +1 >> >> Maybe Abhilash can shed some light here? >> >> We really should know *why* this is needed because having the fix in the >> clock driver just doesn't seem right. It seems like the DMA driver >> should be managing this clock. > > I think my last mail might not have reached you (was accidentally sent > as html). Yeah, I saw it a bit later in Javier's reply. Thanks for doing the research and reporting back. > We are gating the aclk266_g2d clock without checking the > CG_STATUS0 register bits as specified in the UM. It looks like we need > to keep several clocks alive or gate them only after checking the > CG_STATUSx register bits. I dont' know much about this clock hardware, but to me it sounds like a clock driver bug. The suspend fix Javier is proposing would fix it, but to me it sounds like the clock driver needs to actually start checking these CG_STATUSx bits before gating clocks. Otherwise, we might fix this current bug but a similar one will come and bite us another day. Kevin
2015-04-01 6:03 GMT+02:00 Kevin Hilman <khilman@kernel.org>: > Abhilash Kesavan <kesavan.abhilash@gmail.com> writes: > >> On Wed, Apr 1, 2015 at 2:32 AM, Kevin Hilman <khilman@kernel.org> wrote: >>> Javier Martinez Canillas <javier.martinez@collabora.co.uk> writes: >>> >>> [...] >>> >>>> Unfortunately I don't fully understand why this clock needs to be >>>> enabled. It would be good if someone at Samsung can explain in more >>>> detail what the real problem really is. >>> >>> +1 >>> >>> Maybe Abhilash can shed some light here? >>> >>> We really should know *why* this is needed because having the fix in the >>> clock driver just doesn't seem right. It seems like the DMA driver >>> should be managing this clock. >> >> I think my last mail might not have reached you (was accidentally sent >> as html). > > Yeah, I saw it a bit later in Javier's reply. Thanks for doing the > research and reporting back. > >> We are gating the aclk266_g2d clock without checking the >> CG_STATUS0 register bits as specified in the UM. It looks like we need >> to keep several clocks alive or gate them only after checking the >> CG_STATUSx register bits. > > I dont' know much about this clock hardware, but to me it sounds like a > clock driver bug. The suspend fix Javier is proposing would fix it, but > to me it sounds like the clock driver needs to actually start checking > these CG_STATUSx bits before gating clocks. > > Otherwise, we might fix this current bug but a similar one will come and > bite us another day. Actually it looks kind a similar to issue with adma/mau_epll clocks: https://lkml.org/lkml/2014/12/5/291 In that case runtime PM for pl330 caused adma clock to be gated. This lead to gating its parent clock - mau_epll. The clock hierarchy is strange for maudio domain. Basically mau_epll is needed to access maudio clocks however these clocks are not children of mau_epll. I think we should not enable the mdma0 but instead we should find the proper parent which needs to be enabled. Anyway it is kind of annoying (or funny if one has sense of black humour) that two issues are exposed by runtime PM for pl330 driver. Best regards, Krzysztof
Hello, On 31/03/15 22:00, Javier Martinez Canillas wrote: > On 03/31/2015 04:38 PM, Abhilash Kesavan wrote: >> javier.martinez@collabora.co.uk> wrote: >>> Unfortunately I don't fully understand why this clock needs to be >>> enabled. It would be good if someone at Samsung can explain in >>> more detail what the real problem really is. >>> >> >> I had a look at this some more today. The problem actually occurs when the >> mdma0 clock's parent - aclk266_g2d gets disabled. The run-time pm support >> in the dma driver disables mdma0 and in turn aclk266_g2d which causes the >> issue. >> From the User Manual, it appears that aclk266_g2d should be gated only when >> certain bits in the clock gating status register are 0. I cannot say for >> certain, but our gating the aclk266_g2d clock without the CG_STATUS bits >> being 0 could be a cause of the suspend failure. >> > > Thanks a lot for the explanation. I see the NOTE at the bottom of section > 7.9.1.159 CLK_GATE_BUS_TOP that mentions that. I'll add this information > to the commit message when posting as a proper patch instead of a RFC. > > I confirmed that changing the patch to prevent "aclk266_g2d" to be gated > instead of "mdm0" also makes the system to resume correctly from suspend > so I'll change that on the patch as well. > > I see that many of the Exynos5420 clocks (including "aclk266_g2d") use the > CLK_IGNORE_UNUSED flag but AFAIU it only prevents the common clock framework > to disable the clocks on init but doesn't prevent the clocks to be disabled > if all the clock childs are gated so the parent is gated as well. > >> As the CG_STATUS bits are not being checked anywhere in the kernel I think >> aclk266_g2d (and others in GATE_BUS_TOP) should not be gated. I am OK with > > For now I'll just add "aclk266_g2d" but later if needed all the GATE_BUS_TOP > clocks (and others) that should only be gated when CG_STATUS is 0 can be > added. My patch iterates over a list of clocks to be kept during suspend even > when there is only one for now so adding more later if needed will be trivial. It's not clear what subsystems affect state of the CG_STATUSx registers, it would be good if we could get more information on that. They are in the PMU block and are related to LPI (Low Power Interface handshaking), but what subsystems/peripheral blocks exactly are associated with them it's not clear from the documentation. I think it's essential to understand what triggers changes in CG_STATUSx registers, before we start checking their value in the clock driver. Also it might be that there are indeed some clocks which must stay enabled over suspend/resume cycle, then the approach with enabling/disabling clocks in the clock driver might not be such a hack as it looks at first sight. > Or do you think that I should add all the GATE_BUS_TOP clocks now? No, please don't do that. That includes many important clocks and we should be certain what we are doing. I don't think it is expected to touch those clocks in that way, it would likely cause more issues.
Hello Sylwester, On 04/01/2015 01:03 PM, Sylwester Nawrocki wrote: > > On 31/03/15 22:00, Javier Martinez Canillas wrote: >> On 03/31/2015 04:38 PM, Abhilash Kesavan wrote: >>> javier.martinez@collabora.co.uk> wrote: > >>>> Unfortunately I don't fully understand why this clock needs to be >>>> enabled. It would be good if someone at Samsung can explain in >>>> more detail what the real problem really is. >>>> >>> >>> I had a look at this some more today. The problem actually occurs when the >>> mdma0 clock's parent - aclk266_g2d gets disabled. The run-time pm support >>> in the dma driver disables mdma0 and in turn aclk266_g2d which causes the >>> issue. >>> From the User Manual, it appears that aclk266_g2d should be gated only when >>> certain bits in the clock gating status register are 0. I cannot say for >>> certain, but our gating the aclk266_g2d clock without the CG_STATUS bits >>> being 0 could be a cause of the suspend failure. >>> >> >> Thanks a lot for the explanation. I see the NOTE at the bottom of section >> 7.9.1.159 CLK_GATE_BUS_TOP that mentions that. I'll add this information >> to the commit message when posting as a proper patch instead of a RFC. >> >> I confirmed that changing the patch to prevent "aclk266_g2d" to be gated >> instead of "mdm0" also makes the system to resume correctly from suspend >> so I'll change that on the patch as well. >> >> I see that many of the Exynos5420 clocks (including "aclk266_g2d") use the >> CLK_IGNORE_UNUSED flag but AFAIU it only prevents the common clock framework >> to disable the clocks on init but doesn't prevent the clocks to be disabled >> if all the clock childs are gated so the parent is gated as well. >> >>> As the CG_STATUS bits are not being checked anywhere in the kernel I think >>> aclk266_g2d (and others in GATE_BUS_TOP) should not be gated. I am OK with >> >> For now I'll just add "aclk266_g2d" but later if needed all the GATE_BUS_TOP >> clocks (and others) that should only be gated when CG_STATUS is 0 can be >> added. My patch iterates over a list of clocks to be kept during suspend even >> when there is only one for now so adding more later if needed will be trivial. > > It's not clear what subsystems affect state of the CG_STATUSx registers, it > would be good if we could get more information on that. They are in the PMU > block and are related to LPI (Low Power Interface handshaking), but what > subsystems/peripheral blocks exactly are associated with them it's not clear > from the documentation. > Yes, I've been looking at the docs again and found out a couple of things: * Each GC_STATUSx register bit is associated with an IP hw block * Some LPI_MASKx registers maps exactly with the GC_STATUSx (i.e: 0 and 1) and others maps only partially (i.e: LPI_MASK2 and GC_STATUS2) So it is related to LPI as you said and both LPI_MASKx and GC_STATUSx are part of the PMU register address space. In the particular case of aclk266_g2d, the doc says that the clock can only be gated when CG_STATUS0[20] and CG_STATUS0[21] are 0. These are associated with the SSS and SSS_SLIM respectively which AFAIU are crypto h/w modules. > I think it's essential to understand what triggers changes in CG_STATUSx > registers, before we start checking their value in the clock driver. > Indeed, we should really understand what the status on these registers means. Also is not clear from the docs how much time should be waited, how long until giving up, etc. > Also it might be that there are indeed some clocks which must stay enabled > over suspend/resume cycle, then the approach with enabling/disabling clocks > in the clock driver might not be such a hack as it looks at first sight. > Having a clock driver to both a provider and consumer feels hacky to me as well but I didn't find a better way to solve this issue... another option is to have this workaround to solve the S2R issue while we figure out what the the state in the CG_STATUSx really mean. >> Or do you think that I should add all the GATE_BUS_TOP clocks now? > > No, please don't do that. That includes many important clocks and we should > be certain what we are doing. I don't think it is expected to touch those > clocks in that way, it would likely cause more issues. > > Perfect, I just asked since it was not clear to me from Abhilash comment. But I also agree to only focus on the clock that is causing issues now. Best regards, Javier
Hello Javier, On 01/04/15 13:44, Javier Martinez Canillas wrote: > On 04/01/2015 01:03 PM, Sylwester Nawrocki wrote: >> On 31/03/15 22:00, Javier Martinez Canillas wrote: >>> On 03/31/2015 04:38 PM, Abhilash Kesavan wrote: >>>> javier.martinez@collabora.co.uk> wrote: >>>> I had a look at this some more today. The problem actually occurs when the >>>> mdma0 clock's parent - aclk266_g2d gets disabled. The run-time pm support >>>> in the dma driver disables mdma0 and in turn aclk266_g2d which causes the >>>> issue. >>>> From the User Manual, it appears that aclk266_g2d should be gated only when >>>> certain bits in the clock gating status register are 0. I cannot say for >>>> certain, but our gating the aclk266_g2d clock without the CG_STATUS bits >>>> being 0 could be a cause of the suspend failure. >>>> >>> >>> Thanks a lot for the explanation. I see the NOTE at the bottom of section >>> 7.9.1.159 CLK_GATE_BUS_TOP that mentions that. I'll add this information >>> to the commit message when posting as a proper patch instead of a RFC. >>> >>> I confirmed that changing the patch to prevent "aclk266_g2d" to be gated >>> instead of "mdm0" also makes the system to resume correctly from suspend >>> so I'll change that on the patch as well. >>> >>> I see that many of the Exynos5420 clocks (including "aclk266_g2d") use the >>> CLK_IGNORE_UNUSED flag but AFAIU it only prevents the common clock framework >>> to disable the clocks on init but doesn't prevent the clocks to be disabled >>> if all the clock childs are gated so the parent is gated as well. >>> >>>> As the CG_STATUS bits are not being checked anywhere in the kernel I think >>>> aclk266_g2d (and others in GATE_BUS_TOP) should not be gated. I am OK with >>> >>> For now I'll just add "aclk266_g2d" but later if needed all the GATE_BUS_TOP >>> clocks (and others) that should only be gated when CG_STATUS is 0 can be >>> added. My patch iterates over a list of clocks to be kept during suspend even >>> when there is only one for now so adding more later if needed will be trivial. >> >> It's not clear what subsystems affect state of the CG_STATUSx registers, it >> would be good if we could get more information on that. They are in the PMU >> block and are related to LPI (Low Power Interface handshaking), but what >> subsystems/peripheral blocks exactly are associated with them it's not clear >> from the documentation. > > Yes, I've been looking at the docs again and found out a couple of things: > > * Each GC_STATUSx register bit is associated with an IP hw block > * Some LPI_MASKx registers maps exactly with the GC_STATUSx (i.e: 0 and 1) > and others maps only partially (i.e: LPI_MASK2 and GC_STATUS2) The CG_STATUSx and LPI_MASKx bits meaning is not matching according to documentation I have. I guess you've got something newer than REV0.00? > So it is related to LPI as you said and both LPI_MASKx and GC_STATUSx are > part of the PMU register address space. > > In the particular case of aclk266_g2d, the doc says that the clock can only > be gated when CG_STATUS0[20] and CG_STATUS0[21] are 0. These are associated > with the SSS and SSS_SLIM respectively which AFAIU are crypto h/w modules. In my Exynos5420 UM ACLK_266_G2D is associated with CG_STATUS0 register bits 22, 21, which in turn correspond to NR3D and DIS IP blocks, i.e. the camera subsystem. Such a dependency would be rather surprising. >> I think it's essential to understand what triggers changes in CG_STATUSx >> registers, before we start checking their value in the clock driver. >> > > Indeed, we should really understand what the status on these registers > means. Also is not clear from the docs how much time should be waited, > how long until giving up, etc. Exactly, I checked some kernels from http://opensource.samsung.com (e.g. SM-N900_JB_Opensource.zip) for CG_STATUSx, but I didn't find anything related to these registers yet, except the address macro definitions and debug traces in the power domains driver. >> Also it might be that there are indeed some clocks which must stay enabled >> over suspend/resume cycle, then the approach with enabling/disabling clocks >> in the clock driver might not be such a hack as it looks at first sight. >> > > Having a clock driver to both a provider and consumer feels hacky to me as > well but I didn't find a better way to solve this issue... another option > is to have this workaround to solve the S2R issue while we figure out what > the the state in the CG_STATUSx really mean. Let's try to diagnose the issue best we can, then we would choose the most accurate bug fix.
Quoting Krzysztof Kozlowski (2015-04-01 02:16:08) > 2015-04-01 6:03 GMT+02:00 Kevin Hilman <khilman@kernel.org>: > > Abhilash Kesavan <kesavan.abhilash@gmail.com> writes: > > > >> On Wed, Apr 1, 2015 at 2:32 AM, Kevin Hilman <khilman@kernel.org> wrote: > >>> Javier Martinez Canillas <javier.martinez@collabora.co.uk> writes: > >>> > >>> [...] > >>> > >>>> Unfortunately I don't fully understand why this clock needs to be > >>>> enabled. It would be good if someone at Samsung can explain in more > >>>> detail what the real problem really is. > >>> > >>> +1 > >>> > >>> Maybe Abhilash can shed some light here? > >>> > >>> We really should know *why* this is needed because having the fix in the > >>> clock driver just doesn't seem right. It seems like the DMA driver > >>> should be managing this clock. > >> > >> I think my last mail might not have reached you (was accidentally sent > >> as html). > > > > Yeah, I saw it a bit later in Javier's reply. Thanks for doing the > > research and reporting back. > > > >> We are gating the aclk266_g2d clock without checking the > >> CG_STATUS0 register bits as specified in the UM. It looks like we need > >> to keep several clocks alive or gate them only after checking the > >> CG_STATUSx register bits. > > > > I dont' know much about this clock hardware, but to me it sounds like a > > clock driver bug. The suspend fix Javier is proposing would fix it, but > > to me it sounds like the clock driver needs to actually start checking > > these CG_STATUSx bits before gating clocks. > > > > Otherwise, we might fix this current bug but a similar one will come and > > bite us another day. > > Actually it looks kind a similar to issue with adma/mau_epll clocks: > https://lkml.org/lkml/2014/12/5/291 > > In that case runtime PM for pl330 caused adma clock to be gated. This > lead to gating its parent clock - mau_epll. The clock hierarchy is > strange for maudio domain. Basically mau_epll is needed to access > maudio clocks however these clocks are not children of mau_epll. This sounds like an interface clock. It must be enabled for you touch registers across a bus/interconnect. For this type of clock is perfectly correct for the driver to enable it in addition to any functional clocks (e.g. maudio clocks). There are many instances where a driver must enable an interface clock (iclk) and a functional clock (fclk) and this isn't an issue of the driver knowing the clock topology, but instead knowing it's operational requirements. Just to make life more confusing, it might also be appropriate for the clock driver to manage turning on the interface clocks if this is required to touching the functional clocks. In general the register address space is used to determine which IP "owns" such behavior. Regards, Mike > > I think we should not enable the mdma0 but instead we should find the > proper parent which needs to be enabled. > > Anyway it is kind of annoying (or funny if one has sense of black > humour) that two issues are exposed by runtime PM for pl330 driver. > > Best regards, > Krzysztof
Hello Sylwester, On 04/01/2015 07:31 PM, Sylwester Nawrocki wrote: > On 01/04/15 13:44, Javier Martinez Canillas wrote: >> On 04/01/2015 01:03 PM, Sylwester Nawrocki wrote: >>> It's not clear what subsystems affect state of the CG_STATUSx registers, it >>> would be good if we could get more information on that. They are in the PMU >>> block and are related to LPI (Low Power Interface handshaking), but what >>> subsystems/peripheral blocks exactly are associated with them it's not clear >>> from the documentation. >> >> Yes, I've been looking at the docs again and found out a couple of things: >> >> * Each GC_STATUSx register bit is associated with an IP hw block >> * Some LPI_MASKx registers maps exactly with the GC_STATUSx (i.e: 0 and 1) >> and others maps only partially (i.e: LPI_MASK2 and GC_STATUS2) > > The CG_STATUSx and LPI_MASKx bits meaning is not matching according to > documentation I have. I guess you've got something newer than REV0.00? > My Exynos5420 UM is Revision 1.00 dated February 2014 and GC_STATUS0 bits maps LPI_MASK0 with the exception of bit 16 (NR3D) that is not mentioned in GC_STATUS0, there is a hole between 15 (DIS) and 17 (FIMC_SCALERP). GC_STATUS1 maps exactly with LPI_MASK1 and GC_STATUS2 and LPI_MASK2 have the same bits from 0 to 5 and then differ from there. >> So it is related to LPI as you said and both LPI_MASKx and GC_STATUSx are >> part of the PMU register address space. >> >> In the particular case of aclk266_g2d, the doc says that the clock can only >> be gated when CG_STATUS0[20] and CG_STATUS0[21] are 0. These are associated >> with the SSS and SSS_SLIM respectively which AFAIU are crypto h/w modules. > > In my Exynos5420 UM ACLK_266_G2D is associated with CG_STATUS0 register > bits 22, 21, which in turn correspond to NR3D and DIS IP blocks, i.e. > the camera subsystem. Such a dependency would be rather surprising. > Sorry, it was a typo error and I actually meant CG_STATUS0 21 and 22 but these correspond in the documentation I've to 21 (SSS) and 22 (SSS_SLIM). As I mentioned before, DIS correspond to CG_STATUS0 15 and NR3D doesn't have a corresponding bit in CG_STATUS0. But I don't know if that is an error in the doc I've since is suspicious that is the only difference between LPI_MASK0 and CG_STATUS0. >>> I think it's essential to understand what triggers changes in CG_STATUSx >>> registers, before we start checking their value in the clock driver. >>> >> >> Indeed, we should really understand what the status on these registers >> means. Also is not clear from the docs how much time should be waited, >> how long until giving up, etc. > > Exactly, I checked some kernels from http://opensource.samsung.com > (e.g. SM-N900_JB_Opensource.zip) for CG_STATUSx, but I didn't find anything > related to these registers yet, except the address macro definitions > and debug traces in the power domains driver. > Yes, I also looked in the ChromiumOS v3.8 kernel but didn't find anything. >>> Also it might be that there are indeed some clocks which must stay enabled >>> over suspend/resume cycle, then the approach with enabling/disabling clocks >>> in the clock driver might not be such a hack as it looks at first sight. >>> >> >> Having a clock driver to both a provider and consumer feels hacky to me as >> well but I didn't find a better way to solve this issue... another option >> is to have this workaround to solve the S2R issue while we figure out what >> the the state in the CG_STATUSx really mean. > > Let's try to diagnose the issue best we can, then we would choose the most > accurate bug fix. > Sounds good to me. Best regards, Javier
Hi, On Thu, Apr 2, 2015 at 4:01 AM, Javier Martinez Canillas <javier.martinez@collabora.co.uk> wrote: > Hello Sylwester, > > On 04/01/2015 07:31 PM, Sylwester Nawrocki wrote: >> On 01/04/15 13:44, Javier Martinez Canillas wrote: >>> On 04/01/2015 01:03 PM, Sylwester Nawrocki wrote: >>>> It's not clear what subsystems affect state of the CG_STATUSx registers, it >>>> would be good if we could get more information on that. They are in the PMU >>>> block and are related to LPI (Low Power Interface handshaking), but what >>>> subsystems/peripheral blocks exactly are associated with them it's not clear >>>> from the documentation. >>> >>> Yes, I've been looking at the docs again and found out a couple of things: >>> >>> * Each GC_STATUSx register bit is associated with an IP hw block >>> * Some LPI_MASKx registers maps exactly with the GC_STATUSx (i.e: 0 and 1) >>> and others maps only partially (i.e: LPI_MASK2 and GC_STATUS2) >> >> The CG_STATUSx and LPI_MASKx bits meaning is not matching according to >> documentation I have. I guess you've got something newer than REV0.00? >> > > My Exynos5420 UM is Revision 1.00 dated February 2014 and GC_STATUS0 bits > maps LPI_MASK0 with the exception of bit 16 (NR3D) that is not mentioned > in GC_STATUS0, there is a hole between 15 (DIS) and 17 (FIMC_SCALERP). > > GC_STATUS1 maps exactly with LPI_MASK1 and GC_STATUS2 and LPI_MASK2 have > the same bits from 0 to 5 and then differ from there. > >>> So it is related to LPI as you said and both LPI_MASKx and GC_STATUSx are >>> part of the PMU register address space. >>> >>> In the particular case of aclk266_g2d, the doc says that the clock can only >>> be gated when CG_STATUS0[20] and CG_STATUS0[21] are 0. These are associated >>> with the SSS and SSS_SLIM respectively which AFAIU are crypto h/w modules. >> >> In my Exynos5420 UM ACLK_266_G2D is associated with CG_STATUS0 register >> bits 22, 21, which in turn correspond to NR3D and DIS IP blocks, i.e. >> the camera subsystem. Such a dependency would be rather surprising. >> > > Sorry, it was a typo error and I actually meant CG_STATUS0 21 and 22 but > these correspond in the documentation I've to 21 (SSS) and 22 (SSS_SLIM). > > As I mentioned before, DIS correspond to CG_STATUS0 15 and NR3D doesn't > have a corresponding bit in CG_STATUS0. But I don't know if that is an > error in the doc I've since is suspicious that is the only difference > between LPI_MASK0 and CG_STATUS0. > >>>> I think it's essential to understand what triggers changes in CG_STATUSx >>>> registers, before we start checking their value in the clock driver. >>>> >>> >>> Indeed, we should really understand what the status on these registers >>> means. Also is not clear from the docs how much time should be waited, >>> how long until giving up, etc. >> >> Exactly, I checked some kernels from http://opensource.samsung.com >> (e.g. SM-N900_JB_Opensource.zip) for CG_STATUSx, but I didn't find anything >> related to these registers yet, except the address macro definitions >> and debug traces in the power domains driver. >> > > Yes, I also looked in the ChromiumOS v3.8 kernel but didn't find anything. > >>>> Also it might be that there are indeed some clocks which must stay enabled >>>> over suspend/resume cycle, then the approach with enabling/disabling clocks >>>> in the clock driver might not be such a hack as it looks at first sight. >>>> >>> >>> Having a clock driver to both a provider and consumer feels hacky to me as >>> well but I didn't find a better way to solve this issue... another option >>> is to have this workaround to solve the S2R issue while we figure out what >>> the the state in the CG_STATUSx really mean. >> >> Let's try to diagnose the issue best we can, then we would choose the most >> accurate bug fix. >> > > Sounds good to me. Based on the earlier comments I was trying to isolate if: 1) s2r fails because we gate aclk266_g2d (but it is one of those clocks that needs to be always on prior to suspend). 2) s2r fails because we gate aclk266_g2d when CG_STATUS0[21:20] bits are not 0 (thus not following the spec). As I did not have access to the hardware guys who could possibly confirm 1), I decided to a) find a configuration where CG_STATUS0 allows gating of the aclk266_g2d clock (i.e. CG_STATUS0[22:21] are 0). b) disable the aclk266_g2d clock using such a configuration. c) check s2r. I found a configuration [1] which gave the following after boot-up: # devmem 0x10040914 0xFD800014 (CG_STATUS0[22:21] is 0) # devmem 0x10020700 0xC6F8DE9F (aclk266_g2d is enabled) At this point s2r works. I rebooted the board with the same config as above and then disabled aclk266_g2d. # devmem 0x10020700 32 0xC6F8DE9D # devmem 0x10020700 0xC6F8DE9D (aclk266_g2d is disabled) # devmem 0x10040914 0xFD800014 and tried s2r - It fails. From the results, disabling the clock seems to cause the issue rather than the CG_STATUS violation. This is all a little confusing, so please let me know if I have missed something. Regards, Abhilash
Hello Abhilash, On 04/02/2015 02:22 PM, Abhilash Kesavan wrote: > Hi, > > On Thu, Apr 2, 2015 at 4:01 AM, Javier Martinez Canillas > <javier.martinez@collabora.co.uk> wrote: >> Hello Sylwester, >> >> On 04/01/2015 07:31 PM, Sylwester Nawrocki wrote: >>> On 01/04/15 13:44, Javier Martinez Canillas wrote: >>>> On 04/01/2015 01:03 PM, Sylwester Nawrocki wrote: >>>>> It's not clear what subsystems affect state of the CG_STATUSx registers, it >>>>> would be good if we could get more information on that. They are in the PMU >>>>> block and are related to LPI (Low Power Interface handshaking), but what >>>>> subsystems/peripheral blocks exactly are associated with them it's not clear >>>>> from the documentation. >>>> >>>> Yes, I've been looking at the docs again and found out a couple of things: >>>> >>>> * Each GC_STATUSx register bit is associated with an IP hw block >>>> * Some LPI_MASKx registers maps exactly with the GC_STATUSx (i.e: 0 and 1) >>>> and others maps only partially (i.e: LPI_MASK2 and GC_STATUS2) >>> >>> The CG_STATUSx and LPI_MASKx bits meaning is not matching according to >>> documentation I have. I guess you've got something newer than REV0.00? >>> >> >> My Exynos5420 UM is Revision 1.00 dated February 2014 and GC_STATUS0 bits >> maps LPI_MASK0 with the exception of bit 16 (NR3D) that is not mentioned >> in GC_STATUS0, there is a hole between 15 (DIS) and 17 (FIMC_SCALERP). >> >> GC_STATUS1 maps exactly with LPI_MASK1 and GC_STATUS2 and LPI_MASK2 have >> the same bits from 0 to 5 and then differ from there. >> >>>> So it is related to LPI as you said and both LPI_MASKx and GC_STATUSx are >>>> part of the PMU register address space. >>>> >>>> In the particular case of aclk266_g2d, the doc says that the clock can only >>>> be gated when CG_STATUS0[20] and CG_STATUS0[21] are 0. These are associated >>>> with the SSS and SSS_SLIM respectively which AFAIU are crypto h/w modules. >>> >>> In my Exynos5420 UM ACLK_266_G2D is associated with CG_STATUS0 register >>> bits 22, 21, which in turn correspond to NR3D and DIS IP blocks, i.e. >>> the camera subsystem. Such a dependency would be rather surprising. >>> >> >> Sorry, it was a typo error and I actually meant CG_STATUS0 21 and 22 but >> these correspond in the documentation I've to 21 (SSS) and 22 (SSS_SLIM). >> >> As I mentioned before, DIS correspond to CG_STATUS0 15 and NR3D doesn't >> have a corresponding bit in CG_STATUS0. But I don't know if that is an >> error in the doc I've since is suspicious that is the only difference >> between LPI_MASK0 and CG_STATUS0. >> >>>>> I think it's essential to understand what triggers changes in CG_STATUSx >>>>> registers, before we start checking their value in the clock driver. >>>>> >>>> >>>> Indeed, we should really understand what the status on these registers >>>> means. Also is not clear from the docs how much time should be waited, >>>> how long until giving up, etc. >>> >>> Exactly, I checked some kernels from http://opensource.samsung.com >>> (e.g. SM-N900_JB_Opensource.zip) for CG_STATUSx, but I didn't find anything >>> related to these registers yet, except the address macro definitions >>> and debug traces in the power domains driver. >>> >> >> Yes, I also looked in the ChromiumOS v3.8 kernel but didn't find anything. >> >>>>> Also it might be that there are indeed some clocks which must stay enabled >>>>> over suspend/resume cycle, then the approach with enabling/disabling clocks >>>>> in the clock driver might not be such a hack as it looks at first sight. >>>>> >>>> >>>> Having a clock driver to both a provider and consumer feels hacky to me as >>>> well but I didn't find a better way to solve this issue... another option >>>> is to have this workaround to solve the S2R issue while we figure out what >>>> the the state in the CG_STATUSx really mean. >>> >>> Let's try to diagnose the issue best we can, then we would choose the most >>> accurate bug fix. >>> >> >> Sounds good to me. > > Based on the earlier comments I was trying to isolate if: > 1) s2r fails because we gate aclk266_g2d (but it is one of those > clocks that needs to be always on prior to suspend). > 2) s2r fails because we gate aclk266_g2d when CG_STATUS0[21:20] bits > are not 0 (thus not following the spec). > Thanks a lot for continue looking at this. I didn't have time to dig deeper on this since last week. > As I did not have access to the hardware guys who could possibly > confirm 1), I decided to > a) find a configuration where CG_STATUS0 allows gating of the > aclk266_g2d clock (i.e. CG_STATUS0[22:21] are 0). > b) disable the aclk266_g2d clock using such a configuration. > c) check s2r. > > I found a configuration [1] which gave the following after boot-up: I think you forgot the reference for [1] right? Since with latest linux-next (20150402) I got: > # devmem 0x10040914 > 0xFD800014 (CG_STATUS0[22:21] is 0) # devmem 0x10040914 0xFFE00000 (CG_STATUS0[22:21] is 1) > # devmem 0x10020700 > 0xC6F8DE9F (aclk266_g2d is enabled) > > At this point s2r works. > > I rebooted the board with the same config as above and then disabled > aclk266_g2d. > > # devmem 0x10020700 32 0xC6F8DE9D > # devmem 0x10020700 > 0xC6F8DE9D (aclk266_g2d is disabled) > # devmem 0x10040914 > 0xFD800014 > > and tried s2r - It fails. > > From the results, disabling the clock seems to cause the issue rather > than the CG_STATUS violation. This is all a little confusing, so > please let me know if I have missed something. > So IIUC the CG_STATUS0 bits were a red herring and the real problem is that the aclk266_g2d needs to be enabled during suspend (although we still don't know why). It seems were are at a dead end now. Without being able to ask the HW Samsung folks we would never know what's going on here... I can re-post my patches to keep aclk266_g2d enabled during suspend in the clk driver if that is the least bad option. But it would be great to solve this issue otherwise S2R will remain to be broken for Exynos5420 which will be really sad. > Regards, > Abhilash > Best regards, Javier
On 04/07/2015 12:59 PM, Javier Martinez Canillas wrote: > > So IIUC the CG_STATUS0 bits were a red herring and the real problem > is that the aclk266_g2d needs to be enabled during suspend (although > we still don't know why). > > It seems were are at a dead end now. Without being able to ask the HW > Samsung folks we would never know what's going on here... > Ok, I found another interesting data point. ACLK_266_G2D has as constraints that CG_STATUS0[21:22] needs to be 0 before gating the clock and as I mentioned before those are associated with the SSS and SSS_SLIM HW crypto modules according the docs I've. So I looked at the clock used by the SSS module and found that CLK_SSS parent is ACLK_266_G2D but CLK_SSS is never requested since drivers/crypto/s5p-sss.c isn't built for exynos_defconfig. Enabling CONFIG_CRYPTO_DEV_S5P makes the system to resume without any patches since the sss clock prevents aclk266_g2d to be gated. But I wanted to know if it was really aclk266_g2d that was needed or the actual sss clock since the kernel didn't manage that clock without the driver enabled. So I disabled the sss clock before trying a S2R: # devmem 0x10018800 32 0xFFFFFFFB (CLK_SSS in CLK_GATE_IP_G2D is gated) and S2R worked anyways but I see that CLK_GATE_IP_G2D is reset to its default value on S2R so maybe that is why it works anyways? # devmem 0x10018800 0xFFFFFFFF (all CLK_GATE_IP_G2D clocks enabled including CLK_SSS) Does this shed any more light? Could the problem be that the sss clock parent (aclk266_g2d) is gated during S2R? Is the SSS module required for S2R or is just that CLK_SSS prevents the parent to be gated and so it is another red herring? Best regards, Javier
2015-04-07 13:56 GMT+02:00 Javier Martinez Canillas <javier.martinez@collabora.co.uk>: > So I disabled the sss clock before trying a S2R: > > # devmem 0x10018800 32 0xFFFFFFFB > (CLK_SSS in CLK_GATE_IP_G2D is gated) > > and S2R worked anyways but I see that CLK_GATE_IP_G2D is reset to > its default value on S2R so maybe that is why it works anyways? Does the driver restore its value on resume (i.e. has it in the save/restore array)? Remember that suspend causes all clock registers to be reset. Then some of them will be configured by the lowest bootloader stage after wake-up reset, but the kernel needs to restore all of them. > > # devmem 0x10018800 > 0xFFFFFFFF (all CLK_GATE_IP_G2D clocks enabled including CLK_SSS) > > Does this shed any more light? Could the problem be that the sss > clock parent (aclk266_g2d) is gated during S2R? Is the SSS module > required for S2R or is just that CLK_SSS prevents the parent to > be gated and so it is another red herring? Does the board use secure firmware? If yes, it might require to do some encryption on suspend, so if the firmware is broken and doesn't control the clock itself, it might need the SSS clock to be running, when the SLEEP SMC operation is called. Anyway, I just realized that Exynos4 also need several clocks to be ungated on suspend and this is handled by code [1] based on arrays [2]. [1] http://lxr.free-electrons.com/source/drivers/clk/samsung/clk-exynos4.c#L309 [2] http://lxr.free-electrons.com/source/drivers/clk/samsung/clk-exynos4.c#L276 Could this method work for your case as well? There would be no need to call any clock API at all, just low level register writes, which is okay, since this is a low level driver anyway. Best regards, Tomasz
diff --git a/drivers/clk/samsung/clk-exynos5420.c b/drivers/clk/samsung/clk-exynos5420.c index 07d666cc6a29..2d39b629144a 100644 --- a/drivers/clk/samsung/clk-exynos5420.c +++ b/drivers/clk/samsung/clk-exynos5420.c @@ -151,6 +151,7 @@ enum exynos5x_plls { static void __iomem *reg_base; static enum exynos5x_soc exynos5x_soc; +struct samsung_clk_provider *ctx; #ifdef CONFIG_PM_SLEEP static struct samsung_clk_reg_dump *exynos5x_save; @@ -275,8 +276,18 @@ static const struct samsung_clk_reg_dump exynos5420_set_clksrc[] = { { .offset = GATE_IP_PERIC, .value = 0xffffffff, }, }; +/* + * list of clocks that have to be kept enabled during suspend/resume cycle. + */ +static unsigned int exynos5x_clk_suspend[] = { + CLK_MDMA0, +}; + static int exynos5420_clk_suspend(void) { + int i; + struct clk *clk; + samsung_clk_save(reg_base, exynos5x_save, ARRAY_SIZE(exynos5x_clk_regs)); @@ -287,11 +298,24 @@ static int exynos5420_clk_suspend(void) samsung_clk_restore(reg_base, exynos5420_set_clksrc, ARRAY_SIZE(exynos5420_set_clksrc)); + for (i = 0; i < ARRAY_SIZE(exynos5x_clk_suspend); i++) { + clk = samsung_clk_lookup(ctx, exynos5x_clk_suspend[i]); + clk_prepare_enable(clk); + } + return 0; } static void exynos5420_clk_resume(void) { + int i; + struct clk *clk; + + for (i = 0; i < ARRAY_SIZE(exynos5x_clk_suspend); i++) { + clk = samsung_clk_lookup(ctx, exynos5x_clk_suspend[i]); + clk_disable_unprepare(clk); + } + samsung_clk_restore(reg_base, exynos5x_save, ARRAY_SIZE(exynos5x_clk_regs)); @@ -1255,8 +1279,6 @@ static const struct of_device_id ext_clk_match[] __initconst = { static void __init exynos5x_clk_init(struct device_node *np, enum exynos5x_soc soc) { - struct samsung_clk_provider *ctx; - if (np) { reg_base = of_iomap(np, 0); if (!reg_base)
Commit ae43b3289186 ("ARM: 8202/1: dmaengine: pl330: Add runtime Power Management support v12") added pm support for the pl330 dma driver but it makes the clock for the Exynos5420 MDMA0 DMA controller to be gated during suspend and this clock needs to remain enabled in order to make the system resume from a system suspend state. To make sure that the clock is enabled during suspend, enable it prior to entering a suspend state and disable it once the system has resumed. Thanks to Abhilash Kesavan for figuring out that this was the issue. Fixes: ae43b32 ("ARM: 8202/1: dmaengine: pl330: Add runtime Power Management support v12") Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk> --- drivers/clk/samsung/clk-exynos5420.c | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-)