[3/8] clk: tegra114: add LP1 suspend/resume support

Message ID	1374830110-30685-4-git-send-email-josephl@nvidia.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org> From: Joseph Lo <josephl@nvidia.com> To: Stephen Warren <swarren@wwwdotorg.org> Subject: [PATCH 3/8] clk: tegra114: add LP1 suspend/resume support Date: Fri, 26 Jul 2013 17:15:05 +0800 Message-ID: <1374830110-30685-4-git-send-email-josephl@nvidia.com> In-Reply-To: <1374830110-30685-1-git-send-email-josephl@nvidia.com> References: <1374830110-30685-1-git-send-email-josephl@nvidia.com> MIME-Version: 1.0 Cc: linux-tegra@vger.kernel.org, Mike Turquette <mturquette@linaro.org>, linux-arm-kernel@lists.infradead.org, Joseph Lo <josephl@nvidia.com> Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Joseph Lo July 26, 2013, 9:15 a.m. UTC

When the system suspends to LP1, the clock of the CPU would be switched to
CLK_M (12MHz Oscillator) during suspend/resume flow. The clock driver
needs to restore the clock of CPU after LP1 resume.

Cc: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Joseph Lo <josephl@nvidia.com>
---
 drivers/clk/tegra/clk-tegra114.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

Stephen Warren July 29, 2013, 10:51 p.m. UTC | #1

On 07/26/2013 03:15 AM, Joseph Lo wrote:
> When the system suspends to LP1, the clock of the CPU would be switched to
> CLK_M (12MHz Oscillator) during suspend/resume flow. The clock driver
> needs to restore the clock of CPU after LP1 resume.

It's unclear to me how the code change implements "restore the clock of
the CPU". A register name of CCLKG_BURST_POLICY doesn't sound like it's
anything to do with enabled/disabling the CPU clock, nor configuring its
rate. What exactly does this register do, and hence what does this new
code actually restore?

Why don't Tegra20/30 need a similar change?

Joseph Lo Aug. 2, 2013, 8:09 a.m. UTC | #2

On Tue, 2013-07-30 at 06:51 +0800, Stephen Warren wrote:
> On 07/26/2013 03:15 AM, Joseph Lo wrote:
> > When the system suspends to LP1, the clock of the CPU would be switched to
> > CLK_M (12MHz Oscillator) during suspend/resume flow. The clock driver
> > needs to restore the clock of CPU after LP1 resume.
> 
> It's unclear to me how the code change implements "restore the clock of
> the CPU". A register name of CCLKG_BURST_POLICY doesn't sound like it's
> anything to do with enabled/disabling the CPU clock, nor configuring its
> rate. What exactly does this register do, and hence what does this new
> code actually restore?
> 
When system suspend to LP1, most of the PLLs was clock gated. Because we
didn't cut off the core power, the settings of PLL still keep there. But
we switch the clock source of CPU to CLK_M before shut off PLLs by
CCLKG_BURSY_POLICY register. So we need to resume it back to original
clock source by CCLKG_BURST_POLICY register. Or it would be keep in low
rate (CLK_M) after resume.

> Why don't Tegra20/30 need a similar change?
For Tegra20/30, the same code had been implemented in the suspend/resume
function of tegra_cpu_car_ops. It restores the CPU clock ASAP when CPU
resume from a suspend state to get quick performance I believe.

For Tegra114, the resume performance is cool (although we can't see it
in upstream kernel now, it still need some other functions.). We can
implement all the clock related suspend/resume function in the clock
driver.

Stephen Warren Aug. 2, 2013, 8:32 p.m. UTC | #3

On 08/02/2013 02:09 AM, Joseph Lo wrote:
> On Tue, 2013-07-30 at 06:51 +0800, Stephen Warren wrote:
>> On 07/26/2013 03:15 AM, Joseph Lo wrote:
>>> When the system suspends to LP1, the clock of the CPU would be switched to
>>> CLK_M (12MHz Oscillator) during suspend/resume flow. The clock driver
>>> needs to restore the clock of CPU after LP1 resume.
>>
>> It's unclear to me how the code change implements "restore the clock of
>> the CPU". A register name of CCLKG_BURST_POLICY doesn't sound like it's
>> anything to do with enabled/disabling the CPU clock, nor configuring its
>> rate. What exactly does this register do, and hence what does this new
>> code actually restore?
>>
> When system suspend to LP1, most of the PLLs was clock gated. Because we
> didn't cut off the core power, the settings of PLL still keep there. But
> we switch the clock source of CPU to CLK_M before shut off PLLs by
> CCLKG_BURSY_POLICY register. So we need to resume it back to original
> clock source by CCLKG_BURST_POLICY register. Or it would be keep in low
> rate (CLK_M) after resume.

OK, I guess the register name was badly chosen by HW. I'd like to see
part of your description above in the patch description. How about
replacing the patch description with:

----------
When the system suspends to LP1, the CPU clock source is switched to
CLK_M (12MHz Oscillator) during suspend/resume flow[1]. The CPU clock
source is controlled by the CCLKG_BURST_POLICY register, and hence this
register must be restored during LP1 resume.
----------

[1] Question: where does this happen? This patch doesn't make that
change. I wonder why the suspend path can't save this register, rather
than implementing a separate suspend syscore op in the clock driver.
Does the HW auto-switch the value in the register itself?

>> Why don't Tegra20/30 need a similar change?
>
> For Tegra20/30, the same code had been implemented in the suspend/resume
> function of tegra_cpu_car_ops. It restores the CPU clock ASAP when CPU
> resume from a suspend state to get quick performance I believe.
> 
> For Tegra114, the resume performance is cool (although we can't see it
> in upstream kernel now, it still need some other functions.). We can
> implement all the clock related suspend/resume function in the clock
> driver.

OK, I do see something similar in tegra20/30_cpu_clock_suspend/resume.
Why can't this new code be part of the equivalent functions; does the
Tegra114 suspend/resume code in mach-tegra/ not call
tegra_cpu_car_ops.suspend/resume() in the same way it does on Tegra20/30?

Joseph Lo Aug. 5, 2013, 8:02 a.m. UTC | #4

On Sat, 2013-08-03 at 04:32 +0800, Stephen Warren wrote:
> On 08/02/2013 02:09 AM, Joseph Lo wrote:
> > On Tue, 2013-07-30 at 06:51 +0800, Stephen Warren wrote:
> >> On 07/26/2013 03:15 AM, Joseph Lo wrote:
> >>> When the system suspends to LP1, the clock of the CPU would be switched to
> >>> CLK_M (12MHz Oscillator) during suspend/resume flow. The clock driver
> >>> needs to restore the clock of CPU after LP1 resume.
> >>
> >> It's unclear to me how the code change implements "restore the clock of
> >> the CPU". A register name of CCLKG_BURST_POLICY doesn't sound like it's
> >> anything to do with enabled/disabling the CPU clock, nor configuring its
> >> rate. What exactly does this register do, and hence what does this new
> >> code actually restore?
> >>
> > When system suspend to LP1, most of the PLLs was clock gated. Because we
> > didn't cut off the core power, the settings of PLL still keep there. But
> > we switch the clock source of CPU to CLK_M before shut off PLLs by
> > CCLKG_BURSY_POLICY register. So we need to resume it back to original
> > clock source by CCLKG_BURST_POLICY register. Or it would be keep in low
> > rate (CLK_M) after resume.
> 
> OK, I guess the register name was badly chosen by HW. I'd like to see
> part of your description above in the patch description. How about
> replacing the patch description with:
> 
> ----------
> When the system suspends to LP1, the CPU clock source is switched to
> CLK_M (12MHz Oscillator) during suspend/resume flow[1]. The CPU clock
> source is controlled by the CCLKG_BURST_POLICY register, and hence this
> register must be restored during LP1 resume.
> ----------
> 
> [1] Question: where does this happen? This patch doesn't make that
> change. I wonder why the suspend path can't save this register, rather
> than implementing a separate suspend syscore op in the clock driver.
> Does the HW auto-switch the value in the register itself?
> 
If we switch CPU to CLK_M in clock driver, the system will become slowly
during the middle of suspending flow. We do this at the very end of the
LP1 suspending flow before the CPU disable all the PLL clocks.

> >> Why don't Tegra20/30 need a similar change?
> >
> > For Tegra20/30, the same code had been implemented in the suspend/resume
> > function of tegra_cpu_car_ops. It restores the CPU clock ASAP when CPU
> > resume from a suspend state to get quick performance I believe.
> > 
> > For Tegra114, the resume performance is cool (although we can't see it
> > in upstream kernel now, it still need some other functions.). We can
> > implement all the clock related suspend/resume function in the clock
> > driver.
> 
> OK, I do see something similar in tegra20/30_cpu_clock_suspend/resume.
> Why can't this new code be part of the equivalent functions; does the
> Tegra114 suspend/resume code in mach-tegra/ not call
> tegra_cpu_car_ops.suspend/resume() in the same way it does on Tegra20/30?
One of the main reasons is due to DFLL clock. The CPU clock of Tegra114
is going to switch to DFLL to a get higher clock rate. But it depends
some other HW (i.e. I2C), we can't resume it so early when the CPU just
resume like we did in tegra_cpu_car_ops.suspend/resume for Tegra20/30.

Stephen Warren Aug. 5, 2013, 5 p.m. UTC | #5

On 08/05/2013 02:02 AM, Joseph Lo wrote:
> On Sat, 2013-08-03 at 04:32 +0800, Stephen Warren wrote:
>> On 08/02/2013 02:09 AM, Joseph Lo wrote:
>>> On Tue, 2013-07-30 at 06:51 +0800, Stephen Warren wrote:
>>>> On 07/26/2013 03:15 AM, Joseph Lo wrote:
>>>>> When the system suspends to LP1, the clock of the CPU would be switched to
>>>>> CLK_M (12MHz Oscillator) during suspend/resume flow. The clock driver
>>>>> needs to restore the clock of CPU after LP1 resume.
>>>>
>>>> It's unclear to me how the code change implements "restore the clock of
>>>> the CPU". A register name of CCLKG_BURST_POLICY doesn't sound like it's
>>>> anything to do with enabled/disabling the CPU clock, nor configuring its
>>>> rate. What exactly does this register do, and hence what does this new
>>>> code actually restore?
>>>>
>>> When system suspend to LP1, most of the PLLs was clock gated. Because we
>>> didn't cut off the core power, the settings of PLL still keep there. But
>>> we switch the clock source of CPU to CLK_M before shut off PLLs by
>>> CCLKG_BURSY_POLICY register. So we need to resume it back to original
>>> clock source by CCLKG_BURST_POLICY register. Or it would be keep in low
>>> rate (CLK_M) after resume.
>>
>> OK, I guess the register name was badly chosen by HW. I'd like to see
>> part of your description above in the patch description. How about
>> replacing the patch description with:
>>
>> ----------
>> When the system suspends to LP1, the CPU clock source is switched to
>> CLK_M (12MHz Oscillator) during suspend/resume flow[1]. The CPU clock
>> source is controlled by the CCLKG_BURST_POLICY register, and hence this
>> register must be restored during LP1 resume.
>> ----------
>>
>> [1] Question: where does this happen? This patch doesn't make that
>> change. I wonder why the suspend path can't save this register, rather
>> than implementing a separate suspend syscore op in the clock driver.
>> Does the HW auto-switch the value in the register itself?
>
> If we switch CPU to CLK_M in clock driver, the system will become slowly
> during the middle of suspending flow. We do this at the very end of the
> LP1 suspending flow before the CPU disable all the PLL clocks.

I think you answered "why is the switch to CLK_M performed very late",
whereas I asked "where is the code that performs the switch to CLK_M".

>>>> Why don't Tegra20/30 need a similar change?
>>>
>>> For Tegra20/30, the same code had been implemented in the suspend/resume
>>> function of tegra_cpu_car_ops. It restores the CPU clock ASAP when CPU
>>> resume from a suspend state to get quick performance I believe.
>>>
>>> For Tegra114, the resume performance is cool (although we can't see it
>>> in upstream kernel now, it still need some other functions.). We can
>>> implement all the clock related suspend/resume function in the clock
>>> driver.
>>
>> OK, I do see something similar in tegra20/30_cpu_clock_suspend/resume.
>> Why can't this new code be part of the equivalent functions; does the
>> Tegra114 suspend/resume code in mach-tegra/ not call
>> tegra_cpu_car_ops.suspend/resume() in the same way it does on Tegra20/30?
>
> One of the main reasons is due to DFLL clock. The CPU clock of Tegra114
> is going to switch to DFLL to a get higher clock rate. But it depends
> some other HW (i.e. I2C), we can't resume it so early when the CPU just
> resume like we did in tegra_cpu_car_ops.suspend/resume for Tegra20/30.

Is there a guarantee that the syscore_ops are run after the call to
tegar_cpu_car_ops.resume() would be? Perhaps the mach-tegra/ code should
simply call tegra_cpu_car_ops.resume() later on Tegra114 than it does on
earlier chips, so it's run at a suitable time?

Stephen Warren Aug. 5, 2013, 5:39 p.m. UTC | #6

On 08/05/2013 11:00 AM, Stephen Warren wrote:
> On 08/05/2013 02:02 AM, Joseph Lo wrote:
>> On Sat, 2013-08-03 at 04:32 +0800, Stephen Warren wrote:
>>> On 08/02/2013 02:09 AM, Joseph Lo wrote:
>>>> On Tue, 2013-07-30 at 06:51 +0800, Stephen Warren wrote:
>>>>> On 07/26/2013 03:15 AM, Joseph Lo wrote:
>>>>>> When the system suspends to LP1, the clock of the CPU would be switched to
>>>>>> CLK_M (12MHz Oscillator) during suspend/resume flow. The clock driver
>>>>>> needs to restore the clock of CPU after LP1 resume.
>>>>>
>>>>> It's unclear to me how the code change implements "restore the clock of
>>>>> the CPU". A register name of CCLKG_BURST_POLICY doesn't sound like it's
>>>>> anything to do with enabled/disabling the CPU clock, nor configuring its
>>>>> rate. What exactly does this register do, and hence what does this new
>>>>> code actually restore?
>>>>>
>>>> When system suspend to LP1, most of the PLLs was clock gated. Because we
>>>> didn't cut off the core power, the settings of PLL still keep there. But
>>>> we switch the clock source of CPU to CLK_M before shut off PLLs by
>>>> CCLKG_BURSY_POLICY register. So we need to resume it back to original
>>>> clock source by CCLKG_BURST_POLICY register. Or it would be keep in low
>>>> rate (CLK_M) after resume.
>>>
>>> OK, I guess the register name was badly chosen by HW. I'd like to see
>>> part of your description above in the patch description. How about
>>> replacing the patch description with:
>>>
>>> ----------
>>> When the system suspends to LP1, the CPU clock source is switched to
>>> CLK_M (12MHz Oscillator) during suspend/resume flow[1]. The CPU clock
>>> source is controlled by the CCLKG_BURST_POLICY register, and hence this
>>> register must be restored during LP1 resume.
>>> ----------
>>>
>>> [1] Question: where does this happen? This patch doesn't make that
>>> change. I wonder why the suspend path can't save this register, rather
>>> than implementing a separate suspend syscore op in the clock driver.
>>> Does the HW auto-switch the value in the register itself?
>>
>> If we switch CPU to CLK_M in clock driver, the system will become slowly
>> during the middle of suspending flow. We do this at the very end of the
>> LP1 suspending flow before the CPU disable all the PLL clocks.
> 
> I think you answered "why is the switch to CLK_M performed very late",
> whereas I asked "where is the code that performs the switch to CLK_M".

To expand on this a bit more, I can't find any reference to register
CCLKG_BURST_POLICY in arch/arm/mach-tegra/ or drivers/clk/tegra/ except
for the definition of clock cclk_g, nor any reference to that clock in
those two directories. And that CCLKG_BURST_POLICY register is what this
patch saves/restores.

I do see function tegra30_switch_cpu_to_clk32k in patch 5/8 appears to
do something related to switching to clk_m and touches some other
burst-related registers, but not CCLKG_BURST_POLICY...

So, if this syscore_op is attempting to undo some change that happens to
CCLKG_BURST_POLICY during suspend, I still can't see what change is
happening to that register during suspemd, nor which piece of code
causes it.

If the issue is that the value in this register is simply lost during
LP1 because the power is turned off, then I wonder what sets up this
register during the original boot path?

Joseph Lo Aug. 6, 2013, 9:10 a.m. UTC | #7

On Tue, 2013-08-06 at 01:39 +0800, Stephen Warren wrote:
> On 08/05/2013 11:00 AM, Stephen Warren wrote:
> > On 08/05/2013 02:02 AM, Joseph Lo wrote:
> >> On Sat, 2013-08-03 at 04:32 +0800, Stephen Warren wrote:
> >>> On 08/02/2013 02:09 AM, Joseph Lo wrote:
> >>>> On Tue, 2013-07-30 at 06:51 +0800, Stephen Warren wrote:
> >>>>> On 07/26/2013 03:15 AM, Joseph Lo wrote:
> >>>>>> When the system suspends to LP1, the clock of the CPU would be switched to
> >>>>>> CLK_M (12MHz Oscillator) during suspend/resume flow. The clock driver
> >>>>>> needs to restore the clock of CPU after LP1 resume.
> >>>>>
> >>>>> It's unclear to me how the code change implements "restore the clock of
> >>>>> the CPU". A register name of CCLKG_BURST_POLICY doesn't sound like it's
> >>>>> anything to do with enabled/disabling the CPU clock, nor configuring its
> >>>>> rate. What exactly does this register do, and hence what does this new
> >>>>> code actually restore?
> >>>>>
> >>>> When system suspend to LP1, most of the PLLs was clock gated. Because we
> >>>> didn't cut off the core power, the settings of PLL still keep there. But
> >>>> we switch the clock source of CPU to CLK_M before shut off PLLs by
> >>>> CCLKG_BURSY_POLICY register. So we need to resume it back to original
> >>>> clock source by CCLKG_BURST_POLICY register. Or it would be keep in low
> >>>> rate (CLK_M) after resume.
> >>>
> >>> OK, I guess the register name was badly chosen by HW. I'd like to see
> >>> part of your description above in the patch description. How about
> >>> replacing the patch description with:
> >>>
> >>> ----------
> >>> When the system suspends to LP1, the CPU clock source is switched to
> >>> CLK_M (12MHz Oscillator) during suspend/resume flow[1]. The CPU clock
> >>> source is controlled by the CCLKG_BURST_POLICY register, and hence this
> >>> register must be restored during LP1 resume.
> >>> ----------
> >>>
> >>> [1] Question: where does this happen? This patch doesn't make that
> >>> change. I wonder why the suspend path can't save this register, rather
> >>> than implementing a separate suspend syscore op in the clock driver.
> >>> Does the HW auto-switch the value in the register itself?
> >>
> >> If we switch CPU to CLK_M in clock driver, the system will become slowly
> >> during the middle of suspending flow. We do this at the very end of the
> >> LP1 suspending flow before the CPU disable all the PLL clocks.
> > 
> > I think you answered "why is the switch to CLK_M performed very late",
> > whereas I asked "where is the code that performs the switch to CLK_M".
You can find to code in the function tegraXX_switch_cpu_to_clk32k of the
patch 5/8 and 6/8. It switches the SCLK and CCLK to CLKM before
disabling the PLLs.
> 
> To expand on this a bit more, I can't find any reference to register
> CCLKG_BURST_POLICY in arch/arm/mach-tegra/ or drivers/clk/tegra/ except
> for the definition of clock cclk_g, nor any reference to that clock in
> those two directories. And that CCLKG_BURST_POLICY register is what this
> patch saves/restores.
> 
> I do see function tegra30_switch_cpu_to_clk32k in patch 5/8 appears to
> do something related to switching to clk_m and touches some other
> burst-related registers, but not CCLKG_BURST_POLICY...
> 
OK. It's a little difference we do in clock driver side and low level
code side.

As you knew, we have two CPU clusters (CCLKG & CCLKLP) in Tegra chips.
And they have exactly the same clock source for Tegra20/30, but
Tegra114. From the low level side, it should be supported when the
system going to suspend with CPU_G or CPU_LP. So we switched the clock
source of CPU to CLK_M by CCLK_BURST_POLICY. This register can support
the running CPU switch to the clock source they want.

On the clock driver side, because the clock source of CCLKG and CCLKLP
is not exactly the same for the Tegra114. The CCLKG supports DFLL clock
but CCLKLP. We need to restore it separately to avoid the CPU restore to
the wrong clock source.

The CPU clock suspend/resume function in the tegra_cpu_car_ops of
Tegra20/30 is also using CCLK_BURST_POLICY register, because the
definition of the clock source of CCLKG and CCLKLP is the same. We can
simply the code for just using CCLK_BURST_POLICY.

I think the code should OK when CPU_G or CPU_LP running with it.

> So, if this syscore_op is attempting to undo some change that happens to
> CCLKG_BURST_POLICY during suspend, I still can't see what change is
> happening to that register during suspemd, nor which piece of code
> causes it.
> 
> If the issue is that the value in this register is simply lost during
> LP1 because the power is turned off, then I wonder what sets up this
> register during the original boot path?

Joseph Lo Aug. 6, 2013, 9:19 a.m. UTC | #8

On Tue, 2013-08-06 at 01:00 +0800, Stephen Warren wrote:
> On 08/05/2013 02:02 AM, Joseph Lo wrote:
> > On Sat, 2013-08-03 at 04:32 +0800, Stephen Warren wrote:
> >> On 08/02/2013 02:09 AM, Joseph Lo wrote:
> >>> On Tue, 2013-07-30 at 06:51 +0800, Stephen Warren wrote:
> >>>> On 07/26/2013 03:15 AM, Joseph Lo wrote:
> >>>>> When the system suspends to LP1, the clock of the CPU would be switched to
> >>>>> CLK_M (12MHz Oscillator) during suspend/resume flow. The clock driver
> >>>>> needs to restore the clock of CPU after LP1 resume.
> >>>>
[snip]
> >>>> Why don't Tegra20/30 need a similar change?
> >>>
> >>> For Tegra20/30, the same code had been implemented in the suspend/resume
> >>> function of tegra_cpu_car_ops. It restores the CPU clock ASAP when CPU
> >>> resume from a suspend state to get quick performance I believe.
> >>>
> >>> For Tegra114, the resume performance is cool (although we can't see it
> >>> in upstream kernel now, it still need some other functions.). We can
> >>> implement all the clock related suspend/resume function in the clock
> >>> driver.
> >>
> >> OK, I do see something similar in tegra20/30_cpu_clock_suspend/resume.
> >> Why can't this new code be part of the equivalent functions; does the
> >> Tegra114 suspend/resume code in mach-tegra/ not call
> >> tegra_cpu_car_ops.suspend/resume() in the same way it does on Tegra20/30?
> >
> > One of the main reasons is due to DFLL clock. The CPU clock of Tegra114
> > is going to switch to DFLL to a get higher clock rate. But it depends
> > some other HW (i.e. I2C), we can't resume it so early when the CPU just
> > resume like we did in tegra_cpu_car_ops.suspend/resume for Tegra20/30.
> 
> Is there a guarantee that the syscore_ops are run after the call to
> tegar_cpu_car_ops.resume() would be? Perhaps the mach-tegra/ code should
> simply call tegra_cpu_car_ops.resume() later on Tegra114 than it does on
> earlier chips, so it's run at a suitable time?

Yes, the tegra_cpu_car_ops.suspend/resume() is hooked to
suspend_cpu_complex()/resuem_cpu_complex() that in the "pm.c" file. It
would be the final/first function be invoked when system suspend/resume.

Stephen Warren Aug. 6, 2013, 6:37 p.m. UTC | #9

On 08/06/2013 03:10 AM, Joseph Lo wrote:
> On Tue, 2013-08-06 at 01:39 +0800, Stephen Warren wrote:
>> On 08/05/2013 11:00 AM, Stephen Warren wrote:
>>> On 08/05/2013 02:02 AM, Joseph Lo wrote:
>>>> On Sat, 2013-08-03 at 04:32 +0800, Stephen Warren wrote:
>>>>> On 08/02/2013 02:09 AM, Joseph Lo wrote:
>>>>>> On Tue, 2013-07-30 at 06:51 +0800, Stephen Warren wrote:
>>>>>>> On 07/26/2013 03:15 AM, Joseph Lo wrote:
>>>>>>>> When the system suspends to LP1, the clock of the CPU would be switched to
>>>>>>>> CLK_M (12MHz Oscillator) during suspend/resume flow. The clock driver
>>>>>>>> needs to restore the clock of CPU after LP1 resume.
>>>>>>>
>>>>>>> It's unclear to me how the code change implements "restore the clock of
>>>>>>> the CPU". A register name of CCLKG_BURST_POLICY doesn't sound like it's
>>>>>>> anything to do with enabled/disabling the CPU clock, nor configuring its
>>>>>>> rate. What exactly does this register do, and hence what does this new
>>>>>>> code actually restore?
>>>>>>>
>>>>>> When system suspend to LP1, most of the PLLs was clock gated. Because we
>>>>>> didn't cut off the core power, the settings of PLL still keep there. But
>>>>>> we switch the clock source of CPU to CLK_M before shut off PLLs by
>>>>>> CCLKG_BURSY_POLICY register. So we need to resume it back to original
>>>>>> clock source by CCLKG_BURST_POLICY register. Or it would be keep in low
>>>>>> rate (CLK_M) after resume.
>>>>>
>>>>> OK, I guess the register name was badly chosen by HW. I'd like to see
>>>>> part of your description above in the patch description. How about
>>>>> replacing the patch description with:
>>>>>
>>>>> ----------
>>>>> When the system suspends to LP1, the CPU clock source is switched to
>>>>> CLK_M (12MHz Oscillator) during suspend/resume flow[1]. The CPU clock
>>>>> source is controlled by the CCLKG_BURST_POLICY register, and hence this
>>>>> register must be restored during LP1 resume.
>>>>> ----------
>>>>>
>>>>> [1] Question: where does this happen? This patch doesn't make that
>>>>> change. I wonder why the suspend path can't save this register, rather
>>>>> than implementing a separate suspend syscore op in the clock driver.
>>>>> Does the HW auto-switch the value in the register itself?
>>>>
>>>> If we switch CPU to CLK_M in clock driver, the system will become slowly
>>>> during the middle of suspending flow. We do this at the very end of the
>>>> LP1 suspending flow before the CPU disable all the PLL clocks.
>>>
>>> I think you answered "why is the switch to CLK_M performed very late",
>>> whereas I asked "where is the code that performs the switch to CLK_M".
> You can find to code in the function tegraXX_switch_cpu_to_clk32k of the
> patch 5/8 and 6/8. It switches the SCLK and CCLK to CLKM before
> disabling the PLLs.
>>
>> To expand on this a bit more, I can't find any reference to register
>> CCLKG_BURST_POLICY in arch/arm/mach-tegra/ or drivers/clk/tegra/ except
>> for the definition of clock cclk_g, nor any reference to that clock in
>> those two directories. And that CCLKG_BURST_POLICY register is what this
>> patch saves/restores.
>>
>> I do see function tegra30_switch_cpu_to_clk32k in patch 5/8 appears to
>> do something related to switching to clk_m and touches some other
>> burst-related registers, but not CCLKG_BURST_POLICY...
>>
> OK. It's a little difference we do in clock driver side and low level
> code side.
> 
> As you knew, we have two CPU clusters (CCLKG & CCLKLP) in Tegra chips.
> And they have exactly the same clock source for Tegra20/30, but
> Tegra114. From the low level side, it should be supported when the
> system going to suspend with CPU_G or CPU_LP. So we switched the clock
> source of CPU to CLK_M by CCLK_BURST_POLICY. This register can support
> the running CPU switch to the clock source they want.
> 
> On the clock driver side, because the clock source of CCLKG and CCLKLP
> is not exactly the same for the Tegra114. The CCLKG supports DFLL clock
> but CCLKLP. We need to restore it separately to avoid the CPU restore to
> the wrong clock source.
> 
> The CPU clock suspend/resume function in the tegra_cpu_car_ops of
> Tegra20/30 is also using CCLK_BURST_POLICY register, because the
> definition of the clock source of CCLKG and CCLKLP is the same. We can
> simply the code for just using CCLK_BURST_POLICY.
> 
> I think the code should OK when CPU_G or CPU_LP running with it.

OK, so just to paraphrase what you've said, to make sure I understand it:

On Tegra114, there are 3 registers:

1) Controls just the clock source of the G cluster.
2) Controls just the clock source of the LP cluster.
3) Controls both at the same time, simulating a write to both (1) and (2)

In tegra30_switch_cpu_to_clk32k(), we use register (3) above to switch
to clk_m then clk_s. This is possible, because in both registers (1) and
(2), the values for clk_m and clk_s are identical.

In tegra30_lp1_reset(), we can't use the same technique (a write to
register 3) to restore the clock source for both G and LP clusters,
since the values we need to write to those registers to re-select their
clock source is different.

however, on Tegra30, there really is only one register, so we can use
that to both switch to clk_m/clk_s, /and/ to switch back.

That all explains the following part of patch 7/8, which disables the
clock register restore path except on Tegra30 where it will work:

> -	mov32	r4, ((1 << 28) | (0x8))	@ burst policy is PLLX
> -	str	r4, [r0, #CLK_RESET_CCLK_BURST]
> +	cmp	r10, #TEGRA30
> +	movweq	r4, #:lower16:((1 << 28) | (0x8))	@ burst policy is PLLX
> +	movteq	r4, #:upper16:((1 << 28) | (0x8))
> +	streq	r4, [r0, #CLK_RESET_CCLK_BURST]

However, none of this explains why the CPU clock restore logic on
Tegra114 needs to be a syscore_op, rather than simply having
tegra30_lp1_reset() execute some Tegra114-specific code.

Re-writing the part of the patch I quoted above in C, it looks like:

if (soc is tegra30)
    write to CLK_RESET_CCLK_BURST to select pllx

Instead of making that code do nothing on Tegra114, why can't that code be:

if (soc is tegra30)
    write to CLK_RESET_CCLK_BURST to re-select pllx
else
    write to G-cluster-specific register to restore saved value
    write to LP-cluster-specific register to restore saved value

or:

if (soc is tegra30)
    write to CLK_RESET_CCLK_BURST to re-select pllx
else
    write to G-cluster-specific register to re-select pllx/dfll/...
    write to LP-cluster-specific register to restore ...

Plenty of other registers are saved/restored by tegra30_lp1_reset(), so
surely you just need to have save two more registers in the suspend
path, and restore them as I showed above.

That way, we don't have to have some special-case syscore_op to do this;
all chips will restore the clock sources at the exact same place in the
code, and everything will be consistent.

Joseph Lo Aug. 7, 2013, 9:12 a.m. UTC | #10

On Wed, 2013-08-07 at 02:37 +0800, Stephen Warren wrote:
> On 08/06/2013 03:10 AM, Joseph Lo wrote:
> > On Tue, 2013-08-06 at 01:39 +0800, Stephen Warren wrote:
> >> On 08/05/2013 11:00 AM, Stephen Warren wrote:
> >>> On 08/05/2013 02:02 AM, Joseph Lo wrote:
> >>>> On Sat, 2013-08-03 at 04:32 +0800, Stephen Warren wrote:
> >>>>> On 08/02/2013 02:09 AM, Joseph Lo wrote:
> >>>>>> On Tue, 2013-07-30 at 06:51 +0800, Stephen Warren wrote:
> >>>>>>> On 07/26/2013 03:15 AM, Joseph Lo wrote:
[snip]
> > OK. It's a little difference we do in clock driver side and low level
> > code side.
> > 
> > As you knew, we have two CPU clusters (CCLKG & CCLKLP) in Tegra chips.
> > And they have exactly the same clock source for Tegra20/30, but
> > Tegra114. From the low level side, it should be supported when the
> > system going to suspend with CPU_G or CPU_LP. So we switched the clock
> > source of CPU to CLK_M by CCLK_BURST_POLICY. This register can support
> > the running CPU switch to the clock source they want.
> > 
> > On the clock driver side, because the clock source of CCLKG and CCLKLP
> > is not exactly the same for the Tegra114. The CCLKG supports DFLL clock
> > but CCLKLP. We need to restore it separately to avoid the CPU restore to
> > the wrong clock source.
> > 
> > The CPU clock suspend/resume function in the tegra_cpu_car_ops of
> > Tegra20/30 is also using CCLK_BURST_POLICY register, because the
> > definition of the clock source of CCLKG and CCLKLP is the same. We can
> > simply the code for just using CCLK_BURST_POLICY.
> > 
> > I think the code should OK when CPU_G or CPU_LP running with it.
> 
> OK, so just to paraphrase what you've said, to make sure I understand it:
> 
> On Tegra114, there are 3 registers:
It's on Tegra30 too.
> 
> 1) Controls just the clock source of the G cluster.
CCLKG_BURST_POLICY at offset 0x368.
> 2) Controls just the clock source of the LP cluster.
CCLKLP_BURST_POLICY at offset 0x370.
> 3) Controls both at the same time, simulating a write to both (1) and (2)
CCLK_BURST_POLICY at offset 0x20.
Yes, exactly. And I double confirm the item 3 again with a downstream
kernel that support cluster switch today. It's true.
> 
> In tegra30_switch_cpu_to_clk32k(), we use register (3) above to switch
> to clk_m then clk_s. This is possible, because in both registers (1) and
> (2), the values for clk_m and clk_s are identical.
Yes.
> 
> In tegra30_lp1_reset(), we can't use the same technique (a write to
> register 3) to restore the clock source for both G and LP clusters,
> since the values we need to write to those registers to re-select their
> clock source is different.
> 
> however, on Tegra30, there really is only one register, so we can use
> that to both switch to clk_m/clk_s, /and/ to switch back.
> 
> That all explains the following part of patch 7/8, which disables the
> clock register restore path except on Tegra30 where it will work:
> 
> > -	mov32	r4, ((1 << 28) | (0x8))	@ burst policy is PLLX
> > -	str	r4, [r0, #CLK_RESET_CCLK_BURST]
> > +	cmp	r10, #TEGRA30
> > +	movweq	r4, #:lower16:((1 << 28) | (0x8))	@ burst policy is PLLX
> > +	movteq	r4, #:upper16:((1 << 28) | (0x8))
> > +	streq	r4, [r0, #CLK_RESET_CCLK_BURST]
OK. The burst policy of CPU clock has 5 different states (STDBY, IDLE,
RUN, IRQ and FIQ). The SW only can set up the clock source of each
state. The switching to different state was controlled by HW.

So the Tegra30 code here was set up the clock source of IDLE state only
(the CPU runs in this state after resume) to make the performance
better. But It had no idea about the clock sources of the other states.
It still needs clock driver to take care of that.

The clock source id (0x8) means PLLX_OUT0 in Tegra30. In Tegra114, it
means PLLX_OUT0_LJ (low jitter). It's not available at this moment. I
can switch it to PLLX_OUT0 (0xe) for Tegra114 too. The code may like
below.

> -     mov32   r4, ((1 << 28) | (0x8)) @ burst policy is PLLX
> -     str     r4, [r0, #CLK_RESET_CCLK_BURST]
> +     cmp     r10, #TEGRA30
> +     movweq  r4, #:lower16:((1 << 28) | (0x8))
> +     movteq  r4, #:upper16:((1 << 28) | (0x8))
> +	movwne  r4, #:lower16:((1 << 28) | (0xe))
> +	movteq  r4, #:upper16:((1 << 28) | (0xe))
> +     str	r4, [r0, #CLK_RESET_CCLK_BURST]

I had tested this code before I sent this series. It didn't impact or
improve the resuming performance. So I removed them.
Do you want me to add them back?

> 
> However, none of this explains why the CPU clock restore logic on
> Tegra114 needs to be a syscore_op, rather than simply having
> tegra30_lp1_reset() execute some Tegra114-specific code.
> 
> Re-writing the part of the patch I quoted above in C, it looks like:
> 
> if (soc is tegra30)
>     write to CLK_RESET_CCLK_BURST to select pllx
> 
> Instead of making that code do nothing on Tegra114, why can't that code be:
> 
> if (soc is tegra30)
>     write to CLK_RESET_CCLK_BURST to re-select pllx
> else
>     write to G-cluster-specific register to restore saved value
>     write to LP-cluster-specific register to restore saved value
> 
> or:
> 
> if (soc is tegra30)
>     write to CLK_RESET_CCLK_BURST to re-select pllx
> else
>     write to G-cluster-specific register to re-select pllx/dfll/...
>     write to LP-cluster-specific register to restore ...
> 
Hope the explanation above is enough for you. The low level code had no
idea of clock source of each CPU state in BURST_POLICY register. It
needs to be done by clk driver itself.

Stephen Warren Aug. 7, 2013, 4:46 p.m. UTC | #11

On 08/07/2013 03:12 AM, Joseph Lo wrote:
> On Wed, 2013-08-07 at 02:37 +0800, Stephen Warren wrote:
>> On 08/06/2013 03:10 AM, Joseph Lo wrote:
>>> On Tue, 2013-08-06 at 01:39 +0800, Stephen Warren wrote:
>>>> On 08/05/2013 11:00 AM, Stephen Warren wrote:
>>>>> On 08/05/2013 02:02 AM, Joseph Lo wrote:
>>>>>> On Sat, 2013-08-03 at 04:32 +0800, Stephen Warren wrote:
>>>>>>> On 08/02/2013 02:09 AM, Joseph Lo wrote:
>>>>>>>> On Tue, 2013-07-30 at 06:51 +0800, Stephen Warren wrote:
>>>>>>>>> On 07/26/2013 03:15 AM, Joseph Lo wrote:
> [snip]
>>> OK. It's a little difference we do in clock driver side and low level
>>> code side.
>>>
>>> As you knew, we have two CPU clusters (CCLKG & CCLKLP) in Tegra chips.
>>> And they have exactly the same clock source for Tegra20/30, but
>>> Tegra114. From the low level side, it should be supported when the
>>> system going to suspend with CPU_G or CPU_LP. So we switched the clock
>>> source of CPU to CLK_M by CCLK_BURST_POLICY. This register can support
>>> the running CPU switch to the clock source they want.
>>>
>>> On the clock driver side, because the clock source of CCLKG and CCLKLP
>>> is not exactly the same for the Tegra114. The CCLKG supports DFLL clock
>>> but CCLKLP. We need to restore it separately to avoid the CPU restore to
>>> the wrong clock source.
>>>
>>> The CPU clock suspend/resume function in the tegra_cpu_car_ops of
>>> Tegra20/30 is also using CCLK_BURST_POLICY register, because the
>>> definition of the clock source of CCLKG and CCLKLP is the same. We can
>>> simply the code for just using CCLK_BURST_POLICY.
>>>
>>> I think the code should OK when CPU_G or CPU_LP running with it.
>>
>> OK, so just to paraphrase what you've said, to make sure I understand it:
>>
>> On Tegra114, there are 3 registers:
> It's on Tegra30 too.
>>
>> 1) Controls just the clock source of the G cluster.
> CCLKG_BURST_POLICY at offset 0x368.
>> 2) Controls just the clock source of the LP cluster.
> CCLKLP_BURST_POLICY at offset 0x370.
>> 3) Controls both at the same time, simulating a write to both (1) and (2)
> CCLK_BURST_POLICY at offset 0x20.
> Yes, exactly. And I double confirm the item 3 again with a downstream
> kernel that support cluster switch today. It's true.
>>
>> In tegra30_switch_cpu_to_clk32k(), we use register (3) above to switch
>> to clk_m then clk_s. This is possible, because in both registers (1) and
>> (2), the values for clk_m and clk_s are identical.
> Yes.
>>
>> In tegra30_lp1_reset(), we can't use the same technique (a write to
>> register 3) to restore the clock source for both G and LP clusters,
>> since the values we need to write to those registers to re-select their
>> clock source is different.
>>
>> however, on Tegra30, there really is only one register, so we can use
>> that to both switch to clk_m/clk_s, /and/ to switch back.
>>
>> That all explains the following part of patch 7/8, which disables the
>> clock register restore path except on Tegra30 where it will work:
>>
>>> -	mov32	r4, ((1 << 28) | (0x8))	@ burst policy is PLLX
>>> -	str	r4, [r0, #CLK_RESET_CCLK_BURST]
>>> +	cmp	r10, #TEGRA30
>>> +	movweq	r4, #:lower16:((1 << 28) | (0x8))	@ burst policy is PLLX
>>> +	movteq	r4, #:upper16:((1 << 28) | (0x8))
>>> +	streq	r4, [r0, #CLK_RESET_CCLK_BURST]
>
> OK. The burst policy of CPU clock has 5 different states (STDBY, IDLE,
> RUN, IRQ and FIQ). The SW only can set up the clock source of each
> state. The switching to different state was controlled by HW.
> 
> So the Tegra30 code here was set up the clock source of IDLE state only
> (the CPU runs in this state after resume) to make the performance
> better. But It had no idea about the clock sources of the other states.
> It still needs clock driver to take care of that.
> 
> The clock source id (0x8) means PLLX_OUT0 in Tegra30. In Tegra114, it
> means PLLX_OUT0_LJ (low jitter). It's not available at this moment. I
> can switch it to PLLX_OUT0 (0xe) for Tegra114 too. The code may like
> below.
> 
>> -     mov32   r4, ((1 << 28) | (0x8)) @ burst policy is PLLX
>> -     str     r4, [r0, #CLK_RESET_CCLK_BURST]
>> +     cmp     r10, #TEGRA30
>> +     movweq  r4, #:lower16:((1 << 28) | (0x8))
>> +     movteq  r4, #:upper16:((1 << 28) | (0x8))
>> +	movwne  r4, #:lower16:((1 << 28) | (0xe))
>> +	movteq  r4, #:upper16:((1 << 28) | (0xe))

I assume that should be movtne.

>> +     str	r4, [r0, #CLK_RESET_CCLK_BURST]
> 
> I had tested this code before I sent this series. It didn't impact or
> improve the resuming performance. So I removed them.
> Do you want me to add them back?

I think adding that code back would make sense. It would mean the code
works fundamentally the same way for all SoCs.

BTW, you mentioned that LP1 resume takes 10-15 seconds on Dalmore. Are
you sure this isn't because the above code is missing, and hence the CPU
keeps running at 32KHz or the crystal speed, since nothing restores the
CPU clock source until much later, when the syscore op runs?

>> However, none of this explains why the CPU clock restore logic on
>> Tegra114 needs to be a syscore_op, rather than simply having
>> tegra30_lp1_reset() execute some Tegra114-specific code.
>>
>> Re-writing the part of the patch I quoted above in C, it looks like:
>>
>> if (soc is tegra30)
>>     write to CLK_RESET_CCLK_BURST to select pllx
>>
>> Instead of making that code do nothing on Tegra114, why can't that code be:
>>
>> if (soc is tegra30)
>>     write to CLK_RESET_CCLK_BURST to re-select pllx
>> else
>>     write to G-cluster-specific register to restore saved value
>>     write to LP-cluster-specific register to restore saved value
>>
>> or:
>>
>> if (soc is tegra30)
>>     write to CLK_RESET_CCLK_BURST to re-select pllx
>> else
>>     write to G-cluster-specific register to re-select pllx/dfll/...
>>     write to LP-cluster-specific register to restore ...
>>
> Hope the explanation above is enough for you. The low level code had no
> idea of clock source of each CPU state in BURST_POLICY register. It
> needs to be done by clk driver itself.

I still have absolutely no idea why Tegra30 and Tegra114 are different.

You mentioned something about this low-level code only manipulating the
IDLE state, and the clock driver needing to restore the other 4 states.
This raises yet more questions:

1) Do we not need to restore the other 4 states on Tegra30? If not, why
not? If we do, presumably Tegra30 (and Tegra20?) need to the syscore_op
this patch series adds to Tegra114 only? If we don't, then why does
Tegra114 have to restore them?

2) What triggers the HW to switch from IDLE to RUN state? In other
words, I think you're saying that the existing Tegra30 code:

	mov32	r4, ((1 << 28) | (0x8))	@ burst policy is PLLX
	str	r4, [r0, #CLK_RESET_CCLK_BURST]

doesn't change the clock rate right away. When does it change?

Related, when tegra30_switch_cpu_to_clk32k() sets the CPU clock source
for the IDLE state to clk_s, when does that switch actually take place?
That function appears to assume it happens immediately, since it
immediately disables the PLLs that might have been the previous CPU
clock source. Perhaps this is what the MSELECT register writes do?  I
guess not since there's only 1 MSELECT write in tegra_lp1_reset() and
two changes of CPU clock source.

Joseph Lo Aug. 8, 2013, 2:23 a.m. UTC | #12

On Thu, 2013-08-08 at 00:46 +0800, Stephen Warren wrote:
> On 08/07/2013 03:12 AM, Joseph Lo wrote:
> > On Wed, 2013-08-07 at 02:37 +0800, Stephen Warren wrote:
> >> On 08/06/2013 03:10 AM, Joseph Lo wrote:
> >>> On Tue, 2013-08-06 at 01:39 +0800, Stephen Warren wrote:
> >>>> On 08/05/2013 11:00 AM, Stephen Warren wrote:
> >>>>> On 08/05/2013 02:02 AM, Joseph Lo wrote:
> >>>>>> On Sat, 2013-08-03 at 04:32 +0800, Stephen Warren wrote:
> >>>>>>> On 08/02/2013 02:09 AM, Joseph Lo wrote:
> >>>>>>>> On Tue, 2013-07-30 at 06:51 +0800, Stephen Warren wrote:
> >>>>>>>>> On 07/26/2013 03:15 AM, Joseph Lo wrote:
> > [snip]
[snip]
> >> In tegra30_lp1_reset(), we can't use the same technique (a write to
> >> register 3) to restore the clock source for both G and LP clusters,
> >> since the values we need to write to those registers to re-select their
> >> clock source is different.
> >>
> >> however, on Tegra30, there really is only one register, so we can use
> >> that to both switch to clk_m/clk_s, /and/ to switch back.
> >>
> >> That all explains the following part of patch 7/8, which disables the
> >> clock register restore path except on Tegra30 where it will work:
> >>
> >>> -	mov32	r4, ((1 << 28) | (0x8))	@ burst policy is PLLX
> >>> -	str	r4, [r0, #CLK_RESET_CCLK_BURST]
> >>> +	cmp	r10, #TEGRA30
> >>> +	movweq	r4, #:lower16:((1 << 28) | (0x8))	@ burst policy is PLLX
> >>> +	movteq	r4, #:upper16:((1 << 28) | (0x8))
> >>> +	streq	r4, [r0, #CLK_RESET_CCLK_BURST]
> >
> > OK. The burst policy of CPU clock has 5 different states (STDBY, IDLE,
> > RUN, IRQ and FIQ). The SW only can set up the clock source of each
> > state. The switching to different state was controlled by HW.
> > 
> > So the Tegra30 code here was set up the clock source of IDLE state only
> > (the CPU runs in this state after resume) to make the performance
> > better. But It had no idea about the clock sources of the other states.
> > It still needs clock driver to take care of that.
> > 
> > The clock source id (0x8) means PLLX_OUT0 in Tegra30. In Tegra114, it
> > means PLLX_OUT0_LJ (low jitter). It's not available at this moment. I
> > can switch it to PLLX_OUT0 (0xe) for Tegra114 too. The code may like
> > below.
> > 
> >> -     mov32   r4, ((1 << 28) | (0x8)) @ burst policy is PLLX
> >> -     str     r4, [r0, #CLK_RESET_CCLK_BURST]
> >> +     cmp     r10, #TEGRA30
> >> +     movweq  r4, #:lower16:((1 << 28) | (0x8))
> >> +     movteq  r4, #:upper16:((1 << 28) | (0x8))
> >> +	movwne  r4, #:lower16:((1 << 28) | (0xe))
> >> +	movteq  r4, #:upper16:((1 << 28) | (0xe))
> 
> I assume that should be movtne.
Ah, yes.
> 
> >> +     str	r4, [r0, #CLK_RESET_CCLK_BURST]
> > 
> > I had tested this code before I sent this series. It didn't impact or
> > improve the resuming performance. So I removed them.
> > Do you want me to add them back?
> 
> I think adding that code back would make sense. It would mean the code
> works fundamentally the same way for all SoCs.
OK.
> 
> BTW, you mentioned that LP1 resume takes 10-15 seconds on Dalmore. Are
> you sure this isn't because the above code is missing, and hence the CPU
> keeps running at 32KHz or the crystal speed, since nothing restores the
> CPU clock source until much later, when the syscore op runs?
It's due to the long time that SDRAM leaves self-refresh. Looks
something that related to BCT not be flashed during the flashing
process. Please also check the internal bug 1314918.
> 
> >> However, none of this explains why the CPU clock restore logic on
> >> Tegra114 needs to be a syscore_op, rather than simply having
> >> tegra30_lp1_reset() execute some Tegra114-specific code.
> >>
> >> Re-writing the part of the patch I quoted above in C, it looks like:
> >>
> >> if (soc is tegra30)
> >>     write to CLK_RESET_CCLK_BURST to select pllx
> >>
> >> Instead of making that code do nothing on Tegra114, why can't that code be:
> >>
> >> if (soc is tegra30)
> >>     write to CLK_RESET_CCLK_BURST to re-select pllx
> >> else
> >>     write to G-cluster-specific register to restore saved value
> >>     write to LP-cluster-specific register to restore saved value
> >>
> >> or:
> >>
> >> if (soc is tegra30)
> >>     write to CLK_RESET_CCLK_BURST to re-select pllx
> >> else
> >>     write to G-cluster-specific register to re-select pllx/dfll/...
> >>     write to LP-cluster-specific register to restore ...
> >>
> > Hope the explanation above is enough for you. The low level code had no
> > idea of clock source of each CPU state in BURST_POLICY register. It
> > needs to be done by clk driver itself.
> 
> I still have absolutely no idea why Tegra30 and Tegra114 are different.
> 
> You mentioned something about this low-level code only manipulating the
> IDLE state, and the clock driver needing to restore the other 4 states.
> This raises yet more questions:
> 
> 1) Do we not need to restore the other 4 states on Tegra30? If not, why
> not? If we do, presumably Tegra30 (and Tegra20?) need to the syscore_op
> this patch series adds to Tegra114 only? If we don't, then why does
> Tegra114 have to restore them?
We need to restore all of them for all Tegra chips. For Tegra20/30, we
had done it in the tegra_cpu_car_ops.suspend/resume. For Tegra114, the
patch was here.

The other reason is:
1) The PLLX is the main CPU clock source in Tegra20/30. We can restore
it ASAP to get a better performance.
2) For Tegra114, the PLLX is the CPU clock source when CPU runs at low
rates. When CPU in high rate, it uses DFLL as clock source. So it
depends on what the clock source of the CPU when it goes into suspend.
And the DFLL has its own resume code, it needs to be restored before the
CPU uses it as clock source again. It makes the CPU clock restore
sequence like this.

> 
> 2) What triggers the HW to switch from IDLE to RUN state? 
I also want to know more detail about it. The TRM only said it decided
by HW and gave an example about when switching to IRQ or FIQ state.
> In other
> words, I think you're saying that the existing Tegra30 code:
> 
> 	mov32	r4, ((1 << 28) | (0x8))	@ burst policy is PLLX
> 	str	r4, [r0, #CLK_RESET_CCLK_BURST]
> 
> doesn't change the clock rate right away. When does it change?
The code before this is the PLLX re-enable code. Then it switches to
PLLX. The rate is still kept the same when it suspended. The next rate
change after resume would be happened in the CPUfreq driver.
> 
> Related, when tegra30_switch_cpu_to_clk32k() sets the CPU clock source
> for the IDLE state to clk_s, when does that switch actually take place?
After the PLLs was disabled, only SCLK switched to CLK_S, the CPU and
MSELECT still keep in CLK_M. Because the CPU would be powered off
immediately.
> That function appears to assume it happens immediately, since it
> immediately disables the PLLs that might have been the previous CPU
> clock source. Perhaps this is what the MSELECT register writes do?  
Yes, switching them to the Oscillator before disabling the PLLs.
> I
> guess not since there's only 1 MSELECT write in tegra_lp1_reset() and
> two changes of CPU clock source.

Stephen Warren Aug. 8, 2013, 7:54 p.m. UTC | #13

On 08/07/2013 08:23 PM, Joseph Lo wrote:
> On Thu, 2013-08-08 at 00:46 +0800, Stephen Warren wrote:
..
>> I still have absolutely no idea why Tegra30 and Tegra114 are different.
>>
>> You mentioned something about this low-level code only manipulating the
>> IDLE state, and the clock driver needing to restore the other 4 states.
>> This raises yet more questions:
>>
>> 1) Do we not need to restore the other 4 states on Tegra30? If not, why
>> not? If we do, presumably Tegra30 (and Tegra20?) need to the syscore_op
>> this patch series adds to Tegra114 only? If we don't, then why does
>> Tegra114 have to restore them?
>
> We need to restore all of them for all Tegra chips. For Tegra20/30, we
> had done it in the tegra_cpu_car_ops.suspend/resume. For Tegra114, the
> patch was here.
> 
> The other reason is:
> 1) The PLLX is the main CPU clock source in Tegra20/30. We can restore
> it ASAP to get a better performance.
> 2) For Tegra114, the PLLX is the CPU clock source when CPU runs at low
> rates. When CPU in high rate, it uses DFLL as clock source. So it
> depends on what the clock source of the CPU when it goes into suspend.
> And the DFLL has its own resume code, it needs to be restored before the
> CPU uses it as clock source again. It makes the CPU clock restore
> sequence like this.

So, we don't yet support the DFLL upstream. Presumably, the CPU is
always running off PLLX on Tegra114 upstream right now. As such, we can
hard-code that into the resume path just like we do on earlier chips,
i.e. using tegra_cpu_car_ops.resume().

Once we do get DFLL support, presumably the DFLL resume path can switch
the register from PLLX to DFLL, and we still won't need a custom
syscore_ops.

Will that work?

>> 2) What triggers the HW to switch from IDLE to RUN state? 
>
> I also want to know more detail about it. The TRM only said it decided
> by HW and gave an example about when switching to IRQ or FIQ state.
>
>> In other
>> words, I think you're saying that the existing Tegra30 code:
>>
>> 	mov32	r4, ((1 << 28) | (0x8))	@ burst policy is PLLX
>> 	str	r4, [r0, #CLK_RESET_CCLK_BURST]
>>
>> doesn't change the clock rate right away. When does it change?
>
> The code before this is the PLLX re-enable code. Then it switches to
> PLLX. The rate is still kept the same when it suspended. The next rate
> change after resume would be happened in the CPUfreq driver.

Sorry, when I wrote "doesn't change the clock rate right away", I really
meant "doesn't change the clock *source* right away".

Joseph Lo Aug. 9, 2013, 9:23 a.m. UTC | #14

On Fri, 2013-08-09 at 03:54 +0800, Stephen Warren wrote:
> On 08/07/2013 08:23 PM, Joseph Lo wrote:
> > On Thu, 2013-08-08 at 00:46 +0800, Stephen Warren wrote:
> ..
> >> I still have absolutely no idea why Tegra30 and Tegra114 are different.
> >>
> >> You mentioned something about this low-level code only manipulating the
> >> IDLE state, and the clock driver needing to restore the other 4 states.
> >> This raises yet more questions:
> >>
> >> 1) Do we not need to restore the other 4 states on Tegra30? If not, why
> >> not? If we do, presumably Tegra30 (and Tegra20?) need to the syscore_op
> >> this patch series adds to Tegra114 only? If we don't, then why does
> >> Tegra114 have to restore them?
> >
> > We need to restore all of them for all Tegra chips. For Tegra20/30, we
> > had done it in the tegra_cpu_car_ops.suspend/resume. For Tegra114, the
> > patch was here.
> > 
> > The other reason is:
> > 1) The PLLX is the main CPU clock source in Tegra20/30. We can restore
> > it ASAP to get a better performance.
> > 2) For Tegra114, the PLLX is the CPU clock source when CPU runs at low
> > rates. When CPU in high rate, it uses DFLL as clock source. So it
> > depends on what the clock source of the CPU when it goes into suspend.
> > And the DFLL has its own resume code, it needs to be restored before the
> > CPU uses it as clock source again. It makes the CPU clock restore
> > sequence like this.
> 
> So, we don't yet support the DFLL upstream. Presumably, the CPU is
> always running off PLLX on Tegra114 upstream right now. As such, we can
> hard-code that into the resume path just like we do on earlier chips,
> i.e. using tegra_cpu_car_ops.resume().
> 
> Once we do get DFLL support, presumably the DFLL resume path can switch
> the register from PLLX to DFLL, and we still won't need a custom
> syscore_ops.
> 
> Will that work?
Hmm. If we can implement DFLL resume code there, that may be work. But I
can't confirm right now. I can move them to tegra_cpu_car_ops for now.
We can re-visit this later when we get more function to support.

Thanks,
Joseph

[3/8] clk: tegra114: add LP1 suspend/resume support

Commit Message

Comments

Patch