Message ID | 1409173419-13076-1-git-send-email-sonnyrao@chromium.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Aug 27, 2014 at 2:03 PM, Sonny Rao <sonnyrao@chromium.org> wrote: > This is a bug fix for using physical arch timers when > the arch_timer_use_virtual boolean is false. It restores the > arch_counter_get_cntpct() function after removal in > > 0d651e4e "clocksource: arch_timer: use virtual counters" > > and completes the implementation of memory mapped access for physical > timers, so if a system is trying to use physical timers, it will > function properly. > > Signed-off-by: Sonny Rao <sonnyrao@chromium.org> Acked-by: Olof Johansson <olof@lixom.net> This should have a: Fixes: 0d651e4e65e9 ("clocksource: arch_timer: use virtual counters") tag too, and possibly cc stable? -Olof
On Wed, Aug 27, 2014 at 2:19 PM, Olof Johansson <olof@lixom.net> wrote: > On Wed, Aug 27, 2014 at 2:03 PM, Sonny Rao <sonnyrao@chromium.org> wrote: >> This is a bug fix for using physical arch timers when >> the arch_timer_use_virtual boolean is false. It restores the >> arch_counter_get_cntpct() function after removal in >> >> 0d651e4e "clocksource: arch_timer: use virtual counters" >> >> and completes the implementation of memory mapped access for physical >> timers, so if a system is trying to use physical timers, it will >> function properly. >> >> Signed-off-by: Sonny Rao <sonnyrao@chromium.org> > > Acked-by: Olof Johansson <olof@lixom.net> > > This should have a: > > Fixes: 0d651e4e65e9 ("clocksource: arch_timer: use virtual counters") > > tag too, and possibly cc stable? Ok, as far as stable goes, this patch wouldn't apply cleanly going all the way back to 0d651e4e65e9 As-is, it would need to go after 220069945b29 "clocksource: arch_timer: Add support for memory mapped timers" and there would need to be another, simpler, version that went between those two commits. So, I'm not sure what to do in this situation regarding stable? > > > -Olof
+Mark (author of change in question) On 08/27/14 14:27, Sonny Rao wrote: > On Wed, Aug 27, 2014 at 2:19 PM, Olof Johansson <olof@lixom.net> wrote: >> On Wed, Aug 27, 2014 at 2:03 PM, Sonny Rao <sonnyrao@chromium.org> wrote: >>> This is a bug fix for using physical arch timers when >>> the arch_timer_use_virtual boolean is false. It restores the >>> arch_counter_get_cntpct() function after removal in >>> >>> 0d651e4e "clocksource: arch_timer: use virtual counters" >>> >>> and completes the implementation of memory mapped access for physical >>> timers, so if a system is trying to use physical timers, it will >>> function properly. >>> >>> Signed-off-by: Sonny Rao <sonnyrao@chromium.org> >> Acked-by: Olof Johansson <olof@lixom.net> >> >> This should have a: >> >> Fixes: 0d651e4e65e9 ("clocksource: arch_timer: use virtual counters") >> >> tag too, and possibly cc stable? > Ok, as far as stable goes, this patch wouldn't apply cleanly going all > the way back to 0d651e4e65e9 > As-is, it would need to go after 220069945b29 "clocksource: > arch_timer: Add support for memory mapped timers" and there would need > to be another, simpler, version that went between those two commits. > > So, I'm not sure what to do in this situation regarding stable? Is there any reason why the virtual counter can't be read? Maybe we're the hyp and we need to make sure we don't use the virtual timer so that the guest can use it, but that doesn't have any effect on the usage of the virtual counter for the clocksource.
On Wed, Aug 27, 2014 at 3:26 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: > +Mark (author of change in question) > > On 08/27/14 14:27, Sonny Rao wrote: >> On Wed, Aug 27, 2014 at 2:19 PM, Olof Johansson <olof@lixom.net> wrote: >>> On Wed, Aug 27, 2014 at 2:03 PM, Sonny Rao <sonnyrao@chromium.org> wrote: >>>> This is a bug fix for using physical arch timers when >>>> the arch_timer_use_virtual boolean is false. It restores the >>>> arch_counter_get_cntpct() function after removal in >>>> >>>> 0d651e4e "clocksource: arch_timer: use virtual counters" >>>> >>>> and completes the implementation of memory mapped access for physical >>>> timers, so if a system is trying to use physical timers, it will >>>> function properly. >>>> >>>> Signed-off-by: Sonny Rao <sonnyrao@chromium.org> >>> Acked-by: Olof Johansson <olof@lixom.net> >>> >>> This should have a: >>> >>> Fixes: 0d651e4e65e9 ("clocksource: arch_timer: use virtual counters") >>> >>> tag too, and possibly cc stable? >> Ok, as far as stable goes, this patch wouldn't apply cleanly going all >> the way back to 0d651e4e65e9 >> As-is, it would need to go after 220069945b29 "clocksource: >> arch_timer: Add support for memory mapped timers" and there would need >> to be another, simpler, version that went between those two commits. >> >> So, I'm not sure what to do in this situation regarding stable? Greg tends to make a best-effort, you'll find out when he looks at backporting and can reply with a "punt" or "here's the right patch" then. > Is there any reason why the virtual counter can't be read? Maybe we're > the hyp and we need to make sure we don't use the virtual timer so that > the guest can use it, but that doesn't have any effect on the usage of > the virtual counter for the clocksource. There are several cases where virtual is unusable -- in particular it might not have been configured properly (i.e. the phys/virt offset is at a bad value). -Olof
On 08/27/14 15:33, Olof Johansson wrote: > On Wed, Aug 27, 2014 at 3:26 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: > >> Is there any reason why the virtual counter can't be read? Maybe we're >> the hyp and we need to make sure we don't use the virtual timer so that >> the guest can use it, but that doesn't have any effect on the usage of >> the virtual counter for the clocksource. > There are several cases where virtual is unusable -- in particular it > might not have been configured properly (i.e. the phys/virt offset is > at a bad value). > Any specifics? It would be nice to say so in the commit text so that others using such devices know they need this patch. I'm guessing the firmware can't be fixed? In this particular case is there actually a virtual interrupt but we've explicitly removed it from the DT so that the driver can be forced into using the physical counter? Or are we getting saved by the hyp check?
On Wed, Aug 27, 2014 at 5:56 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: > On 08/27/14 15:33, Olof Johansson wrote: >> On Wed, Aug 27, 2014 at 3:26 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: >> >>> Is there any reason why the virtual counter can't be read? Maybe we're >>> the hyp and we need to make sure we don't use the virtual timer so that >>> the guest can use it, but that doesn't have any effect on the usage of >>> the virtual counter for the clocksource. >> There are several cases where virtual is unusable -- in particular it >> might not have been configured properly (i.e. the phys/virt offset is >> at a bad value). >> > > Any specifics? It would be nice to say so in the commit text so that > others using such devices know they need this patch. I'm guessing the > firmware can't be fixed? Yeah, there are a few. The big.LITTLE on the Chromebook 2 models have this issue, due to the A7 cluster having an incorrect offset programmed. However, arch timers aren't supported on that SoC in the first place, so it's not a problem in reality. The other known platform is rk3288. It has products out in the wild where firmware updates are unlikely. Essentially, I expect many vendors who use BSP kernels by default to have firmware that forgets to setup the offset, since hardware doesn't come up with a default one, and their older BSP kernels doesn't access the virtual one. > In this particular case is there actually a virtual interrupt but we've > explicitly removed it from the DT so that the driver can be forced into > using the physical counter? Or are we getting saved by the hyp check? The SoC has the virtual timer, and if it has firmware that supports it there's a good reason to still have it there. After all, DT describes hardware. I have a patch I should post that adds a property to make the driver pick the physical timer instead, since right now it'll always use virtual if it's available. I'll try to get that posted later tonight. -Olof
Hi, On Wed, Aug 27, 2014 at 7:58 PM, Olof Johansson <olof@lixom.net> wrote: > On Wed, Aug 27, 2014 at 5:56 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: >> On 08/27/14 15:33, Olof Johansson wrote: >>> On Wed, Aug 27, 2014 at 3:26 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: >>> >>>> Is there any reason why the virtual counter can't be read? Maybe we're >>>> the hyp and we need to make sure we don't use the virtual timer so that >>>> the guest can use it, but that doesn't have any effect on the usage of >>>> the virtual counter for the clocksource. >>> There are several cases where virtual is unusable -- in particular it >>> might not have been configured properly (i.e. the phys/virt offset is >>> at a bad value). >>> >> >> Any specifics? It would be nice to say so in the commit text so that >> others using such devices know they need this patch. I'm guessing the >> firmware can't be fixed? Even if we could change things to use a virtual timer in some cases, Sonny's patch still fixes a bug. The code as written right now makes pretenses about supporting the physical timer, but it doesn't work. That should be fixed. > Yeah, there are a few. The big.LITTLE on the Chromebook 2 models have > this issue, due to the A7 cluster having an incorrect offset > programmed. However, arch timers aren't supported on that SoC in the > first place, so it's not a problem in reality. > > The other known platform is rk3288. It has products out in the wild > where firmware updates are unlikely. One other reason is that (I'm told) that the virtual offset is lost in certain power down conditions (powering down a core, going into S3, etc). When we power back up the offset is effectively reset to a random value. That means we need something to reprogram the virtual timer offset whenever we power things back up. If we've got a hypervisor then the hypervisor will definitely be involved in powering things back up and it can reset the virtual offset. ...but forcing systems to implement a hypervisor (or somehow adding an interface for the kernel to call back into firmware) is a huge effort and it means more hard-to-update code sitting in firmware. Note: having the virtual offset initted to a random value seems like an unfortunate design choice for the virtual timer offset (guaranteeing it was initted to 0 would have avoided the problem), but that's what we seem to have. -Doug
[cc-ing Will for the VDSO bits] Hi Sonny, On 27/08/14 22:03, Sonny Rao wrote: > This is a bug fix for using physical arch timers when > the arch_timer_use_virtual boolean is false. It restores the > arch_counter_get_cntpct() function after removal in > > 0d651e4e "clocksource: arch_timer: use virtual counters" This isn't a bug, but rather a feature. It allows the kernel to consistently use the virtual counter everywhere (including userspace), no matter if it behaves as a host or a guest, without worrying about the accessibility of the physical counter (hint: you have no way of knowing if it is accessible or not). > and completes the implementation of memory mapped access for physical > timers, so if a system is trying to use physical timers, it will > function properly. > > Signed-off-by: Sonny Rao <sonnyrao@chromium.org> > --- > arch/arm/include/asm/arch_timer.h | 9 +++++++++ > arch/arm64/include/asm/arch_timer.h | 10 ++++++++++ > drivers/clocksource/arm_arch_timer.c | 30 ++++++++++++++++++++++++++---- > 3 files changed, 45 insertions(+), 4 deletions(-) > > diff --git a/arch/arm/include/asm/arch_timer.h b/arch/arm/include/asm/arch_timer.h > index 0704e0c..e72aa4d 100644 > --- a/arch/arm/include/asm/arch_timer.h > +++ b/arch/arm/include/asm/arch_timer.h > @@ -78,6 +78,15 @@ static inline u32 arch_timer_get_cntfrq(void) > return val; > } > > +static inline u64 arch_counter_get_cntpct(void) > +{ > + u64 cval; > + > + isb(); > + asm volatile("mrrc p15, 0, %Q0, %R0, c14" : "=r" (cval)); > + return cval; > +} > + > static inline u64 arch_counter_get_cntvct(void) > { > u64 cval; > diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h > index 9400596..58657c4 100644 > --- a/arch/arm64/include/asm/arch_timer.h > +++ b/arch/arm64/include/asm/arch_timer.h > @@ -135,6 +135,16 @@ static inline void arch_timer_evtstrm_enable(int divider) > #endif > } > > +static inline u64 arch_counter_get_cntpct(void) > +{ > + u64 cval; > + > + isb(); > + asm volatile("mrs %0, cntpct_el0" : "=r" (cval)); > + > + return cval; > +} > + > static inline u64 arch_counter_get_cntvct(void) > { > u64 cval; > diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c > index 5163ec1..ad723cb 100644 > --- a/drivers/clocksource/arm_arch_timer.c > +++ b/drivers/clocksource/arm_arch_timer.c > @@ -30,6 +30,8 @@ > #define CNTTIDR 0x08 > #define CNTTIDR_VIRT(n) (BIT(1) << ((n) * 4)) > > +#define CNTPCT_LO 0x00 > +#define CNTPCT_HI 0x04 > #define CNTVCT_LO 0x08 > #define CNTVCT_HI 0x0c > #define CNTFRQ 0x10 > @@ -386,6 +388,19 @@ static u64 arch_counter_get_cntvct_mem(void) > return ((u64) vct_hi << 32) | vct_lo; > } > > +static u64 arch_counter_get_cntpct_mem(void) > +{ > + u32 pct_lo, pct_hi, tmp_hi; > + > + do { > + pct_hi = readl_relaxed(arch_counter_base + CNTPCT_HI); > + pct_lo = readl_relaxed(arch_counter_base + CNTPCT_LO); > + tmp_hi = readl_relaxed(arch_counter_base + CNTPCT_HI); > + } while (pct_hi != tmp_hi); > + > + return ((u64) pct_hi << 32) | pct_lo; > +} > + > /* > * Default to cp15 based access because arm64 uses this function for > * sched_clock() before DT is probed and the cp15 method is guaranteed > @@ -429,10 +444,17 @@ static void __init arch_counter_register(unsigned type) > u64 start_count; > > /* Register the CP15 based counter if we have one */ > - if (type & ARCH_CP15_TIMER) > - arch_timer_read_counter = arch_counter_get_cntvct; > - else > - arch_timer_read_counter = arch_counter_get_cntvct_mem; > + if (type & ARCH_CP15_TIMER) { > + if (arch_timer_use_virtual) > + arch_timer_read_counter = arch_counter_get_cntvct; > + else > + arch_timer_read_counter = arch_counter_get_cntpct; > + } else { > + if (arch_timer_use_virtual) > + arch_timer_read_counter = arch_counter_get_cntvct_mem; > + else > + arch_timer_read_counter = arch_counter_get_cntpct_mem; > + } > > start_count = arch_timer_read_counter(); > clocksource_register_hz(&clocksource_counter, arch_timer_rate); > What will be the effect on arm64 where the VDSO uses the virtual counter to implement gettimeofday? Thanks, M.
On Thu, Aug 28, 2014 at 04:33:31AM +0100, Doug Anderson wrote: > Hi, > > On Wed, Aug 27, 2014 at 7:58 PM, Olof Johansson <olof@lixom.net> wrote: > > On Wed, Aug 27, 2014 at 5:56 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: > >> On 08/27/14 15:33, Olof Johansson wrote: > >>> On Wed, Aug 27, 2014 at 3:26 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: > >>> > >>>> Is there any reason why the virtual counter can't be read? Maybe we're > >>>> the hyp and we need to make sure we don't use the virtual timer so that > >>>> the guest can use it, but that doesn't have any effect on the usage of > >>>> the virtual counter for the clocksource. > >>> There are several cases where virtual is unusable -- in particular it > >>> might not have been configured properly (i.e. the phys/virt offset is > >>> at a bad value). > >>> > >> > >> Any specifics? It would be nice to say so in the commit text so that > >> others using such devices know they need this patch. I'm guessing the > >> firmware can't be fixed? > > Even if we could change things to use a virtual timer in some cases, > Sonny's patch still fixes a bug. The code as written right now makes > pretenses about supporting the physical timer, but it doesn't work. > That should be fixed. The code does support the physical timer. It does not support the physical counter (and makes no pretenses that it does). I had hoped we wouldn't encounter cases where CNTVOFF was hopelessly ill-configured on a platform, but evidently we have. So we need some workaround for that. > > Yeah, there are a few. The big.LITTLE on the Chromebook 2 models have > > this issue, due to the A7 cluster having an incorrect offset > > programmed. However, arch timers aren't supported on that SoC in the > > first place, so it's not a problem in reality. > > > > The other known platform is rk3288. It has products out in the wild > > where firmware updates are unlikely. > > One other reason is that (I'm told) that the virtual offset is lost in > certain power down conditions (powering down a core, going into S3, > etc). When we power back up the offset is effectively reset to a > random value. That means we need something to reprogram the virtual > timer offset whenever we power things back up. > > If we've got a hypervisor then the hypervisor will definitely be > involved in powering things back up and it can reset the virtual > offset. ...but forcing systems to implement a hypervisor (or somehow > adding an interface for the kernel to call back into firmware) is a > huge effort and it means more hard-to-update code sitting in firmware. Not if you boot Linux at hyp, as we've recommended for this precise reason. That doesn't fix other things like CNTFRQ if the secure initialisation doesn't poke that, however. Thanks, Mark.
Hi Mark, On 08/28/2014 05:35 AM, Mark Rutland wrote: > On Thu, Aug 28, 2014 at 04:33:31AM +0100, Doug Anderson wrote: >> Hi, >> >> On Wed, Aug 27, 2014 at 7:58 PM, Olof Johansson <olof@lixom.net> wrote: >>> On Wed, Aug 27, 2014 at 5:56 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: >>>> On 08/27/14 15:33, Olof Johansson wrote: >>>>> On Wed, Aug 27, 2014 at 3:26 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: >>>>> >>>>>> Is there any reason why the virtual counter can't be read? Maybe we're >>>>>> the hyp and we need to make sure we don't use the virtual timer so that >>>>>> the guest can use it, but that doesn't have any effect on the usage of >>>>>> the virtual counter for the clocksource. >>>>> >>>>> There are several cases where virtual is unusable -- in particular it >>>>> might not have been configured properly (i.e. the phys/virt offset is >>>>> at a bad value). >>>> >>>> Any specifics? It would be nice to say so in the commit text so that >>>> others using such devices know they need this patch. I'm guessing the >>>> firmware can't be fixed? >> >> Even if we could change things to use a virtual timer in some cases, >> Sonny's patch still fixes a bug. The code as written right now makes >> pretenses about supporting the physical timer, but it doesn't work. >> That should be fixed. > > The code does support the physical timer. It does not support the > physical counter (and makes no pretenses that it does). I think if you could please explain the following code, that may help clear up some of the confusion. if (arch_timer_use_virtual) { clk->irq = arch_timer_ppi[VIRT_PPI]; clk->set_mode = arch_timer_set_mode_virt; clk->set_next_event = arch_timer_set_next_event_virt; } else { clk->irq = arch_timer_ppi[PHYS_SECURE_PPI]; clk->set_mode = arch_timer_set_mode_phys; clk->set_next_event = arch_timer_set_next_event_phys; } http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/clocksource/arm_arch_timer.c#n272 Perhaps you mean to say the code does not support *non-secure access* to the physical timer? Thanks, Christopher
On Thu, Aug 28, 2014 at 06:09:32PM +0100, Christopher Covington wrote: > Hi Mark, Hi Christopher, > On 08/28/2014 05:35 AM, Mark Rutland wrote: > > On Thu, Aug 28, 2014 at 04:33:31AM +0100, Doug Anderson wrote: > >> Hi, > >> > >> On Wed, Aug 27, 2014 at 7:58 PM, Olof Johansson <olof@lixom.net> wrote: > >>> On Wed, Aug 27, 2014 at 5:56 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: > >>>> On 08/27/14 15:33, Olof Johansson wrote: > >>>>> On Wed, Aug 27, 2014 at 3:26 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: > >>>>> > >>>>>> Is there any reason why the virtual counter can't be read? Maybe we're > >>>>>> the hyp and we need to make sure we don't use the virtual timer so that > >>>>>> the guest can use it, but that doesn't have any effect on the usage of > >>>>>> the virtual counter for the clocksource. > >>>>> > >>>>> There are several cases where virtual is unusable -- in particular it > >>>>> might not have been configured properly (i.e. the phys/virt offset is > >>>>> at a bad value). > >>>> > >>>> Any specifics? It would be nice to say so in the commit text so that > >>>> others using such devices know they need this patch. I'm guessing the > >>>> firmware can't be fixed? > >> > >> Even if we could change things to use a virtual timer in some cases, > >> Sonny's patch still fixes a bug. The code as written right now makes > >> pretenses about supporting the physical timer, but it doesn't work. > >> That should be fixed. > > > > The code does support the physical timer. It does not support the > > physical counter (and makes no pretenses that it does). > > I think if you could please explain the following code, that may help clear up > some of the confusion. > > if (arch_timer_use_virtual) { > clk->irq = arch_timer_ppi[VIRT_PPI]; > clk->set_mode = arch_timer_set_mode_virt; > clk->set_next_event = arch_timer_set_next_event_virt; > } else { > clk->irq = arch_timer_ppi[PHYS_SECURE_PPI]; > clk->set_mode = arch_timer_set_mode_phys; > clk->set_next_event = arch_timer_set_next_event_phys; > } > > http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/clocksource/arm_arch_timer.c#n272 > > Perhaps you mean to say the code does not support *non-secure access* to the > physical timer? Not quite, and one issue here is conflating the timer and the counter. We use the physical timer iff we know it is accessible and correct to use (e.g. when we have been booted at Hyp). If we don't know that, arch_timer_use_virtual is set and we use the virtual timer. In that sense, we support the use of the physical timer. If the system is sane (with uniform CNTVOFF), that is fine. The fact that we have a broken system to work around does not mean that the driver is broken nor that it is making false pretenses. The driver is perfectly functional given a sane environment. The use of virtual/physical is not a feature that falls to personal preference; it is a requirement for correctness that we only use the physical timer when we have an absolute guarantee that it is possible and correct to do so. So saying that we "support" the physical timer is also somewhat misleading; we use what we must. We _always_ use the virtual counter rather than the physical counter as this saved on having different paths for host and guest (which is important for the 64-bit VDSO, for example). In that sense we do not support the use of the physical counter. We don't need to use it in a sane system. Thanks, Mark.
On Thu, Aug 28, 2014 at 2:35 AM, Mark Rutland <mark.rutland@arm.com> wrote: > On Thu, Aug 28, 2014 at 04:33:31AM +0100, Doug Anderson wrote: >> Hi, >> >> On Wed, Aug 27, 2014 at 7:58 PM, Olof Johansson <olof@lixom.net> wrote: >> > On Wed, Aug 27, 2014 at 5:56 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: >> >> On 08/27/14 15:33, Olof Johansson wrote: >> >>> On Wed, Aug 27, 2014 at 3:26 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: >> >>> >> >>>> Is there any reason why the virtual counter can't be read? Maybe we're >> >>>> the hyp and we need to make sure we don't use the virtual timer so that >> >>>> the guest can use it, but that doesn't have any effect on the usage of >> >>>> the virtual counter for the clocksource. >> >>> There are several cases where virtual is unusable -- in particular it >> >>> might not have been configured properly (i.e. the phys/virt offset is >> >>> at a bad value). >> >>> >> >> >> >> Any specifics? It would be nice to say so in the commit text so that >> >> others using such devices know they need this patch. I'm guessing the >> >> firmware can't be fixed? >> >> Even if we could change things to use a virtual timer in some cases, >> Sonny's patch still fixes a bug. The code as written right now makes >> pretenses about supporting the physical timer, but it doesn't work. >> That should be fixed. > > The code does support the physical timer. It does not support the > physical counter (and makes no pretenses that it does). > Is there some reason that it should not support it? It seems like the two things are highly related. > I had hoped we wouldn't encounter cases where CNTVOFF was hopelessly > ill-configured on a platform, but evidently we have. So we need some > workaround for that. > >> > Yeah, there are a few. The big.LITTLE on the Chromebook 2 models have >> > this issue, due to the A7 cluster having an incorrect offset >> > programmed. However, arch timers aren't supported on that SoC in the >> > first place, so it's not a problem in reality. >> > >> > The other known platform is rk3288. It has products out in the wild >> > where firmware updates are unlikely. >> >> One other reason is that (I'm told) that the virtual offset is lost in >> certain power down conditions (powering down a core, going into S3, >> etc). When we power back up the offset is effectively reset to a >> random value. That means we need something to reprogram the virtual >> timer offset whenever we power things back up. >> >> If we've got a hypervisor then the hypervisor will definitely be >> involved in powering things back up and it can reset the virtual >> offset. ...but forcing systems to implement a hypervisor (or somehow >> adding an interface for the kernel to call back into firmware) is a >> huge effort and it means more hard-to-update code sitting in firmware. > > Not if you boot Linux at hyp, as we've recommended for this precise > reason. That doesn't fix other things like CNTFRQ if the secure > initialisation doesn't poke that, however. That's interesting, we could look into that. > Thanks, > Mark.
On Fri, Aug 29, 2014 at 01:10:49AM +0100, Sonny Rao wrote: > On Thu, Aug 28, 2014 at 2:35 AM, Mark Rutland <mark.rutland@arm.com> wrote: > > On Thu, Aug 28, 2014 at 04:33:31AM +0100, Doug Anderson wrote: > >> Hi, > >> > >> On Wed, Aug 27, 2014 at 7:58 PM, Olof Johansson <olof@lixom.net> wrote: > >> > On Wed, Aug 27, 2014 at 5:56 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: > >> >> On 08/27/14 15:33, Olof Johansson wrote: > >> >>> On Wed, Aug 27, 2014 at 3:26 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: > >> >>> > >> >>>> Is there any reason why the virtual counter can't be read? Maybe we're > >> >>>> the hyp and we need to make sure we don't use the virtual timer so that > >> >>>> the guest can use it, but that doesn't have any effect on the usage of > >> >>>> the virtual counter for the clocksource. > >> >>> There are several cases where virtual is unusable -- in particular it > >> >>> might not have been configured properly (i.e. the phys/virt offset is > >> >>> at a bad value). > >> >>> > >> >> > >> >> Any specifics? It would be nice to say so in the commit text so that > >> >> others using such devices know they need this patch. I'm guessing the > >> >> firmware can't be fixed? > >> > >> Even if we could change things to use a virtual timer in some cases, > >> Sonny's patch still fixes a bug. The code as written right now makes > >> pretenses about supporting the physical timer, but it doesn't work. > >> That should be fixed. > > > > The code does support the physical timer. It does not support the > > physical counter (and makes no pretenses that it does). > > > > Is there some reason that it should not support it? It seems like the > two things are highly related. While the two are related, in sane systems the use of the physical counters is rendered unnecessary by the ability to write to CNTVOFF at PL2. If an OS is booted at PL1 the physical timers aren't guaranteed to be accessible, so the OS must use the virtual timers. As the OS could be virtualized it must use the virtual counters. If an OS is booted at PL2 it can access the physical counters, and should do so in case something like KVM will be used later. The OS can write to CNTVOFF at PL2, and if it sets CNTVOFF to zero the physical and virtual counters are equivalent. Thus it can use the virtual counters and doesn't need to have additional code in several places (including the VDSO) where it needs to choose to read which counters to read. The problem only exists where PL2 exists and the firmware/bootloader skipped PL2 without initialising the necessary PL2 state. This is in general a stupid thing to do; it introduces a problem that need not exist and throws away the option of using the features PL2 provides. This is a firmware/bootloader bug. > > I had hoped we wouldn't encounter cases where CNTVOFF was hopelessly > > ill-configured on a platform, but evidently we have. So we need some > > workaround for that. > > > >> > Yeah, there are a few. The big.LITTLE on the Chromebook 2 models have > >> > this issue, due to the A7 cluster having an incorrect offset > >> > programmed. However, arch timers aren't supported on that SoC in the > >> > first place, so it's not a problem in reality. > >> > > >> > The other known platform is rk3288. It has products out in the wild > >> > where firmware updates are unlikely. > >> > >> One other reason is that (I'm told) that the virtual offset is lost in > >> certain power down conditions (powering down a core, going into S3, > >> etc). When we power back up the offset is effectively reset to a > >> random value. That means we need something to reprogram the virtual > >> timer offset whenever we power things back up. > >> > >> If we've got a hypervisor then the hypervisor will definitely be > >> involved in powering things back up and it can reset the virtual > >> offset. ...but forcing systems to implement a hypervisor (or somehow > >> adding an interface for the kernel to call back into firmware) is a > >> huge effort and it means more hard-to-update code sitting in firmware. > > > > Not if you boot Linux at hyp, as we've recommended for this precise > > reason. That doesn't fix other things like CNTFRQ if the secure > > initialisation doesn't poke that, however. > > That's interesting, we could look into that. I would strongly recommend that you do. :) Linux has supported booting at Hyp for quite a while now, so I would hope the only issues would be the organisation of the firmware and/or bootloader. Thanks, Mark.
On Fri, Aug 29, 2014 at 3:04 AM, Mark Rutland <mark.rutland@arm.com> wrote: > On Fri, Aug 29, 2014 at 01:10:49AM +0100, Sonny Rao wrote: >> On Thu, Aug 28, 2014 at 2:35 AM, Mark Rutland <mark.rutland@arm.com> wrote: >> > On Thu, Aug 28, 2014 at 04:33:31AM +0100, Doug Anderson wrote: >> >> Hi, >> >> >> >> On Wed, Aug 27, 2014 at 7:58 PM, Olof Johansson <olof@lixom.net> wrote: >> >> > On Wed, Aug 27, 2014 at 5:56 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: >> >> >> On 08/27/14 15:33, Olof Johansson wrote: >> >> >>> On Wed, Aug 27, 2014 at 3:26 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: >> >> >>> >> >> >>>> Is there any reason why the virtual counter can't be read? Maybe we're >> >> >>>> the hyp and we need to make sure we don't use the virtual timer so that >> >> >>>> the guest can use it, but that doesn't have any effect on the usage of >> >> >>>> the virtual counter for the clocksource. >> >> >>> There are several cases where virtual is unusable -- in particular it >> >> >>> might not have been configured properly (i.e. the phys/virt offset is >> >> >>> at a bad value). >> >> >>> >> >> >> >> >> >> Any specifics? It would be nice to say so in the commit text so that >> >> >> others using such devices know they need this patch. I'm guessing the >> >> >> firmware can't be fixed? >> >> >> >> Even if we could change things to use a virtual timer in some cases, >> >> Sonny's patch still fixes a bug. The code as written right now makes >> >> pretenses about supporting the physical timer, but it doesn't work. >> >> That should be fixed. >> > >> > The code does support the physical timer. It does not support the >> > physical counter (and makes no pretenses that it does). >> > >> >> Is there some reason that it should not support it? It seems like the >> two things are highly related. > > While the two are related, in sane systems the use of the physical > counters is rendered unnecessary by the ability to write to CNTVOFF at > PL2. By sane, do you mean a system which starts the kernel in PL2? Or one that has CNTVOFF initialized to the same value on all CPUs? > > If an OS is booted at PL1 the physical timers aren't guaranteed to be > accessible, so the OS must use the virtual timers. As the OS could be > virtualized it must use the virtual counters. I was curious to learn more about these modes and looked at the spec. The spec I have seems to say that in a VMSA implementation without virtualization, then CNTPCT is always available, but if it has virtualization then a bit needs to be set, which I think is what you're referring to. I think the spec also says that virtualization extensions are optional. How do you deal with the case that they are not implemented? Or maybe that simply isn't supported? > If an OS is booted at PL2 it can access the physical counters, and > should do so in case something like KVM will be used later. The OS can > write to CNTVOFF at PL2, and if it sets CNTVOFF to zero the physical and > virtual counters are equivalent. Thus it can use the virtual counters > and doesn't need to have additional code in several places (including > the VDSO) where it needs to choose to read which counters to read. > > The problem only exists where PL2 exists and the firmware/bootloader > skipped PL2 without initialising the necessary PL2 state. This is in > general a stupid thing to do; it introduces a problem that need not > exist and throws away the option of using the features PL2 provides. > This is a firmware/bootloader bug. Well it's not quite that simple, this is actually an issue with the hardware that the CNTVOFF comes up with different values on different cores. This happens not only at boot, but any time the core is powered on, which could include deep sleep or CPU hotplug and suspend to ram. The firmware may not be involved in all these cases, so we cannot rely on it to fix this problem. > >> > I had hoped we wouldn't encounter cases where CNTVOFF was hopelessly >> > ill-configured on a platform, but evidently we have. So we need some >> > workaround for that. >> > >> >> > Yeah, there are a few. The big.LITTLE on the Chromebook 2 models have >> >> > this issue, due to the A7 cluster having an incorrect offset >> >> > programmed. However, arch timers aren't supported on that SoC in the >> >> > first place, so it's not a problem in reality. >> >> > >> >> > The other known platform is rk3288. It has products out in the wild >> >> > where firmware updates are unlikely. >> >> >> >> One other reason is that (I'm told) that the virtual offset is lost in >> >> certain power down conditions (powering down a core, going into S3, >> >> etc). When we power back up the offset is effectively reset to a >> >> random value. That means we need something to reprogram the virtual >> >> timer offset whenever we power things back up. >> >> >> >> If we've got a hypervisor then the hypervisor will definitely be >> >> involved in powering things back up and it can reset the virtual >> >> offset. ...but forcing systems to implement a hypervisor (or somehow >> >> adding an interface for the kernel to call back into firmware) is a >> >> huge effort and it means more hard-to-update code sitting in firmware. >> > >> > Not if you boot Linux at hyp, as we've recommended for this precise >> > reason. That doesn't fix other things like CNTFRQ if the secure >> > initialisation doesn't poke that, however. >> >> That's interesting, we could look into that. > > I would strongly recommend that you do. :) > > Linux has supported booting at Hyp for quite a while now, so I would > hope the only issues would be the organisation of the firmware and/or > bootloader. I'm still investigating this for our firmware (but cannot guarantee that anyone else will do this), but like I mentioned above, it probably won't solve the problem for suspend to RAM or deep sleep/power gating of cpu cores. > Thanks, > Mark.
On Thu, Sep 04, 2014 at 06:01:27PM +0100, Sonny Rao wrote: > On Fri, Aug 29, 2014 at 3:04 AM, Mark Rutland <mark.rutland@arm.com> wrote: > > On Fri, Aug 29, 2014 at 01:10:49AM +0100, Sonny Rao wrote: > >> On Thu, Aug 28, 2014 at 2:35 AM, Mark Rutland <mark.rutland@arm.com> wrote: > >> > On Thu, Aug 28, 2014 at 04:33:31AM +0100, Doug Anderson wrote: > >> >> Hi, > >> >> > >> >> On Wed, Aug 27, 2014 at 7:58 PM, Olof Johansson <olof@lixom.net> wrote: > >> >> > On Wed, Aug 27, 2014 at 5:56 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: > >> >> >> On 08/27/14 15:33, Olof Johansson wrote: > >> >> >>> On Wed, Aug 27, 2014 at 3:26 PM, Stephen Boyd <sboyd@codeaurora.org> wrote: > >> >> >>> > >> >> >>>> Is there any reason why the virtual counter can't be read? Maybe we're > >> >> >>>> the hyp and we need to make sure we don't use the virtual timer so that > >> >> >>>> the guest can use it, but that doesn't have any effect on the usage of > >> >> >>>> the virtual counter for the clocksource. > >> >> >>> There are several cases where virtual is unusable -- in particular it > >> >> >>> might not have been configured properly (i.e. the phys/virt offset is > >> >> >>> at a bad value). > >> >> >>> > >> >> >> > >> >> >> Any specifics? It would be nice to say so in the commit text so that > >> >> >> others using such devices know they need this patch. I'm guessing the > >> >> >> firmware can't be fixed? > >> >> > >> >> Even if we could change things to use a virtual timer in some cases, > >> >> Sonny's patch still fixes a bug. The code as written right now makes > >> >> pretenses about supporting the physical timer, but it doesn't work. > >> >> That should be fixed. > >> > > >> > The code does support the physical timer. It does not support the > >> > physical counter (and makes no pretenses that it does). > >> > > >> > >> Is there some reason that it should not support it? It seems like the > >> two things are highly related. > > > > While the two are related, in sane systems the use of the physical > > counters is rendered unnecessary by the ability to write to CNTVOFF at > > PL2. > > By sane, do you mean a system which starts the kernel in PL2? Or one > that has CNTVOFF initialized to the same value on all CPUs? So long as all CPUs have a uniform view of time (with the same CNTFRQ and no observable offset), things are sane. Fundamentally, everything that can only be configured at a higher privilege level must have been configured. So if booted at PL2, it's not necessary for CNTVOFF to be initialised while for PL1 it's critical that it is. > > If an OS is booted at PL1 the physical timers aren't guaranteed to be > > accessible, so the OS must use the virtual timers. As the OS could be > > virtualized it must use the virtual counters. > > I was curious to learn more about these modes and looked at the spec. > The spec I have seems to say that in a VMSA implementation without > virtualization, then CNTPCT is always available, but if it has > virtualization then a bit needs to be set, which I think is what > you're referring to. I think the spec also says that virtualization > extensions are optional. How do you deal with the case that they are > not implemented? Or maybe that simply isn't supported? Do you mean where the virtualization extensions are not implemented, but the timers are? If so: * CNTVOFF doesn't exist, and there is no distinction between virtual and physical time. * There is no PL2. * The virtual counters and timers exist, and are accessible to PL1. So using the virtual timers and counters in this case (as we do) works. > > If an OS is booted at PL2 it can access the physical counters, and > > should do so in case something like KVM will be used later. The OS can > > write to CNTVOFF at PL2, and if it sets CNTVOFF to zero the physical and > > virtual counters are equivalent. Thus it can use the virtual counters > > and doesn't need to have additional code in several places (including > > the VDSO) where it needs to choose to read which counters to read. > > > > The problem only exists where PL2 exists and the firmware/bootloader > > skipped PL2 without initialising the necessary PL2 state. This is in > > general a stupid thing to do; it introduces a problem that need not > > exist and throws away the option of using the features PL2 provides. > > This is a firmware/bootloader bug. > > Well it's not quite that simple, this is actually an issue with the > hardware that the CNTVOFF comes up with different values on different > cores. This happens not only at boot, but any time the core is > powered on, which could include deep sleep or CPU hotplug and suspend > to ram. The firmware may not be involved in all these cases, so we > cannot rely on it to fix this problem. If the only thing that can restore that state is firmware, then I would take that as an indication that firmware _must_ be involved in those cases. Surely there is other secure or PL2 state that you need to restore in those cases anyhow? > >> > I had hoped we wouldn't encounter cases where CNTVOFF was hopelessly > >> > ill-configured on a platform, but evidently we have. So we need some > >> > workaround for that. > >> > > >> >> > Yeah, there are a few. The big.LITTLE on the Chromebook 2 models have > >> >> > this issue, due to the A7 cluster having an incorrect offset > >> >> > programmed. However, arch timers aren't supported on that SoC in the > >> >> > first place, so it's not a problem in reality. > >> >> > > >> >> > The other known platform is rk3288. It has products out in the wild > >> >> > where firmware updates are unlikely. > >> >> > >> >> One other reason is that (I'm told) that the virtual offset is lost in > >> >> certain power down conditions (powering down a core, going into S3, > >> >> etc). When we power back up the offset is effectively reset to a > >> >> random value. That means we need something to reprogram the virtual > >> >> timer offset whenever we power things back up. > >> >> > >> >> If we've got a hypervisor then the hypervisor will definitely be > >> >> involved in powering things back up and it can reset the virtual > >> >> offset. ...but forcing systems to implement a hypervisor (or somehow > >> >> adding an interface for the kernel to call back into firmware) is a > >> >> huge effort and it means more hard-to-update code sitting in firmware. > >> > > >> > Not if you boot Linux at hyp, as we've recommended for this precise > >> > reason. That doesn't fix other things like CNTFRQ if the secure > >> > initialisation doesn't poke that, however. > >> > >> That's interesting, we could look into that. > > > > I would strongly recommend that you do. :) > > > > Linux has supported booting at Hyp for quite a while now, so I would > > hope the only issues would be the organisation of the firmware and/or > > bootloader. > > I'm still investigating this for our firmware (but cannot guarantee > that anyone else will do this), but like I mentioned above, it > probably won't solve the problem for suspend to RAM or deep > sleep/power gating of cpu cores. That doesn't sound right to me. I assume from those cases you'd return back to the OS at PL2. If state is lost in those cases which an OS cannot restore with PL2 privilege, then you have bigger problems. Thanks, Mark.
On Thu, Sep 04, 2014 at 06:01:27PM +0100, Sonny Rao wrote: [...] > > If an OS is booted at PL2 it can access the physical counters, and > > should do so in case something like KVM will be used later. The OS can > > write to CNTVOFF at PL2, and if it sets CNTVOFF to zero the physical and > > virtual counters are equivalent. Thus it can use the virtual counters > > and doesn't need to have additional code in several places (including > > the VDSO) where it needs to choose to read which counters to read. > > > > The problem only exists where PL2 exists and the firmware/bootloader > > skipped PL2 without initialising the necessary PL2 state. This is in > > general a stupid thing to do; it introduces a problem that need not > > exist and throws away the option of using the features PL2 provides. > > This is a firmware/bootloader bug. > > Well it's not quite that simple, this is actually an issue with the > hardware that the CNTVOFF comes up with different values on different > cores. This happens not only at boot, but any time the core is > powered on, which could include deep sleep or CPU hotplug and suspend > to ram. The firmware may not be involved in all these cases, so we > cannot rely on it to fix this problem. And why isn't firmware involved in those cases if it _is_ involved in cold boot ? Resuming from low-power means resuming the machine/core as it was when it was running before power down, anything that deviates from that behaviour is a programming bug. Lorenzo
Mark, On Thu, Aug 28, 2014 at 2:35 AM, Mark Rutland <mark.rutland@arm.com> wrote: > Not if you boot Linux at hyp, as we've recommended for this precise > reason. That doesn't fix other things like CNTFRQ if the secure > initialisation doesn't poke that, however. I'll freely admit that I'm out of my league and out of my comfort zone here, but... In the theory that firmware ought to be as minimal as possible (because it's hard to update and hard to keep in sync with kernel versions), it seems like firmware ought to start the kernel out in as permissive mode as it's willing to provide, right? If the kernel is started out as permissive as possible then it can do anything it needs to. Future versions of the kernel can be implemented to do any way-cool things that they want to do without an update to firmware, right? ...and current versions of the kernel can just shed permissions if they don't want them. ...so if I understand correctly, "Secure SVC" mode is more permissive than "Non Secure HYP" mode, right? It looks to me as if we currently start the kernel in "Secure SVC" mode. What do you think about the kernel detecting Secure SVC and then dropping down permission levels (to Non Secure HYP). Once it did this, it could update things like the virtual offset and then transition down further into non-secure SVC mode. ...or maybe this has been discussed millions of times already and I'm just clueless. ...or maybe this is just too hard for the kernel to do in a generic way? -Doug
On Fri, Sep 05, 2014 at 11:11:47PM +0100, Doug Anderson wrote: > On Thu, Aug 28, 2014 at 2:35 AM, Mark Rutland <mark.rutland@arm.com> wrote: > > Not if you boot Linux at hyp, as we've recommended for this precise > > reason. That doesn't fix other things like CNTFRQ if the secure > > initialisation doesn't poke that, however. > > I'll freely admit that I'm out of my league and out of my comfort zone > here, but... > > In the theory that firmware ought to be as minimal as possible > (because it's hard to update and hard to keep in sync with kernel > versions), it seems like firmware ought to start the kernel out in as > permissive mode as it's willing to provide, right? Not necessarily (and definitely not for arm64). And we've seen in practice that the actual deployed kernel may run in a different security mode than what's in mainline and used for initial development (you may just not see all the patches upstream). For development, that's indeed simpler, but once you go into production, a customer requesting some secure OS for payments etc. and Linux is moved to the non-secure side (and you end up putting hacks in the kernel because they were not spotted during initial development with Linux running in secure mode). > If the kernel is started out as permissive as possible then it can do > anything it needs to. Future versions of the kernel can be > implemented to do any way-cool things that they want to do without an > update to firmware, right? ...and current versions of the kernel can > just shed permissions if they don't want them. > > ...so if I understand correctly, "Secure SVC" mode is more permissive > than "Non Secure HYP" mode, right? It looks to me as if we currently > start the kernel in "Secure SVC" mode. What do you think about the > kernel detecting Secure SVC and then dropping down permission levels > (to Non Secure HYP). Once it did this, it could update things like > the virtual offset and then transition down further into non-secure > SVC mode. If we talk about ARMv8/AArch64, Secure SVC (aka secure EL1) is not more permissive than Non-secure Hyp (aka non-secure EL2). The only way to go from secure EL1 to non-secure EL2 is via EL3 (and SMC call) which means firmware code. Certain registers like CNTFRQ are only writable in EL3, CNTVOFF in EL2 or EL3.
On 09/05/2014 06:11 PM, Doug Anderson wrote: > Mark, > > On Thu, Aug 28, 2014 at 2:35 AM, Mark Rutland <mark.rutland@arm.com> wrote: >> Not if you boot Linux at hyp, as we've recommended for this precise >> reason. That doesn't fix other things like CNTFRQ if the secure >> initialisation doesn't poke that, however. > > I'll freely admit that I'm out of my league and out of my comfort zone > here, but... > > In the theory that firmware ought to be as minimal as possible > (because it's hard to update and hard to keep in sync with kernel > versions), it seems like firmware ought to start the kernel out in as > permissive mode as it's willing to provide, right? > > If the kernel is started out as permissive as possible then it can do > anything it needs to. Future versions of the kernel can be > implemented to do any way-cool things that they want to do without an > update to firmware, right? ...and current versions of the kernel can > just shed permissions if they don't want them. > > ...so if I understand correctly, "Secure SVC" mode is more permissive > than "Non Secure HYP" mode, right? It looks to me as if we currently > start the kernel in "Secure SVC" mode. What do you think about the > kernel detecting Secure SVC and then dropping down permission levels > (to Non Secure HYP). Once it did this, it could update things like > the virtual offset and then transition down further into non-secure > SVC mode. > > ...or maybe this has been discussed millions of times already and I'm > just clueless. ...or maybe this is just too hard for the kernel to do > in a generic way? I think this is a great idea. When running on simulators, it would make (the non-DTB parts of) the bootwrapper and QEMU's built-in bootloader unnecessary. Implementing it on AArch64 should be trivial as you can just read CurrentEL and work from whatever EL/PL you're at. Is there an easy way to check whether you're in secure or nonsecure mode in AArch32? I seem to recall discussion about putting this information into the DTB, which makes me think there isn't. Christopher
On Wed, Sep 10, 2014 at 03:58:15PM +0100, Christopher Covington wrote: > On 09/05/2014 06:11 PM, Doug Anderson wrote: > > On Thu, Aug 28, 2014 at 2:35 AM, Mark Rutland <mark.rutland@arm.com> wrote: > >> Not if you boot Linux at hyp, as we've recommended for this precise > >> reason. That doesn't fix other things like CNTFRQ if the secure > >> initialisation doesn't poke that, however. > > > > I'll freely admit that I'm out of my league and out of my comfort zone > > here, but... > > > > In the theory that firmware ought to be as minimal as possible > > (because it's hard to update and hard to keep in sync with kernel > > versions), it seems like firmware ought to start the kernel out in as > > permissive mode as it's willing to provide, right? > > > > If the kernel is started out as permissive as possible then it can do > > anything it needs to. Future versions of the kernel can be > > implemented to do any way-cool things that they want to do without an > > update to firmware, right? ...and current versions of the kernel can > > just shed permissions if they don't want them. > > > > ...so if I understand correctly, "Secure SVC" mode is more permissive > > than "Non Secure HYP" mode, right? It looks to me as if we currently > > start the kernel in "Secure SVC" mode. What do you think about the > > kernel detecting Secure SVC and then dropping down permission levels > > (to Non Secure HYP). Once it did this, it could update things like > > the virtual offset and then transition down further into non-secure > > SVC mode. > > > > ...or maybe this has been discussed millions of times already and I'm > > just clueless. ...or maybe this is just too hard for the kernel to do > > in a generic way? > > I think this is a great idea. When running on simulators, it would make (the > non-DTB parts of) the bootwrapper and QEMU's built-in bootloader unnecessary. > > Implementing it on AArch64 should be trivial as you can just read CurrentEL > and work from whatever EL/PL you're at. Is there an easy way to check whether > you're in secure or nonsecure mode in AArch32? I seem to recall discussion > about putting this information into the DTB, which makes me think there isn't. (I replied couple of days ago but the SMTP server I was using still hasn't delivered it) Anyway, for AArch64 you get the CurrentEL but does not tell you whether it's secure or not (the same as the mode bits in CPSR basically). Secure EL1 is not more permissive than non-secure EL2, you still have to go through EL3 firmware using an SMC call to be able to switch to EL2. So for AArch64, you need to rely on firmware to do at least the correct initialisation (unless you start the kernel at EL3).
On Wed, Sep 10, 2014 at 03:58:15PM +0100, Christopher Covington wrote: > On 09/05/2014 06:11 PM, Doug Anderson wrote: > > Mark, > > > > On Thu, Aug 28, 2014 at 2:35 AM, Mark Rutland <mark.rutland@arm.com> wrote: > >> Not if you boot Linux at hyp, as we've recommended for this precise > >> reason. That doesn't fix other things like CNTFRQ if the secure > >> initialisation doesn't poke that, however. > > > > I'll freely admit that I'm out of my league and out of my comfort zone > > here, but... > > > > In the theory that firmware ought to be as minimal as possible > > (because it's hard to update and hard to keep in sync with kernel > > versions), it seems like firmware ought to start the kernel out in as > > permissive mode as it's willing to provide, right? > > > > If the kernel is started out as permissive as possible then it can do > > anything it needs to. Future versions of the kernel can be > > implemented to do any way-cool things that they want to do without an > > update to firmware, right? ...and current versions of the kernel can > > just shed permissions if they don't want them. > > > > ...so if I understand correctly, "Secure SVC" mode is more permissive > > than "Non Secure HYP" mode, right? It looks to me as if we currently > > start the kernel in "Secure SVC" mode. What do you think about the > > kernel detecting Secure SVC and then dropping down permission levels > > (to Non Secure HYP). Once it did this, it could update things like > > the virtual offset and then transition down further into non-secure > > SVC mode. > > > > ...or maybe this has been discussed millions of times already and I'm > > just clueless. ...or maybe this is just too hard for the kernel to do > > in a generic way? > > I think this is a great idea. When running on simulators, it would make (the > non-DTB parts of) the bootwrapper and QEMU's built-in bootloader unnecessary. That's not true in general as other secure initialization will still be necessary, and the extent and character of that initialization is going to be implementation specific. If you have the infrastructure necessary to load and boot an arbitrary kernel at the most highly privileged exception level, then you necessarily have all the plumbing in place for loading an updateable firmware into that level. That latter option is preferable. The issue at hand is not whether Linux should be in charge of secure world state. The issue at hand is that firmware booted Linux at PL1 without taking ownership of all PL2 state (CNTVOFF included). Were either the firmware or Linux to manage PL2 (configuring CNTVOFF with a consistent value on all CPUs) that problem would not exist. > Implementing it on AArch64 should be trivial as you can just read > CurrentEL > and work from whatever EL/PL you're at. While it's trivial to identify which EL software is executing in, doing the right thing based on that is going to be far harder, especially in a general purpose OS. A loadable piece of firmware can know what is best for a particular platform (e.g. which errata apply which require poking implementation defined registers with magic implementation defined values) as it is tied to that piece of hardware. Mark.
On Wed, Sep 10, 2014 at 8:55 AM, Mark Rutland <mark.rutland@arm.com> wrote: > On Wed, Sep 10, 2014 at 03:58:15PM +0100, Christopher Covington wrote: >> On 09/05/2014 06:11 PM, Doug Anderson wrote: >> > Mark, >> > >> > On Thu, Aug 28, 2014 at 2:35 AM, Mark Rutland <mark.rutland@arm.com> wrote: >> >> Not if you boot Linux at hyp, as we've recommended for this precise >> >> reason. That doesn't fix other things like CNTFRQ if the secure >> >> initialisation doesn't poke that, however. >> > >> > I'll freely admit that I'm out of my league and out of my comfort zone >> > here, but... >> > >> > In the theory that firmware ought to be as minimal as possible >> > (because it's hard to update and hard to keep in sync with kernel >> > versions), it seems like firmware ought to start the kernel out in as >> > permissive mode as it's willing to provide, right? >> > >> > If the kernel is started out as permissive as possible then it can do >> > anything it needs to. Future versions of the kernel can be >> > implemented to do any way-cool things that they want to do without an >> > update to firmware, right? ...and current versions of the kernel can >> > just shed permissions if they don't want them. >> > >> > ...so if I understand correctly, "Secure SVC" mode is more permissive >> > than "Non Secure HYP" mode, right? It looks to me as if we currently >> > start the kernel in "Secure SVC" mode. What do you think about the >> > kernel detecting Secure SVC and then dropping down permission levels >> > (to Non Secure HYP). Once it did this, it could update things like >> > the virtual offset and then transition down further into non-secure >> > SVC mode. >> > >> > ...or maybe this has been discussed millions of times already and I'm >> > just clueless. ...or maybe this is just too hard for the kernel to do >> > in a generic way? >> >> I think this is a great idea. When running on simulators, it would make (the >> non-DTB parts of) the bootwrapper and QEMU's built-in bootloader unnecessary. > > That's not true in general as other secure initialization will still be > necessary, and the extent and character of that initialization is going > to be implementation specific. > > If you have the infrastructure necessary to load and boot an arbitrary > kernel at the most highly privileged exception level, then you > necessarily have all the plumbing in place for loading an updateable > firmware into that level. That latter option is preferable. It might be preferable by you since it makes the kernel simpler, but for the rest of us who are busy shipping product, it means extra work. Why would we have to write a brand new firmware layer for this? No fricking way. > The issue at hand is not whether Linux should be in charge of secure > world state. The issue at hand is that firmware booted Linux at PL1 > without taking ownership of all PL2 state (CNTVOFF included). Were > either the firmware or Linux to manage PL2 (configuring CNTVOFF with a > consistent value on all CPUs) that problem would not exist. No, the issue at hand is that Linux is now expecting PL2 resources to always have been set up. That is a regression in the true sense. It didn't use to be the case, and now it is. -Olof
Catalin, On Mon, Sep 8, 2014 at 6:54 AM, Catalin Marinas <catalin.marinas@arm.com> wrote: > On Fri, Sep 05, 2014 at 11:11:47PM +0100, Doug Anderson wrote: >> On Thu, Aug 28, 2014 at 2:35 AM, Mark Rutland <mark.rutland@arm.com> wrote: >> > Not if you boot Linux at hyp, as we've recommended for this precise >> > reason. That doesn't fix other things like CNTFRQ if the secure >> > initialisation doesn't poke that, however. >> >> I'll freely admit that I'm out of my league and out of my comfort zone >> here, but... >> >> In the theory that firmware ought to be as minimal as possible >> (because it's hard to update and hard to keep in sync with kernel >> versions), it seems like firmware ought to start the kernel out in as >> permissive mode as it's willing to provide, right? > > Not necessarily (and definitely not for arm64). And we've seen in > practice that the actual deployed kernel may run in a different security > mode than what's in mainline and used for initial development (you may > just not see all the patches upstream). For development, that's indeed > simpler, but once you go into production, a customer requesting some > secure OS for payments etc. and Linux is moved to the non-secure side > (and you end up putting hacks in the kernel because they were not > spotted during initial development with Linux running in secure mode). I guess the important part of my statement is "as it's willing to provide". In your case your firmware isn't willing to provide "secure SVC". In our case it is. We've actually shipped products where the firmware boots the kernel in "secure" mode. These products have a very different security model than you're envisioning, I guess. In these products, less firmware is better and adding firmware code to do the whole transition to "non secure hyp mode" just isn't worth it. It's not just a one-time blob of code in the firmware before booting the kernel. (I'm told) it means somehow keeping firmware code around to help out with turning processors off/on and to help with resume from S3. That's infrastructure that we don't want to add. To put it another way: if you're already architecting your system to have firmware manage the secure state then everything you have said makes sense. You've got to solve all the problems (processor bringup, suspend/resume, etc) anyway, so being consistent about having the kernel in nonsecure HYP makes sense. ...but if you have no need to ever limit access to "secure mode" then adding a whole lot of code to the firmware doesn't make sense to me. >> If the kernel is started out as permissive as possible then it can do >> anything it needs to. Future versions of the kernel can be >> implemented to do any way-cool things that they want to do without an >> update to firmware, right? ...and current versions of the kernel can >> just shed permissions if they don't want them. >> >> ...so if I understand correctly, "Secure SVC" mode is more permissive >> than "Non Secure HYP" mode, right? It looks to me as if we currently >> start the kernel in "Secure SVC" mode. What do you think about the >> kernel detecting Secure SVC and then dropping down permission levels >> (to Non Secure HYP). Once it did this, it could update things like >> the virtual offset and then transition down further into non-secure >> SVC mode. > > If we talk about ARMv8/AArch64, Secure SVC (aka secure EL1) is not more > permissive than Non-secure Hyp (aka non-secure EL2). The only way to go > from secure EL1 to non-secure EL2 is via EL3 (and SMC call) which means > firmware code. Certain registers like CNTFRQ are only writable in EL3, > CNTVOFF in EL2 or EL3. Hmmm, I guess I was under the impression that if you were in Secure SVC mode then you are guaranteed to have enough access that you could control where the SMC call would go. Thus you could install your own SMC handler and effectively transition yourself from Secure SVC to Monitor mode, then to Hypervisor mode. I was also under the impression that if you were in Nonsecure HYP mode that you couldn't guarantee that you could control the transition back to Secure SVC mode. If the above is incorrect then I guess my statement is incorrect. Can you confirm? ...however, even if "Secure SVC" isn't guaranteed to be more permissive than "Nonsecure HYP", it is often the case that you can transition from "Secure SVC" to "Nonsecure HYP" (by setting up handlers, etc) because the system hasn't been locked down. Given that if we "do nothing" in firmware we seem to boot the kernel in Secure SVC w/ the ability to transition to Nonsecure HYP, that seems like a good reason for the kernel to be able to handle that switching. -Doug
Mark, On Wed, Sep 10, 2014 at 8:55 AM, Mark Rutland <mark.rutland@arm.com> wrote: > That's not true in general as other secure initialization will still be > necessary, and the extent and character of that initialization is going > to be implementation specific. Can you give more examples of what you mean by implementation specific? Does this mean that secure initialization will be different for every SoC out there? ...or is it just different for different ARM cores? -Doug
Hi Sonny, On Wed, Aug 27, 2014 at 10:03:39PM +0100, Sonny Rao wrote: > This is a bug fix for using physical arch timers when > the arch_timer_use_virtual boolean is false. It restores the > arch_counter_get_cntpct() function after removal in > > 0d651e4e "clocksource: arch_timer: use virtual counters" > > and completes the implementation of memory mapped access for physical > timers, so if a system is trying to use physical timers, it will > function properly. To get back to the topic at hand: Which platform is this required by? Why exactly is arch_timer_use_virtual false in this case? Thanks, Mark. > > Signed-off-by: Sonny Rao <sonnyrao@chromium.org> > --- > arch/arm/include/asm/arch_timer.h | 9 +++++++++ > arch/arm64/include/asm/arch_timer.h | 10 ++++++++++ > drivers/clocksource/arm_arch_timer.c | 30 ++++++++++++++++++++++++++---- > 3 files changed, 45 insertions(+), 4 deletions(-) > > diff --git a/arch/arm/include/asm/arch_timer.h b/arch/arm/include/asm/arch_timer.h > index 0704e0c..e72aa4d 100644 > --- a/arch/arm/include/asm/arch_timer.h > +++ b/arch/arm/include/asm/arch_timer.h > @@ -78,6 +78,15 @@ static inline u32 arch_timer_get_cntfrq(void) > return val; > } > > +static inline u64 arch_counter_get_cntpct(void) > +{ > + u64 cval; > + > + isb(); > + asm volatile("mrrc p15, 0, %Q0, %R0, c14" : "=r" (cval)); > + return cval; > +} > + > static inline u64 arch_counter_get_cntvct(void) > { > u64 cval; > diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h > index 9400596..58657c4 100644 > --- a/arch/arm64/include/asm/arch_timer.h > +++ b/arch/arm64/include/asm/arch_timer.h > @@ -135,6 +135,16 @@ static inline void arch_timer_evtstrm_enable(int divider) > #endif > } > > +static inline u64 arch_counter_get_cntpct(void) > +{ > + u64 cval; > + > + isb(); > + asm volatile("mrs %0, cntpct_el0" : "=r" (cval)); > + > + return cval; > +} > + > static inline u64 arch_counter_get_cntvct(void) > { > u64 cval; > diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c > index 5163ec1..ad723cb 100644 > --- a/drivers/clocksource/arm_arch_timer.c > +++ b/drivers/clocksource/arm_arch_timer.c > @@ -30,6 +30,8 @@ > #define CNTTIDR 0x08 > #define CNTTIDR_VIRT(n) (BIT(1) << ((n) * 4)) > > +#define CNTPCT_LO 0x00 > +#define CNTPCT_HI 0x04 > #define CNTVCT_LO 0x08 > #define CNTVCT_HI 0x0c > #define CNTFRQ 0x10 > @@ -386,6 +388,19 @@ static u64 arch_counter_get_cntvct_mem(void) > return ((u64) vct_hi << 32) | vct_lo; > } > > +static u64 arch_counter_get_cntpct_mem(void) > +{ > + u32 pct_lo, pct_hi, tmp_hi; > + > + do { > + pct_hi = readl_relaxed(arch_counter_base + CNTPCT_HI); > + pct_lo = readl_relaxed(arch_counter_base + CNTPCT_LO); > + tmp_hi = readl_relaxed(arch_counter_base + CNTPCT_HI); > + } while (pct_hi != tmp_hi); > + > + return ((u64) pct_hi << 32) | pct_lo; > +} > + > /* > * Default to cp15 based access because arm64 uses this function for > * sched_clock() before DT is probed and the cp15 method is guaranteed > @@ -429,10 +444,17 @@ static void __init arch_counter_register(unsigned type) > u64 start_count; > > /* Register the CP15 based counter if we have one */ > - if (type & ARCH_CP15_TIMER) > - arch_timer_read_counter = arch_counter_get_cntvct; > - else > - arch_timer_read_counter = arch_counter_get_cntvct_mem; > + if (type & ARCH_CP15_TIMER) { > + if (arch_timer_use_virtual) > + arch_timer_read_counter = arch_counter_get_cntvct; > + else > + arch_timer_read_counter = arch_counter_get_cntpct; > + } else { > + if (arch_timer_use_virtual) > + arch_timer_read_counter = arch_counter_get_cntvct_mem; > + else > + arch_timer_read_counter = arch_counter_get_cntpct_mem; > + } > > start_count = arch_timer_read_counter(); > clocksource_register_hz(&clocksource_counter, arch_timer_rate); > -- > 1.8.3.2 > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >
On Wed, Sep 10, 2014 at 06:17:23PM +0100, Doug Anderson wrote: > On Mon, Sep 8, 2014 at 6:54 AM, Catalin Marinas <catalin.marinas@arm.com> wrote: > > On Fri, Sep 05, 2014 at 11:11:47PM +0100, Doug Anderson wrote: > >> On Thu, Aug 28, 2014 at 2:35 AM, Mark Rutland <mark.rutland@arm.com> wrote: > >> > Not if you boot Linux at hyp, as we've recommended for this precise > >> > reason. That doesn't fix other things like CNTFRQ if the secure > >> > initialisation doesn't poke that, however. > >> > >> I'll freely admit that I'm out of my league and out of my comfort zone > >> here, but... > >> > >> In the theory that firmware ought to be as minimal as possible > >> (because it's hard to update and hard to keep in sync with kernel > >> versions), it seems like firmware ought to start the kernel out in as > >> permissive mode as it's willing to provide, right? > > > > Not necessarily (and definitely not for arm64). And we've seen in > > practice that the actual deployed kernel may run in a different security > > mode than what's in mainline and used for initial development (you may > > just not see all the patches upstream). For development, that's indeed > > simpler, but once you go into production, a customer requesting some > > secure OS for payments etc. and Linux is moved to the non-secure side > > (and you end up putting hacks in the kernel because they were not > > spotted during initial development with Linux running in secure mode). > > I guess the important part of my statement is "as it's willing to > provide". In your case your firmware isn't willing to provide "secure > SVC". In our case it is. > > We've actually shipped products where the firmware boots the kernel in > "secure" mode. These products have a very different security model > than you're envisioning, I guess. In these products, less firmware is > better and adding firmware code to do the whole transition to "non > secure hyp mode" just isn't worth it. It's not just a one-time blob > of code in the firmware before booting the kernel. (I'm told) it > means somehow keeping firmware code around to help out with turning > processors off/on and to help with resume from S3. That's > infrastructure that we don't want to add. Setting aside the security model, booting in secure mode can also have a significant impact on the programming model of system IP, not just the CPU. For example, the SMMU register file suddenly looks different and the way in which it is programmed also changes. The GIC (v3+) is similar. Linux doesn't have drivers that know about this so, whilst not impossible, there's a non-trivial amount of work (and then maintenance overhead) if you want to support booting Linux on the secure side. The secure world may be more permissive in some regards, but it's actually a burden in others. It's not a simple superset of non-secure, and this is more evident than ever in ARMv8. As I understand it, this conversation started because we're booting at non-secure EL1 with a badly configured EL2. That sounds like something we may have to fix/work around in Linux (Olof points out that it used to work), but we shouldn't conflate that with booting on the secure side to get away from broken firmware. Will
Mark, On Wed, Sep 10, 2014 at 10:27 AM, Mark Rutland <mark.rutland@arm.com> wrote: > Hi Sonny, > > On Wed, Aug 27, 2014 at 10:03:39PM +0100, Sonny Rao wrote: >> This is a bug fix for using physical arch timers when >> the arch_timer_use_virtual boolean is false. It restores the >> arch_counter_get_cntpct() function after removal in >> >> 0d651e4e "clocksource: arch_timer: use virtual counters" >> >> and completes the implementation of memory mapped access for physical >> timers, so if a system is trying to use physical timers, it will >> function properly. > > To get back to the topic at hand: > > Which platform is this required by? I've seen similar problems on the A7s on exynos5420 / exynos5800 and on rk3288. I can't say what other platforms might be affected. Note that the arch timers on exyno5420/exynos5800 are not supported anyway, so I guess that means we're just worrying about the rk3288. > Why exactly is arch_timer_use_virtual false in this case? To re-summarize my understanding of everything (many of the below is secondhand knowledge, so correct if wrong): 1. The initial problem is that the virtual offset is not initialized by anyone and is per core (each core gets a different, random offset). That makes the virtual counter useless. ...but the kernel only uses virtual counters. 2. As far as I know, we don't have any particular need for HYP mode nor for limiting access to secure mode. 3. You can only change the virtual offset from HYP mode. That means someone needs to transition to HYP mode if we want to use virtual counters. 4. If the kernel happens to be in HYP mode it will init the virtual offset. 5. We could transition to HYP mode once in the firmware and boot the kernel like that, but since firmware is gone after we've booted the kernel, we run into the same problem when we power off a processor and when we resume from S3. The firmware is not involved in these cases. In these cases the processors have an uninitialized virtual offset again. These processors don't appear to magically come up in HYP mode. Thus it would be up to the kernel to transition to HYP mode, init the offset, and get out of HYP mode. ...or just use the physical counter. If you can suggest something that doesn't require us to involve the firmware in processor bringup and in resume, I'm all ears. We have a desire not to involve the firmware there because of all the issues of keeping kernel/firmware in sync and because of the extra difficultly of shipping firmware updates (understandably the QA needed to validate a new firmware is much higher than the QA needed to validate a new kernel). -Doug
On Wed, Sep 10, 2014 at 10:52 AM, Doug Anderson <dianders@chromium.org> wrote: > Mark, > > On Wed, Sep 10, 2014 at 10:27 AM, Mark Rutland <mark.rutland@arm.com> wrote: >> Hi Sonny, >> >> On Wed, Aug 27, 2014 at 10:03:39PM +0100, Sonny Rao wrote: >>> This is a bug fix for using physical arch timers when >>> the arch_timer_use_virtual boolean is false. It restores the >>> arch_counter_get_cntpct() function after removal in >>> >>> 0d651e4e "clocksource: arch_timer: use virtual counters" >>> >>> and completes the implementation of memory mapped access for physical >>> timers, so if a system is trying to use physical timers, it will >>> function properly. >> >> To get back to the topic at hand: >> >> Which platform is this required by? > > I've seen similar problems on the A7s on exynos5420 / exynos5800 and > on rk3288. I can't say what other platforms might be affected. Note > that the arch timers on exyno5420/exynos5800 are not supported anyway, > so I guess that means we're just worrying about the rk3288. Yeah, given that this problem has manifest on at least two different SoCs using ARM's cores, it would have been nice if the offset were specified to start out as zero when in reset by the architecture (and was implemented that way in ARM's core IP), but it looks like it wasn't. > > >> Why exactly is arch_timer_use_virtual false in this case? > > To re-summarize my understanding of everything (many of the below is > secondhand knowledge, so correct if wrong): > > 1. The initial problem is that the virtual offset is not initialized > by anyone and is per core (each core gets a different, random offset). > That makes the virtual counter useless. ...but the kernel only uses > virtual counters. > > 2. As far as I know, we don't have any particular need for HYP mode > nor for limiting access to secure mode. > > 3. You can only change the virtual offset from HYP mode. That means > someone needs to transition to HYP mode if we want to use virtual > counters. > > 4. If the kernel happens to be in HYP mode it will init the virtual offset. > > 5. We could transition to HYP mode once in the firmware and boot the > kernel like that, but since firmware is gone after we've booted the > kernel, we run into the same problem when we power off a processor and > when we resume from S3. The firmware is not involved in these cases. > In these cases the processors have an uninitialized virtual offset > again. These processors don't appear to magically come up in HYP > mode. Thus it would be up to the kernel to transition to HYP mode, > init the offset, and get out of HYP mode. ...or just use the physical > counter. > > > If you can suggest something that doesn't require us to involve the > firmware in processor bringup and in resume, I'm all ears. We have a > desire not to involve the firmware there because of all the issues of > keeping kernel/firmware in sync and because of the extra difficultly > of shipping firmware updates (understandably the QA needed to validate > a new firmware is much higher than the QA needed to validate a new > kernel). One thing that was used in the past was to have the kernel load a blob from /lib/firmware/ which did some re-initialization when coming out of suspend or deep sleep. We could do something similar here and have it either fix virtual offset or put it into hyp mode. That would help solve our issue of making it easier to update and avoid QA hassles. Is such a solution acceptable to upstream? > > > -Doug
Will, On Wed, Sep 10, 2014 at 10:34 AM, Will Deacon <will.deacon@arm.com> wrote: > Setting aside the security model, booting in secure mode can also have a > significant impact on the programming model of system IP, not just the CPU. > For example, the SMMU register file suddenly looks different and the way in > which it is programmed also changes. The GIC (v3+) is similar. Linux doesn't > have drivers that know about this so, whilst not impossible, there's a > non-trivial amount of work (and then maintenance overhead) if you want to > support booting Linux on the secure side. Hrm. I could have sworn someone told me that exynos5250-snow is booted in Secure SVC mode and has been that way forever. I could certainly be wrong. Stupid question, but should any of your statements apply to an A15 or an A12? Maybe things change with newer CPUs? On the rk3288 I have in front of me I tried running the following at (semi early) boot to help confirm we were in Secure SVC. Standard "I don't know what I'm doing" caveats apply: asm volatile("mrc p15, 0, %0, c1, c1, 0" : "=r" (val)); pr_info("DOUG: val is %#010x", val); val |= (1 << 2); asm volatile("mcr p15, 0, %0, c1, c1, 0" : : "r" (val)); val = 0xffffffff; asm volatile("mrc p15, 0, %0, c1, c1, 0" : "=r" (val)); pr_info("DOUG: val is %#010x", val); When I did that I found that I was able to read back 0x00000004. OK, and I just replicated that same behavior on exynos5250-snow. Since I think the SCR register is only accessible from "Secure PL1 modes" and I showed that I could change it, to me that meant that I must be in "Secure PL1". Please yell if I'm just being an idiot and I can look for some other way to test for secure mode. Also: I wasn't actually suggesting running Linux in secure mode, I was only suggesting _booting_ Linux in secure mode. The idea was that Linux would realize this and immediately switch to nonsecure HYP mode. Linux doesn't normally run at HYP mode, so it would eventually drop down into nonsecure SVC mode, too. One (admittedly weak) justification was that if someone later wanted to do some whizbang feature in the Linux kernel that required "secure svc" mode then it would be possible on this hardware. Obviously this person would need to implement a whole crapload of code to handle this, but they wouldn't need to add custom firmware. The other (important) justification was to keep bootloaders simple. All of the "I've never even thought about secure mode" bootloaders that I've seen just happen to boot the kernel in Secure SVC. That makes me think that must be the easiest thing for them to do. We should support them. > The secure world may be more permissive in some regards, but it's actually > a burden in others. It's not a simple superset of non-secure, and this is > more evident than ever in ARMv8. > > As I understand it, this conversation started because we're booting at > non-secure EL1 with a badly configured EL2. That sounds like something > we may have to fix/work around in Linux (Olof points out that it used to > work), but we shouldn't conflate that with booting on the secure side to > get away from broken firmware. See above and also my summary to Mark Rutland. I don't think your statement above is quite where we're at, but hopefully we're on the same page now. :) -Doug
Hi, On Wed, Sep 10, 2014 at 10:52 AM, Doug Anderson <dianders@chromium.org> wrote: > Mark, > > On Wed, Sep 10, 2014 at 10:27 AM, Mark Rutland <mark.rutland@arm.com> wrote: >> Hi Sonny, >> >> On Wed, Aug 27, 2014 at 10:03:39PM +0100, Sonny Rao wrote: >>> This is a bug fix for using physical arch timers when >>> the arch_timer_use_virtual boolean is false. It restores the >>> arch_counter_get_cntpct() function after removal in >>> >>> 0d651e4e "clocksource: arch_timer: use virtual counters" >>> >>> and completes the implementation of memory mapped access for physical >>> timers, so if a system is trying to use physical timers, it will >>> function properly. >> >> To get back to the topic at hand: >> >> Which platform is this required by? > > I've seen similar problems on the A7s on exynos5420 / exynos5800 and > on rk3288. I can't say what other platforms might be affected. Note > that the arch timers on exyno5420/exynos5800 are not supported anyway, > so I guess that means we're just worrying about the rk3288. > > >> Why exactly is arch_timer_use_virtual false in this case? > > To re-summarize my understanding of everything (many of the below is > secondhand knowledge, so correct if wrong): > > 1. The initial problem is that the virtual offset is not initialized > by anyone and is per core (each core gets a different, random offset). > That makes the virtual counter useless. ...but the kernel only uses > virtual counters. > > 2. As far as I know, we don't have any particular need for HYP mode > nor for limiting access to secure mode. > > 3. You can only change the virtual offset from HYP mode. That means > someone needs to transition to HYP mode if we want to use virtual > counters. > > 4. If the kernel happens to be in HYP mode it will init the virtual offset. > > 5. We could transition to HYP mode once in the firmware and boot the > kernel like that, but since firmware is gone after we've booted the > kernel, we run into the same problem when we power off a processor and > when we resume from S3. The firmware is not involved in these cases. > In these cases the processors have an uninitialized virtual offset > again. These processors don't appear to magically come up in HYP > mode. Thus it would be up to the kernel to transition to HYP mode, > init the offset, and get out of HYP mode. ...or just use the physical > counter. > > > If you can suggest something that doesn't require us to involve the > firmware in processor bringup and in resume, I'm all ears. We have a > desire not to involve the firmware there because of all the issues of > keeping kernel/firmware in sync and because of the extra difficultly > of shipping firmware updates (understandably the QA needed to validate > a new firmware is much higher than the QA needed to validate a new > kernel). Note: we can avoid all of this mess by just using physical counters and otherwise leaving the system alone. That would mean landing Sonny's patch plus another one that allows us to manually chose physical counters. If everyone can agree that's not terrible then we could probably end the conversation there. -Doug
On Wed, Sep 10, 2014 at 07:09:59PM +0100, Doug Anderson wrote: > On Wed, Sep 10, 2014 at 10:34 AM, Will Deacon <will.deacon@arm.com> wrote: > > Setting aside the security model, booting in secure mode can also have a > > significant impact on the programming model of system IP, not just the CPU. > > For example, the SMMU register file suddenly looks different and the way in > > which it is programmed also changes. The GIC (v3+) is similar. Linux doesn't > > have drivers that know about this so, whilst not impossible, there's a > > non-trivial amount of work (and then maintenance overhead) if you want to > > support booting Linux on the secure side. > > Hrm. I could have sworn someone told me that exynos5250-snow is > booted in Secure SVC mode and has been that way forever. I could > certainly be wrong. Stupid question, but should any of your > statements apply to an A15 or an A12? Maybe things change with newer > CPUs? Snow does boot secure, which is why you don't get KVM without hacking the bootloader. However, snow also doesn't use: - An ARMv8 CPU - GICv3 - ARM SMMU - Virtualisation (in the supported software image) - Secure services (by the architectural definition) > On the rk3288 I have in front of me I tried running the following at > (semi early) boot to help confirm we were in Secure SVC. Standard "I > don't know what I'm doing" caveats apply: > > asm volatile("mrc p15, 0, %0, c1, c1, 0" : "=r" (val)); > pr_info("DOUG: val is %#010x", val); > val |= (1 << 2); > asm volatile("mcr p15, 0, %0, c1, c1, 0" : : "r" (val)); > val = 0xffffffff; > asm volatile("mrc p15, 0, %0, c1, c1, 0" : "=r" (val)); > pr_info("DOUG: val is %#010x", val); > > When I did that I found that I was able to read back 0x00000004. OK, > and I just replicated that same behavior on exynos5250-snow. > > Since I think the SCR register is only accessible from "Secure PL1 > modes" and I showed that I could change it, to me that meant that I > must be in "Secure PL1". Please yell if I'm just being an idiot and I > can look for some other way to test for secure mode. Again, this isn't the kind of platform I was referring to. Until recently, (32-bit) Linux has booted in secure svc just fine and there's no reason to break platforms relying on that. > Also: I wasn't actually suggesting running Linux in secure mode, I was > only suggesting _booting_ Linux in secure mode. The idea was that > Linux would realize this and immediately switch to nonsecure HYP mode. > Linux doesn't normally run at HYP mode, so it would eventually drop > down into nonsecure SVC mode, too. For ARMv7, fill your boots. For ARMv8, this doesn't work unless you enter at EL3, in which case the kernel is going to need to grow an awful lot of knowledge to initialise arbitrary systems. > The other (important) justification was to keep bootloaders simple. > All of the "I've never even thought about secure mode" bootloaders > that I've seen just happen to boot the kernel in Secure SVC. That > makes me think that must be the easiest thing for them to do. We > should support them. Just to reiterate, you're talking ARMv7 and I'm talking ARMv8. For ARMv7, there is a separate argument about whether we should make bootloaders simple and the OS more complex, which has been had on the mailing list before (I don't think there was a good conclusion though -- Nico and Catalin were discussing about MCPM and PSCI, which is very much related to this topic). > > The secure world may be more permissive in some regards, but it's actually > > a burden in others. It's not a simple superset of non-secure, and this is > > more evident than ever in ARMv8. > > > > As I understand it, this conversation started because we're booting at > > non-secure EL1 with a badly configured EL2. That sounds like something > > we may have to fix/work around in Linux (Olof points out that it used to > > work), but we shouldn't conflate that with booting on the secure side to > > get away from broken firmware. > > See above and also my summary to Mark Rutland. I don't think your > statement above is quite where we're at, but hopefully we're on the > same page now. :) If `where we're at' is trying to boot an ARMv7 product, then you can boot in secure svc and lose virtualisation support. Looking forward to ARMv8, this isn't going to work, so I'd encourage you to start thinking about getting a working bootloader as a requirement. Will
Will, On Wed, Sep 10, 2014 at 11:46 AM, Will Deacon <will.deacon@arm.com> wrote: > On Wed, Sep 10, 2014 at 07:09:59PM +0100, Doug Anderson wrote: >> On Wed, Sep 10, 2014 at 10:34 AM, Will Deacon <will.deacon@arm.com> wrote: >> > Setting aside the security model, booting in secure mode can also have a >> > significant impact on the programming model of system IP, not just the CPU. >> > For example, the SMMU register file suddenly looks different and the way in >> > which it is programmed also changes. The GIC (v3+) is similar. Linux doesn't >> > have drivers that know about this so, whilst not impossible, there's a >> > non-trivial amount of work (and then maintenance overhead) if you want to >> > support booting Linux on the secure side. >> >> Hrm. I could have sworn someone told me that exynos5250-snow is >> booted in Secure SVC mode and has been that way forever. I could >> certainly be wrong. Stupid question, but should any of your >> statements apply to an A15 or an A12? Maybe things change with newer >> CPUs? > > Snow does boot secure, which is why you don't get KVM without hacking the > bootloader. However, snow also doesn't use: Well, you could get KVM if the kernel knew to switch from "secure SVC" to "nonsecure HYP", right? ...so one bonus of implementing this (for 32-bit ARM only, of course) is that legacy systems (like snow) would magically get KVM without replacing the bootloader... > Again, this isn't the kind of platform I was referring to. Until recently, > (32-bit) Linux has booted in secure svc just fine and there's no reason to > break platforms relying on that. Yup, exactly. I'm definitely not on the cutting edge here. Just working on plain old boring 32-bit systems. ;) I have even less knowledge about 64-bit systems than 32 bit if that can be believed. ;) > Just to reiterate, you're talking ARMv7 and I'm talking ARMv8. For ARMv7, > there is a separate argument about whether we should make bootloaders > simple and the OS more complex, which has been had on the mailing list > before (I don't think there was a good conclusion though -- Nico and Catalin > were discussing about MCPM and PSCI, which is very much related to this > topic). Yup. I'm super happy that folks are figuring this out. :) I think my viewpoint / position is adequately clear at this point, so I'll step back and let the experts work something out and focus my attention on kitchens that have fewer chefs. If things need to be different on ARMv8 then the folks working on those system will need to solve these problems. ...but it would be nice if we weren't forced to solve them on 32-bit. >> > The secure world may be more permissive in some regards, but it's actually >> > a burden in others. It's not a simple superset of non-secure, and this is >> > more evident than ever in ARMv8. >> > >> > As I understand it, this conversation started because we're booting at >> > non-secure EL1 with a badly configured EL2. That sounds like something >> > we may have to fix/work around in Linux (Olof points out that it used to >> > work), but we shouldn't conflate that with booting on the secure side to >> > get away from broken firmware. >> >> See above and also my summary to Mark Rutland. I don't think your >> statement above is quite where we're at, but hopefully we're on the >> same page now. :) > > If `where we're at' is trying to boot an ARMv7 product, then you can boot in > secure svc and lose virtualisation support. Looking forward to ARMv8, this > isn't going to work, so I'd encourage you to start thinking about getting > a working bootloader as a requirement. Yup, definitely on the same page now. With everyone working on this I'd imagine that there will be some nice standards worked out by the time real ARMv8 is ready to ship? ...so would you say that you're in support of landing the patch to allow physical counters? I know Olof has Acked the patch above, but it's nice if there's general agreement that it's OK. -Doug
On Wed, Sep 10, 2014 at 08:50:16PM +0100, Doug Anderson wrote: > On Wed, Sep 10, 2014 at 11:46 AM, Will Deacon <will.deacon@arm.com> wrote: > > If `where we're at' is trying to boot an ARMv7 product, then you can boot in > > secure svc and lose virtualisation support. Looking forward to ARMv8, this > > isn't going to work, so I'd encourage you to start thinking about getting > > a working bootloader as a requirement. > > Yup, definitely on the same page now. With everyone working on this > I'd imagine that there will be some nice standards worked out by the > time real ARMv8 is ready to ship? > > ...so would you say that you're in support of landing the patch to > allow physical counters? I know Olof has Acked the patch above, but > it's nice if there's general agreement that it's OK. I'm in favour of fixing the regression, yes. What I didn't understand from the patch is where arch_timer_use_virtual is set to false for your machine, as we need to be careful not to regress arm64 (the vdso uses the virtual counter there). Will
Will, On Thu, Sep 11, 2014 at 2:57 AM, Will Deacon <will.deacon@arm.com> wrote: > On Wed, Sep 10, 2014 at 08:50:16PM +0100, Doug Anderson wrote: >> On Wed, Sep 10, 2014 at 11:46 AM, Will Deacon <will.deacon@arm.com> wrote: >> > If `where we're at' is trying to boot an ARMv7 product, then you can boot in >> > secure svc and lose virtualisation support. Looking forward to ARMv8, this >> > isn't going to work, so I'd encourage you to start thinking about getting >> > a working bootloader as a requirement. >> >> Yup, definitely on the same page now. With everyone working on this >> I'd imagine that there will be some nice standards worked out by the >> time real ARMv8 is ready to ship? >> >> ...so would you say that you're in support of landing the patch to >> allow physical counters? I know Olof has Acked the patch above, but >> it's nice if there's general agreement that it's OK. > > I'm in favour of fixing the regression, yes. What I didn't understand from > the patch is where arch_timer_use_virtual is set to false for your machine, > as we need to be careful not to regress arm64 (the vdso uses the virtual > counter there). See <https://patchwork.kernel.org/patch/4889311/> -Doug
diff --git a/arch/arm/include/asm/arch_timer.h b/arch/arm/include/asm/arch_timer.h index 0704e0c..e72aa4d 100644 --- a/arch/arm/include/asm/arch_timer.h +++ b/arch/arm/include/asm/arch_timer.h @@ -78,6 +78,15 @@ static inline u32 arch_timer_get_cntfrq(void) return val; } +static inline u64 arch_counter_get_cntpct(void) +{ + u64 cval; + + isb(); + asm volatile("mrrc p15, 0, %Q0, %R0, c14" : "=r" (cval)); + return cval; +} + static inline u64 arch_counter_get_cntvct(void) { u64 cval; diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h index 9400596..58657c4 100644 --- a/arch/arm64/include/asm/arch_timer.h +++ b/arch/arm64/include/asm/arch_timer.h @@ -135,6 +135,16 @@ static inline void arch_timer_evtstrm_enable(int divider) #endif } +static inline u64 arch_counter_get_cntpct(void) +{ + u64 cval; + + isb(); + asm volatile("mrs %0, cntpct_el0" : "=r" (cval)); + + return cval; +} + static inline u64 arch_counter_get_cntvct(void) { u64 cval; diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c index 5163ec1..ad723cb 100644 --- a/drivers/clocksource/arm_arch_timer.c +++ b/drivers/clocksource/arm_arch_timer.c @@ -30,6 +30,8 @@ #define CNTTIDR 0x08 #define CNTTIDR_VIRT(n) (BIT(1) << ((n) * 4)) +#define CNTPCT_LO 0x00 +#define CNTPCT_HI 0x04 #define CNTVCT_LO 0x08 #define CNTVCT_HI 0x0c #define CNTFRQ 0x10 @@ -386,6 +388,19 @@ static u64 arch_counter_get_cntvct_mem(void) return ((u64) vct_hi << 32) | vct_lo; } +static u64 arch_counter_get_cntpct_mem(void) +{ + u32 pct_lo, pct_hi, tmp_hi; + + do { + pct_hi = readl_relaxed(arch_counter_base + CNTPCT_HI); + pct_lo = readl_relaxed(arch_counter_base + CNTPCT_LO); + tmp_hi = readl_relaxed(arch_counter_base + CNTPCT_HI); + } while (pct_hi != tmp_hi); + + return ((u64) pct_hi << 32) | pct_lo; +} + /* * Default to cp15 based access because arm64 uses this function for * sched_clock() before DT is probed and the cp15 method is guaranteed @@ -429,10 +444,17 @@ static void __init arch_counter_register(unsigned type) u64 start_count; /* Register the CP15 based counter if we have one */ - if (type & ARCH_CP15_TIMER) - arch_timer_read_counter = arch_counter_get_cntvct; - else - arch_timer_read_counter = arch_counter_get_cntvct_mem; + if (type & ARCH_CP15_TIMER) { + if (arch_timer_use_virtual) + arch_timer_read_counter = arch_counter_get_cntvct; + else + arch_timer_read_counter = arch_counter_get_cntpct; + } else { + if (arch_timer_use_virtual) + arch_timer_read_counter = arch_counter_get_cntvct_mem; + else + arch_timer_read_counter = arch_counter_get_cntpct_mem; + } start_count = arch_timer_read_counter(); clocksource_register_hz(&clocksource_counter, arch_timer_rate);
This is a bug fix for using physical arch timers when the arch_timer_use_virtual boolean is false. It restores the arch_counter_get_cntpct() function after removal in 0d651e4e "clocksource: arch_timer: use virtual counters" and completes the implementation of memory mapped access for physical timers, so if a system is trying to use physical timers, it will function properly. Signed-off-by: Sonny Rao <sonnyrao@chromium.org> --- arch/arm/include/asm/arch_timer.h | 9 +++++++++ arch/arm64/include/asm/arch_timer.h | 10 ++++++++++ drivers/clocksource/arm_arch_timer.c | 30 ++++++++++++++++++++++++++---- 3 files changed, 45 insertions(+), 4 deletions(-)