diff mbox

[2/2] arm: apply more __ro_after_init

Message ID 1464979224-2085-3-git-send-email-keescook@chromium.org (mailing list archive)
State New, archived
Headers show

Commit Message

Kees Cook June 3, 2016, 6:40 p.m. UTC
Guided by grsecurity's analogous __read_only markings in arch/arm,
this applies several uses of __ro_after_init to structures that are
only updated during __init.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/arm/kernel/cpuidle.c |  2 +-
 arch/arm/kernel/setup.c   | 10 +++++-----
 arch/arm/kernel/smp.c     |  2 +-
 arch/arm/lib/delay.c      |  2 +-
 arch/arm/mm/mmu.c         |  9 ++-------
 arch/x86/mm/ioremap.c     |  3 +--
 6 files changed, 11 insertions(+), 17 deletions(-)

Comments

Greg Kroah-Hartman June 3, 2016, 6:51 p.m. UTC | #1
On Fri, Jun 03, 2016 at 11:40:24AM -0700, Kees Cook wrote:
> Guided by grsecurity's analogous __read_only markings in arch/arm,
> this applies several uses of __ro_after_init to structures that are
> only updated during __init.
> 
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
>  arch/arm/kernel/cpuidle.c |  2 +-
>  arch/arm/kernel/setup.c   | 10 +++++-----
>  arch/arm/kernel/smp.c     |  2 +-
>  arch/arm/lib/delay.c      |  2 +-
>  arch/arm/mm/mmu.c         |  9 ++-------
>  arch/x86/mm/ioremap.c     |  3 +--

I don't think this x86 file is an arm-specific one :)

That minor nit aside, these patches are a great step forward, are you
going to take them and work to push them upstream, or do you want/need
others to do this?

thanks,

greg k-h
Kees Cook June 3, 2016, 9:26 p.m. UTC | #2
On Fri, Jun 3, 2016 at 11:51 AM, Greg KH <gregkh@linuxfoundation.org> wrote:
> On Fri, Jun 03, 2016 at 11:40:24AM -0700, Kees Cook wrote:
>> Guided by grsecurity's analogous __read_only markings in arch/arm,
>> this applies several uses of __ro_after_init to structures that are
>> only updated during __init.
>>
>> Signed-off-by: Kees Cook <keescook@chromium.org>
>> ---
>>  arch/arm/kernel/cpuidle.c |  2 +-
>>  arch/arm/kernel/setup.c   | 10 +++++-----
>>  arch/arm/kernel/smp.c     |  2 +-
>>  arch/arm/lib/delay.c      |  2 +-
>>  arch/arm/mm/mmu.c         |  9 ++-------
>>  arch/x86/mm/ioremap.c     |  3 +--
>
> I don't think this x86 file is an arm-specific one :)

Hah, whooops. :)

> That minor nit aside, these patches are a great step forward, are you
> going to take them and work to push them upstream, or do you want/need
> others to do this?

I'll collect more like these and carry a tree for -next and push them for v4.8.

-Kees
Greg Kroah-Hartman June 3, 2016, 9:54 p.m. UTC | #3
On Fri, Jun 03, 2016 at 02:26:54PM -0700, Kees Cook wrote:
> On Fri, Jun 3, 2016 at 11:51 AM, Greg KH <gregkh@linuxfoundation.org> wrote:
> > On Fri, Jun 03, 2016 at 11:40:24AM -0700, Kees Cook wrote:
> >> Guided by grsecurity's analogous __read_only markings in arch/arm,
> >> this applies several uses of __ro_after_init to structures that are
> >> only updated during __init.
> >>
> >> Signed-off-by: Kees Cook <keescook@chromium.org>
> >> ---
> >>  arch/arm/kernel/cpuidle.c |  2 +-
> >>  arch/arm/kernel/setup.c   | 10 +++++-----
> >>  arch/arm/kernel/smp.c     |  2 +-
> >>  arch/arm/lib/delay.c      |  2 +-
> >>  arch/arm/mm/mmu.c         |  9 ++-------
> >>  arch/x86/mm/ioremap.c     |  3 +--
> >
> > I don't think this x86 file is an arm-specific one :)
> 
> Hah, whooops. :)
> 
> > That minor nit aside, these patches are a great step forward, are you
> > going to take them and work to push them upstream, or do you want/need
> > others to do this?
> 
> I'll collect more like these and carry a tree for -next and push them for v4.8.

Sounds good!

Is there any "problem" with applying these markings to code that could
be built as a module?  I'm thinking of lots of buses and drivers that
have structures like this, but can be a module or not, depending on the
configuration selected.  It would be nice to get the "benefit" of
protection if the code is built into the kernel image.

thanks,

greg k-h
Kees Cook June 3, 2016, 10:01 p.m. UTC | #4
On Fri, Jun 3, 2016 at 2:54 PM, Greg KH <gregkh@linuxfoundation.org> wrote:
> On Fri, Jun 03, 2016 at 02:26:54PM -0700, Kees Cook wrote:
>> On Fri, Jun 3, 2016 at 11:51 AM, Greg KH <gregkh@linuxfoundation.org> wrote:
>> > On Fri, Jun 03, 2016 at 11:40:24AM -0700, Kees Cook wrote:
>> >> Guided by grsecurity's analogous __read_only markings in arch/arm,
>> >> this applies several uses of __ro_after_init to structures that are
>> >> only updated during __init.
>> >>
>> >> Signed-off-by: Kees Cook <keescook@chromium.org>
>> >> ---
>> >>  arch/arm/kernel/cpuidle.c |  2 +-
>> >>  arch/arm/kernel/setup.c   | 10 +++++-----
>> >>  arch/arm/kernel/smp.c     |  2 +-
>> >>  arch/arm/lib/delay.c      |  2 +-
>> >>  arch/arm/mm/mmu.c         |  9 ++-------
>> >>  arch/x86/mm/ioremap.c     |  3 +--
>> >
>> > I don't think this x86 file is an arm-specific one :)
>>
>> Hah, whooops. :)
>>
>> > That minor nit aside, these patches are a great step forward, are you
>> > going to take them and work to push them upstream, or do you want/need
>> > others to do this?
>>
>> I'll collect more like these and carry a tree for -next and push them for v4.8.
>
> Sounds good!
>
> Is there any "problem" with applying these markings to code that could
> be built as a module?  I'm thinking of lots of buses and drivers that
> have structures like this, but can be a module or not, depending on the
> configuration selected.  It would be nice to get the "benefit" of
> protection if the code is built into the kernel image.

There's no operational problem, it will just currently offer no
protections, and once the module side of things HAS been fixed, if any
got marked incorrectly, it'll be discovered then instead of when they
were added.

-Kees
Russell King (Oracle) Aug. 10, 2016, 9:43 a.m. UTC | #5
On Fri, Jun 03, 2016 at 11:40:24AM -0700, Kees Cook wrote:
> @@ -1309,16 +1309,11 @@ void __init arm_mm_memblock_reserve(void)
>   * Any other function or debugging method which may touch any device _will_
>   * crash the kernel.
>   */
> +static char vectors[PAGE_SIZE * 2] __ro_after_init __aligned(PAGE_SIZE);
>  static void __init devicemaps_init(const struct machine_desc *mdesc)
>  {
>  	struct map_desc map;
>  	unsigned long addr;
> -	void *vectors;
> -
> -	/*
> -	 * Allocate the vector page early.
> -	 */
> -	vectors = early_alloc(PAGE_SIZE * 2);

This one is not appropriate.  We _do_ write to these pages after init
for FIQ handler updates.  See set_fiq_handler().
Arnd Bergmann Aug. 10, 2016, 10 a.m. UTC | #6
On Wednesday, August 10, 2016 10:43:39 AM CEST Russell King - ARM Linux wrote:
> On Fri, Jun 03, 2016 at 11:40:24AM -0700, Kees Cook wrote:
> > @@ -1309,16 +1309,11 @@ void __init arm_mm_memblock_reserve(void)
> >   * Any other function or debugging method which may touch any device _will_
> >   * crash the kernel.
> >   */
> > +static char vectors[PAGE_SIZE * 2] __ro_after_init __aligned(PAGE_SIZE);
> >  static void __init devicemaps_init(const struct machine_desc *mdesc)
> >  {
> >       struct map_desc map;
> >       unsigned long addr;
> > -     void *vectors;
> > -
> > -     /*
> > -      * Allocate the vector page early.
> > -      */
> > -     vectors = early_alloc(PAGE_SIZE * 2);
> 
> This one is not appropriate.  We _do_ write to these pages after init
> for FIQ handler updates.  See set_fiq_handler().

Is that the only thing that modifies the page? If we think this is a
valuable change, we could make it depend on the absence of FIQ
support, as very few platforms (rpc, omap1, s3c24xx and possibly
imx) seem to even use it.

	Arnd
Russell King (Oracle) Aug. 10, 2016, 10:12 a.m. UTC | #7
On Wed, Aug 10, 2016 at 12:00:53PM +0200, Arnd Bergmann wrote:
> On Wednesday, August 10, 2016 10:43:39 AM CEST Russell King - ARM Linux wrote:
> > On Fri, Jun 03, 2016 at 11:40:24AM -0700, Kees Cook wrote:
> > > @@ -1309,16 +1309,11 @@ void __init arm_mm_memblock_reserve(void)
> > >   * Any other function or debugging method which may touch any device _will_
> > >   * crash the kernel.
> > >   */
> > > +static char vectors[PAGE_SIZE * 2] __ro_after_init __aligned(PAGE_SIZE);
> > >  static void __init devicemaps_init(const struct machine_desc *mdesc)
> > >  {
> > >       struct map_desc map;
> > >       unsigned long addr;
> > > -     void *vectors;
> > > -
> > > -     /*
> > > -      * Allocate the vector page early.
> > > -      */
> > > -     vectors = early_alloc(PAGE_SIZE * 2);
> > 
> > This one is not appropriate.  We _do_ write to these pages after init
> > for FIQ handler updates.  See set_fiq_handler().
> 
> Is that the only thing that modifies the page? If we think this is a
> valuable change, we could make it depend on the absence of FIQ
> support, as very few platforms (rpc, omap1, s3c24xx and possibly
> imx) seem to even use it.

There's the TLS emulation too, but that writes via the vectors mapping
at 0xffff0ff0.
Daniel Micay Aug. 10, 2016, 5:06 p.m. UTC | #8
On Wed, 2016-08-10 at 10:43 +0100, Russell King - ARM Linux wrote:
> On Fri, Jun 03, 2016 at 11:40:24AM -0700, Kees Cook wrote:
> > 
> > @@ -1309,16 +1309,11 @@ void __init arm_mm_memblock_reserve(void)
> >   * Any other function or debugging method which may touch any
> > device _will_
> >   * crash the kernel.
> >   */
> > +static char vectors[PAGE_SIZE * 2] __ro_after_init
> > __aligned(PAGE_SIZE);
> >  static void __init devicemaps_init(const struct machine_desc
> > *mdesc)
> >  {
> >  	struct map_desc map;
> >  	unsigned long addr;
> > -	void *vectors;
> > -
> > -	/*
> > -	 * Allocate the vector page early.
> > -	 */
> > -	vectors = early_alloc(PAGE_SIZE * 2);
> 
> This one is not appropriate.  We _do_ write to these pages after init
> for FIQ handler updates.  See set_fiq_handler().

This is one of the many cases where pax_open_kernel/pax_close_kernel are
needed to temporarily toggle it read-only. From grsecurity:

@@ -95,7 +95,10 @@ void set_fiq_handler(void *start, unsigned int
length)
 	void *base = vectors_page;
 	unsigned offset = FIQ_OFFSET;
 
+	pax_open_kernel();
 	memcpy(base + offset, start, length);
+	pax_close_kernel();
Kees Cook Aug. 10, 2016, 6:32 p.m. UTC | #9
On Wed, Aug 10, 2016 at 2:43 AM, Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:
> On Fri, Jun 03, 2016 at 11:40:24AM -0700, Kees Cook wrote:
>> @@ -1309,16 +1309,11 @@ void __init arm_mm_memblock_reserve(void)
>>   * Any other function or debugging method which may touch any device _will_
>>   * crash the kernel.
>>   */
>> +static char vectors[PAGE_SIZE * 2] __ro_after_init __aligned(PAGE_SIZE);
>>  static void __init devicemaps_init(const struct machine_desc *mdesc)
>>  {
>>       struct map_desc map;
>>       unsigned long addr;
>> -     void *vectors;
>> -
>> -     /*
>> -      * Allocate the vector page early.
>> -      */
>> -     vectors = early_alloc(PAGE_SIZE * 2);
>
> This one is not appropriate.  We _do_ write to these pages after init
> for FIQ handler updates.  See set_fiq_handler().

Ah, interesting. I guess none of that hardware is being tested on
linux-next. I'll drop that chunk and resubmit.

Thanks!

-Kees
Arnd Bergmann Aug. 10, 2016, 7:31 p.m. UTC | #10
On Wednesday, August 10, 2016 11:12:53 AM CEST Russell King - ARM Linux wrote:
> On Wed, Aug 10, 2016 at 12:00:53PM +0200, Arnd Bergmann wrote:
> > On Wednesday, August 10, 2016 10:43:39 AM CEST Russell King - ARM Linux wrote:
> > > On Fri, Jun 03, 2016 at 11:40:24AM -0700, Kees Cook wrote:
> > > > @@ -1309,16 +1309,11 @@ void __init arm_mm_memblock_reserve(void)
> > > >   * Any other function or debugging method which may touch any device _will_
> > > >   * crash the kernel.
> > > >   */
> > > > +static char vectors[PAGE_SIZE * 2] __ro_after_init __aligned(PAGE_SIZE);
> > > >  static void __init devicemaps_init(const struct machine_desc *mdesc)
> > > >  {
> > > >       struct map_desc map;
> > > >       unsigned long addr;
> > > > -     void *vectors;
> > > > -
> > > > -     /*
> > > > -      * Allocate the vector page early.
> > > > -      */
> > > > -     vectors = early_alloc(PAGE_SIZE * 2);
> > > 
> > > This one is not appropriate.  We _do_ write to these pages after init
> > > for FIQ handler updates.  See set_fiq_handler().
> > 
> > Is that the only thing that modifies the page? If we think this is a
> > valuable change, we could make it depend on the absence of FIQ
> > support, as very few platforms (rpc, omap1, s3c24xx and possibly
> > imx) seem to even use it.
> 
> There's the TLS emulation too, but that writes via the vectors mapping
> at 0xffff0ff0.

Ok, so that should be safe. Can we change the fiq code to also use the
high mapping and then take the __ro_after_init patch on top?

	Arnd
Arnd Bergmann Aug. 10, 2016, 7:41 p.m. UTC | #11
On Wednesday, August 10, 2016 11:32:07 AM CEST Kees Cook wrote:
> On Wed, Aug 10, 2016 at 2:43 AM, Russell King - ARM Linux
> <linux@armlinux.org.uk> wrote:
> > On Fri, Jun 03, 2016 at 11:40:24AM -0700, Kees Cook wrote:
> >> @@ -1309,16 +1309,11 @@ void __init arm_mm_memblock_reserve(void)
> >>   * Any other function or debugging method which may touch any device _will_
> >>   * crash the kernel.
> >>   */
> >> +static char vectors[PAGE_SIZE * 2] __ro_after_init __aligned(PAGE_SIZE);
> >>  static void __init devicemaps_init(const struct machine_desc *mdesc)
> >>  {
> >>       struct map_desc map;
> >>       unsigned long addr;
> >> -     void *vectors;
> >> -
> >> -     /*
> >> -      * Allocate the vector page early.
> >> -      */
> >> -     vectors = early_alloc(PAGE_SIZE * 2);
> >
> > This one is not appropriate.  We _do_ write to these pages after init
> > for FIQ handler updates.  See set_fiq_handler().
> 
> Ah, interesting. I guess none of that hardware is being tested on
> linux-next.

Right. The OMAP1 Amstrad Delta is a somewhat obscure machine, and that
would be the most likely candidate to run into this.

RiscPC also has FIQ support, but I have not heard of anyone other
than Russell still using one with a modern kernel, and I doubt he
tests linux-next on it.

The s3c24xx and imx machines that could use FIQ probably don't
use it in practice, last time I checked, I didn't see any DTS file
or platform data definition in the kernel that activated that
code path.

> I'll drop that chunk and resubmit.

Good enough for now, but it may be worth revisiting this, as the
vector page might be a good target for an attack if you have a
way to overwrite a few bytes in the kernel.

Note that there are two mappings for the pages, and as Russell
mentioned, the TLS emulation writes to the other one that is
at a fixed virtual address.

It might be better to start by making the fixed mapping readonly,
as KASLR doesn't protect that one at all, and change the TLS
code accordingly.

	Arnd
Kees Cook Aug. 10, 2016, 9:40 p.m. UTC | #12
On Wed, Aug 10, 2016 at 12:41 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Wednesday, August 10, 2016 11:32:07 AM CEST Kees Cook wrote:
>> On Wed, Aug 10, 2016 at 2:43 AM, Russell King - ARM Linux
>> <linux@armlinux.org.uk> wrote:
>> > On Fri, Jun 03, 2016 at 11:40:24AM -0700, Kees Cook wrote:
>> >> @@ -1309,16 +1309,11 @@ void __init arm_mm_memblock_reserve(void)
>> >>   * Any other function or debugging method which may touch any device _will_
>> >>   * crash the kernel.
>> >>   */
>> >> +static char vectors[PAGE_SIZE * 2] __ro_after_init __aligned(PAGE_SIZE);
>> >>  static void __init devicemaps_init(const struct machine_desc *mdesc)
>> >>  {
>> >>       struct map_desc map;
>> >>       unsigned long addr;
>> >> -     void *vectors;
>> >> -
>> >> -     /*
>> >> -      * Allocate the vector page early.
>> >> -      */
>> >> -     vectors = early_alloc(PAGE_SIZE * 2);
>> >
>> > This one is not appropriate.  We _do_ write to these pages after init
>> > for FIQ handler updates.  See set_fiq_handler().
>>
>> Ah, interesting. I guess none of that hardware is being tested on
>> linux-next.
>
> Right. The OMAP1 Amstrad Delta is a somewhat obscure machine, and that
> would be the most likely candidate to run into this.
>
> RiscPC also has FIQ support, but I have not heard of anyone other
> than Russell still using one with a modern kernel, and I doubt he
> tests linux-next on it.
>
> The s3c24xx and imx machines that could use FIQ probably don't
> use it in practice, last time I checked, I didn't see any DTS file
> or platform data definition in the kernel that activated that
> code path.
>
>> I'll drop that chunk and resubmit.
>
> Good enough for now, but it may be worth revisiting this, as the
> vector page might be a good target for an attack if you have a
> way to overwrite a few bytes in the kernel.
>
> Note that there are two mappings for the pages, and as Russell
> mentioned, the TLS emulation writes to the other one that is
> at a fixed virtual address.
>
> It might be better to start by making the fixed mapping readonly,
> as KASLR doesn't protect that one at all, and change the TLS
> code accordingly.

That sounds good (anyone want to work on this?). Does arm64 need a
similar vector page protection?

-Kees
Russell King (Oracle) Aug. 10, 2016, 11:02 p.m. UTC | #13
On Wed, Aug 10, 2016 at 09:31:05PM +0200, Arnd Bergmann wrote:
> On Wednesday, August 10, 2016 11:12:53 AM CEST Russell King - ARM Linux wrote:
> > There's the TLS emulation too, but that writes via the vectors mapping
> > at 0xffff0ff0.
> 
> Ok, so that should be safe. Can we change the fiq code to also use the
> high mapping and then take the __ro_after_init patch on top?

We can't - if the kernel is configured without the kuser helpers in
the vectors page, it's mapped read-only.  I'm not sure what the
intersection is between platforms that can have FIQs and platforms
that can disable the kuser helpers.
Russell King (Oracle) Aug. 10, 2016, 11:06 p.m. UTC | #14
On Wed, Aug 10, 2016 at 09:41:23PM +0200, Arnd Bergmann wrote:
> It might be better to start by making the fixed mapping readonly,
> as KASLR doesn't protect that one at all, and change the TLS
> code accordingly.

I think that's impossible, because we gave userspace permission to
read 0xffff0ff0 directly without using __kuser_get_tls.  You're
talking about potentially breaking userspace.

If you disable kuser helpers, then the page becomes read-only and
invisible to userspace anyway.  So, everything is being done there
which can be done - if you have kuser helpers enabled, then you
lose some opportunities for these security improvements.
Arnd Bergmann Aug. 11, 2016, 3:54 p.m. UTC | #15
On Thursday, August 11, 2016 12:06:45 AM CEST Russell King - ARM Linux wrote:
> On Wed, Aug 10, 2016 at 09:41:23PM +0200, Arnd Bergmann wrote:
> > It might be better to start by making the fixed mapping readonly,
> > as KASLR doesn't protect that one at all, and change the TLS
> > code accordingly.
> 
> I think that's impossible, because we gave userspace permission to
> read 0xffff0ff0 directly without using __kuser_get_tls.  You're
> talking about potentially breaking userspace.
> 
> If you disable kuser helpers, then the page becomes read-only and
> invisible to userspace anyway.  So, everything is being done there
> which can be done - if you have kuser helpers enabled, then you
> lose some opportunities for these security improvements.

What I meant was writing to the page through the linear mapping
rather than the virtual mapping at 0xffff0000 so we can leave that
one read-only (I did not consider whether that might cause cache
aliasing problems when reading from the other address).

Your other point is more important though: if one really cares
about optimizing security here, they probably should disable
kuser helpers completely anyway.

Kees, is that something you have on your radar already?

	Arnd
Arnd Bergmann Aug. 11, 2016, 4:02 p.m. UTC | #16
On Thursday, August 11, 2016 12:02:42 AM CEST Russell King - ARM Linux wrote:
> On Wed, Aug 10, 2016 at 09:31:05PM +0200, Arnd Bergmann wrote:
> > On Wednesday, August 10, 2016 11:12:53 AM CEST Russell King - ARM Linux wrote:
> > > There's the TLS emulation too, but that writes via the vectors mapping
> > > at 0xffff0ff0.
> > 
> > Ok, so that should be safe. Can we change the fiq code to also use the
> > high mapping and then take the __ro_after_init patch on top?
> 
> We can't - if the kernel is configured without the kuser helpers in
> the vectors page, it's mapped read-only. I'm not sure what the
> intersection is between platforms that can have FIQs and platforms
> that can disable the kuser helpers.

From Kconfig logic and callers of set_fiq_handler(), theoretically
there is just i.MX3, but I think they never use fiq in their
audio drivers in practice already, and Mark Brown mentioned
that we could remove fiq support in the imx audio driver (don't
remember the details at the moment).

If we can prove that i.MX3 PCM FIQ support is never used, then the
intersection is empty, and all machines that use FIQ require kuser
helpers.

This may change with Daniel Thompson's patches that use the FIQ
for NMI backtrace.

	Arnd
Kees Cook Aug. 11, 2016, 10:16 p.m. UTC | #17
On Thu, Aug 11, 2016 at 8:54 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Thursday, August 11, 2016 12:06:45 AM CEST Russell King - ARM Linux wrote:
>> On Wed, Aug 10, 2016 at 09:41:23PM +0200, Arnd Bergmann wrote:
>> > It might be better to start by making the fixed mapping readonly,
>> > as KASLR doesn't protect that one at all, and change the TLS
>> > code accordingly.
>>
>> I think that's impossible, because we gave userspace permission to
>> read 0xffff0ff0 directly without using __kuser_get_tls.  You're
>> talking about potentially breaking userspace.
>>
>> If you disable kuser helpers, then the page becomes read-only and
>> invisible to userspace anyway.  So, everything is being done there
>> which can be done - if you have kuser helpers enabled, then you
>> lose some opportunities for these security improvements.
>
> What I meant was writing to the page through the linear mapping
> rather than the virtual mapping at 0xffff0000 so we can leave that
> one read-only (I did not consider whether that might cause cache
> aliasing problems when reading from the other address).
>
> Your other point is more important though: if one really cares
> about optimizing security here, they probably should disable
> kuser helpers completely anyway.
>
> Kees, is that something you have on your radar already?

It wasn't no. I will add it. :)

-Kees
Daniel Thompson Aug. 12, 2016, 11:34 a.m. UTC | #18
On 11/08/16 17:02, Arnd Bergmann wrote:
> On Thursday, August 11, 2016 12:02:42 AM CEST Russell King - ARM Linux wrote:
>> On Wed, Aug 10, 2016 at 09:31:05PM +0200, Arnd Bergmann wrote:
>>> On Wednesday, August 10, 2016 11:12:53 AM CEST Russell King - ARM Linux wrote:
>>>> There's the TLS emulation too, but that writes via the vectors mapping
>>>> at 0xffff0ff0.
>>>
>>> Ok, so that should be safe. Can we change the fiq code to also use the
>>> high mapping and then take the __ro_after_init patch on top?
>>
>> We can't - if the kernel is configured without the kuser helpers in
>> the vectors page, it's mapped read-only. I'm not sure what the
>> intersection is between platforms that can have FIQs and platforms
>> that can disable the kuser helpers.
>
> From Kconfig logic and callers of set_fiq_handler(), theoretically
> there is just i.MX3, but I think they never use fiq in their
> audio drivers in practice already, and Mark Brown mentioned
> that we could remove fiq support in the imx audio driver (don't
> remember the details at the moment).
>
> If we can prove that i.MX3 PCM FIQ support is never used, then the
> intersection is empty, and all machines that use FIQ require kuser
> helpers.
>
> This may change with Daniel Thompson's patches that use the FIQ
> for NMI backtrace.

It shouldn't do!

All the work I did (and am, very slowly, still doing) worked by using 
the default FIQ handler provided at boot time to jump into the perf code.

Nothing I have done or plan to do needs set_fiq_handler() to remain 
functional.

Likewise, nothing I have done should cause set_fiq_handler() to stop 
working for people who do still use it. FWIW I got the impression over 
the last few years that the most significant uses of FIQ on modern 
systems are out-of-tree uses who have designed custom FPGA hardware (and 
presumably designed them with very short FIFOs).


Daniel.
Russell King (Oracle) Aug. 12, 2016, 4:24 p.m. UTC | #19
On Thu, Aug 11, 2016 at 05:54:08PM +0200, Arnd Bergmann wrote:
> On Thursday, August 11, 2016 12:06:45 AM CEST Russell King - ARM Linux wrote:
> > On Wed, Aug 10, 2016 at 09:41:23PM +0200, Arnd Bergmann wrote:
> > > It might be better to start by making the fixed mapping readonly,
> > > as KASLR doesn't protect that one at all, and change the TLS
> > > code accordingly.
> > 
> > I think that's impossible, because we gave userspace permission to
> > read 0xffff0ff0 directly without using __kuser_get_tls.  You're
> > talking about potentially breaking userspace.
> > 
> > If you disable kuser helpers, then the page becomes read-only and
> > invisible to userspace anyway.  So, everything is being done there
> > which can be done - if you have kuser helpers enabled, then you
> > lose some opportunities for these security improvements.
> 
> What I meant was writing to the page through the linear mapping
> rather than the virtual mapping at 0xffff0000 so we can leave that
> one read-only (I did not consider whether that might cause cache
> aliasing problems when reading from the other address).

Kees original patch was about moving the vector pages into the
read-only area after init, so the linear mapping of them becomes
read-only as well.  So that won't work.  We need at least one
read-write mapping for FIQ, and for context switching for kuser
helpers.

> Your other point is more important though: if one really cares
> about optimizing security here, they probably should disable
> kuser helpers completely anyway.

We could probably predicate moving the vectors page into the RO
section when kuser helpers are enabled.

> Kees, is that something you have on your radar already?

I believe Android already disable kuser helpers as of a few years ago.
diff mbox

Patch

diff --git a/arch/arm/kernel/cpuidle.c b/arch/arm/kernel/cpuidle.c
index a44b268e12e1..d574708ea20c 100644
--- a/arch/arm/kernel/cpuidle.c
+++ b/arch/arm/kernel/cpuidle.c
@@ -19,7 +19,7 @@  extern struct of_cpuidle_method __cpuidle_method_of_table[];
 static const struct of_cpuidle_method __cpuidle_method_of_table_sentinel
 	__used __section(__cpuidle_method_of_table_end);
 
-static struct cpuidle_ops cpuidle_ops[NR_CPUS];
+static struct cpuidle_ops cpuidle_ops[NR_CPUS] __ro_after_init;
 
 /**
  * arm_cpuidle_simple_enter() - a wrapper to cpu_do_idle()
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 7b5350060612..38196e581d0a 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -115,19 +115,19 @@  EXPORT_SYMBOL(elf_hwcap2);
 
 
 #ifdef MULTI_CPU
-struct processor processor __read_mostly;
+struct processor processor __ro_after_init;
 #endif
 #ifdef MULTI_TLB
-struct cpu_tlb_fns cpu_tlb __read_mostly;
+struct cpu_tlb_fns cpu_tlb __ro_after_init;
 #endif
 #ifdef MULTI_USER
-struct cpu_user_fns cpu_user __read_mostly;
+struct cpu_user_fns cpu_user __ro_after_init;
 #endif
 #ifdef MULTI_CACHE
-struct cpu_cache_fns cpu_cache __read_mostly;
+struct cpu_cache_fns cpu_cache __ro_after_init;
 #endif
 #ifdef CONFIG_OUTER_CACHE
-struct outer_cache_fns outer_cache __read_mostly;
+struct outer_cache_fns outer_cache __ro_after_init;
 EXPORT_SYMBOL(outer_cache);
 #endif
 
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index df90bc59bfce..9902e76fce0d 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -82,7 +82,7 @@  enum ipi_msg_type {
 
 static DECLARE_COMPLETION(cpu_running);
 
-static struct smp_operations smp_ops;
+static struct smp_operations smp_ops __ro_after_init;
 
 void __init smp_set_ops(const struct smp_operations *ops)
 {
diff --git a/arch/arm/lib/delay.c b/arch/arm/lib/delay.c
index 8044591dca72..2cef11884857 100644
--- a/arch/arm/lib/delay.c
+++ b/arch/arm/lib/delay.c
@@ -29,7 +29,7 @@ 
 /*
  * Default to the loop-based delay implementation.
  */
-struct arm_delay_ops arm_delay_ops = {
+struct arm_delay_ops arm_delay_ops __ro_after_init = {
 	.delay		= __loop_delay,
 	.const_udelay	= __loop_const_udelay,
 	.udelay		= __loop_udelay,
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 62f4d01941f7..5e2acc78175e 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -243,7 +243,7 @@  __setup("noalign", noalign_setup);
 #define PROT_PTE_S2_DEVICE	PROT_PTE_DEVICE
 #define PROT_SECT_DEVICE	PMD_TYPE_SECT|PMD_SECT_AP_WRITE
 
-static struct mem_type mem_types[] = {
+static struct mem_type mem_types[] __ro_after_init = {
 	[MT_DEVICE] = {		  /* Strongly ordered / ARMv6 shared device */
 		.prot_pte	= PROT_PTE_DEVICE | L_PTE_MT_DEV_SHARED |
 				  L_PTE_SHARED,
@@ -1309,16 +1309,11 @@  void __init arm_mm_memblock_reserve(void)
  * Any other function or debugging method which may touch any device _will_
  * crash the kernel.
  */
+static char vectors[PAGE_SIZE * 2] __ro_after_init __aligned(PAGE_SIZE);
 static void __init devicemaps_init(const struct machine_desc *mdesc)
 {
 	struct map_desc map;
 	unsigned long addr;
-	void *vectors;
-
-	/*
-	 * Allocate the vector page early.
-	 */
-	vectors = early_alloc(PAGE_SIZE * 2);
 
 	early_trap_init(vectors);
 
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 4e5821896eec..f0894910bdd7 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -419,8 +419,7 @@  void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr)
 	iounmap((void __iomem *)((unsigned long)addr & PAGE_MASK));
 }
 
-static pte_t bm_pte[PAGE_SIZE/sizeof(pte_t)] __ro_after_init
-					     __aligned(PAGE_SIZE);
+static pte_t bm_pte[PAGE_SIZE/sizeof(pte_t)] __page_aligned_bss;
 
 static inline pmd_t * __init early_ioremap_pmd(unsigned long addr)
 {