diff mbox

arm64: Make arch_randomize_brk avoid stack area

Message ID 1462370994.2895.11.camel@linaro.org (mailing list archive)
State New, archived
Headers show

Commit Message

Jon Medhurst (Tixy) May 4, 2016, 2:09 p.m. UTC
When a process is created, various address randomisations could end up
colluding to place the address of brk in the stack memory. This would
mean processes making small heap based memory allocations are in danger
of having them overwriting, or overwritten by, the stack.

Another consequence, is that even for processes that make no use of
brk, the output of /proc/*/maps may show the stack area listed as
'[heap]' rather than '[stack]'. Apart from being misleading this causes
fatal errors with the Android run-time like:
"No [stack] line found in /proc/self/task/*/maps"

To prevent this problem pick a limit for brk that allows for the stack's
memory. At the same time we remove randomize_base() as that was only
used by arch_randomize_brk().

Note, in practice, since commit d1fd836dcf00 ("mm: split ET_DYN ASLR
from mmap ASLR") this problem shouldn't occur because the address chosen
for loading binaries is well clear of the stack, however, prior to that
the problem does occur because of the following...

The memory layout of a task is determined by arch_pick_mmap_layout. If
address randomisation is enabled (task has flag PF_RANDOMIZE) this sets
mmap_base to a random address at the top of task memory just below a
region calculated to allow for a stack which itself may have a random
base address. Any mmap operations that then happen which require an
address allocating will use the topdown allocation method, i.e. the
first allocated memory will be at the top of memory, just below the
area set aside for the stack.

When a relocatable binary is loaded into a new process by
load_elf_binary and randomised address are enabled, it uses a
'load_bias' of zero, so that when mmap is called to create a memory
region for it, a new address is picked (address zero not being
available). As this is the first memory region in the task, it gets the
region just below the stack, as described previously.

The loader then set's brk to the end of the elf data section, which will
be near the end of the loaded binary and then it calls
arch_randomize_brk. As this currently stands, this adds a random amount
to brk, which unfortunately may take it into the address range where the
stack lies.

Testing:

These changes have been tested on Linux v4.6-rc4 using 100000
invocations of a program [1] that can display the offset of a process's
brk...

$for i in $(seq 100000); do ./aslr --report brk ; done

This shows values of brk are evenly distributed over a 1GB range, both
before and after this change.

With Linux version 3.18 (where the collision of brk and stack can happen
and this change limits brk to avoid that) the distribution of brk values
after the change shows a slope where lower values for brk are more
common and upper values have about half the frequency of those.

[1] http://bazaar.launchpad.net/~ubuntu-bugcontrol/qa-regression-testing/master/files/2499/scripts/kernel-security/aslr/

Signed-off-by: Jon Medhurst <tixy@linaro.org>
Cc: <stable@vger.kernel.org> # 4.0 and earlier

---

Changes since RFC.
- Fixed compilation errors (finger trouble preparing original email)
- Updated commit message to included notes on testing
- Added CC stable for Linux '4.0 and earlier'

 arch/arm64/kernel/process.c | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

Comments

Kees Cook May 4, 2016, 5:20 p.m. UTC | #1
On Wed, May 4, 2016 at 7:09 AM, Jon Medhurst (Tixy) <tixy@linaro.org> wrote:
> When a process is created, various address randomisations could end up
> colluding to place the address of brk in the stack memory. This would
> mean processes making small heap based memory allocations are in danger
> of having them overwriting, or overwritten by, the stack.
>
> Another consequence, is that even for processes that make no use of
> brk, the output of /proc/*/maps may show the stack area listed as
> '[heap]' rather than '[stack]'. Apart from being misleading this causes
> fatal errors with the Android run-time like:
> "No [stack] line found in /proc/self/task/*/maps"
>
> To prevent this problem pick a limit for brk that allows for the stack's
> memory. At the same time we remove randomize_base() as that was only
> used by arch_randomize_brk().
>
> Note, in practice, since commit d1fd836dcf00 ("mm: split ET_DYN ASLR
> from mmap ASLR") this problem shouldn't occur because the address chosen
> for loading binaries is well clear of the stack, however, prior to that
> the problem does occur because of the following...
>
> The memory layout of a task is determined by arch_pick_mmap_layout. If
> address randomisation is enabled (task has flag PF_RANDOMIZE) this sets
> mmap_base to a random address at the top of task memory just below a
> region calculated to allow for a stack which itself may have a random
> base address. Any mmap operations that then happen which require an
> address allocating will use the topdown allocation method, i.e. the
> first allocated memory will be at the top of memory, just below the
> area set aside for the stack.
>
> When a relocatable binary is loaded into a new process by
> load_elf_binary and randomised address are enabled, it uses a
> 'load_bias' of zero, so that when mmap is called to create a memory
> region for it, a new address is picked (address zero not being
> available). As this is the first memory region in the task, it gets the
> region just below the stack, as described previously.
>
> The loader then set's brk to the end of the elf data section, which will
> be near the end of the loaded binary and then it calls
> arch_randomize_brk. As this currently stands, this adds a random amount
> to brk, which unfortunately may take it into the address range where the
> stack lies.
>
> Testing:
>
> These changes have been tested on Linux v4.6-rc4 using 100000
> invocations of a program [1] that can display the offset of a process's
> brk...
>
> $for i in $(seq 100000); do ./aslr --report brk ; done
>
> This shows values of brk are evenly distributed over a 1GB range, both
> before and after this change.
>
> With Linux version 3.18 (where the collision of brk and stack can happen
> and this change limits brk to avoid that) the distribution of brk values
> after the change shows a slope where lower values for brk are more
> common and upper values have about half the frequency of those.
>
> [1] http://bazaar.launchpad.net/~ubuntu-bugcontrol/qa-regression-testing/master/files/2499/scripts/kernel-security/aslr/
>
> Signed-off-by: Jon Medhurst <tixy@linaro.org>
> Cc: <stable@vger.kernel.org> # 4.0 and earlier

Acked-by: Kees Cook <keescook@chromium.org>

-Kees

>
> ---
>
> Changes since RFC.
> - Fixed compilation errors (finger trouble preparing original email)
> - Updated commit message to included notes on testing
> - Added CC stable for Linux '4.0 and earlier'
>
>  arch/arm64/kernel/process.c | 24 ++++++++++++++++++------
>  1 file changed, 18 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 8062482..26429a0 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -382,13 +382,25 @@ unsigned long arch_align_stack(unsigned long sp)
>         return sp & ~0xf;
>  }
>
> -static unsigned long randomize_base(unsigned long base)
> +unsigned long arch_randomize_brk(struct mm_struct *mm)
>  {
> +       unsigned long base = mm->brk;
>         unsigned long range_end = base + (STACK_RND_MASK << PAGE_SHIFT) + 1;
> -       return randomize_range(base, range_end, 0) ? : base;
> -}
> +       unsigned long max_stack, range_limit;
>
> -unsigned long arch_randomize_brk(struct mm_struct *mm)
> -{
> -       return randomize_base(mm->brk);
> +       /*
> +        * Determine how much room we need to leave available for the stack.
> +        * We limit this to a reasonable value, because extremely large or
> +        * unlimited stacks are always going to bump up against brk at some
> +        * point and we don't want to fail to randomise brk in those cases.
> +        */
> +       max_stack = rlimit(RLIMIT_STACK);
> +       if (max_stack > SZ_128M)
> +               max_stack = SZ_128M;
> +
> +       range_limit = mm->start_stack - max_stack - 1;
> +       if (range_end > range_limit)
> +               range_end = range_limit;
> +
> +       return randomize_range(base, range_end, 0) ? : base;
>  }
> --
> 2.1.4
>
>
Catalin Marinas May 6, 2016, 11:19 a.m. UTC | #2
Hi Tixy,

On Wed, May 04, 2016 at 03:09:54PM +0100, Jon Medhurst (Tixy) wrote:
> Note, in practice, since commit d1fd836dcf00 ("mm: split ET_DYN ASLR
> from mmap ASLR") this problem shouldn't occur because the address chosen
> for loading binaries is well clear of the stack, however, prior to that
> the problem does occur because of the following...

[...]

> These changes have been tested on Linux v4.6-rc4 using 100000
> invocations of a program [1] that can display the offset of a process's
> brk...

[...]

> Signed-off-by: Jon Medhurst <tixy@linaro.org>
> Cc: <stable@vger.kernel.org> # 4.0 and earlier

I don't fully understand what we are supposed to do with this patch.
Should it only be applied to stable kernels prior to 4.0? Do we need it
in mainline? As you stated above, this problem does not exist in recent
kernels.
Jon Medhurst (Tixy) May 6, 2016, 11:51 a.m. UTC | #3
On Fri, 2016-05-06 at 12:19 +0100, Catalin Marinas wrote:
> Hi Tixy,
> 
> On Wed, May 04, 2016 at 03:09:54PM +0100, Jon Medhurst (Tixy) wrote:
> > Note, in practice, since commit d1fd836dcf00 ("mm: split ET_DYN ASLR
> > from mmap ASLR") this problem shouldn't occur because the address chosen
> > for loading binaries is well clear of the stack, however, prior to that
> > the problem does occur because of the following...
> 
> [...]
> 
> > These changes have been tested on Linux v4.6-rc4 using 100000
> > invocations of a program [1] that can display the offset of a process's
> > brk...
> 
> [...]
> 
> > Signed-off-by: Jon Medhurst <tixy@linaro.org>
> > Cc: <stable@vger.kernel.org> # 4.0 and earlier
> 
> I don't fully understand what we are supposed to do with this patch.
> Should it only be applied to stable kernels prior to 4.0? Do we need it
> in mainline? As you stated above, this problem does not exist in recent
> kernels.

Well, if you think it's worthwhile defensive programming against future
changes to elf loader, then it could go into latest kernels. Otherwise,
then yes, it's for Linux 4.0 and earlier. What's the process for that,
email it to stable@vger.kernel.org direct? Is that OK without an Ack
from an arm64 maintainer?
Catalin Marinas May 10, 2016, 3:55 p.m. UTC | #4
On Fri, May 06, 2016 at 12:51:00PM +0100, Jon Medhurst (Tixy) wrote:
> On Fri, 2016-05-06 at 12:19 +0100, Catalin Marinas wrote:
> > On Wed, May 04, 2016 at 03:09:54PM +0100, Jon Medhurst (Tixy) wrote:
> > > Note, in practice, since commit d1fd836dcf00 ("mm: split ET_DYN ASLR
> > > from mmap ASLR") this problem shouldn't occur because the address chosen
> > > for loading binaries is well clear of the stack, however, prior to that
> > > the problem does occur because of the following...
> > 
> > [...]
> > 
> > > These changes have been tested on Linux v4.6-rc4 using 100000
> > > invocations of a program [1] that can display the offset of a process's
> > > brk...
> > 
> > [...]
> > 
> > > Signed-off-by: Jon Medhurst <tixy@linaro.org>
> > > Cc: <stable@vger.kernel.org> # 4.0 and earlier
> > 
> > I don't fully understand what we are supposed to do with this patch.
> > Should it only be applied to stable kernels prior to 4.0? Do we need it
> > in mainline? As you stated above, this problem does not exist in recent
> > kernels.
> 
> Well, if you think it's worthwhile defensive programming against future
> changes to elf loader, then it could go into latest kernels.

I don't think we should bother with latest upstream. AFAICT, with
splitting ET_DYN ASLR from the mmap one, we shouldn't hit this issue.
And I wouldn't expect the two regions to be unified again in the future.

> Otherwise, then yes, it's for Linux 4.0 and earlier. What's the
> process for that, email it to stable@vger.kernel.org direct?

Usually emailing stable@vger.kernel.org with an explanation of why it is
not needed in mainline since it is not a back-port.

> Is that OK without an Ack from an arm64 maintainer?

I guess it's up to the stable maintainers but in any case:

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
diff mbox

Patch

diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 8062482..26429a0 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -382,13 +382,25 @@  unsigned long arch_align_stack(unsigned long sp)
 	return sp & ~0xf;
 }
 
-static unsigned long randomize_base(unsigned long base)
+unsigned long arch_randomize_brk(struct mm_struct *mm)
 {
+	unsigned long base = mm->brk;
 	unsigned long range_end = base + (STACK_RND_MASK << PAGE_SHIFT) + 1;
-	return randomize_range(base, range_end, 0) ? : base;
-}
+	unsigned long max_stack, range_limit;
 
-unsigned long arch_randomize_brk(struct mm_struct *mm)
-{
-	return randomize_base(mm->brk);
+	/*
+	 * Determine how much room we need to leave available for the stack.
+	 * We limit this to a reasonable value, because extremely large or
+	 * unlimited stacks are always going to bump up against brk at some
+	 * point and we don't want to fail to randomise brk in those cases.
+	 */
+	max_stack = rlimit(RLIMIT_STACK);
+	if (max_stack > SZ_128M)
+		max_stack = SZ_128M;
+
+	range_limit = mm->start_stack - max_stack - 1;
+	if (range_end > range_limit)
+		range_end = range_limit;
+
+	return randomize_range(base, range_end, 0) ? : base;
 }