[1/2] scs: switch to vmapped shadow stacks

Message ID	20201022202355.3529836-2-samitolvanen@google.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=FtUi=D5=lists.infradead.org=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6F87224654 Date: Thu, 22 Oct 2020 13:23:54 -0700 In-Reply-To: <20201022202355.3529836-1-samitolvanen@google.com> Message-Id: <20201022202355.3529836-2-samitolvanen@google.com> Mime-Version: 1.0 References: <20201022202355.3529836-1-samitolvanen@google.com> Subject: [PATCH 1/2] scs: switch to vmapped shadow stacks From: Sami Tolvanen <samitolvanen@google.com> To: Will Deacon <will@kernel.org>, Catalin Marinas <catalin.marinas@arm.com> Precedence: list Cc: Mark Rutland <mark.rutland@arm.com>, Kees Cook <keescook@chromium.org>, Ard Biesheuvel <ard.biesheuvel@linaro.org>, linux-kernel@vger.kernel.org, James Morse <james.morse@arm.com>, Sami Tolvanen <samitolvanen@google.com>, linux-arm-kernel@lists.infradead.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org
Series	scs: switch to vmapped shadow stacks \| expand [0/2] scs: switch to vmapped shadow stacks [1/2] scs: switch to vmapped shadow stacks [2/2] arm64: scs: use vmapped IRQ and SDEI shadow stacks

Message ID

20201022202355.3529836-2-samitolvanen@google.com (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6F87224654
Date: Thu, 22 Oct 2020 13:23:54 -0700
In-Reply-To: <20201022202355.3529836-1-samitolvanen@google.com>
Message-Id: <20201022202355.3529836-2-samitolvanen@google.com>
Mime-Version: 1.0
References: <20201022202355.3529836-1-samitolvanen@google.com>
Subject: [PATCH 1/2] scs: switch to vmapped shadow stacks
From: Sami Tolvanen <samitolvanen@google.com>
To: Will Deacon <will@kernel.org>, Catalin Marinas <catalin.marinas@arm.com>
Precedence: list
Cc: Mark Rutland <mark.rutland@arm.com>, Kees Cook <keescook@chromium.org>,
 Ard Biesheuvel <ard.biesheuvel@linaro.org>, linux-kernel@vger.kernel.org,
 James Morse <james.morse@arm.com>, Sami Tolvanen <samitolvanen@google.com>,
 linux-arm-kernel@lists.infradead.org
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: 
 linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org

Series

scs: switch to vmapped shadow stacks | expand

Commit Message

Sami Tolvanen Oct. 22, 2020, 8:23 p.m. UTC

The kernel currently uses kmem_cache to allocate shadow call stacks,
which means an overflow may not be immediately detected and can
potentially result in another task's shadow stack to be overwritten.

This change switches SCS to use virtually mapped shadow stacks,
which increases shadow stack size to a full page and provides more
robust overflow detection similarly to VMAP_STACK.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 include/linux/scs.h |  7 +----
 kernel/scs.c        | 63 ++++++++++++++++++++++++++++++++++++++-------
 2 files changed, 55 insertions(+), 15 deletions(-)

Comments

Kees Cook Oct. 22, 2020, 10:38 p.m. UTC | #1

On Thu, Oct 22, 2020 at 01:23:54PM -0700, Sami Tolvanen wrote:
> The kernel currently uses kmem_cache to allocate shadow call stacks,
> which means an overflow may not be immediately detected and can
> potentially result in another task's shadow stack to be overwritten.
> 
> This change switches SCS to use virtually mapped shadow stacks,
> which increases shadow stack size to a full page and provides more
> robust overflow detection similarly to VMAP_STACK.
> 
> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>

Thanks! I much prefer this to kmem. :)

Reviewed-by: Kees Cook <keescook@chromium.org>

Will Deacon Nov. 19, 2020, 1 p.m. UTC | #2

Hi Sami,

On Thu, Oct 22, 2020 at 01:23:54PM -0700, Sami Tolvanen wrote:
> The kernel currently uses kmem_cache to allocate shadow call stacks,
> which means an overflow may not be immediately detected and can
> potentially result in another task's shadow stack to be overwritten.
> 
> This change switches SCS to use virtually mapped shadow stacks,
> which increases shadow stack size to a full page and provides more
> robust overflow detection similarly to VMAP_STACK.
> 
> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> ---
>  include/linux/scs.h |  7 +----
>  kernel/scs.c        | 63 ++++++++++++++++++++++++++++++++++++++-------
>  2 files changed, 55 insertions(+), 15 deletions(-)

Cheers for posting this. I _much_ prefer handling the SCS this way, but I
have some comments on the implementation below.

> diff --git a/include/linux/scs.h b/include/linux/scs.h
> index 6dec390cf154..86e3c4b7b714 100644
> --- a/include/linux/scs.h
> +++ b/include/linux/scs.h
> @@ -15,12 +15,7 @@
>  
>  #ifdef CONFIG_SHADOW_CALL_STACK
>  
> -/*
> - * In testing, 1 KiB shadow stack size (i.e. 128 stack frames on a 64-bit
> - * architecture) provided ~40% safety margin on stack usage while keeping
> - * memory allocation overhead reasonable.
> - */
> -#define SCS_SIZE		SZ_1K
> +#define SCS_SIZE		PAGE_SIZE

We could make this SCS_ORDER and then forget about alignment etc.

>  #define GFP_SCS			(GFP_KERNEL | __GFP_ZERO)
>  
>  /* An illegal pointer value to mark the end of the shadow stack. */
> diff --git a/kernel/scs.c b/kernel/scs.c
> index 4ff4a7ba0094..2136edba548d 100644
> --- a/kernel/scs.c
> +++ b/kernel/scs.c
> @@ -5,50 +5,95 @@
>   * Copyright (C) 2019 Google LLC
>   */
>  
> +#include <linux/cpuhotplug.h>
>  #include <linux/kasan.h>
>  #include <linux/mm.h>
>  #include <linux/scs.h>
> -#include <linux/slab.h>
> +#include <linux/vmalloc.h>
>  #include <linux/vmstat.h>
>  
> -static struct kmem_cache *scs_cache;
> -
>  static void __scs_account(void *s, int account)
>  {
> -	struct page *scs_page = virt_to_page(s);
> +	struct page *scs_page = vmalloc_to_page(s);
>  
>  	mod_node_page_state(page_pgdat(scs_page), NR_KERNEL_SCS_KB,
>  			    account * (SCS_SIZE / SZ_1K));
>  }
>  
> +/* Matches NR_CACHED_STACKS for VMAP_STACK */
> +#define NR_CACHED_SCS 2
> +static DEFINE_PER_CPU(void *, scs_cache[NR_CACHED_SCS]);
> +
>  static void *scs_alloc(int node)
>  {
> -	void *s = kmem_cache_alloc_node(scs_cache, GFP_SCS, node);
> +	int i;
> +	void *s;
> +
> +	for (i = 0; i < NR_CACHED_SCS; i++) {
> +		s = this_cpu_xchg(scs_cache[i], NULL);
> +		if (s) {
> +			memset(s, 0, SCS_SIZE);
> +			goto out;
> +		}
> +	}
> +
> +	/*
> +	 * We allocate a full page for the shadow stack, which should be
> +	 * more than we need. Check the assumption nevertheless.
> +	 */
> +	BUILD_BUG_ON(SCS_SIZE > PAGE_SIZE);i

With SCS_ORDER, you can drop this.

> +
> +	s = __vmalloc_node_range(PAGE_SIZE, SCS_SIZE,
> +				 VMALLOC_START, VMALLOC_END,
> +				 GFP_SCS, PAGE_KERNEL, 0,
> +				 node, __builtin_return_address(0));

Do we actually need vmalloc here? If we used alloc_pages() + vmap()
instead, then we could avoid the expensive call to vmalloc_to_page()
in __scs_account().

>  
>  	if (!s)
>  		return NULL;
>  
> +out:
>  	*__scs_magic(s) = SCS_END_MAGIC;
>  
>  	/*
>  	 * Poison the allocation to catch unintentional accesses to
>  	 * the shadow stack when KASAN is enabled.
>  	 */
> -	kasan_poison_object_data(scs_cache, s);
> +	kasan_poison_vmalloc(s, SCS_SIZE);
>  	__scs_account(s, 1);
>  	return s;
>  }
>  
>  static void scs_free(void *s)
>  {
> +	int i;
> +
>  	__scs_account(s, -1);
> -	kasan_unpoison_object_data(scs_cache, s);
> -	kmem_cache_free(scs_cache, s);
> +	kasan_unpoison_vmalloc(s, SCS_SIZE);

I don't see the point in unpoisoning here tbh; vfree_atomic() re-poisons
almost immediately, so we should probably defer this to scs_alloc() and
only when picking the stack out of the cache.

> +
> +	for (i = 0; i < NR_CACHED_SCS; i++)

Can you add a comment about the re-entrancy here and why we're using
this_cpu_cmpxchg() please?

Tnanks,

Will

Sami Tolvanen Nov. 20, 2020, 5 p.m. UTC | #3

On Thu, Nov 19, 2020 at 5:00 AM Will Deacon <will@kernel.org> wrote:
>
> Hi Sami,
>
> On Thu, Oct 22, 2020 at 01:23:54PM -0700, Sami Tolvanen wrote:
> > The kernel currently uses kmem_cache to allocate shadow call stacks,
> > which means an overflow may not be immediately detected and can
> > potentially result in another task's shadow stack to be overwritten.
> >
> > This change switches SCS to use virtually mapped shadow stacks,
> > which increases shadow stack size to a full page and provides more
> > robust overflow detection similarly to VMAP_STACK.
> >
> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> > ---
> >  include/linux/scs.h |  7 +----
> >  kernel/scs.c        | 63 ++++++++++++++++++++++++++++++++++++++-------
> >  2 files changed, 55 insertions(+), 15 deletions(-)
>
> Cheers for posting this. I _much_ prefer handling the SCS this way, but I
> have some comments on the implementation below.
>
> > diff --git a/include/linux/scs.h b/include/linux/scs.h
> > index 6dec390cf154..86e3c4b7b714 100644
> > --- a/include/linux/scs.h
> > +++ b/include/linux/scs.h
> > @@ -15,12 +15,7 @@
> >
> >  #ifdef CONFIG_SHADOW_CALL_STACK
> >
> > -/*
> > - * In testing, 1 KiB shadow stack size (i.e. 128 stack frames on a 64-bit
> > - * architecture) provided ~40% safety margin on stack usage while keeping
> > - * memory allocation overhead reasonable.
> > - */
> > -#define SCS_SIZE             SZ_1K
> > +#define SCS_SIZE             PAGE_SIZE
>
> We could make this SCS_ORDER and then forget about alignment etc.

It's still convenient to have SCS_SIZE defined, I think. I can
certainly define SCS_ORDER and use that to define SCS_SIZE, but do you
think we'll need an order >0 here at some point in future?

> >  #define GFP_SCS                      (GFP_KERNEL | __GFP_ZERO)
> >
> >  /* An illegal pointer value to mark the end of the shadow stack. */
> > diff --git a/kernel/scs.c b/kernel/scs.c
> > index 4ff4a7ba0094..2136edba548d 100644
> > --- a/kernel/scs.c
> > +++ b/kernel/scs.c
> > @@ -5,50 +5,95 @@
> >   * Copyright (C) 2019 Google LLC
> >   */
> >
> > +#include <linux/cpuhotplug.h>
> >  #include <linux/kasan.h>
> >  #include <linux/mm.h>
> >  #include <linux/scs.h>
> > -#include <linux/slab.h>
> > +#include <linux/vmalloc.h>
> >  #include <linux/vmstat.h>
> >
> > -static struct kmem_cache *scs_cache;
> > -
> >  static void __scs_account(void *s, int account)
> >  {
> > -     struct page *scs_page = virt_to_page(s);
> > +     struct page *scs_page = vmalloc_to_page(s);
> >
> >       mod_node_page_state(page_pgdat(scs_page), NR_KERNEL_SCS_KB,
> >                           account * (SCS_SIZE / SZ_1K));
> >  }
> >
> > +/* Matches NR_CACHED_STACKS for VMAP_STACK */
> > +#define NR_CACHED_SCS 2
> > +static DEFINE_PER_CPU(void *, scs_cache[NR_CACHED_SCS]);
> > +
> >  static void *scs_alloc(int node)
> >  {
> > -     void *s = kmem_cache_alloc_node(scs_cache, GFP_SCS, node);
> > +     int i;
> > +     void *s;
> > +
> > +     for (i = 0; i < NR_CACHED_SCS; i++) {
> > +             s = this_cpu_xchg(scs_cache[i], NULL);
> > +             if (s) {
> > +                     memset(s, 0, SCS_SIZE);
> > +                     goto out;
> > +             }
> > +     }
> > +
> > +     /*
> > +      * We allocate a full page for the shadow stack, which should be
> > +      * more than we need. Check the assumption nevertheless.
> > +      */
> > +     BUILD_BUG_ON(SCS_SIZE > PAGE_SIZE);i
>
> With SCS_ORDER, you can drop this.
>
> > +
> > +     s = __vmalloc_node_range(PAGE_SIZE, SCS_SIZE,
> > +                              VMALLOC_START, VMALLOC_END,
> > +                              GFP_SCS, PAGE_KERNEL, 0,
> > +                              node, __builtin_return_address(0));
>
> Do we actually need vmalloc here? If we used alloc_pages() + vmap()

Does it matter that vmap() always uses NUMA_NO_NODE? We'll also lose
the ability to use vfree_atomic() in scs_release() unless we use
VM_MAP_PUT_PAGES and allocate the page array passed to vmap() with
kvmalloc(), which I think we need to do to avoid sleeping in
scs_free().

> instead, then we could avoid the expensive call to vmalloc_to_page()
> in __scs_account().

We still need vmalloc_to_page() in scs_release(). I suppose we could
alternatively follow the example in kernel/fork.c and cache the
vm_struct from find_vm_area() and use vm->pages[0] for the accounting.
Thoughts?

>
> >
> >       if (!s)
> >               return NULL;
> >
> > +out:
> >       *__scs_magic(s) = SCS_END_MAGIC;
> >
> >       /*
> >        * Poison the allocation to catch unintentional accesses to
> >        * the shadow stack when KASAN is enabled.
> >        */
> > -     kasan_poison_object_data(scs_cache, s);
> > +     kasan_poison_vmalloc(s, SCS_SIZE);
> >       __scs_account(s, 1);
> >       return s;
> >  }
> >
> >  static void scs_free(void *s)
> >  {
> > +     int i;
> > +
> >       __scs_account(s, -1);
> > -     kasan_unpoison_object_data(scs_cache, s);
> > -     kmem_cache_free(scs_cache, s);
> > +     kasan_unpoison_vmalloc(s, SCS_SIZE);
>
> I don't see the point in unpoisoning here tbh; vfree_atomic() re-poisons
> almost immediately, so we should probably defer this to scs_alloc() and
> only when picking the stack out of the cache.

Sure, I'll change this in v2.

>
> > +
> > +     for (i = 0; i < NR_CACHED_SCS; i++)
>
> Can you add a comment about the re-entrancy here and why we're using
> this_cpu_cmpxchg() please?

I'll add a comment.

Sami

Will Deacon Nov. 23, 2020, 11:08 a.m. UTC | #4

Hi Sami,

On Fri, Nov 20, 2020 at 09:00:17AM -0800, Sami Tolvanen wrote:
> On Thu, Nov 19, 2020 at 5:00 AM Will Deacon <will@kernel.org> wrote:
> > On Thu, Oct 22, 2020 at 01:23:54PM -0700, Sami Tolvanen wrote:
> > > The kernel currently uses kmem_cache to allocate shadow call stacks,
> > > which means an overflow may not be immediately detected and can
> > > potentially result in another task's shadow stack to be overwritten.
> > >
> > > This change switches SCS to use virtually mapped shadow stacks,
> > > which increases shadow stack size to a full page and provides more
> > > robust overflow detection similarly to VMAP_STACK.
> > >
> > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> > > ---
> > >  include/linux/scs.h |  7 +----
> > >  kernel/scs.c        | 63 ++++++++++++++++++++++++++++++++++++++-------
> > >  2 files changed, 55 insertions(+), 15 deletions(-)
> >
> > Cheers for posting this. I _much_ prefer handling the SCS this way, but I
> > have some comments on the implementation below.
> >
> > > diff --git a/include/linux/scs.h b/include/linux/scs.h
> > > index 6dec390cf154..86e3c4b7b714 100644
> > > --- a/include/linux/scs.h
> > > +++ b/include/linux/scs.h
> > > @@ -15,12 +15,7 @@
> > >
> > >  #ifdef CONFIG_SHADOW_CALL_STACK
> > >
> > > -/*
> > > - * In testing, 1 KiB shadow stack size (i.e. 128 stack frames on a 64-bit
> > > - * architecture) provided ~40% safety margin on stack usage while keeping
> > > - * memory allocation overhead reasonable.
> > > - */
> > > -#define SCS_SIZE             SZ_1K
> > > +#define SCS_SIZE             PAGE_SIZE
> >
> > We could make this SCS_ORDER and then forget about alignment etc.
> 
> It's still convenient to have SCS_SIZE defined, I think. I can
> certainly define SCS_ORDER and use that to define SCS_SIZE, but do you
> think we'll need an order >0 here at some point in future?

I'm not daft enough to comment on SCS size again ;)
Let's define SCS_ORDER 0 and then SCS_SIZE in terms of that.

> 
> > >  #define GFP_SCS                      (GFP_KERNEL | __GFP_ZERO)
> > >
> > >  /* An illegal pointer value to mark the end of the shadow stack. */
> > > diff --git a/kernel/scs.c b/kernel/scs.c
> > > index 4ff4a7ba0094..2136edba548d 100644
> > > --- a/kernel/scs.c
> > > +++ b/kernel/scs.c
> > > @@ -5,50 +5,95 @@
> > >   * Copyright (C) 2019 Google LLC
> > >   */
> > >
> > > +#include <linux/cpuhotplug.h>
> > >  #include <linux/kasan.h>
> > >  #include <linux/mm.h>
> > >  #include <linux/scs.h>
> > > -#include <linux/slab.h>
> > > +#include <linux/vmalloc.h>
> > >  #include <linux/vmstat.h>
> > >
> > > -static struct kmem_cache *scs_cache;
> > > -
> > >  static void __scs_account(void *s, int account)
> > >  {
> > > -     struct page *scs_page = virt_to_page(s);
> > > +     struct page *scs_page = vmalloc_to_page(s);
> > >
> > >       mod_node_page_state(page_pgdat(scs_page), NR_KERNEL_SCS_KB,
> > >                           account * (SCS_SIZE / SZ_1K));
> > >  }
> > >
> > > +/* Matches NR_CACHED_STACKS for VMAP_STACK */
> > > +#define NR_CACHED_SCS 2
> > > +static DEFINE_PER_CPU(void *, scs_cache[NR_CACHED_SCS]);
> > > +
> > >  static void *scs_alloc(int node)
> > >  {
> > > -     void *s = kmem_cache_alloc_node(scs_cache, GFP_SCS, node);
> > > +     int i;
> > > +     void *s;
> > > +
> > > +     for (i = 0; i < NR_CACHED_SCS; i++) {
> > > +             s = this_cpu_xchg(scs_cache[i], NULL);
> > > +             if (s) {
> > > +                     memset(s, 0, SCS_SIZE);
> > > +                     goto out;
> > > +             }
> > > +     }
> > > +
> > > +     /*
> > > +      * We allocate a full page for the shadow stack, which should be
> > > +      * more than we need. Check the assumption nevertheless.
> > > +      */
> > > +     BUILD_BUG_ON(SCS_SIZE > PAGE_SIZE);i
> >
> > With SCS_ORDER, you can drop this.
> >
> > > +
> > > +     s = __vmalloc_node_range(PAGE_SIZE, SCS_SIZE,
> > > +                              VMALLOC_START, VMALLOC_END,
> > > +                              GFP_SCS, PAGE_KERNEL, 0,
> > > +                              node, __builtin_return_address(0));
> >
> > Do we actually need vmalloc here? If we used alloc_pages() + vmap()
> 
> Does it matter that vmap() always uses NUMA_NO_NODE? We'll also lose
> the ability to use vfree_atomic() in scs_release() unless we use
> VM_MAP_PUT_PAGES and allocate the page array passed to vmap() with
> kvmalloc(), which I think we need to do to avoid sleeping in
> scs_free().

Huh, I didn't realise we didn't have vunmap_atomic(). In which case, I take
that back -- let's stick with vmalloc() for now.

Cheers,

Will

diff --git a/include/linux/scs.h b/include/linux/scs.h
index 6dec390cf154..86e3c4b7b714 100644
--- a/include/linux/scs.h
+++ b/include/linux/scs.h
@@ -15,12 +15,7 @@ 
 
 #ifdef CONFIG_SHADOW_CALL_STACK
 
-/*
- * In testing, 1 KiB shadow stack size (i.e. 128 stack frames on a 64-bit
- * architecture) provided ~40% safety margin on stack usage while keeping
- * memory allocation overhead reasonable.
- */
-#define SCS_SIZE		SZ_1K
+#define SCS_SIZE		PAGE_SIZE
 #define GFP_SCS			(GFP_KERNEL | __GFP_ZERO)
 
 /* An illegal pointer value to mark the end of the shadow stack. */
diff --git a/kernel/scs.c b/kernel/scs.c
index 4ff4a7ba0094..2136edba548d 100644
--- a/kernel/scs.c
+++ b/kernel/scs.c
@@ -5,50 +5,95 @@ 
  * Copyright (C) 2019 Google LLC
  */
 
+#include <linux/cpuhotplug.h>
 #include <linux/kasan.h>
 #include <linux/mm.h>
 #include <linux/scs.h>
-#include <linux/slab.h>
+#include <linux/vmalloc.h>
 #include <linux/vmstat.h>
 
-static struct kmem_cache *scs_cache;
-
 static void __scs_account(void *s, int account)
 {
-	struct page *scs_page = virt_to_page(s);
+	struct page *scs_page = vmalloc_to_page(s);
 
 	mod_node_page_state(page_pgdat(scs_page), NR_KERNEL_SCS_KB,
 			    account * (SCS_SIZE / SZ_1K));
 }
 
+/* Matches NR_CACHED_STACKS for VMAP_STACK */
+#define NR_CACHED_SCS 2
+static DEFINE_PER_CPU(void *, scs_cache[NR_CACHED_SCS]);
+
 static void *scs_alloc(int node)
 {
-	void *s = kmem_cache_alloc_node(scs_cache, GFP_SCS, node);
+	int i;
+	void *s;
+
+	for (i = 0; i < NR_CACHED_SCS; i++) {
+		s = this_cpu_xchg(scs_cache[i], NULL);
+		if (s) {
+			memset(s, 0, SCS_SIZE);
+			goto out;
+		}
+	}
+
+	/*
+	 * We allocate a full page for the shadow stack, which should be
+	 * more than we need. Check the assumption nevertheless.
+	 */
+	BUILD_BUG_ON(SCS_SIZE > PAGE_SIZE);
+
+	s = __vmalloc_node_range(PAGE_SIZE, SCS_SIZE,
+				 VMALLOC_START, VMALLOC_END,
+				 GFP_SCS, PAGE_KERNEL, 0,
+				 node, __builtin_return_address(0));
 
 	if (!s)
 		return NULL;
 
+out:
 	*__scs_magic(s) = SCS_END_MAGIC;
 
 	/*
 	 * Poison the allocation to catch unintentional accesses to
 	 * the shadow stack when KASAN is enabled.
 	 */
-	kasan_poison_object_data(scs_cache, s);
+	kasan_poison_vmalloc(s, SCS_SIZE);
 	__scs_account(s, 1);
 	return s;
 }
 
 static void scs_free(void *s)
 {
+	int i;
+
 	__scs_account(s, -1);
-	kasan_unpoison_object_data(scs_cache, s);
-	kmem_cache_free(scs_cache, s);
+	kasan_unpoison_vmalloc(s, SCS_SIZE);
+
+	for (i = 0; i < NR_CACHED_SCS; i++)
+		if (this_cpu_cmpxchg(scs_cache[i], 0, s) == NULL)
+			return;
+
+	vfree_atomic(s);
+}
+
+static int scs_cleanup(unsigned int cpu)
+{
+	int i;
+	void **cache = per_cpu_ptr(scs_cache, cpu);
+
+	for (i = 0; i < NR_CACHED_SCS; i++) {
+		vfree(cache[i]);
+		cache[i] = NULL;
+	}
+
+	return 0;
 }
 
 void __init scs_init(void)
 {
-	scs_cache = kmem_cache_create("scs_cache", SCS_SIZE, 0, 0, NULL);
+	cpuhp_setup_state(CPUHP_BP_PREPARE_DYN, "scs:scs_cache", NULL,
+			  scs_cleanup);
 }
 
 int scs_prepare(struct task_struct *tsk, int node)

[1/2] scs: switch to vmapped shadow stacks

Commit Message

Comments

Patch