diff mbox series

kfence: unpoison pool region before use

Message ID 20210403051325.683071-1-pcc@google.com (mailing list archive)
State New, archived
Headers show
Series kfence: unpoison pool region before use | expand

Commit Message

Peter Collingbourne April 3, 2021, 5:13 a.m. UTC
If the memory region allocated by KFENCE had previously been poisoned,
any validity checks done using kasan_byte_accessible() will fail. Fix
it by unpoisoning the memory before using it as the pool region.

Link: https://linux-review.googlesource.com/id/I0af99e9f1c25eaf7e1ec295836b5d148d76940c5
Signed-off-by: Peter Collingbourne <pcc@google.com>
---
 mm/kfence/core.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

Comments

Marco Elver April 3, 2021, 10:03 a.m. UTC | #1
On Sat, 3 Apr 2021 at 07:13, Peter Collingbourne <pcc@google.com> wrote:
> If the memory region allocated by KFENCE had previously been poisoned,
> any validity checks done using kasan_byte_accessible() will fail. Fix
> it by unpoisoning the memory before using it as the pool region.
>
> Link: https://linux-review.googlesource.com/id/I0af99e9f1c25eaf7e1ec295836b5d148d76940c5
> Signed-off-by: Peter Collingbourne <pcc@google.com>

Thanks, at a high level this seems reasonable, because we always want
to ensure that KFENCE memory remains unpoisoned with KASAN on. FWIW I
subjected a config with KFENCE+KASAN (generic, SW_TAGS, and HW_TAGS)
to syzkaller testing and ran kfence_test:

  Tested-by: Marco Elver <elver@google.com>


However, it is unclear to me under which circumstances we actually
need this, i.e. something would grab some memblock memory, somehow
poison it, and then release the memory back during early boot (note,
kfence_alloc_pool() is called before slab setup). If we can somehow
understand what actually did this, perhaps it'd help tell us if this
actually needs fixing in KFENCE or it's the other thing that needs a
fix.

Given all this is happening during really early boot, I'd expect no or
very few calls to kasan_poison() until kfence_alloc_pool() is called.
We can probably debug it more by having kasan_poison() do a "if
(!__kfence_pool) dump_stack();" somewhere. Can you try this on the
system where you can repro the problem? I tried this just now on the
latest mainline kernel, and saw 0 calls until kfence_alloc_pool().

Thanks,
-- Marco

> ---
>  mm/kfence/core.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/mm/kfence/core.c b/mm/kfence/core.c
> index d53c91f881a4..bb22b0cf77aa 100644
> --- a/mm/kfence/core.c
> +++ b/mm/kfence/core.c
> @@ -633,13 +633,19 @@ static DECLARE_DELAYED_WORK(kfence_timer, toggle_allocation_gate);
>
>  void __init kfence_alloc_pool(void)
>  {
> +       void *pool;
> +
>         if (!kfence_sample_interval)
>                 return;
>
> -       __kfence_pool = memblock_alloc(KFENCE_POOL_SIZE, PAGE_SIZE);
> -
> -       if (!__kfence_pool)
> +       pool = memblock_alloc(KFENCE_POOL_SIZE, PAGE_SIZE);
> +       if (!pool) {
>                 pr_err("failed to allocate pool\n");
> +               return;
> +       }
> +
> +       kasan_unpoison_range(pool, KFENCE_POOL_SIZE);
> +       __kfence_pool = pool;
>  }
>
>  void __init kfence_init(void)
> --
> 2.31.0.208.g409f899ff0-goog
>
Andrey Konovalov April 3, 2021, 2:05 p.m. UTC | #2
On Sat, Apr 3, 2021 at 7:13 AM Peter Collingbourne <pcc@google.com> wrote:
>
> If the memory region allocated by KFENCE had previously been poisoned,
> any validity checks done using kasan_byte_accessible() will fail. Fix
> it by unpoisoning the memory before using it as the pool region.

Which kasan_byte_accessible() call fails?

KASAN checks shouldn't be performed for KFENCE objects. We have a
number of is_kfence_address() checks in KASAN runtime, but maybe we're
missing some. Perhaps, we should even move those checks into the
high-level wrappers in include/linux/kasan.h.

> Link: https://linux-review.googlesource.com/id/I0af99e9f1c25eaf7e1ec295836b5d148d76940c5
> Signed-off-by: Peter Collingbourne <pcc@google.com>
> ---
>  mm/kfence/core.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/mm/kfence/core.c b/mm/kfence/core.c
> index d53c91f881a4..bb22b0cf77aa 100644
> --- a/mm/kfence/core.c
> +++ b/mm/kfence/core.c
> @@ -633,13 +633,19 @@ static DECLARE_DELAYED_WORK(kfence_timer, toggle_allocation_gate);
>
>  void __init kfence_alloc_pool(void)
>  {
> +       void *pool;
> +
>         if (!kfence_sample_interval)
>                 return;
>
> -       __kfence_pool = memblock_alloc(KFENCE_POOL_SIZE, PAGE_SIZE);
> -
> -       if (!__kfence_pool)
> +       pool = memblock_alloc(KFENCE_POOL_SIZE, PAGE_SIZE);
> +       if (!pool) {
>                 pr_err("failed to allocate pool\n");
> +               return;
> +       }
> +
> +       kasan_unpoison_range(pool, KFENCE_POOL_SIZE);
> +       __kfence_pool = pool;
>  }
>
>  void __init kfence_init(void)
> --
> 2.31.0.208.g409f899ff0-goog
>
Marco Elver April 3, 2021, 2:45 p.m. UTC | #3
On Sat, 3 Apr 2021 at 16:05, Andrey Konovalov <andreyknvl@gmail.com> wrote:
...
> Which kasan_byte_accessible() call fails?
>
> KASAN checks shouldn't be performed for KFENCE objects. We have a
> number of is_kfence_address() checks in KASAN runtime, but maybe we're
> missing some. Perhaps, we should even move those checks into the
> high-level wrappers in include/linux/kasan.h.

Moving them into include/linux/kasan.h seems unnecessary and an easy
way to introduce unnecessary overhead. AFAIK, there should be no
difference between having them in the high-level wrappers and the
inner runtime functions. I think until we understand what is actually
going on and could thoroughly justify, I'd be opposed to larger
changes. The small patch here is innocent enough, but it'd still be
good to understand. (FWIW, I believe the issue was encountered with
SW_TAGS on a downstream kernel.)
Peter Collingbourne April 3, 2021, 8:40 p.m. UTC | #4
On Sat, Apr 3, 2021 at 3:03 AM Marco Elver <elver@google.com> wrote:
>
> On Sat, 3 Apr 2021 at 07:13, Peter Collingbourne <pcc@google.com> wrote:
> > If the memory region allocated by KFENCE had previously been poisoned,
> > any validity checks done using kasan_byte_accessible() will fail. Fix
> > it by unpoisoning the memory before using it as the pool region.
> >
> > Link: https://linux-review.googlesource.com/id/I0af99e9f1c25eaf7e1ec295836b5d148d76940c5
> > Signed-off-by: Peter Collingbourne <pcc@google.com>
>
> Thanks, at a high level this seems reasonable, because we always want
> to ensure that KFENCE memory remains unpoisoned with KASAN on. FWIW I
> subjected a config with KFENCE+KASAN (generic, SW_TAGS, and HW_TAGS)
> to syzkaller testing and ran kfence_test:
>
>   Tested-by: Marco Elver <elver@google.com>
>
>
> However, it is unclear to me under which circumstances we actually
> need this, i.e. something would grab some memblock memory, somehow
> poison it, and then release the memory back during early boot (note,
> kfence_alloc_pool() is called before slab setup). If we can somehow
> understand what actually did this, perhaps it'd help tell us if this
> actually needs fixing in KFENCE or it's the other thing that needs a
> fix.
>
> Given all this is happening during really early boot, I'd expect no or
> very few calls to kasan_poison() until kfence_alloc_pool() is called.
> We can probably debug it more by having kasan_poison() do a "if
> (!__kfence_pool) dump_stack();" somewhere. Can you try this on the
> system where you can repro the problem? I tried this just now on the
> latest mainline kernel, and saw 0 calls until kfence_alloc_pool().

I looked into the issue some more, and it turned out that the memory
wasn't getting poisoned by kasan_poison() but rather by the calls to
kasan_map_populate() in kasan_init_shadow(). Starting with the patch
"kasan: initialize shadow to TAG_INVALID for SW_TAGS",
KASAN_SHADOW_INIT is set to 0xFE rather than 0xFF, which caused the
failure. The Android kernel branch for 5.10 (and the downstream kernel
I was working with) already have this patch, but it isn't in the
mainline kernel yet.

Now that I understand the cause of the issue, I can reproduce it using
the KFENCE unit tests on a db845c board, using both the Android 5.10
and mainline branches if I cherry-pick that change. Here's an example
crash from the unit tests (the failure was originally also observed
from ksize in the downstream kernel):

[   46.692195][  T175] BUG: KASAN: invalid-access in test_krealloc+0x1c4/0xf98
[   46.699282][  T175] Read of size 1 at addr ffffff80e9e7b000 by task
kunit_try_catch/175
[   46.707400][  T175] Pointer tag: [ff], memory tag: [fe]
[   46.712710][  T175]
[   46.714955][  T175] CPU: 4 PID: 175 Comm: kunit_try_catch Tainted:
G    B             5.12.0-rc5-mainline-09505-ga2ab5b26d445-dirty #1
[   46.727193][  T175] Hardware name: Thundercomm Dragonboard 845c (DT)
[   46.733636][  T175] Call trace:
[   46.736841][  T175]  dump_backtrace+0x0/0x2f8
[   46.741295][  T175]  show_stack+0x2c/0x3c
[   46.745388][  T175]  dump_stack+0x124/0x1bc
[   46.749668][  T175]  print_address_description+0x7c/0x308
[   46.755178][  T175]  __kasan_report+0x1a8/0x398
[   46.759816][  T175]  kasan_report+0x50/0x7c
[   46.764103][  T175]  __kasan_check_byte+0x3c/0x54
[   46.768916][  T175]  ksize+0x4c/0x94
[   46.772573][  T175]  test_krealloc+0x1c4/0xf98
[   46.777108][  T175]  kunit_try_run_case+0x94/0x1c4
[   46.781990][  T175]  kunit_generic_run_threadfn_adapter+0x30/0x44
[   46.788196][  T175]  kthread+0x20c/0x234
[   46.792213][  T175]  ret_from_fork+0x10/0x30

Since "kasan: initialize shadow to TAG_INVALID for SW_TAGS" hasn't
landed in mainline yet, it seems like we should insert this patch
before that one rather than adding a Fixes: tag.

Peter
Marco Elver April 3, 2021, 10:30 p.m. UTC | #5
On Sat, 3 Apr 2021 at 22:40, Peter Collingbourne <pcc@google.com> wrote:
> On Sat, Apr 3, 2021 at 3:03 AM Marco Elver <elver@google.com> wrote:
> > On Sat, 3 Apr 2021 at 07:13, Peter Collingbourne <pcc@google.com> wrote:
> > > If the memory region allocated by KFENCE had previously been poisoned,
> > > any validity checks done using kasan_byte_accessible() will fail. Fix
> > > it by unpoisoning the memory before using it as the pool region.
> > >
> > > Link: https://linux-review.googlesource.com/id/I0af99e9f1c25eaf7e1ec295836b5d148d76940c5
> > > Signed-off-by: Peter Collingbourne <pcc@google.com>
> >
> > Thanks, at a high level this seems reasonable, because we always want
> > to ensure that KFENCE memory remains unpoisoned with KASAN on. FWIW I
> > subjected a config with KFENCE+KASAN (generic, SW_TAGS, and HW_TAGS)
> > to syzkaller testing and ran kfence_test:
> >
> >   Tested-by: Marco Elver <elver@google.com>
> >
> >
> > However, it is unclear to me under which circumstances we actually
> > need this, i.e. something would grab some memblock memory, somehow
> > poison it, and then release the memory back during early boot (note,
> > kfence_alloc_pool() is called before slab setup). If we can somehow
> > understand what actually did this, perhaps it'd help tell us if this
> > actually needs fixing in KFENCE or it's the other thing that needs a
> > fix.
> >
> > Given all this is happening during really early boot, I'd expect no or
> > very few calls to kasan_poison() until kfence_alloc_pool() is called.
> > We can probably debug it more by having kasan_poison() do a "if
> > (!__kfence_pool) dump_stack();" somewhere. Can you try this on the
> > system where you can repro the problem? I tried this just now on the
> > latest mainline kernel, and saw 0 calls until kfence_alloc_pool().
>
> I looked into the issue some more, and it turned out that the memory
> wasn't getting poisoned by kasan_poison() but rather by the calls to
> kasan_map_populate() in kasan_init_shadow(). Starting with the patch
> "kasan: initialize shadow to TAG_INVALID for SW_TAGS",
> KASAN_SHADOW_INIT is set to 0xFE rather than 0xFF, which caused the
> failure. The Android kernel branch for 5.10 (and the downstream kernel
> I was working with) already have this patch, but it isn't in the
> mainline kernel yet.
>
> Now that I understand the cause of the issue, I can reproduce it using
> the KFENCE unit tests on a db845c board, using both the Android 5.10
> and mainline branches if I cherry-pick that change. Here's an example
> crash from the unit tests (the failure was originally also observed
> from ksize in the downstream kernel):
>
> [   46.692195][  T175] BUG: KASAN: invalid-access in test_krealloc+0x1c4/0xf98
> [   46.699282][  T175] Read of size 1 at addr ffffff80e9e7b000 by task
> kunit_try_catch/175
> [   46.707400][  T175] Pointer tag: [ff], memory tag: [fe]
> [   46.712710][  T175]
> [   46.714955][  T175] CPU: 4 PID: 175 Comm: kunit_try_catch Tainted:
> G    B             5.12.0-rc5-mainline-09505-ga2ab5b26d445-dirty #1
> [   46.727193][  T175] Hardware name: Thundercomm Dragonboard 845c (DT)
> [   46.733636][  T175] Call trace:
> [   46.736841][  T175]  dump_backtrace+0x0/0x2f8
> [   46.741295][  T175]  show_stack+0x2c/0x3c
> [   46.745388][  T175]  dump_stack+0x124/0x1bc
> [   46.749668][  T175]  print_address_description+0x7c/0x308
> [   46.755178][  T175]  __kasan_report+0x1a8/0x398
> [   46.759816][  T175]  kasan_report+0x50/0x7c
> [   46.764103][  T175]  __kasan_check_byte+0x3c/0x54
> [   46.768916][  T175]  ksize+0x4c/0x94
> [   46.772573][  T175]  test_krealloc+0x1c4/0xf98
> [   46.777108][  T175]  kunit_try_run_case+0x94/0x1c4
> [   46.781990][  T175]  kunit_generic_run_threadfn_adapter+0x30/0x44
> [   46.788196][  T175]  kthread+0x20c/0x234
> [   46.792213][  T175]  ret_from_fork+0x10/0x30
>
> Since "kasan: initialize shadow to TAG_INVALID for SW_TAGS" hasn't
> landed in mainline yet, it seems like we should insert this patch
> before that one rather than adding a Fixes: tag.

Thanks for getting to the bottom of it.

However, given the above, I think we need to explain this in the
commit message (which also makes the dependency between these 2
patches clear) and add a comment above the new kasan_unpoison_range().
That is, if we still think this is the right fix -- I'm not entirely
sure it is.

Because what I gather from "kasan: initialize shadow to TAG_INVALID
for SW_TAGS", is the requirement that "0xFF pointer tag is a match-all
tag, it doesn't matter what tag the accessed memory has".

While KFENCE memory is accessible through the slab API, and in this
case ksize() calling kasan_check_byte() leading to a failure, the
kasan_check_byte() call is part of the public KASAN API. Which means
that if some subsystem decides to memblock_alloc() some memory, and
wishes to use kasan_check_byte() on that memory but with an untagged
pointer, will get the same problem as KFENCE: with generic and HW_TAGS
mode everything is fine, but with SW_TAGS mode things break.

To me this indicates the fix is not with KFENCE, but should be in
mm/kasan/sw_tags.c:kasan_byte_accessible(), which should not load the
shadow when the pointer is untagged.

Thanks,
-- Marco
Andrey Konovalov April 3, 2021, 11:52 p.m. UTC | #6
On Sun, Apr 4, 2021 at 12:31 AM Marco Elver <elver@google.com> wrote:
>
> However, given the above, I think we need to explain this in the
> commit message (which also makes the dependency between these 2
> patches clear) and add a comment above the new kasan_unpoison_range().
> That is, if we still think this is the right fix -- I'm not entirely
> sure it is.
>
> Because what I gather from "kasan: initialize shadow to TAG_INVALID
> for SW_TAGS", is the requirement that "0xFF pointer tag is a match-all
> tag, it doesn't matter what tag the accessed memory has".
>
> While KFENCE memory is accessible through the slab API, and in this
> case ksize() calling kasan_check_byte() leading to a failure, the
> kasan_check_byte() call is part of the public KASAN API. Which means
> that if some subsystem decides to memblock_alloc() some memory, and
> wishes to use kasan_check_byte() on that memory but with an untagged
> pointer, will get the same problem as KFENCE: with generic and HW_TAGS
> mode everything is fine, but with SW_TAGS mode things break.

It makes sense to allow this function to operate on any kind of
memory, including memory that hasn't been previously marked by KASAN.

> To me this indicates the fix is not with KFENCE, but should be in
> mm/kasan/sw_tags.c:kasan_byte_accessible(), which should not load the
> shadow when the pointer is untagged.

The problem isn't in accessing shadow per se. Looking at
kasan_byte_accessible() (in both sw_tags.c and kasan.h), the return
statement there seems just wrong and redundant. The KASAN_TAG_KERNEL
check should come first:

return tag == KASAN_TAG_KERNEL || (shadow_byte != KASAN_TAG_INVALID &&
tag == shadow_byte);

This way, if the pointer tag is KASAN_TAG_KERNEL, the memory is
accessible no matter what the memory tag is.

But then the KASAN_TAG_INVALID check isn't needed, as this value is
never assigned to a pointer tag. Which brings us to:

return tag == KASAN_TAG_KERNEL || tag == shadow_byte;

Which is essentially the same check that kasan_check_range() performs.

Although, kasan_check_range() also checks that the shadow is <
KASAN_SHADOW_START. It makes makes sense to add this check into
kasan_byte_accessible() as well, before accessing shadow.

Thanks!
Andrey Konovalov April 4, 2021, 12:09 p.m. UTC | #7
On Sun, Apr 4, 2021 at 1:52 AM Andrey Konovalov <andreyknvl@gmail.com> wrote:
>
> On Sun, Apr 4, 2021 at 12:31 AM Marco Elver <elver@google.com> wrote:
> >
> > However, given the above, I think we need to explain this in the
> > commit message (which also makes the dependency between these 2
> > patches clear) and add a comment above the new kasan_unpoison_range().
> > That is, if we still think this is the right fix -- I'm not entirely
> > sure it is.
> >
> > Because what I gather from "kasan: initialize shadow to TAG_INVALID
> > for SW_TAGS", is the requirement that "0xFF pointer tag is a match-all
> > tag, it doesn't matter what tag the accessed memory has".
> >
> > While KFENCE memory is accessible through the slab API, and in this
> > case ksize() calling kasan_check_byte() leading to a failure, the
> > kasan_check_byte() call is part of the public KASAN API. Which means
> > that if some subsystem decides to memblock_alloc() some memory, and
> > wishes to use kasan_check_byte() on that memory but with an untagged
> > pointer, will get the same problem as KFENCE: with generic and HW_TAGS
> > mode everything is fine, but with SW_TAGS mode things break.
>
> It makes sense to allow this function to operate on any kind of
> memory, including memory that hasn't been previously marked by KASAN.
>
> > To me this indicates the fix is not with KFENCE, but should be in
> > mm/kasan/sw_tags.c:kasan_byte_accessible(), which should not load the
> > shadow when the pointer is untagged.
>
> The problem isn't in accessing shadow per se. Looking at
> kasan_byte_accessible() (in both sw_tags.c and kasan.h), the return
> statement there seems just wrong and redundant.

(Technically, it's not wrong. But it was written under the assumption
that no accesses to KASAN_TAG_INVALID memory are valid. It's not the
case with KFENCE (without built-in KASAN annotations). And there might
be other places where this isn't the case as well, though we haven't
encountered any yet.)
Peter Collingbourne April 5, 2021, 7:04 p.m. UTC | #8
On Sat, Apr 3, 2021 at 4:52 PM Andrey Konovalov <andreyknvl@gmail.com> wrote:
>
> On Sun, Apr 4, 2021 at 12:31 AM Marco Elver <elver@google.com> wrote:
> >
> > However, given the above, I think we need to explain this in the
> > commit message (which also makes the dependency between these 2
> > patches clear) and add a comment above the new kasan_unpoison_range().
> > That is, if we still think this is the right fix -- I'm not entirely
> > sure it is.
> >
> > Because what I gather from "kasan: initialize shadow to TAG_INVALID
> > for SW_TAGS", is the requirement that "0xFF pointer tag is a match-all
> > tag, it doesn't matter what tag the accessed memory has".
> >
> > While KFENCE memory is accessible through the slab API, and in this
> > case ksize() calling kasan_check_byte() leading to a failure, the
> > kasan_check_byte() call is part of the public KASAN API. Which means
> > that if some subsystem decides to memblock_alloc() some memory, and
> > wishes to use kasan_check_byte() on that memory but with an untagged
> > pointer, will get the same problem as KFENCE: with generic and HW_TAGS
> > mode everything is fine, but with SW_TAGS mode things break.
>
> It makes sense to allow this function to operate on any kind of
> memory, including memory that hasn't been previously marked by KASAN.
>
> > To me this indicates the fix is not with KFENCE, but should be in
> > mm/kasan/sw_tags.c:kasan_byte_accessible(), which should not load the
> > shadow when the pointer is untagged.
>
> The problem isn't in accessing shadow per se. Looking at
> kasan_byte_accessible() (in both sw_tags.c and kasan.h), the return
> statement there seems just wrong and redundant. The KASAN_TAG_KERNEL
> check should come first:
>
> return tag == KASAN_TAG_KERNEL || (shadow_byte != KASAN_TAG_INVALID &&
> tag == shadow_byte);
>
> This way, if the pointer tag is KASAN_TAG_KERNEL, the memory is
> accessible no matter what the memory tag is.
>
> But then the KASAN_TAG_INVALID check isn't needed, as this value is
> never assigned to a pointer tag. Which brings us to:
>
> return tag == KASAN_TAG_KERNEL || tag == shadow_byte;
>
> Which is essentially the same check that kasan_check_range() performs.
>
> Although, kasan_check_range() also checks that the shadow is <
> KASAN_SHADOW_START. It makes makes sense to add this check into
> kasan_byte_accessible() as well, before accessing shadow.
>
> Thanks!

Okay, if the intent is that kasan_byte_accessible() should work on any
memory, not just slab memory, then I agree that it should do the same
thing as kasan_check_range().

Peter
diff mbox series

Patch

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index d53c91f881a4..bb22b0cf77aa 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -633,13 +633,19 @@  static DECLARE_DELAYED_WORK(kfence_timer, toggle_allocation_gate);
 
 void __init kfence_alloc_pool(void)
 {
+	void *pool;
+
 	if (!kfence_sample_interval)
 		return;
 
-	__kfence_pool = memblock_alloc(KFENCE_POOL_SIZE, PAGE_SIZE);
-
-	if (!__kfence_pool)
+	pool = memblock_alloc(KFENCE_POOL_SIZE, PAGE_SIZE);
+	if (!pool) {
 		pr_err("failed to allocate pool\n");
+		return;
+	}
+
+	kasan_unpoison_range(pool, KFENCE_POOL_SIZE);
+	__kfence_pool = pool;
 }
 
 void __init kfence_init(void)