Message ID | 20220628113714.7792-2-yee.lee@mediatek.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2,1/1] mm: kfence: apply kmemleak_ignore_phys on early allocated pool | expand |
On Tue, Jun 28, 2022 at 07:37:11PM +0800, yee.lee@mediatek.com wrote: > From: Yee Lee <yee.lee@mediatek.com> > > This patch solves two issues. > > (1) The pool allocated by memblock needs to unregister from > kmemleak scanning. Apply kmemleak_ignore_phys to replace the > original kmemleak_free as its address now is stored in the phys tree. > > (2) The pool late allocated by page-alloc doesn't need to unregister. > Move out the freeing operation from its call path. > > Suggested-by: Catalin Marinas <catalin.marinas@arm.com> > Suggested-by: Marco Elver <elver@google.com> > Signed-off-by: Yee Lee <yee.lee@mediatek.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Hi Yee, On Tue, Jun 28, 2022 at 1:42 PM <yee.lee@mediatek.com> wrote: > From: Yee Lee <yee.lee@mediatek.com> > > This patch solves two issues. > > (1) The pool allocated by memblock needs to unregister from > kmemleak scanning. Apply kmemleak_ignore_phys to replace the > original kmemleak_free as its address now is stored in the phys tree. > > (2) The pool late allocated by page-alloc doesn't need to unregister. > Move out the freeing operation from its call path. > > Suggested-by: Catalin Marinas <catalin.marinas@arm.com> > Suggested-by: Marco Elver <elver@google.com> > Signed-off-by: Yee Lee <yee.lee@mediatek.com> Thank you, this fixes the storm of BUG: KFENCE: invalid read in scan_block+0x78/0x130 BUG: KFENCE: use-after-free read in scan_block+0x78/0x130 BUG: KFENCE: out-of-bounds read in scan_block+0x78/0x130 messages I was seeing on arm64. Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On Fri, 15 Jul 2022 10:17:43 +0200 Geert Uytterhoeven <geert@linux-m68k.org> wrote: > On Tue, Jun 28, 2022 at 1:42 PM <yee.lee@mediatek.com> wrote: > > From: Yee Lee <yee.lee@mediatek.com> > > > > This patch solves two issues. > > > > (1) The pool allocated by memblock needs to unregister from > > kmemleak scanning. Apply kmemleak_ignore_phys to replace the > > original kmemleak_free as its address now is stored in the phys tree. > > > > (2) The pool late allocated by page-alloc doesn't need to unregister. > > Move out the freeing operation from its call path. > > > > Suggested-by: Catalin Marinas <catalin.marinas@arm.com> > > Suggested-by: Marco Elver <elver@google.com> > > Signed-off-by: Yee Lee <yee.lee@mediatek.com> > > Thank you, this fixes the storm of > > BUG: KFENCE: invalid read in scan_block+0x78/0x130 > BUG: KFENCE: use-after-free read in scan_block+0x78/0x130 > BUG: KFENCE: out-of-bounds read in scan_block+0x78/0x130 > > messages I was seeing on arm64. Thanks, but... - It would be great if we could identify a Fixes: for this. - This patch has been accused of crashing the kernel: https://lkml.kernel.org/r/YsFeUHkrFTQ7T51Q@xsang-OptiPlex-9020 Do we think that report is bogus?
Hi Andrew, On Sat, Jul 16, 2022 at 1:33 AM Andrew Morton <akpm@linux-foundation.org> wrote: > On Fri, 15 Jul 2022 10:17:43 +0200 Geert Uytterhoeven <geert@linux-m68k.org> wrote: > > On Tue, Jun 28, 2022 at 1:42 PM <yee.lee@mediatek.com> wrote: > > > From: Yee Lee <yee.lee@mediatek.com> > > > > > > This patch solves two issues. > > > > > > (1) The pool allocated by memblock needs to unregister from > > > kmemleak scanning. Apply kmemleak_ignore_phys to replace the > > > original kmemleak_free as its address now is stored in the phys tree. > > > > > > (2) The pool late allocated by page-alloc doesn't need to unregister. > > > Move out the freeing operation from its call path. > > > > > > Suggested-by: Catalin Marinas <catalin.marinas@arm.com> > > > Suggested-by: Marco Elver <elver@google.com> > > > Signed-off-by: Yee Lee <yee.lee@mediatek.com> > > > > Thank you, this fixes the storm of > > > > BUG: KFENCE: invalid read in scan_block+0x78/0x130 > > BUG: KFENCE: use-after-free read in scan_block+0x78/0x130 > > BUG: KFENCE: out-of-bounds read in scan_block+0x78/0x130 > > > > messages I was seeing on arm64. > > Thanks, but... > > - It would be great if we could identify a Fixes: for this. IIRC, I started seeing the issue with "[PATCH v4 3/4] mm: kmemleak: add rbtree and store physical address for objects allocated with PA" (i.e. commit 0c24e061196c21d5 ("mm: kmemleak: add rbtree and store physical address for objects allocated with PA")) of series "[PATCH v4 0/4] mm: kmemleak: store objects allocated with physical address separately and check when scan" (https://lore.kernel.org/all/20220611035551.1823303-1-patrick.wang.shcn@gmail.com), in an arm64 config that had enabled kfence. So I think this patch is sort of a dependency for that series. I had cherry-picked that series after bisecting a regression to commit 23c2d497de21f258 ("mm: kmemleak: take a full lowmem check in kmemleak_*_phys()") in v5.18-rc3, and having a look around. > - This patch has been accused of crashing the kernel: > > https://lkml.kernel.org/r/YsFeUHkrFTQ7T51Q@xsang-OptiPlex-9020 > > Do we think that report is bogus? I think all of this is highly architecture-specific... Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On Sat, 16 Jul 2022 at 20:43, Geert Uytterhoeven <geert@linux-m68k.org> wrote: [...] > > - This patch has been accused of crashing the kernel: > > > > https://lkml.kernel.org/r/YsFeUHkrFTQ7T51Q@xsang-OptiPlex-9020 > > > > Do we think that report is bogus? > > I think all of this is highly architecture-specific... The report can be reproduced on i386 with CONFIG_X86_PAE=y. But e.g. mm/memblock.c:memblock_free() is also guilty of using __pa() on previously memblock_alloc()'d addresses. Looking at the phys addr before memblock_alloc() does virt_to_phys(), the result of __pa() looks correct even on PAE, at least for the purpose of passing it on to kmemleak(). So I don't know what that BUG_ON(slow_virt_to_phys() != phys_addr) is supposed to tell us here. Ideas?
On Sat, Jul 16, 2022 at 08:43:06PM +0200, Geert Uytterhoeven wrote: > On Sat, Jul 16, 2022 at 1:33 AM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Fri, 15 Jul 2022 10:17:43 +0200 Geert Uytterhoeven <geert@linux-m68k.org> wrote: > > > On Tue, Jun 28, 2022 at 1:42 PM <yee.lee@mediatek.com> wrote: > > > > From: Yee Lee <yee.lee@mediatek.com> > > > > > > > > This patch solves two issues. > > > > > > > > (1) The pool allocated by memblock needs to unregister from > > > > kmemleak scanning. Apply kmemleak_ignore_phys to replace the > > > > original kmemleak_free as its address now is stored in the phys tree. > > > > > > > > (2) The pool late allocated by page-alloc doesn't need to unregister. > > > > Move out the freeing operation from its call path. > > > > > > > > Suggested-by: Catalin Marinas <catalin.marinas@arm.com> > > > > Suggested-by: Marco Elver <elver@google.com> > > > > Signed-off-by: Yee Lee <yee.lee@mediatek.com> > > > > > > Thank you, this fixes the storm of > > > > > > BUG: KFENCE: invalid read in scan_block+0x78/0x130 > > > BUG: KFENCE: use-after-free read in scan_block+0x78/0x130 > > > BUG: KFENCE: out-of-bounds read in scan_block+0x78/0x130 > > > > > > messages I was seeing on arm64. > > > > Thanks, but... > > > > - It would be great if we could identify a Fixes: for this. > > IIRC, I started seeing the issue with "[PATCH v4 3/4] mm: > kmemleak: add rbtree and store physical address for objects > allocated with PA" (i.e. commit 0c24e061196c21d5 ("mm: kmemleak: > add rbtree and store physical address for objects allocated > with PA")) of series "[PATCH v4 0/4] mm: kmemleak: store objects > allocated with physical address separately and check when scan" > (https://lore.kernel.org/all/20220611035551.1823303-1-patrick.wang.shcn@gmail.com), > in an arm64 config that had enabled kfence. Yes, I think it fixes 0c24e061196c21d5 since after that commit, the kmemleak_free() no longer worked as expected on physically allocated objects.
On Mon, 18 Jul 2022 16:26:25 +0200 Marco Elver <elver@google.com> wrote: > On Sat, 16 Jul 2022 at 20:43, Geert Uytterhoeven <geert@linux-m68k.org> wrote: > [...] > > > - This patch has been accused of crashing the kernel: > > > > > > https://lkml.kernel.org/r/YsFeUHkrFTQ7T51Q@xsang-OptiPlex-9020 > > > > > > Do we think that report is bogus? > > > > I think all of this is highly architecture-specific... > > The report can be reproduced on i386 with CONFIG_X86_PAE=y. But e.g. > mm/memblock.c:memblock_free() is also guilty of using __pa() on > previously memblock_alloc()'d addresses. Looking at the phys addr > before memblock_alloc() does virt_to_phys(), the result of __pa() > looks correct even on PAE, at least for the purpose of passing it on > to kmemleak(). So I don't know what that BUG_ON(slow_virt_to_phys() != > phys_addr) is supposed to tell us here. > It's only been nine years, so I'm sure Dave can remember why he added it ;) BUG_ON(slow_virt_to_phys((void *)x) != phys_addr); in arch/x86/mm/physaddr.c:__phys_addr(). This kfence patch does seem to be desirable, but we can't proceed if it's resulting in kernel crashes.
On 7/19/22 16:13, Andrew Morton wrote: > On Mon, 18 Jul 2022 16:26:25 +0200 Marco Elver <elver@google.com> wrote: > >> On Sat, 16 Jul 2022 at 20:43, Geert Uytterhoeven <geert@linux-m68k.org> wrote: >> [...] >>>> - This patch has been accused of crashing the kernel: >>>> >>>> https://lkml.kernel.org/r/YsFeUHkrFTQ7T51Q@xsang-OptiPlex-9020 >>>> >>>> Do we think that report is bogus? >>> I think all of this is highly architecture-specific... >> The report can be reproduced on i386 with CONFIG_X86_PAE=y. But e.g. >> mm/memblock.c:memblock_free() is also guilty of using __pa() on >> previously memblock_alloc()'d addresses. Looking at the phys addr >> before memblock_alloc() does virt_to_phys(), the result of __pa() >> looks correct even on PAE, at least for the purpose of passing it on >> to kmemleak(). So I don't know what that BUG_ON(slow_virt_to_phys() != >> phys_addr) is supposed to tell us here. >> > It's only been nine years, so I'm sure Dave can remember why he added > it ;) > > BUG_ON(slow_virt_to_phys((void *)x) != phys_addr); > > in arch/x86/mm/physaddr.c:__phys_addr(). I think I intended it to double check that the linear map is *actually* a linear map for 'x'. Sure, we can use the "x - PAGE_OFFSET" shortcut, but did it turn out to be actually accurate for the address it was handed? I'd be curious what the page tables actually say for the address that's causing problems.
[+x86 maintainers ...] On Wed, 20 Jul 2022 at 01:22, Dave Hansen <dave.hansen@intel.com> wrote: > On 7/19/22 16:13, Andrew Morton wrote: > > On Mon, 18 Jul 2022 16:26:25 +0200 Marco Elver <elver@google.com> wrote: > > > >> On Sat, 16 Jul 2022 at 20:43, Geert Uytterhoeven <geert@linux-m68k.org> wrote: > >> [...] > >>>> - This patch has been accused of crashing the kernel: > >>>> > >>>> https://lkml.kernel.org/r/YsFeUHkrFTQ7T51Q@xsang-OptiPlex-9020 > >>>> > >>>> Do we think that report is bogus? > >>> I think all of this is highly architecture-specific... > >> The report can be reproduced on i386 with CONFIG_X86_PAE=y. But e.g. > >> mm/memblock.c:memblock_free() is also guilty of using __pa() on > >> previously memblock_alloc()'d addresses. Looking at the phys addr > >> before memblock_alloc() does virt_to_phys(), the result of __pa() > >> looks correct even on PAE, at least for the purpose of passing it on > >> to kmemleak(). So I don't know what that BUG_ON(slow_virt_to_phys() != > >> phys_addr) is supposed to tell us here. > >> > > It's only been nine years, so I'm sure Dave can remember why he added > > it ;) > > > > BUG_ON(slow_virt_to_phys((void *)x) != phys_addr); > > > > in arch/x86/mm/physaddr.c:__phys_addr(). > > I think I intended it to double check that the linear map is *actually* > a linear map for 'x'. Sure, we can use the "x - PAGE_OFFSET" shortcut, > but did it turn out to be actually accurate for the address it was handed? > > I'd be curious what the page tables actually say for the address that's > causing problems. test robot just reminded us again: https://lore.kernel.org/all/YufXncrWhJZH0ifB@xsang-OptiPlex-9020/T/#u Few things I noticed: * mm/memblock.c's memblock_free() also uses __pa() to convert back to physical address. Presumably that's also wrong. What should be used instead? * kmemleak happily converts phys_addr_t to unsigned long everywhere, but with i386 PAE, this will narrow a 64-bit address to a 32-bit address. Is that correct? Does kmemleak need a "depends on 64BIT || !PHYS_ADDR_T_64BIT"?
diff --git a/mm/kfence/core.c b/mm/kfence/core.c index 4e7cd4c8e687..32a4a75e820c 100644 --- a/mm/kfence/core.c +++ b/mm/kfence/core.c @@ -600,14 +600,6 @@ static unsigned long kfence_init_pool(void) addr += 2 * PAGE_SIZE; } - /* - * The pool is live and will never be deallocated from this point on. - * Remove the pool object from the kmemleak object tree, as it would - * otherwise overlap with allocations returned by kfence_alloc(), which - * are registered with kmemleak through the slab post-alloc hook. - */ - kmemleak_free(__kfence_pool); - return 0; } @@ -620,8 +612,16 @@ static bool __init kfence_init_pool_early(void) addr = kfence_init_pool(); - if (!addr) + if (!addr) { + /* + * The pool is live and will never be deallocated from this point on. + * Ignore the pool object from the kmemleak phys object tree, as it would + * otherwise overlap with allocations returned by kfence_alloc(), which + * are registered with kmemleak through the slab post-alloc hook. + */ + kmemleak_ignore_phys(__pa(__kfence_pool)); return true; + } /* * Only release unprotected pages, and do not try to go back and change