Message ID | 20220505101337.1997819-1-42.hyeyoo@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v3] mm/kfence: reset PG_slab and memcg_data before freeing __kfence_pool | expand |
On Thu, May 05, 2022 at 07:13:37PM +0900, Hyeonggon Yoo wrote: > When kfence fails to initialize kfence pool, it frees the pool. > But it does not reset PG_slab flag and memcg_data of struct page. > > Below is a BUG because of this. Let's fix it by resetting PG_slab > and memcg_data before free. > > [ 0.089149] BUG: Bad page state in process swapper/0 pfn:3d8e06 > [ 0.089149] page:ffffea46cf638180 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x3d8e06 > [ 0.089150] memcg:ffffffff94a475d1 > [ 0.089150] flags: 0x17ffffc0000200(slab|node=0|zone=2|lastcpupid=0x1fffff) > [ 0.089151] raw: 0017ffffc0000200 ffffea46cf638188 ffffea46cf638188 0000000000000000 > [ 0.089152] raw: 0000000000000000 0000000000000000 00000000ffffffff ffffffff94a475d1 > [ 0.089152] page dumped because: page still charged to cgroup > [ 0.089153] Modules linked in: > [ 0.089153] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G B W 5.18.0-rc1+ #965 > [ 0.089154] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 > [ 0.089154] Call Trace: > [ 0.089155] <TASK> > [ 0.089155] dump_stack_lvl+0x49/0x5f > [ 0.089157] dump_stack+0x10/0x12 > [ 0.089158] bad_page.cold+0x63/0x94 > [ 0.089159] check_free_page_bad+0x66/0x70 > [ 0.089160] __free_pages_ok+0x423/0x530 > [ 0.089161] __free_pages_core+0x8e/0xa0 > [ 0.089162] memblock_free_pages+0x10/0x12 > [ 0.089164] memblock_free_late+0x8f/0xb9 > [ 0.089165] kfence_init+0x68/0x92 > [ 0.089166] start_kernel+0x789/0x992 > [ 0.089167] x86_64_start_reservations+0x24/0x26 > [ 0.089168] x86_64_start_kernel+0xa9/0xaf > [ 0.089170] secondary_startup_64_no_verify+0xd5/0xdb > [ 0.089171] </TASK> > > Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure") > Fixes: 8f0b36497303 ("mm: kfence: fix objcgs vector allocation") > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> > Reviewed-by: Marco Elver <elver@google.com> > Reviewed-by: Muchun Song <songmuchun@bytedance.com> > --- > > v2 -> v3: > - Add Reviewed-by: tags from Marco and Muchun. Thanks! > - Initialize folio where it is defined. > > mm/kfence/core.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/mm/kfence/core.c b/mm/kfence/core.c > index a203747ad2c0..b7d3a9667f00 100644 > --- a/mm/kfence/core.c > +++ b/mm/kfence/core.c > @@ -642,6 +642,14 @@ static bool __init kfence_init_pool_early(void) > * fails for the first page, and therefore expect addr==__kfence_pool in > * most failure cases. > */ > + for (char *p = (char *)addr; p < __kfence_pool + KFENCE_POOL_SIZE; p += PAGE_SIZE) { > + struct folio *folio = virt_to_folio(p); > + After more thinking, I think it is better to use 'struct slab *' to define a local variable since we already use this struct throughout slab core. What do you think? Thanks. > + __folio_clear_slab(folio); > +#ifdef CONFIG_MEMCG > + folio->memcg_data = 0; > +#endif > + } > memblock_free_late(__pa(addr), KFENCE_POOL_SIZE - (addr - (unsigned long)__kfence_pool)); > __kfence_pool = NULL; > return false; > -- > 2.32.0 > >
On Thu, May 05, 2022 at 06:54:18PM +0800, Muchun Song wrote: > On Thu, May 05, 2022 at 07:13:37PM +0900, Hyeonggon Yoo wrote: > > When kfence fails to initialize kfence pool, it frees the pool. > > But it does not reset PG_slab flag and memcg_data of struct page. > > > > Below is a BUG because of this. Let's fix it by resetting PG_slab > > and memcg_data before free. > > > > [ 0.089149] BUG: Bad page state in process swapper/0 pfn:3d8e06 > > [ 0.089149] page:ffffea46cf638180 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x3d8e06 > > [ 0.089150] memcg:ffffffff94a475d1 > > [ 0.089150] flags: 0x17ffffc0000200(slab|node=0|zone=2|lastcpupid=0x1fffff) > > [ 0.089151] raw: 0017ffffc0000200 ffffea46cf638188 ffffea46cf638188 0000000000000000 > > [ 0.089152] raw: 0000000000000000 0000000000000000 00000000ffffffff ffffffff94a475d1 > > [ 0.089152] page dumped because: page still charged to cgroup > > [ 0.089153] Modules linked in: > > [ 0.089153] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G B W 5.18.0-rc1+ #965 > > [ 0.089154] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 > > [ 0.089154] Call Trace: > > [ 0.089155] <TASK> > > [ 0.089155] dump_stack_lvl+0x49/0x5f > > [ 0.089157] dump_stack+0x10/0x12 > > [ 0.089158] bad_page.cold+0x63/0x94 > > [ 0.089159] check_free_page_bad+0x66/0x70 > > [ 0.089160] __free_pages_ok+0x423/0x530 > > [ 0.089161] __free_pages_core+0x8e/0xa0 > > [ 0.089162] memblock_free_pages+0x10/0x12 > > [ 0.089164] memblock_free_late+0x8f/0xb9 > > [ 0.089165] kfence_init+0x68/0x92 > > [ 0.089166] start_kernel+0x789/0x992 > > [ 0.089167] x86_64_start_reservations+0x24/0x26 > > [ 0.089168] x86_64_start_kernel+0xa9/0xaf > > [ 0.089170] secondary_startup_64_no_verify+0xd5/0xdb > > [ 0.089171] </TASK> > > > > Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure") > > Fixes: 8f0b36497303 ("mm: kfence: fix objcgs vector allocation") > > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> > > Reviewed-by: Marco Elver <elver@google.com> > > Reviewed-by: Muchun Song <songmuchun@bytedance.com> > > --- > > > > v2 -> v3: > > - Add Reviewed-by: tags from Marco and Muchun. Thanks! > > - Initialize folio where it is defined. > > > > mm/kfence/core.c | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/mm/kfence/core.c b/mm/kfence/core.c > > index a203747ad2c0..b7d3a9667f00 100644 > > --- a/mm/kfence/core.c > > +++ b/mm/kfence/core.c > > @@ -642,6 +642,14 @@ static bool __init kfence_init_pool_early(void) > > * fails for the first page, and therefore expect addr==__kfence_pool in > > * most failure cases. > > */ > > + for (char *p = (char *)addr; p < __kfence_pool + KFENCE_POOL_SIZE; p += PAGE_SIZE) { > > + struct folio *folio = virt_to_folio(p); > > + > > After more thinking, I think it is better to use 'struct slab *' > to define a local variable since we already use this struct > throughout slab core. What do you think? > I think that may not be better. In the code we're freeing folios (so not going to reuse it again in slab/kfence). And it may not be Slab depending on why kfence_init_pool() failed. > Thanks. > > > + __folio_clear_slab(folio); > > +#ifdef CONFIG_MEMCG > > + folio->memcg_data = 0; > > +#endif > > + } > > memblock_free_late(__pa(addr), KFENCE_POOL_SIZE - (addr - (unsigned long)__kfence_pool)); > > __kfence_pool = NULL; > > return false; > > -- > > 2.32.0 > > > >
On Thu, May 05, 2022 at 08:33:36PM +0900, Hyeonggon Yoo wrote: > On Thu, May 05, 2022 at 06:54:18PM +0800, Muchun Song wrote: > > On Thu, May 05, 2022 at 07:13:37PM +0900, Hyeonggon Yoo wrote: > > > When kfence fails to initialize kfence pool, it frees the pool. > > > But it does not reset PG_slab flag and memcg_data of struct page. > > > > > > Below is a BUG because of this. Let's fix it by resetting PG_slab > > > and memcg_data before free. > > > > > > [ 0.089149] BUG: Bad page state in process swapper/0 pfn:3d8e06 > > > [ 0.089149] page:ffffea46cf638180 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x3d8e06 > > > [ 0.089150] memcg:ffffffff94a475d1 > > > [ 0.089150] flags: 0x17ffffc0000200(slab|node=0|zone=2|lastcpupid=0x1fffff) > > > [ 0.089151] raw: 0017ffffc0000200 ffffea46cf638188 ffffea46cf638188 0000000000000000 > > > [ 0.089152] raw: 0000000000000000 0000000000000000 00000000ffffffff ffffffff94a475d1 > > > [ 0.089152] page dumped because: page still charged to cgroup > > > [ 0.089153] Modules linked in: > > > [ 0.089153] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G B W 5.18.0-rc1+ #965 > > > [ 0.089154] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 > > > [ 0.089154] Call Trace: > > > [ 0.089155] <TASK> > > > [ 0.089155] dump_stack_lvl+0x49/0x5f > > > [ 0.089157] dump_stack+0x10/0x12 > > > [ 0.089158] bad_page.cold+0x63/0x94 > > > [ 0.089159] check_free_page_bad+0x66/0x70 > > > [ 0.089160] __free_pages_ok+0x423/0x530 > > > [ 0.089161] __free_pages_core+0x8e/0xa0 > > > [ 0.089162] memblock_free_pages+0x10/0x12 > > > [ 0.089164] memblock_free_late+0x8f/0xb9 > > > [ 0.089165] kfence_init+0x68/0x92 > > > [ 0.089166] start_kernel+0x789/0x992 > > > [ 0.089167] x86_64_start_reservations+0x24/0x26 > > > [ 0.089168] x86_64_start_kernel+0xa9/0xaf > > > [ 0.089170] secondary_startup_64_no_verify+0xd5/0xdb > > > [ 0.089171] </TASK> > > > > > > Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure") > > > Fixes: 8f0b36497303 ("mm: kfence: fix objcgs vector allocation") > > > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> > > > Reviewed-by: Marco Elver <elver@google.com> > > > Reviewed-by: Muchun Song <songmuchun@bytedance.com> > > > --- > > > > > > v2 -> v3: > > > - Add Reviewed-by: tags from Marco and Muchun. Thanks! > > > - Initialize folio where it is defined. > > > > > > mm/kfence/core.c | 8 ++++++++ > > > 1 file changed, 8 insertions(+) > > > > > > diff --git a/mm/kfence/core.c b/mm/kfence/core.c > > > index a203747ad2c0..b7d3a9667f00 100644 > > > --- a/mm/kfence/core.c > > > +++ b/mm/kfence/core.c > > > @@ -642,6 +642,14 @@ static bool __init kfence_init_pool_early(void) > > > * fails for the first page, and therefore expect addr==__kfence_pool in > > > * most failure cases. > > > */ > > > + for (char *p = (char *)addr; p < __kfence_pool + KFENCE_POOL_SIZE; p += PAGE_SIZE) { > > > + struct folio *folio = virt_to_folio(p); > > > + > > > > After more thinking, I think it is better to use 'struct slab *' > > to define a local variable since we already use this struct > > throughout slab core. What do you think? > > > > I think that may not be better. > > In the code we're freeing folios (so not going to reuse it again in slab/kfence). > And it may not be Slab depending on why kfence_init_pool() failed. > If it it not a Slab, then virt_to_slab() returns NULL in this case, it is unnecessary to clear PG_slab and reset its ->memcg_data. Right? Like the following changes: diff --git a/mm/kfence/core.c b/mm/kfence/core.c index 6e69986c3f0d..d90fe82dc752 100644 --- a/mm/kfence/core.c +++ b/mm/kfence/core.c @@ -627,6 +627,16 @@ static bool __init kfence_init_pool_early(void) * fails for the first page, and therefore expect addr==__kfence_pool in * most failure cases. */ + for (char *p = (char *)addr; p < __kfence_pool + KFENCE_POOL_SIZE; p += PAGE_SIZE) { + struct slab *slab = virt_to_slab(p); + + if (!slab) + continue; + __folio_clear_slab(slab_folio(slab)); +#ifdef CONFIG_MEMCG + slab->memcg_data = 0; +#endif + } memblock_free_late(__pa(addr), KFENCE_POOL_SIZE - (addr - (unsigned long)__kfence_pool)); __kfence_pool = NULL; return false;
diff --git a/mm/kfence/core.c b/mm/kfence/core.c index a203747ad2c0..b7d3a9667f00 100644 --- a/mm/kfence/core.c +++ b/mm/kfence/core.c @@ -642,6 +642,14 @@ static bool __init kfence_init_pool_early(void) * fails for the first page, and therefore expect addr==__kfence_pool in * most failure cases. */ + for (char *p = (char *)addr; p < __kfence_pool + KFENCE_POOL_SIZE; p += PAGE_SIZE) { + struct folio *folio = virt_to_folio(p); + + __folio_clear_slab(folio); +#ifdef CONFIG_MEMCG + folio->memcg_data = 0; +#endif + } memblock_free_late(__pa(addr), KFENCE_POOL_SIZE - (addr - (unsigned long)__kfence_pool)); __kfence_pool = NULL; return false;