Message ID | dbbcd36ca1f045ec81f49c7657928a1cdf24872b.1550065120.git.robin.murphy@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm: Fix __dump_page() for poisoned pages | expand |
On Wed 13-02-19 13:40:49, Robin Murphy wrote: > Evaluating page_mapping() on a poisoned page ends up dereferencing junk > and making PF_POISONED_CHECK() considerably crashier than intended. Fix > that by not inspecting the mapping until we've determined that it's > likely to be valid. Has this ever triggered? I am mainly asking because there is no usage of mapping so I would expect that the compiler wouldn't really call page_mapping until it is really used. > Fixes: 1c6fb1d89e73 ("mm: print more information about mapping in __dump_page") > Signed-off-by: Robin Murphy <robin.murphy@arm.com> > --- > mm/debug.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/mm/debug.c b/mm/debug.c > index 0abb987dad9b..1611cf00a137 100644 > --- a/mm/debug.c > +++ b/mm/debug.c > @@ -44,7 +44,7 @@ const struct trace_print_flags vmaflag_names[] = { > > void __dump_page(struct page *page, const char *reason) > { > - struct address_space *mapping = page_mapping(page); > + struct address_space *mapping; > bool page_poisoned = PagePoisoned(page); > int mapcount; > > @@ -58,6 +58,8 @@ void __dump_page(struct page *page, const char *reason) > goto hex_only; > } > > + mapping = page_mapping(page); > + > /* > * Avoid VM_BUG_ON() in page_mapcount(). > * page->_mapcount space in struct page is used by sl[aou]b pages to > -- > 2.20.1.dirty >
On 13/02/2019 14:23, Michal Hocko wrote: > On Wed 13-02-19 13:40:49, Robin Murphy wrote: >> Evaluating page_mapping() on a poisoned page ends up dereferencing junk >> and making PF_POISONED_CHECK() considerably crashier than intended. Fix >> that by not inspecting the mapping until we've determined that it's >> likely to be valid. > > Has this ever triggered? I am mainly asking because there is no usage of > mapping so I would expect that the compiler wouldn't really call > page_mapping until it is really used. A function call is a sequence point, so any compiler that did that would be totally broken. The crash looks like this (now from an explicit dump_page() call before it happens naturally deep within pfn_to_nid()): ----- [ 107.147056] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000006 [ 107.155774] Mem abort info: [ 107.158546] ESR = 0x96000005 [ 107.161572] Exception class = DABT (current EL), IL = 32 bits [ 107.167437] SET = 0, FnV = 0 [ 107.170460] EA = 0, S1PTW = 0 [ 107.173568] Data abort info: [ 107.176419] ISV = 0, ISS = 0x00000005 [ 107.180218] CM = 0, WnR = 0 [ 107.183151] user pgtable: 4k pages, 39-bit VAs, pgdp = 00000000c2f6ac38 [ 107.189702] [0000000000000006] pgd=0000000000000000, pud=0000000000000000 [ 107.196430] Internal error: Oops: 96000005 [#1] PREEMPT SMP [ 107.201942] Modules linked in: [ 107.204962] CPU: 2 PID: 491 Comm: bash Not tainted 5.0.0-rc1+ #1 [ 107.210903] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Dec 17 2018 [ 107.221576] pstate: 00000005 (nzcv daif -PAN -UAO) [ 107.226321] pc : page_mapping+0x18/0x118 [ 107.230200] lr : __dump_page+0x1c/0x398 [ 107.233990] sp : ffffff8011a53c30 [ 107.237265] x29: ffffff8011a53c30 x28: ffffffc039b6ec00 [ 107.242520] x27: 0000000000000000 x26: 0000000000000000 [ 107.247775] x25: 0000000056000000 x24: 0000000000000015 [ 107.253029] x23: ffffff80114d8b18 x22: 0000000000000022 [ 107.258283] x21: ffffffc03538ec38 x20: ffffff8011082e78 [ 107.263537] x19: ffffffbf20000000 x18: 0000000000000000 [ 107.268790] x17: 0000000000000000 x16: 0000000000000000 [ 107.274044] x15: 0000000000000000 x14: 0000000000000000 [ 107.279297] x13: 0000000000000000 x12: 0000000000000030 [ 107.284550] x11: 0000000000000030 x10: 0101010101010101 [ 107.289804] x9 : ff7274615e68726c x8 : 7f7f7f7f7f7f7f7f [ 107.295057] x7 : feff64756e6c6471 x6 : 0000000000008080 [ 107.300310] x5 : 0000000000000000 x4 : 0000000000000000 [ 107.305564] x3 : ffffffc039b6ec00 x2 : fffffffffffffffe [ 107.310817] x1 : ffffffffffffffff x0 : fffffffffffffffe [ 107.316072] Process bash (pid: 491, stack limit = 0x000000004ebd4ecd) [ 107.322442] Call trace: [ 107.324858] page_mapping+0x18/0x118 [ 107.328392] __dump_page+0x1c/0x398 [ 107.331840] dump_page+0xc/0x18 [ 107.334945] remove_store+0xbc/0x120 [ 107.338479] dev_attr_store+0x18/0x28 [ 107.342103] sysfs_kf_write+0x40/0x50 [ 107.345722] kernfs_fop_write+0x130/0x1d8 [ 107.349687] __vfs_write+0x30/0x180 [ 107.353134] vfs_write+0xb4/0x1a0 [ 107.356410] ksys_write+0x60/0xd0 [ 107.359686] __arm64_sys_write+0x18/0x20 [ 107.363565] el0_svc_common+0x94/0xf8 [ 107.367184] el0_svc_handler+0x68/0x70 [ 107.370890] el0_svc+0x8/0xc [ 107.373737] Code: f9400401 d1000422 f240003f 9a801040 (f9400402) [ 107.379766] ---[ end trace cdb5eb5bf435cecb ]--- ----- While after this patch, DEBUG_VM works as intended: ----- [ 46.835963] page:ffffffbf20000000 is uninitialized and poisoned [ 46.835970] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff [ 46.849520] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff [ 46.857194] page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) [ 46.863170] ------------[ cut here ]------------ [ 46.867736] kernel BUG at ./include/linux/mm.h:1006! [ 46.872646] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 46.878071] Modules linked in: [ 46.881092] CPU: 1 PID: 483 Comm: bash Not tainted 5.0.0-rc1+ #3 [ 46.887032] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Dec 17 2018 [ 46.897704] pstate: 40000005 (nZcv daif -PAN -UAO) [ 46.902449] pc : remove_store+0xbc/0x120 ... ----- Robin. >> Fixes: 1c6fb1d89e73 ("mm: print more information about mapping in __dump_page") >> Signed-off-by: Robin Murphy <robin.murphy@arm.com> >> --- >> mm/debug.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/mm/debug.c b/mm/debug.c >> index 0abb987dad9b..1611cf00a137 100644 >> --- a/mm/debug.c >> +++ b/mm/debug.c >> @@ -44,7 +44,7 @@ const struct trace_print_flags vmaflag_names[] = { >> >> void __dump_page(struct page *page, const char *reason) >> { >> - struct address_space *mapping = page_mapping(page); >> + struct address_space *mapping; >> bool page_poisoned = PagePoisoned(page); >> int mapcount; >> >> @@ -58,6 +58,8 @@ void __dump_page(struct page *page, const char *reason) >> goto hex_only; >> } >> >> + mapping = page_mapping(page); >> + >> /* >> * Avoid VM_BUG_ON() in page_mapcount(). >> * page->_mapcount space in struct page is used by sl[aou]b pages to >> -- >> 2.20.1.dirty >> >
On Wed 13-02-19 14:38:37, Robin Murphy wrote: > On 13/02/2019 14:23, Michal Hocko wrote: > > On Wed 13-02-19 13:40:49, Robin Murphy wrote: > > > Evaluating page_mapping() on a poisoned page ends up dereferencing junk > > > and making PF_POISONED_CHECK() considerably crashier than intended. Fix > > > that by not inspecting the mapping until we've determined that it's > > > likely to be valid. > > > > Has this ever triggered? I am mainly asking because there is no usage of > > mapping so I would expect that the compiler wouldn't really call > > page_mapping until it is really used. > > A function call is a sequence point, so any compiler that did that would be > totally broken. Ohh, right I thought this is a static inline. > The crash looks like this (now from an explicit dump_page() call before it > happens naturally deep within pfn_to_nid()): please add this to the changelog. Acked-by: Michal Hocko <mhocko@suse.com> > ----- > [ 107.147056] Unable to handle kernel NULL pointer dereference at virtual > address 0000000000000006 > [ 107.155774] Mem abort info: > [ 107.158546] ESR = 0x96000005 > [ 107.161572] Exception class = DABT (current EL), IL = 32 bits > [ 107.167437] SET = 0, FnV = 0 > [ 107.170460] EA = 0, S1PTW = 0 > [ 107.173568] Data abort info: > [ 107.176419] ISV = 0, ISS = 0x00000005 > [ 107.180218] CM = 0, WnR = 0 > [ 107.183151] user pgtable: 4k pages, 39-bit VAs, pgdp = 00000000c2f6ac38 > [ 107.189702] [0000000000000006] pgd=0000000000000000, pud=0000000000000000 > [ 107.196430] Internal error: Oops: 96000005 [#1] PREEMPT SMP > [ 107.201942] Modules linked in: > [ 107.204962] CPU: 2 PID: 491 Comm: bash Not tainted 5.0.0-rc1+ #1 > [ 107.210903] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno > Development Platform, BIOS EDK II Dec 17 2018 > [ 107.221576] pstate: 00000005 (nzcv daif -PAN -UAO) > [ 107.226321] pc : page_mapping+0x18/0x118 > [ 107.230200] lr : __dump_page+0x1c/0x398 > [ 107.233990] sp : ffffff8011a53c30 > [ 107.237265] x29: ffffff8011a53c30 x28: ffffffc039b6ec00 > [ 107.242520] x27: 0000000000000000 x26: 0000000000000000 > [ 107.247775] x25: 0000000056000000 x24: 0000000000000015 > [ 107.253029] x23: ffffff80114d8b18 x22: 0000000000000022 > [ 107.258283] x21: ffffffc03538ec38 x20: ffffff8011082e78 > [ 107.263537] x19: ffffffbf20000000 x18: 0000000000000000 > [ 107.268790] x17: 0000000000000000 x16: 0000000000000000 > [ 107.274044] x15: 0000000000000000 x14: 0000000000000000 > [ 107.279297] x13: 0000000000000000 x12: 0000000000000030 > [ 107.284550] x11: 0000000000000030 x10: 0101010101010101 > [ 107.289804] x9 : ff7274615e68726c x8 : 7f7f7f7f7f7f7f7f > [ 107.295057] x7 : feff64756e6c6471 x6 : 0000000000008080 > [ 107.300310] x5 : 0000000000000000 x4 : 0000000000000000 > [ 107.305564] x3 : ffffffc039b6ec00 x2 : fffffffffffffffe > [ 107.310817] x1 : ffffffffffffffff x0 : fffffffffffffffe > [ 107.316072] Process bash (pid: 491, stack limit = 0x000000004ebd4ecd) > [ 107.322442] Call trace: > [ 107.324858] page_mapping+0x18/0x118 > [ 107.328392] __dump_page+0x1c/0x398 > [ 107.331840] dump_page+0xc/0x18 > [ 107.334945] remove_store+0xbc/0x120 > [ 107.338479] dev_attr_store+0x18/0x28 > [ 107.342103] sysfs_kf_write+0x40/0x50 > [ 107.345722] kernfs_fop_write+0x130/0x1d8 > [ 107.349687] __vfs_write+0x30/0x180 > [ 107.353134] vfs_write+0xb4/0x1a0 > [ 107.356410] ksys_write+0x60/0xd0 > [ 107.359686] __arm64_sys_write+0x18/0x20 > [ 107.363565] el0_svc_common+0x94/0xf8 > [ 107.367184] el0_svc_handler+0x68/0x70 > [ 107.370890] el0_svc+0x8/0xc > [ 107.373737] Code: f9400401 d1000422 f240003f 9a801040 (f9400402) > [ 107.379766] ---[ end trace cdb5eb5bf435cecb ]--- > ----- > > While after this patch, DEBUG_VM works as intended: > ----- > [ 46.835963] page:ffffffbf20000000 is uninitialized and poisoned > [ 46.835970] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff > ffffffffffffffff > [ 46.849520] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff > ffffffffffffffff > [ 46.857194] page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) > [ 46.863170] ------------[ cut here ]------------ > [ 46.867736] kernel BUG at ./include/linux/mm.h:1006! > [ 46.872646] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > [ 46.878071] Modules linked in: > [ 46.881092] CPU: 1 PID: 483 Comm: bash Not tainted 5.0.0-rc1+ #3 > [ 46.887032] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno > Development Platform, BIOS EDK II Dec 17 2018 > [ 46.897704] pstate: 40000005 (nZcv daif -PAN -UAO) > [ 46.902449] pc : remove_store+0xbc/0x120 > ... > ----- > > Robin. > > > > Fixes: 1c6fb1d89e73 ("mm: print more information about mapping in __dump_page") > > > Signed-off-by: Robin Murphy <robin.murphy@arm.com> > > > --- > > > mm/debug.c | 4 +++- > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > diff --git a/mm/debug.c b/mm/debug.c > > > index 0abb987dad9b..1611cf00a137 100644 > > > --- a/mm/debug.c > > > +++ b/mm/debug.c > > > @@ -44,7 +44,7 @@ const struct trace_print_flags vmaflag_names[] = { > > > void __dump_page(struct page *page, const char *reason) > > > { > > > - struct address_space *mapping = page_mapping(page); > > > + struct address_space *mapping; > > > bool page_poisoned = PagePoisoned(page); > > > int mapcount; > > > @@ -58,6 +58,8 @@ void __dump_page(struct page *page, const char *reason) > > > goto hex_only; > > > } > > > + mapping = page_mapping(page); > > > + > > > /* > > > * Avoid VM_BUG_ON() in page_mapcount(). > > > * page->_mapcount space in struct page is used by sl[aou]b pages to > > > -- > > > 2.20.1.dirty > > > > >
diff --git a/mm/debug.c b/mm/debug.c index 0abb987dad9b..1611cf00a137 100644 --- a/mm/debug.c +++ b/mm/debug.c @@ -44,7 +44,7 @@ const struct trace_print_flags vmaflag_names[] = { void __dump_page(struct page *page, const char *reason) { - struct address_space *mapping = page_mapping(page); + struct address_space *mapping; bool page_poisoned = PagePoisoned(page); int mapcount; @@ -58,6 +58,8 @@ void __dump_page(struct page *page, const char *reason) goto hex_only; } + mapping = page_mapping(page); + /* * Avoid VM_BUG_ON() in page_mapcount(). * page->_mapcount space in struct page is used by sl[aou]b pages to
Evaluating page_mapping() on a poisoned page ends up dereferencing junk and making PF_POISONED_CHECK() considerably crashier than intended. Fix that by not inspecting the mapping until we've determined that it's likely to be valid. Fixes: 1c6fb1d89e73 ("mm: print more information about mapping in __dump_page") Signed-off-by: Robin Murphy <robin.murphy@arm.com> --- mm/debug.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)