Message ID | 160697689204.605323.17629854984697045602.stgit@dwillia2-desk3.amr.corp.intel.com (mailing list archive) |
---|---|
State | Accepted |
Commit | d1c5246e08eb64991001d97a3bd119c93edbc79a |
Headers | show |
Series | x86/mm: Fix leak of pmd ptlock | expand |
On Wed, Dec 02, 2020 at 10:28:12PM -0800, Dan Williams wrote:
> pmd_free() is close, but it is a messy fit due to requiring an @mm arg.
Hurpm, only parisc and s390 actually use that argument. And s390
_really_ needs it, because they're doing runtime folding per mm.
Might I tempt an x86/mm maintainer to ack this, or a x86-tip maintainer to apply it outright? On Wed, Dec 2, 2020 at 10:28 PM Dan Williams <dan.j.williams@intel.com> wrote: > > Commit 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces") > introduced a new location where a pmd was released, but neglected to run > the pmd page destructor. In fact, this happened previously for a > different pmd release path and was fixed by commit: > > c283610e44ec ("x86, mm: do not leak page->ptl for pmd page tables"). > > This issue was hidden until recently because the failure mode is silent, > but commit: > > b2b29d6d0119 ("mm: account PMD tables like PTE tables") > > ...turns the failure mode into this signature: > > BUG: Bad page state in process lt-pmem-ns pfn:15943d > page:000000007262ed7b refcount:0 mapcount:-1024 mapping:0000000000000000 index:0x0 pfn:0x15943d > flags: 0xaffff800000000() > raw: 00affff800000000 dead000000000100 0000000000000000 0000000000000000 > raw: 0000000000000000 ffff913a029bcc08 00000000fffffbff 0000000000000000 > page dumped because: nonzero mapcount > [..] > dump_stack+0x8b/0xb0 > bad_page.cold+0x63/0x94 > free_pcp_prepare+0x224/0x270 > free_unref_page+0x18/0xd0 > pud_free_pmd_page+0x146/0x160 > ioremap_pud_range+0xe3/0x350 > ioremap_page_range+0x108/0x160 > __ioremap_caller.constprop.0+0x174/0x2b0 > ? memremap+0x7a/0x110 > memremap+0x7a/0x110 > devm_memremap+0x53/0xa0 > pmem_attach_disk+0x4ed/0x530 [nd_pmem] > ? __devm_release_region+0x52/0x80 > nvdimm_bus_probe+0x85/0x210 [libnvdimm] > > Given this is a repeat occurrence it seemed prudent to look for other > places where this destructor might be missing and whether a better > helper is needed. try_to_free_pmd_page() looks like a candidate, but > testing with setting up and tearing down pmd mappings via the dax unit > tests is thus far not triggering the failure. As for a better helper > pmd_free() is close, but it is a messy fit due to requiring an @mm arg. > Also, ___pmd_free_tlb() wants to call paravirt_tlb_remove_table() > instead of free_page(), so open-coded pgtable_pmd_page_dtor() seems the > best way forward for now. > > Fixes: 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces") > Cc: <stable@vger.kernel.org> > Cc: Dave Hansen <dave.hansen@linux.intel.com> > Cc: Andy Lutomirski <luto@kernel.org> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Borislav Petkov <bp@alien8.de> > Cc: x86@kernel.org > Cc: "H. Peter Anvin" <hpa@zytor.com> > Co-debugged-by: Matthew Wilcox <willy@infradead.org> > Tested-by: Yi Zhang <yi.zhang@redhat.com> > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > --- > arch/x86/mm/pgtable.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c > index dfd82f51ba66..f6a9e2e36642 100644 > --- a/arch/x86/mm/pgtable.c > +++ b/arch/x86/mm/pgtable.c > @@ -829,6 +829,8 @@ int pud_free_pmd_page(pud_t *pud, unsigned long addr) > } > > free_page((unsigned long)pmd_sv); > + > + pgtable_pmd_page_dtor(virt_to_page(pmd)); > free_page((unsigned long)pmd); > > return 1; >
Ping, this bug is still present on v5.11-rc2, need a resend? On Wed, Dec 2, 2020 at 10:28 PM Dan Williams <dan.j.williams@intel.com> wrote: > > Commit 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces") > introduced a new location where a pmd was released, but neglected to run > the pmd page destructor. In fact, this happened previously for a > different pmd release path and was fixed by commit: > > c283610e44ec ("x86, mm: do not leak page->ptl for pmd page tables"). > > This issue was hidden until recently because the failure mode is silent, > but commit: > > b2b29d6d0119 ("mm: account PMD tables like PTE tables") > > ...turns the failure mode into this signature: > > BUG: Bad page state in process lt-pmem-ns pfn:15943d > page:000000007262ed7b refcount:0 mapcount:-1024 mapping:0000000000000000 index:0x0 pfn:0x15943d > flags: 0xaffff800000000() > raw: 00affff800000000 dead000000000100 0000000000000000 0000000000000000 > raw: 0000000000000000 ffff913a029bcc08 00000000fffffbff 0000000000000000 > page dumped because: nonzero mapcount > [..] > dump_stack+0x8b/0xb0 > bad_page.cold+0x63/0x94 > free_pcp_prepare+0x224/0x270 > free_unref_page+0x18/0xd0 > pud_free_pmd_page+0x146/0x160 > ioremap_pud_range+0xe3/0x350 > ioremap_page_range+0x108/0x160 > __ioremap_caller.constprop.0+0x174/0x2b0 > ? memremap+0x7a/0x110 > memremap+0x7a/0x110 > devm_memremap+0x53/0xa0 > pmem_attach_disk+0x4ed/0x530 [nd_pmem] > ? __devm_release_region+0x52/0x80 > nvdimm_bus_probe+0x85/0x210 [libnvdimm] > > Given this is a repeat occurrence it seemed prudent to look for other > places where this destructor might be missing and whether a better > helper is needed. try_to_free_pmd_page() looks like a candidate, but > testing with setting up and tearing down pmd mappings via the dax unit > tests is thus far not triggering the failure. As for a better helper > pmd_free() is close, but it is a messy fit due to requiring an @mm arg. > Also, ___pmd_free_tlb() wants to call paravirt_tlb_remove_table() > instead of free_page(), so open-coded pgtable_pmd_page_dtor() seems the > best way forward for now. > > Fixes: 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces") > Cc: <stable@vger.kernel.org> > Cc: Dave Hansen <dave.hansen@linux.intel.com> > Cc: Andy Lutomirski <luto@kernel.org> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Borislav Petkov <bp@alien8.de> > Cc: x86@kernel.org > Cc: "H. Peter Anvin" <hpa@zytor.com> > Co-debugged-by: Matthew Wilcox <willy@infradead.org> > Tested-by: Yi Zhang <yi.zhang@redhat.com> > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > --- > arch/x86/mm/pgtable.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c > index dfd82f51ba66..f6a9e2e36642 100644 > --- a/arch/x86/mm/pgtable.c > +++ b/arch/x86/mm/pgtable.c > @@ -829,6 +829,8 @@ int pud_free_pmd_page(pud_t *pud, unsigned long addr) > } > > free_page((unsigned long)pmd_sv); > + > + pgtable_pmd_page_dtor(virt_to_page(pmd)); > free_page((unsigned long)pmd); > > return 1; >
On 12/2/20 10:28 PM, Dan Williams wrote: > Commit 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces") > introduced a new location where a pmd was released, but neglected to run > the pmd page destructor. In fact, this happened previously for a > different pmd release path and was fixed by commit: > > c283610e44ec ("x86, mm: do not leak page->ptl for pmd page tables"). > > This issue was hidden until recently because the failure mode is silent, > but commit: Looks sane. Thanks as always for the thorough changelog and the investigation into why we're suddenly seeing this now. I agree that ridding ourselves of open-coded free_page()'s is a good idea, but this patch itself needs to be around for stable anyway. So, Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index dfd82f51ba66..f6a9e2e36642 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -829,6 +829,8 @@ int pud_free_pmd_page(pud_t *pud, unsigned long addr) } free_page((unsigned long)pmd_sv); + + pgtable_pmd_page_dtor(virt_to_page(pmd)); free_page((unsigned long)pmd); return 1;