Message ID | 20180607151344.a22a1e7182a2142e6d24e4de@linux-foundation.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Jun 07, 2018 at 03:13:44PM -0700, Andrew Morton wrote: > This patch is quite urgent and is tagged for -stable backporting, yet > it remains in an unreviewed state. Any takers? It looks a straightforward safe fix, on x86 hva_to_gfn_memslot would zap those bits and hide the misalignment caused by the low metadata bits being erroneously left set in the address, but the arm code notices when that's the last page in the memslot and the hva_end is getting aligned and the size is below one page. > [35380.933345] [<ffff000008088f00>] dump_backtrace+0x0/0x22c > [35380.938723] [<ffff000008089150>] show_stack+0x24/0x2c > [35380.943759] [<ffff00000893c078>] dump_stack+0x8c/0xb0 > [35380.948794] [<ffff00000820ab50>] bad_page+0xf4/0x154 > [35380.953740] [<ffff000008211ce8>] free_pages_check_bad+0x90/0x9c > [35380.959642] [<ffff00000820c430>] free_pcppages_bulk+0x464/0x518 > [35380.965545] [<ffff00000820db98>] free_hot_cold_page+0x22c/0x300 > [35380.971448] [<ffff0000082176fc>] __put_page+0x54/0x60 > [35380.976484] [<ffff0000080b1164>] unmap_stage2_range+0x170/0x2b4 > [35380.982385] [<ffff0000080b12d8>] kvm_unmap_hva_handler+0x30/0x40 > [35380.988375] [<ffff0000080b0104>] handle_hva_to_gpa+0xb0/0xec > [35380.994016] [<ffff0000080b2644>] kvm_unmap_hva_range+0x5c/0xd0 > [35380.999833] [<ffff0000080a8054>] > > I even injected a fault on purpose in kvm_unmap_hva_range by seting > size=size-0x200, the call trace is similar as above. So I thought the > panic is similarly caused by the root cause of WARN_ON. I think the problem triggers in the addr += PAGE_SIZE of unmap_stage2_ptes that never matches end because end is aligned but addr is not. } while (pte++, addr += PAGE_SIZE, addr != end); x86 again only works on hva_start/hva_end after converting it to gfn_start/end and that being in pfn units the bits are zapped before they risk to cause trouble. > > Link: http://lkml.kernel.org/r/1525403506-6750-1-git-send-email-hejianet@gmail.com > Signed-off-by: Jia He <jia.he@hxt-semitech.com> > Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com> > Cc: Andrea Arcangeli <aarcange@redhat.com> > Cc: Minchan Kim <minchan@kernel.org> > Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> > Cc: Arvind Yadav <arvind.yadav.cs@gmail.com> > Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> > Cc: Jia He <hejianet@gmail.com> > Cc: <stable@vger.kernel.org> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > --- > Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Hi Andrew On 6/8/2018 6:13 AM, Andrew Morton Wrote: > On Thu, 24 May 2018 13:38:05 -0700 Andrew Morton <akpm@linux-foundation.org> wrote: >>> >>> Jia, Andrew, >>> >>> What is the status of this patch ? >>> >> >> I have it scheduled for 4.18-rc1, with a cc:stable for backporting. >> >> I'd normally put such a fix into 4.17-rcX but I'd like to give Hugh >> time to review it and to generally give it a bit more time for review >> and test. >> >> Have you tested it yourself? > > I'll take your silence as a no. Sorry if you asked the previous question to me. I've tested by myself in arm64 server (QDF2400,46 cpus,96G mem) Without this patch, the WARN_ON is very easy for reproducing. After this patch, I have run the same benchmarch for a whole day without any WARN_ONs Hope it helpful. Cheers, Jia
Hi Andrea Thanks for the review. On 6/8/2018 7:38 AM, Andrea Arcangeli Wrote: > On Thu, Jun 07, 2018 at 03:13:44PM -0700, Andrew Morton wrote: >> This patch is quite urgent and is tagged for -stable backporting, yet >> it remains in an unreviewed state. Any takers? > > It looks a straightforward safe fix, on x86 hva_to_gfn_memslot would > zap those bits and hide the misalignment caused by the low metadata > bits being erroneously left set in the address, but the arm code > notices when that's the last page in the memslot and the hva_end is > getting aligned and the size is below one page. > >> [35380.933345] [<ffff000008088f00>] dump_backtrace+0x0/0x22c >> [35380.938723] [<ffff000008089150>] show_stack+0x24/0x2c >> [35380.943759] [<ffff00000893c078>] dump_stack+0x8c/0xb0 >> [35380.948794] [<ffff00000820ab50>] bad_page+0xf4/0x154 >> [35380.953740] [<ffff000008211ce8>] free_pages_check_bad+0x90/0x9c >> [35380.959642] [<ffff00000820c430>] free_pcppages_bulk+0x464/0x518 >> [35380.965545] [<ffff00000820db98>] free_hot_cold_page+0x22c/0x300 >> [35380.971448] [<ffff0000082176fc>] __put_page+0x54/0x60 >> [35380.976484] [<ffff0000080b1164>] unmap_stage2_range+0x170/0x2b4 >> [35380.982385] [<ffff0000080b12d8>] kvm_unmap_hva_handler+0x30/0x40 >> [35380.988375] [<ffff0000080b0104>] handle_hva_to_gpa+0xb0/0xec >> [35380.994016] [<ffff0000080b2644>] kvm_unmap_hva_range+0x5c/0xd0 >> [35380.999833] [<ffff0000080a8054>] >> >> I even injected a fault on purpose in kvm_unmap_hva_range by seting >> size=size-0x200, the call trace is similar as above. So I thought the >> panic is similarly caused by the root cause of WARN_ON. > > I think the problem triggers in the addr += PAGE_SIZE of > unmap_stage2_ptes that never matches end because end is aligned but > addr is not. > > } while (pte++, addr += PAGE_SIZE, addr != end); > > x86 again only works on hva_start/hva_end after converting it to > gfn_start/end and that being in pfn units the bits are zapped before > they risk to cause trouble. For this panic issue on arm64, I started another thread to discuss https://lkml.org/lkml/2018/5/2/61
On 06/07/2018 11:13 PM, Andrew Morton wrote: > On Thu, 24 May 2018 13:38:05 -0700 Andrew Morton <akpm@linux-foundation.org> wrote: >>> >>> Jia, Andrew, >>> >>> What is the status of this patch ? >>> >> >> I have it scheduled for 4.18-rc1, with a cc:stable for backporting. >> >> I'd normally put such a fix into 4.17-rcX but I'd like to give Hugh >> time to review it and to generally give it a bit more time for review >> and test. >> >> Have you tested it yourself? > > I'll take your silence as a no. > Hi Andrew, Sorry, I missed this one. I can see give a try to reproduce the problem and verify the fix. Cheers Suzuki
diff -puN mm/ksm.c~mm-ksm-ignore-stable_flag-of-rmap_item-address-in-rmap_walk_ksm mm/ksm.c --- a/mm/ksm.c~mm-ksm-ignore-stable_flag-of-rmap_item-address-in-rmap_walk_ksm +++ a/mm/ksm.c @@ -216,6 +216,8 @@ struct rmap_item { #define SEQNR_MASK 0x0ff /* low bits of unstable tree seqnr */ #define UNSTABLE_FLAG 0x100 /* is a node of the unstable tree */ #define STABLE_FLAG 0x200 /* is listed from the stable tree */ +#define KSM_FLAG_MASK (SEQNR_MASK|UNSTABLE_FLAG|STABLE_FLAG) + /* to mask all the flags */ /* The stable and unstable tree heads */ static struct rb_root one_stable_tree[1] = { RB_ROOT }; @@ -2598,10 +2600,15 @@ again: anon_vma_lock_read(anon_vma); anon_vma_interval_tree_foreach(vmac, &anon_vma->rb_root, 0, ULONG_MAX) { + unsigned long addr; + cond_resched(); vma = vmac->vma; - if (rmap_item->address < vma->vm_start || - rmap_item->address >= vma->vm_end) + + /* Ignore the stable/unstable/sqnr flags */ + addr = rmap_item->address & ~KSM_FLAG_MASK; + + if (addr < vma->vm_start || addr >= vma->vm_end) continue; /* * Initially we examine only the vma which covers this @@ -2615,8 +2622,7 @@ again: if (rwc->invalid_vma && rwc->invalid_vma(vma, rwc->arg)) continue; - if (!rwc->rmap_one(page, vma, - rmap_item->address, rwc->arg)) { + if (!rwc->rmap_one(page, vma, addr, rwc->arg)) { anon_vma_unlock_read(anon_vma); return; }