Message ID | 20200414062050.66644-1-songmuchun@bytedance.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm/ksm: Fix kernel NULL pointer dereference at 0000000000000040 | expand |
On 14.04.20 08:20, Muchun Song wrote: > The find_mergeable_vma can return NULL. In this case, it leads > to crash when we access vma->vm_mm(which's offset is 0x40) in > write_protect_page. And this case did happen on our server. The > following calltrace is captured in kernel 4.19 with ksm enabled. > So add a vma check to fix it. > > -------------------------------------------------------------------------- > BUG: unable to handle kernel NULL pointer dereference at 0000000000000040 > PGD 0 P4D 0 > Oops: 0000 [#1] SMP NOPTI > CPU: 9 PID: 510 Comm: ksmd Kdump: loaded Tainted: G OE 4.19.36.bsk.9-amd64 #4.19.36.bsk.9 > Hardware name: FOXCONN R-5111/GROOT, BIOS IC1B111F 08/17/2019 > RIP: 0010:try_to_merge_one_page+0xc7/0x760 > Code: 24 58 65 48 33 34 25 28 00 00 00 89 e8 0f 85 a3 06 00 00 48 83 c4 > 60 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 8b 46 08 a8 01 75 b8 <49> > 8b 44 24 40 4c 8d 7c 24 20 b9 07 00 00 00 4c 89 e6 4c 89 ff 48 > RSP: 0018:ffffadbdd9fffdb0 EFLAGS: 00010246 > RAX: ffffda83ffd4be08 RBX: ffffda83ffd4be40 RCX: 0000002c6e800000 > RDX: 0000000000000000 RSI: ffffda83ffd4be40 RDI: 0000000000000000 > RBP: ffffa11939f02ec0 R08: 0000000094e1a447 R09: 00000000abe76577 > R10: 0000000000000962 R11: 0000000000004e6a R12: 0000000000000000 > R13: ffffda83b1e06380 R14: ffffa18f31f072c0 R15: ffffda83ffd4be40 > FS: 0000000000000000(0000) GS:ffffa0da43b80000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000040 CR3: 0000002c77c0a003 CR4: 00000000007626e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > PKRU: 55555554 > Call Trace: > ? follow_page_pte+0x36d/0x5e0 > ksm_scan_thread+0x115e/0x1960 > ? remove_wait_queue+0x60/0x60 > kthread+0xf5/0x130 > ? try_to_merge_with_ksm_page+0x90/0x90 > ? kthread_create_worker_on_cpu+0x70/0x70 > ret_from_fork+0x1f/0x30 > -------------------------------------------------------------------------- > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > Signed-off-by: Xiongchun duan <duanxiongchun@bytedance.com> ^ why this signed-off ? > --- > mm/ksm.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/mm/ksm.c b/mm/ksm.c > index a558da9e71770..69b2f85e22d5b 100644 > --- a/mm/ksm.c > +++ b/mm/ksm.c > @@ -2112,8 +2112,11 @@ static void cmp_and_merge_page(struct page *page, struct rmap_item *rmap_item) > > down_read(&mm->mmap_sem); > vma = find_mergeable_vma(mm, rmap_item->address); > - err = try_to_merge_one_page(vma, page, > - ZERO_PAGE(rmap_item->address)); > + if (vma) > + err = try_to_merge_one_page(vma, page, > + ZERO_PAGE(rmap_item->address)); > + else > + err = -EFAULT; > up_read(&mm->mmap_sem); > /* > * In case of failure, the page was not really empty, so we > Reviewed-by: David Hildenbrand <david@redhat.com>
On Tue, Apr 14, 2020 at 7:48 PM David Hildenbrand <david@redhat.com> wrote: > > On 14.04.20 08:20, Muchun Song wrote: > > The find_mergeable_vma can return NULL. In this case, it leads > > to crash when we access vma->vm_mm(which's offset is 0x40) in > > write_protect_page. And this case did happen on our server. The > > following calltrace is captured in kernel 4.19 with ksm enabled. > > So add a vma check to fix it. > > > > -------------------------------------------------------------------------- > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000040 > > PGD 0 P4D 0 > > Oops: 0000 [#1] SMP NOPTI > > CPU: 9 PID: 510 Comm: ksmd Kdump: loaded Tainted: G OE 4.19.36.bsk.9-amd64 #4.19.36.bsk.9 > > Hardware name: FOXCONN R-5111/GROOT, BIOS IC1B111F 08/17/2019 > > RIP: 0010:try_to_merge_one_page+0xc7/0x760 > > Code: 24 58 65 48 33 34 25 28 00 00 00 89 e8 0f 85 a3 06 00 00 48 83 c4 > > 60 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 8b 46 08 a8 01 75 b8 <49> > > 8b 44 24 40 4c 8d 7c 24 20 b9 07 00 00 00 4c 89 e6 4c 89 ff 48 > > RSP: 0018:ffffadbdd9fffdb0 EFLAGS: 00010246 > > RAX: ffffda83ffd4be08 RBX: ffffda83ffd4be40 RCX: 0000002c6e800000 > > RDX: 0000000000000000 RSI: ffffda83ffd4be40 RDI: 0000000000000000 > > RBP: ffffa11939f02ec0 R08: 0000000094e1a447 R09: 00000000abe76577 > > R10: 0000000000000962 R11: 0000000000004e6a R12: 0000000000000000 > > R13: ffffda83b1e06380 R14: ffffa18f31f072c0 R15: ffffda83ffd4be40 > > FS: 0000000000000000(0000) GS:ffffa0da43b80000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 0000000000000040 CR3: 0000002c77c0a003 CR4: 00000000007626e0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > PKRU: 55555554 > > Call Trace: > > ? follow_page_pte+0x36d/0x5e0 > > ksm_scan_thread+0x115e/0x1960 > > ? remove_wait_queue+0x60/0x60 > > kthread+0xf5/0x130 > > ? try_to_merge_with_ksm_page+0x90/0x90 > > ? kthread_create_worker_on_cpu+0x70/0x70 > > ret_from_fork+0x1f/0x30 > > -------------------------------------------------------------------------- > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > Signed-off-by: Xiongchun duan <duanxiongchun@bytedance.com> > > ^ why this signed-off ? > Because I have a partner. And I just sent the v2 patch which updates the commit message and patch subject. Thanks for your review. > > --- > > mm/ksm.c | 7 +++++-- > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/mm/ksm.c b/mm/ksm.c > > index a558da9e71770..69b2f85e22d5b 100644 > > --- a/mm/ksm.c > > +++ b/mm/ksm.c > > @@ -2112,8 +2112,11 @@ static void cmp_and_merge_page(struct page *page, struct rmap_item *rmap_item) > > > > down_read(&mm->mmap_sem); > > vma = find_mergeable_vma(mm, rmap_item->address); > > - err = try_to_merge_one_page(vma, page, > > - ZERO_PAGE(rmap_item->address)); > > + if (vma) > > + err = try_to_merge_one_page(vma, page, > > + ZERO_PAGE(rmap_item->address)); > > + else > > + err = -EFAULT; > > up_read(&mm->mmap_sem); > > /* > > * In case of failure, the page was not really empty, so we > > > > Reviewed-by: David Hildenbrand <david@redhat.com> > > -- > Thanks, > > David / dhildenb >
On 14.04.20 14:09, Muchun Song wrote: > On Tue, Apr 14, 2020 at 7:48 PM David Hildenbrand <david@redhat.com> wrote: >> >> On 14.04.20 08:20, Muchun Song wrote: >>> The find_mergeable_vma can return NULL. In this case, it leads >>> to crash when we access vma->vm_mm(which's offset is 0x40) in >>> write_protect_page. And this case did happen on our server. The >>> following calltrace is captured in kernel 4.19 with ksm enabled. >>> So add a vma check to fix it. >>> >>> -------------------------------------------------------------------------- >>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000040 >>> PGD 0 P4D 0 >>> Oops: 0000 [#1] SMP NOPTI >>> CPU: 9 PID: 510 Comm: ksmd Kdump: loaded Tainted: G OE 4.19.36.bsk.9-amd64 #4.19.36.bsk.9 >>> Hardware name: FOXCONN R-5111/GROOT, BIOS IC1B111F 08/17/2019 >>> RIP: 0010:try_to_merge_one_page+0xc7/0x760 >>> Code: 24 58 65 48 33 34 25 28 00 00 00 89 e8 0f 85 a3 06 00 00 48 83 c4 >>> 60 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 8b 46 08 a8 01 75 b8 <49> >>> 8b 44 24 40 4c 8d 7c 24 20 b9 07 00 00 00 4c 89 e6 4c 89 ff 48 >>> RSP: 0018:ffffadbdd9fffdb0 EFLAGS: 00010246 >>> RAX: ffffda83ffd4be08 RBX: ffffda83ffd4be40 RCX: 0000002c6e800000 >>> RDX: 0000000000000000 RSI: ffffda83ffd4be40 RDI: 0000000000000000 >>> RBP: ffffa11939f02ec0 R08: 0000000094e1a447 R09: 00000000abe76577 >>> R10: 0000000000000962 R11: 0000000000004e6a R12: 0000000000000000 >>> R13: ffffda83b1e06380 R14: ffffa18f31f072c0 R15: ffffda83ffd4be40 >>> FS: 0000000000000000(0000) GS:ffffa0da43b80000(0000) knlGS:0000000000000000 >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> CR2: 0000000000000040 CR3: 0000002c77c0a003 CR4: 00000000007626e0 >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >>> PKRU: 55555554 >>> Call Trace: >>> ? follow_page_pte+0x36d/0x5e0 >>> ksm_scan_thread+0x115e/0x1960 >>> ? remove_wait_queue+0x60/0x60 >>> kthread+0xf5/0x130 >>> ? try_to_merge_with_ksm_page+0x90/0x90 >>> ? kthread_create_worker_on_cpu+0x70/0x70 >>> ret_from_fork+0x1f/0x30 >>> -------------------------------------------------------------------------- >>> >>> Signed-off-by: Muchun Song <songmuchun@bytedance.com> >>> Signed-off-by: Xiongchun duan <duanxiongchun@bytedance.com> >> >> ^ why this signed-off ? >> > > Because I have a partner. And I just sent the v2 patch which updates the > commit message and patch subject. Thanks for your review. Then we use Co-developed-by AFAIK instead.
On Tue, Apr 14, 2020 at 8:23 PM David Hildenbrand <david@redhat.com> wrote: > > On 14.04.20 14:09, Muchun Song wrote: > > On Tue, Apr 14, 2020 at 7:48 PM David Hildenbrand <david@redhat.com> wrote: > >> > >> On 14.04.20 08:20, Muchun Song wrote: > >>> The find_mergeable_vma can return NULL. In this case, it leads > >>> to crash when we access vma->vm_mm(which's offset is 0x40) in > >>> write_protect_page. And this case did happen on our server. The > >>> following calltrace is captured in kernel 4.19 with ksm enabled. > >>> So add a vma check to fix it. > >>> > >>> -------------------------------------------------------------------------- > >>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000040 > >>> PGD 0 P4D 0 > >>> Oops: 0000 [#1] SMP NOPTI > >>> CPU: 9 PID: 510 Comm: ksmd Kdump: loaded Tainted: G OE 4.19.36.bsk.9-amd64 #4.19.36.bsk.9 > >>> Hardware name: FOXCONN R-5111/GROOT, BIOS IC1B111F 08/17/2019 > >>> RIP: 0010:try_to_merge_one_page+0xc7/0x760 > >>> Code: 24 58 65 48 33 34 25 28 00 00 00 89 e8 0f 85 a3 06 00 00 48 83 c4 > >>> 60 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 8b 46 08 a8 01 75 b8 <49> > >>> 8b 44 24 40 4c 8d 7c 24 20 b9 07 00 00 00 4c 89 e6 4c 89 ff 48 > >>> RSP: 0018:ffffadbdd9fffdb0 EFLAGS: 00010246 > >>> RAX: ffffda83ffd4be08 RBX: ffffda83ffd4be40 RCX: 0000002c6e800000 > >>> RDX: 0000000000000000 RSI: ffffda83ffd4be40 RDI: 0000000000000000 > >>> RBP: ffffa11939f02ec0 R08: 0000000094e1a447 R09: 00000000abe76577 > >>> R10: 0000000000000962 R11: 0000000000004e6a R12: 0000000000000000 > >>> R13: ffffda83b1e06380 R14: ffffa18f31f072c0 R15: ffffda83ffd4be40 > >>> FS: 0000000000000000(0000) GS:ffffa0da43b80000(0000) knlGS:0000000000000000 > >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >>> CR2: 0000000000000040 CR3: 0000002c77c0a003 CR4: 00000000007626e0 > >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > >>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > >>> PKRU: 55555554 > >>> Call Trace: > >>> ? follow_page_pte+0x36d/0x5e0 > >>> ksm_scan_thread+0x115e/0x1960 > >>> ? remove_wait_queue+0x60/0x60 > >>> kthread+0xf5/0x130 > >>> ? try_to_merge_with_ksm_page+0x90/0x90 > >>> ? kthread_create_worker_on_cpu+0x70/0x70 > >>> ret_from_fork+0x1f/0x30 > >>> -------------------------------------------------------------------------- > >>> > >>> Signed-off-by: Muchun Song <songmuchun@bytedance.com> > >>> Signed-off-by: Xiongchun duan <duanxiongchun@bytedance.com> > >> > >> ^ why this signed-off ? > >> > > > > Because I have a partner. And I just sent the v2 patch which updates the > > commit message and patch subject. Thanks for your review. > > Then we use Co-developed-by AFAIK instead. > Thanks a lot. I will fix it.
diff --git a/mm/ksm.c b/mm/ksm.c index a558da9e71770..69b2f85e22d5b 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -2112,8 +2112,11 @@ static void cmp_and_merge_page(struct page *page, struct rmap_item *rmap_item) down_read(&mm->mmap_sem); vma = find_mergeable_vma(mm, rmap_item->address); - err = try_to_merge_one_page(vma, page, - ZERO_PAGE(rmap_item->address)); + if (vma) + err = try_to_merge_one_page(vma, page, + ZERO_PAGE(rmap_item->address)); + else + err = -EFAULT; up_read(&mm->mmap_sem); /* * In case of failure, the page was not really empty, so we