diff mbox series

mm/ksm: Fix kernel NULL pointer dereference at 0000000000000040

Message ID 20200414062050.66644-1-songmuchun@bytedance.com (mailing list archive)
State New, archived
Headers show
Series mm/ksm: Fix kernel NULL pointer dereference at 0000000000000040 | expand

Commit Message

Muchun Song April 14, 2020, 6:20 a.m. UTC
The find_mergeable_vma can return NULL. In this case, it leads
to crash when we access vma->vm_mm(which's offset is 0x40) in
write_protect_page. And this case did happen on our server. The
following calltrace is captured in kernel 4.19 with ksm enabled.
So add a vma check to fix it.

--------------------------------------------------------------------------
  BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP NOPTI
  CPU: 9 PID: 510 Comm: ksmd Kdump: loaded Tainted: G OE 4.19.36.bsk.9-amd64 #4.19.36.bsk.9
  Hardware name: FOXCONN R-5111/GROOT, BIOS IC1B111F 08/17/2019
  RIP: 0010:try_to_merge_one_page+0xc7/0x760
  Code: 24 58 65 48 33 34 25 28 00 00 00 89 e8 0f 85 a3 06 00 00 48 83 c4
        60 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 8b 46 08 a8 01 75 b8 <49>
        8b 44 24 40 4c 8d 7c 24 20 b9 07 00 00 00 4c 89 e6 4c 89 ff 48
  RSP: 0018:ffffadbdd9fffdb0 EFLAGS: 00010246
  RAX: ffffda83ffd4be08 RBX: ffffda83ffd4be40 RCX: 0000002c6e800000
  RDX: 0000000000000000 RSI: ffffda83ffd4be40 RDI: 0000000000000000
  RBP: ffffa11939f02ec0 R08: 0000000094e1a447 R09: 00000000abe76577
  R10: 0000000000000962 R11: 0000000000004e6a R12: 0000000000000000
  R13: ffffda83b1e06380 R14: ffffa18f31f072c0 R15: ffffda83ffd4be40
  FS: 0000000000000000(0000) GS:ffffa0da43b80000(0000) knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000040 CR3: 0000002c77c0a003 CR4: 00000000007626e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
    ? follow_page_pte+0x36d/0x5e0
    ksm_scan_thread+0x115e/0x1960
    ? remove_wait_queue+0x60/0x60
    kthread+0xf5/0x130
    ? try_to_merge_with_ksm_page+0x90/0x90
    ? kthread_create_worker_on_cpu+0x70/0x70
    ret_from_fork+0x1f/0x30
--------------------------------------------------------------------------

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Xiongchun duan <duanxiongchun@bytedance.com>
---
 mm/ksm.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

David Hildenbrand April 14, 2020, 11:48 a.m. UTC | #1
On 14.04.20 08:20, Muchun Song wrote:
> The find_mergeable_vma can return NULL. In this case, it leads
> to crash when we access vma->vm_mm(which's offset is 0x40) in
> write_protect_page. And this case did happen on our server. The
> following calltrace is captured in kernel 4.19 with ksm enabled.
> So add a vma check to fix it.
> 
> --------------------------------------------------------------------------
>   BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
>   PGD 0 P4D 0
>   Oops: 0000 [#1] SMP NOPTI
>   CPU: 9 PID: 510 Comm: ksmd Kdump: loaded Tainted: G OE 4.19.36.bsk.9-amd64 #4.19.36.bsk.9
>   Hardware name: FOXCONN R-5111/GROOT, BIOS IC1B111F 08/17/2019
>   RIP: 0010:try_to_merge_one_page+0xc7/0x760
>   Code: 24 58 65 48 33 34 25 28 00 00 00 89 e8 0f 85 a3 06 00 00 48 83 c4
>         60 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 8b 46 08 a8 01 75 b8 <49>
>         8b 44 24 40 4c 8d 7c 24 20 b9 07 00 00 00 4c 89 e6 4c 89 ff 48
>   RSP: 0018:ffffadbdd9fffdb0 EFLAGS: 00010246
>   RAX: ffffda83ffd4be08 RBX: ffffda83ffd4be40 RCX: 0000002c6e800000
>   RDX: 0000000000000000 RSI: ffffda83ffd4be40 RDI: 0000000000000000
>   RBP: ffffa11939f02ec0 R08: 0000000094e1a447 R09: 00000000abe76577
>   R10: 0000000000000962 R11: 0000000000004e6a R12: 0000000000000000
>   R13: ffffda83b1e06380 R14: ffffa18f31f072c0 R15: ffffda83ffd4be40
>   FS: 0000000000000000(0000) GS:ffffa0da43b80000(0000) knlGS:0000000000000000
>   CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 0000000000000040 CR3: 0000002c77c0a003 CR4: 00000000007626e0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>   PKRU: 55555554
>   Call Trace:
>     ? follow_page_pte+0x36d/0x5e0
>     ksm_scan_thread+0x115e/0x1960
>     ? remove_wait_queue+0x60/0x60
>     kthread+0xf5/0x130
>     ? try_to_merge_with_ksm_page+0x90/0x90
>     ? kthread_create_worker_on_cpu+0x70/0x70
>     ret_from_fork+0x1f/0x30
> --------------------------------------------------------------------------
> 
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> Signed-off-by: Xiongchun duan <duanxiongchun@bytedance.com>

^ why this signed-off ?

> ---
>  mm/ksm.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/ksm.c b/mm/ksm.c
> index a558da9e71770..69b2f85e22d5b 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -2112,8 +2112,11 @@ static void cmp_and_merge_page(struct page *page, struct rmap_item *rmap_item)
>  
>  		down_read(&mm->mmap_sem);
>  		vma = find_mergeable_vma(mm, rmap_item->address);
> -		err = try_to_merge_one_page(vma, page,
> -					    ZERO_PAGE(rmap_item->address));
> +		if (vma)
> +			err = try_to_merge_one_page(vma, page,
> +					ZERO_PAGE(rmap_item->address));
> +		else
> +			err = -EFAULT;
>  		up_read(&mm->mmap_sem);
>  		/*
>  		 * In case of failure, the page was not really empty, so we
> 

Reviewed-by: David Hildenbrand <david@redhat.com>
Muchun Song April 14, 2020, 12:09 p.m. UTC | #2
On Tue, Apr 14, 2020 at 7:48 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 14.04.20 08:20, Muchun Song wrote:
> > The find_mergeable_vma can return NULL. In this case, it leads
> > to crash when we access vma->vm_mm(which's offset is 0x40) in
> > write_protect_page. And this case did happen on our server. The
> > following calltrace is captured in kernel 4.19 with ksm enabled.
> > So add a vma check to fix it.
> >
> > --------------------------------------------------------------------------
> >   BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
> >   PGD 0 P4D 0
> >   Oops: 0000 [#1] SMP NOPTI
> >   CPU: 9 PID: 510 Comm: ksmd Kdump: loaded Tainted: G OE 4.19.36.bsk.9-amd64 #4.19.36.bsk.9
> >   Hardware name: FOXCONN R-5111/GROOT, BIOS IC1B111F 08/17/2019
> >   RIP: 0010:try_to_merge_one_page+0xc7/0x760
> >   Code: 24 58 65 48 33 34 25 28 00 00 00 89 e8 0f 85 a3 06 00 00 48 83 c4
> >         60 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 8b 46 08 a8 01 75 b8 <49>
> >         8b 44 24 40 4c 8d 7c 24 20 b9 07 00 00 00 4c 89 e6 4c 89 ff 48
> >   RSP: 0018:ffffadbdd9fffdb0 EFLAGS: 00010246
> >   RAX: ffffda83ffd4be08 RBX: ffffda83ffd4be40 RCX: 0000002c6e800000
> >   RDX: 0000000000000000 RSI: ffffda83ffd4be40 RDI: 0000000000000000
> >   RBP: ffffa11939f02ec0 R08: 0000000094e1a447 R09: 00000000abe76577
> >   R10: 0000000000000962 R11: 0000000000004e6a R12: 0000000000000000
> >   R13: ffffda83b1e06380 R14: ffffa18f31f072c0 R15: ffffda83ffd4be40
> >   FS: 0000000000000000(0000) GS:ffffa0da43b80000(0000) knlGS:0000000000000000
> >   CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >   CR2: 0000000000000040 CR3: 0000002c77c0a003 CR4: 00000000007626e0
> >   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >   PKRU: 55555554
> >   Call Trace:
> >     ? follow_page_pte+0x36d/0x5e0
> >     ksm_scan_thread+0x115e/0x1960
> >     ? remove_wait_queue+0x60/0x60
> >     kthread+0xf5/0x130
> >     ? try_to_merge_with_ksm_page+0x90/0x90
> >     ? kthread_create_worker_on_cpu+0x70/0x70
> >     ret_from_fork+0x1f/0x30
> > --------------------------------------------------------------------------
> >
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > Signed-off-by: Xiongchun duan <duanxiongchun@bytedance.com>
>
> ^ why this signed-off ?
>

Because I have a partner. And I just sent the v2 patch which updates the
commit message and patch subject. Thanks for your review.

> > ---
> >  mm/ksm.c | 7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/ksm.c b/mm/ksm.c
> > index a558da9e71770..69b2f85e22d5b 100644
> > --- a/mm/ksm.c
> > +++ b/mm/ksm.c
> > @@ -2112,8 +2112,11 @@ static void cmp_and_merge_page(struct page *page, struct rmap_item *rmap_item)
> >
> >               down_read(&mm->mmap_sem);
> >               vma = find_mergeable_vma(mm, rmap_item->address);
> > -             err = try_to_merge_one_page(vma, page,
> > -                                         ZERO_PAGE(rmap_item->address));
> > +             if (vma)
> > +                     err = try_to_merge_one_page(vma, page,
> > +                                     ZERO_PAGE(rmap_item->address));
> > +             else
> > +                     err = -EFAULT;
> >               up_read(&mm->mmap_sem);
> >               /*
> >                * In case of failure, the page was not really empty, so we
> >
>
> Reviewed-by: David Hildenbrand <david@redhat.com>
>
> --
> Thanks,
>
> David / dhildenb
>
David Hildenbrand April 14, 2020, 12:23 p.m. UTC | #3
On 14.04.20 14:09, Muchun Song wrote:
> On Tue, Apr 14, 2020 at 7:48 PM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 14.04.20 08:20, Muchun Song wrote:
>>> The find_mergeable_vma can return NULL. In this case, it leads
>>> to crash when we access vma->vm_mm(which's offset is 0x40) in
>>> write_protect_page. And this case did happen on our server. The
>>> following calltrace is captured in kernel 4.19 with ksm enabled.
>>> So add a vma check to fix it.
>>>
>>> --------------------------------------------------------------------------
>>>   BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
>>>   PGD 0 P4D 0
>>>   Oops: 0000 [#1] SMP NOPTI
>>>   CPU: 9 PID: 510 Comm: ksmd Kdump: loaded Tainted: G OE 4.19.36.bsk.9-amd64 #4.19.36.bsk.9
>>>   Hardware name: FOXCONN R-5111/GROOT, BIOS IC1B111F 08/17/2019
>>>   RIP: 0010:try_to_merge_one_page+0xc7/0x760
>>>   Code: 24 58 65 48 33 34 25 28 00 00 00 89 e8 0f 85 a3 06 00 00 48 83 c4
>>>         60 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 8b 46 08 a8 01 75 b8 <49>
>>>         8b 44 24 40 4c 8d 7c 24 20 b9 07 00 00 00 4c 89 e6 4c 89 ff 48
>>>   RSP: 0018:ffffadbdd9fffdb0 EFLAGS: 00010246
>>>   RAX: ffffda83ffd4be08 RBX: ffffda83ffd4be40 RCX: 0000002c6e800000
>>>   RDX: 0000000000000000 RSI: ffffda83ffd4be40 RDI: 0000000000000000
>>>   RBP: ffffa11939f02ec0 R08: 0000000094e1a447 R09: 00000000abe76577
>>>   R10: 0000000000000962 R11: 0000000000004e6a R12: 0000000000000000
>>>   R13: ffffda83b1e06380 R14: ffffa18f31f072c0 R15: ffffda83ffd4be40
>>>   FS: 0000000000000000(0000) GS:ffffa0da43b80000(0000) knlGS:0000000000000000
>>>   CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>   CR2: 0000000000000040 CR3: 0000002c77c0a003 CR4: 00000000007626e0
>>>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>>   PKRU: 55555554
>>>   Call Trace:
>>>     ? follow_page_pte+0x36d/0x5e0
>>>     ksm_scan_thread+0x115e/0x1960
>>>     ? remove_wait_queue+0x60/0x60
>>>     kthread+0xf5/0x130
>>>     ? try_to_merge_with_ksm_page+0x90/0x90
>>>     ? kthread_create_worker_on_cpu+0x70/0x70
>>>     ret_from_fork+0x1f/0x30
>>> --------------------------------------------------------------------------
>>>
>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>> Signed-off-by: Xiongchun duan <duanxiongchun@bytedance.com>
>>
>> ^ why this signed-off ?
>>
> 
> Because I have a partner. And I just sent the v2 patch which updates the
> commit message and patch subject. Thanks for your review.

Then we use Co-developed-by AFAIK instead.
Muchun Song April 14, 2020, 1:01 p.m. UTC | #4
On Tue, Apr 14, 2020 at 8:23 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 14.04.20 14:09, Muchun Song wrote:
> > On Tue, Apr 14, 2020 at 7:48 PM David Hildenbrand <david@redhat.com> wrote:
> >>
> >> On 14.04.20 08:20, Muchun Song wrote:
> >>> The find_mergeable_vma can return NULL. In this case, it leads
> >>> to crash when we access vma->vm_mm(which's offset is 0x40) in
> >>> write_protect_page. And this case did happen on our server. The
> >>> following calltrace is captured in kernel 4.19 with ksm enabled.
> >>> So add a vma check to fix it.
> >>>
> >>> --------------------------------------------------------------------------
> >>>   BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
> >>>   PGD 0 P4D 0
> >>>   Oops: 0000 [#1] SMP NOPTI
> >>>   CPU: 9 PID: 510 Comm: ksmd Kdump: loaded Tainted: G OE 4.19.36.bsk.9-amd64 #4.19.36.bsk.9
> >>>   Hardware name: FOXCONN R-5111/GROOT, BIOS IC1B111F 08/17/2019
> >>>   RIP: 0010:try_to_merge_one_page+0xc7/0x760
> >>>   Code: 24 58 65 48 33 34 25 28 00 00 00 89 e8 0f 85 a3 06 00 00 48 83 c4
> >>>         60 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 8b 46 08 a8 01 75 b8 <49>
> >>>         8b 44 24 40 4c 8d 7c 24 20 b9 07 00 00 00 4c 89 e6 4c 89 ff 48
> >>>   RSP: 0018:ffffadbdd9fffdb0 EFLAGS: 00010246
> >>>   RAX: ffffda83ffd4be08 RBX: ffffda83ffd4be40 RCX: 0000002c6e800000
> >>>   RDX: 0000000000000000 RSI: ffffda83ffd4be40 RDI: 0000000000000000
> >>>   RBP: ffffa11939f02ec0 R08: 0000000094e1a447 R09: 00000000abe76577
> >>>   R10: 0000000000000962 R11: 0000000000004e6a R12: 0000000000000000
> >>>   R13: ffffda83b1e06380 R14: ffffa18f31f072c0 R15: ffffda83ffd4be40
> >>>   FS: 0000000000000000(0000) GS:ffffa0da43b80000(0000) knlGS:0000000000000000
> >>>   CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>>   CR2: 0000000000000040 CR3: 0000002c77c0a003 CR4: 00000000007626e0
> >>>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >>>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >>>   PKRU: 55555554
> >>>   Call Trace:
> >>>     ? follow_page_pte+0x36d/0x5e0
> >>>     ksm_scan_thread+0x115e/0x1960
> >>>     ? remove_wait_queue+0x60/0x60
> >>>     kthread+0xf5/0x130
> >>>     ? try_to_merge_with_ksm_page+0x90/0x90
> >>>     ? kthread_create_worker_on_cpu+0x70/0x70
> >>>     ret_from_fork+0x1f/0x30
> >>> --------------------------------------------------------------------------
> >>>
> >>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> >>> Signed-off-by: Xiongchun duan <duanxiongchun@bytedance.com>
> >>
> >> ^ why this signed-off ?
> >>
> >
> > Because I have a partner. And I just sent the v2 patch which updates the
> > commit message and patch subject. Thanks for your review.
>
> Then we use Co-developed-by AFAIK instead.
>

Thanks a lot. I will fix it.
diff mbox series

Patch

diff --git a/mm/ksm.c b/mm/ksm.c
index a558da9e71770..69b2f85e22d5b 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2112,8 +2112,11 @@  static void cmp_and_merge_page(struct page *page, struct rmap_item *rmap_item)
 
 		down_read(&mm->mmap_sem);
 		vma = find_mergeable_vma(mm, rmap_item->address);
-		err = try_to_merge_one_page(vma, page,
-					    ZERO_PAGE(rmap_item->address));
+		if (vma)
+			err = try_to_merge_one_page(vma, page,
+					ZERO_PAGE(rmap_item->address));
+		else
+			err = -EFAULT;
 		up_read(&mm->mmap_sem);
 		/*
 		 * In case of failure, the page was not really empty, so we