Message ID | 20241120201151.9518-1-david@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v1] mm/mempolicy: fix migrate_to_node() assuming there is at least one VMA in a MM | expand |
* David Hildenbrand <david@redhat.com> [241120 15:12]: > We currently assume that there is at least one VMA in a MM, which isn't > true. > > So we might end up having find_vma() return NULL, to then de-reference > NULL. So properly handle find_vma() returning NULL. > > This fixes the report: > > Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI > KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] > CPU: 1 UID: 0 PID: 6021 Comm: syz-executor284 Not tainted 6.12.0-rc7-syzkaller-00187-gf868cd251776 #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024 > RIP: 0010:migrate_to_node mm/mempolicy.c:1090 [inline] > RIP: 0010:do_migrate_pages+0x403/0x6f0 mm/mempolicy.c:1194 > Code: ... > RSP: 0018:ffffc9000375fd08 EFLAGS: 00010246 > RAX: 0000000000000000 RBX: ffffc9000375fd78 RCX: 0000000000000000 > RDX: ffff88807e171300 RSI: dffffc0000000000 RDI: ffff88803390c044 > RBP: ffff88807e171428 R08: 0000000000000014 R09: fffffbfff2039ef1 > R10: ffffffff901cf78f R11: 0000000000000000 R12: 0000000000000003 > R13: ffffc9000375fe90 R14: ffffc9000375fe98 R15: ffffc9000375fdf8 > FS: 00005555919e1380(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00005555919e1ca8 CR3: 000000007f12a000 CR4: 00000000003526f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > <TASK> > kernel_migrate_pages+0x5b2/0x750 mm/mempolicy.c:1709 > __do_sys_migrate_pages mm/mempolicy.c:1727 [inline] > __se_sys_migrate_pages mm/mempolicy.c:1723 [inline] > __x64_sys_migrate_pages+0x96/0x100 mm/mempolicy.c:1723 > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > Fixes: 39743889aaf7 ("[PATCH] Swap Migration V5: sys_migrate_pages interface") > Reported-by: syzbot+3511625422f7aa637f0d@syzkaller.appspotmail.com > Closes: https://lore.kernel.org/lkml/673d2696.050a0220.3c9d61.012f.GAE@google.com/T/ > Cc: <stable@vger.kernel.org> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Christoph Lameter <cl@linux.com> > Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> > Signed-off-by: David Hildenbrand <david@redhat.com> I hate the extra check because syzbot can cause this as this should basically never happen in real life, but it seems we have to add it. I wonder where else this is could show up. Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> > --- > mm/mempolicy.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > index b646fab3e45e1..fbb6127e4595a 100644 > --- a/mm/mempolicy.c > +++ b/mm/mempolicy.c > @@ -1080,6 +1080,10 @@ static long migrate_to_node(struct mm_struct *mm, int source, int dest, > > mmap_read_lock(mm); > vma = find_vma(mm, 0); > + if (!vma) { > + mmap_read_unlock(mm); > + return 0; > + } > > /* > * This does not migrate the range, but isolates all pages that > -- > 2.47.0 > >
On 20.11.24 21:27, Liam R. Howlett wrote: > * David Hildenbrand <david@redhat.com> [241120 15:12]: >> We currently assume that there is at least one VMA in a MM, which isn't >> true. >> >> So we might end up having find_vma() return NULL, to then de-reference >> NULL. So properly handle find_vma() returning NULL. >> >> This fixes the report: >> >> Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI >> KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] >> CPU: 1 UID: 0 PID: 6021 Comm: syz-executor284 Not tainted 6.12.0-rc7-syzkaller-00187-gf868cd251776 #0 >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024 >> RIP: 0010:migrate_to_node mm/mempolicy.c:1090 [inline] >> RIP: 0010:do_migrate_pages+0x403/0x6f0 mm/mempolicy.c:1194 >> Code: ... >> RSP: 0018:ffffc9000375fd08 EFLAGS: 00010246 >> RAX: 0000000000000000 RBX: ffffc9000375fd78 RCX: 0000000000000000 >> RDX: ffff88807e171300 RSI: dffffc0000000000 RDI: ffff88803390c044 >> RBP: ffff88807e171428 R08: 0000000000000014 R09: fffffbfff2039ef1 >> R10: ffffffff901cf78f R11: 0000000000000000 R12: 0000000000000003 >> R13: ffffc9000375fe90 R14: ffffc9000375fe98 R15: ffffc9000375fdf8 >> FS: 00005555919e1380(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 00005555919e1ca8 CR3: 000000007f12a000 CR4: 00000000003526f0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> Call Trace: >> <TASK> >> kernel_migrate_pages+0x5b2/0x750 mm/mempolicy.c:1709 >> __do_sys_migrate_pages mm/mempolicy.c:1727 [inline] >> __se_sys_migrate_pages mm/mempolicy.c:1723 [inline] >> __x64_sys_migrate_pages+0x96/0x100 mm/mempolicy.c:1723 >> do_syscall_x64 arch/x86/entry/common.c:52 [inline] >> do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 >> entry_SYSCALL_64_after_hwframe+0x77/0x7f >> >> Fixes: 39743889aaf7 ("[PATCH] Swap Migration V5: sys_migrate_pages interface") >> Reported-by: syzbot+3511625422f7aa637f0d@syzkaller.appspotmail.com >> Closes: https://lore.kernel.org/lkml/673d2696.050a0220.3c9d61.012f.GAE@google.com/T/ >> Cc: <stable@vger.kernel.org> >> Cc: Andrew Morton <akpm@linux-foundation.org> >> Cc: Christoph Lameter <cl@linux.com> >> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> >> Signed-off-by: David Hildenbrand <david@redhat.com> > > I hate the extra check because syzbot can cause this as this should > basically never happen in real life, but it seems we have to add it. I think the reproducer achieves it by doing an MADV_DONTFORK on all VMAs and then fork'ing. Likely it doesn't make sense to have a new MM without any VMAs, because it cannot do anything reasonable. But then, I'm not 100% sure if there are other creative ways to obtain/achieve the same. $ git grep "find_vma(mm, 0)" mm/mempolicy.c: vma = find_vma(mm, 0); Apart from that there seems to be kernel/bpf/task_iter.c where we do curr_vma = find_vma(curr_mm, 0); and properly check for NULL later. So this one sticks out, and is not on anything that I consider a fast path ... easy fix. :) Thanks!
diff --git a/mm/mempolicy.c b/mm/mempolicy.c index b646fab3e45e1..fbb6127e4595a 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1080,6 +1080,10 @@ static long migrate_to_node(struct mm_struct *mm, int source, int dest, mmap_read_lock(mm); vma = find_vma(mm, 0); + if (!vma) { + mmap_read_unlock(mm); + return 0; + } /* * This does not migrate the range, but isolates all pages that
We currently assume that there is at least one VMA in a MM, which isn't true. So we might end up having find_vma() return NULL, to then de-reference NULL. So properly handle find_vma() returning NULL. This fixes the report: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 1 UID: 0 PID: 6021 Comm: syz-executor284 Not tainted 6.12.0-rc7-syzkaller-00187-gf868cd251776 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024 RIP: 0010:migrate_to_node mm/mempolicy.c:1090 [inline] RIP: 0010:do_migrate_pages+0x403/0x6f0 mm/mempolicy.c:1194 Code: ... RSP: 0018:ffffc9000375fd08 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffc9000375fd78 RCX: 0000000000000000 RDX: ffff88807e171300 RSI: dffffc0000000000 RDI: ffff88803390c044 RBP: ffff88807e171428 R08: 0000000000000014 R09: fffffbfff2039ef1 R10: ffffffff901cf78f R11: 0000000000000000 R12: 0000000000000003 R13: ffffc9000375fe90 R14: ffffc9000375fe98 R15: ffffc9000375fdf8 FS: 00005555919e1380(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005555919e1ca8 CR3: 000000007f12a000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> kernel_migrate_pages+0x5b2/0x750 mm/mempolicy.c:1709 __do_sys_migrate_pages mm/mempolicy.c:1727 [inline] __se_sys_migrate_pages mm/mempolicy.c:1723 [inline] __x64_sys_migrate_pages+0x96/0x100 mm/mempolicy.c:1723 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: 39743889aaf7 ("[PATCH] Swap Migration V5: sys_migrate_pages interface") Reported-by: syzbot+3511625422f7aa637f0d@syzkaller.appspotmail.com Closes: https://lore.kernel.org/lkml/673d2696.050a0220.3c9d61.012f.GAE@google.com/T/ Cc: <stable@vger.kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: David Hildenbrand <david@redhat.com> --- mm/mempolicy.c | 4 ++++ 1 file changed, 4 insertions(+)