Message ID | 20191029042059.28541-5-dja@axtens.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | kasan: support backing vmalloc space with real shadow memory | expand |
On 10/29/19 7:20 AM, Daniel Axtens wrote: > In the case where KASAN directly allocates memory to back vmalloc > space, don't map the early shadow page over it. > > We prepopulate pgds/p4ds for the range that would otherwise be empty. > This is required to get it synced to hardware on boot, allowing the > lower levels of the page tables to be filled dynamically. > > Acked-by: Dmitry Vyukov <dvyukov@google.com> > Signed-off-by: Daniel Axtens <dja@axtens.net> > > --- > +static void __init kasan_shallow_populate_pgds(void *start, void *end) > +{ > + unsigned long addr, next; > + pgd_t *pgd; > + void *p; > + int nid = early_pfn_to_nid((unsigned long)start); This doesn't make sense. start is not even a pfn. With linear mapping we try to identify nid to have the shadow on the same node as memory. But in this case we don't have memory or the corresponding shadow (yet), we only install pgd/p4d. I guess we could just use NUMA_NO_NODE. The rest looks ok, so with that fixed: Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Andrey Ryabinin <aryabinin@virtuozzo.com> writes: > On 10/29/19 7:20 AM, Daniel Axtens wrote: >> In the case where KASAN directly allocates memory to back vmalloc >> space, don't map the early shadow page over it. >> >> We prepopulate pgds/p4ds for the range that would otherwise be empty. >> This is required to get it synced to hardware on boot, allowing the >> lower levels of the page tables to be filled dynamically. >> >> Acked-by: Dmitry Vyukov <dvyukov@google.com> >> Signed-off-by: Daniel Axtens <dja@axtens.net> >> >> --- > >> +static void __init kasan_shallow_populate_pgds(void *start, void *end) >> +{ >> + unsigned long addr, next; >> + pgd_t *pgd; >> + void *p; >> + int nid = early_pfn_to_nid((unsigned long)start); > > This doesn't make sense. start is not even a pfn. With linear mapping > we try to identify nid to have the shadow on the same node as memory. But > in this case we don't have memory or the corresponding shadow (yet), > we only install pgd/p4d. > I guess we could just use NUMA_NO_NODE. Ah wow, that's quite the clanger on my part. There are a couple of other invocations of early_pfn_to_nid in that file that use an address directly, but at least they reference actual memory. I'll send a separate patch to fix those up. > The rest looks ok, so with that fixed: > > Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Thanks heaps! I've fixed up the nit you identifed in the first patch, and I agree that the last patch probably isn't needed. I'll respin the series shortly. Regards, Daniel
On 10/30/19 4:50 PM, Daniel Axtens wrote: > Andrey Ryabinin <aryabinin@virtuozzo.com> writes: > >> On 10/29/19 7:20 AM, Daniel Axtens wrote: >>> In the case where KASAN directly allocates memory to back vmalloc >>> space, don't map the early shadow page over it. >>> >>> We prepopulate pgds/p4ds for the range that would otherwise be empty. >>> This is required to get it synced to hardware on boot, allowing the >>> lower levels of the page tables to be filled dynamically. >>> >>> Acked-by: Dmitry Vyukov <dvyukov@google.com> >>> Signed-off-by: Daniel Axtens <dja@axtens.net> >>> >>> --- >> >>> +static void __init kasan_shallow_populate_pgds(void *start, void *end) >>> +{ >>> + unsigned long addr, next; >>> + pgd_t *pgd; >>> + void *p; >>> + int nid = early_pfn_to_nid((unsigned long)start); >> >> This doesn't make sense. start is not even a pfn. With linear mapping >> we try to identify nid to have the shadow on the same node as memory. But >> in this case we don't have memory or the corresponding shadow (yet), >> we only install pgd/p4d. >> I guess we could just use NUMA_NO_NODE. > > Ah wow, that's quite the clanger on my part. > > There are a couple of other invocations of early_pfn_to_nid in that file > that use an address directly, but at least they reference actual memory. > I'll send a separate patch to fix those up. I see only one incorrect, in kasan_init(): early_pfn_to_nid(__pa(_stext)) It should be wrapped with PFN_DOWN(). Other usages in map_range() seems to be correct, range->start,end is pfns. > >> The rest looks ok, so with that fixed: >> >> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> > > Thanks heaps! I've fixed up the nit you identifed in the first patch, > and I agree that the last patch probably isn't needed. I'll respin the > series shortly. > Hold on a sec, just spotted another thing to fix. > @@ -352,9 +397,24 @@ void __init kasan_init(void) > shadow_cpu_entry_end = (void *)round_up( > (unsigned long)shadow_cpu_entry_end, PAGE_SIZE); > > + /* > + * If we're in full vmalloc mode, don't back vmalloc space with early > + * shadow pages. Instead, prepopulate pgds/p4ds so they are synced to > + * the global table and we can populate the lower levels on demand. > + */ > +#ifdef CONFIG_KASAN_VMALLOC > + kasan_shallow_populate_pgds( > + kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM), This should be VMALLOC_START, there is no point to allocate pgds for the hole between linear mapping and vmalloc, just waste of memory. It make sense to map early shadow for that hole, because if code dereferences address in that hole we will see the page fault on that address instead of fault on the shadow. So something like this might work: kasan_populate_early_shadow( kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM), kasan_mem_to_shadow((void *)VMALLOC_START)); if (IS_ENABLED(CONFIG_KASAN_VMALLOC) kasan_shallow_populate_pgds(kasan_mem_to_shadow(VMALLOC_START), kasan_mem_to_shadow((void *)VMALLOC_END)) else kasan_populate_early_shadow(kasan_mem_to_shadow(VMALLOC_START), kasan_mem_to_shadow((void *)VMALLOC_END)); kasan_populate_early_shadow( kasan_mem_to_shadow((void *)VMALLOC_END + 1), shadow_cpu_entry_begin);
Andrey Ryabinin <aryabinin@virtuozzo.com> writes: > On 10/30/19 4:50 PM, Daniel Axtens wrote: >> Andrey Ryabinin <aryabinin@virtuozzo.com> writes: >> >>> On 10/29/19 7:20 AM, Daniel Axtens wrote: >>>> In the case where KASAN directly allocates memory to back vmalloc >>>> space, don't map the early shadow page over it. >>>> >>>> We prepopulate pgds/p4ds for the range that would otherwise be empty. >>>> This is required to get it synced to hardware on boot, allowing the >>>> lower levels of the page tables to be filled dynamically. >>>> >>>> Acked-by: Dmitry Vyukov <dvyukov@google.com> >>>> Signed-off-by: Daniel Axtens <dja@axtens.net> >>>> >>>> --- >>> >>>> +static void __init kasan_shallow_populate_pgds(void *start, void *end) >>>> +{ >>>> + unsigned long addr, next; >>>> + pgd_t *pgd; >>>> + void *p; >>>> + int nid = early_pfn_to_nid((unsigned long)start); >>> >>> This doesn't make sense. start is not even a pfn. With linear mapping >>> we try to identify nid to have the shadow on the same node as memory. But >>> in this case we don't have memory or the corresponding shadow (yet), >>> we only install pgd/p4d. >>> I guess we could just use NUMA_NO_NODE. >> >> Ah wow, that's quite the clanger on my part. >> >> There are a couple of other invocations of early_pfn_to_nid in that file >> that use an address directly, but at least they reference actual memory. >> I'll send a separate patch to fix those up. > > I see only one incorrect, in kasan_init(): early_pfn_to_nid(__pa(_stext)) > It should be wrapped with PFN_DOWN(). > Other usages in map_range() seems to be correct, range->start,end is pfns. > Oh, right, I didn't realise map_range was already using pfns. > >> >>> The rest looks ok, so with that fixed: >>> >>> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> >> >> Thanks heaps! I've fixed up the nit you identifed in the first patch, >> and I agree that the last patch probably isn't needed. I'll respin the >> series shortly. >> > > Hold on a sec, just spotted another thing to fix. > >> @@ -352,9 +397,24 @@ void __init kasan_init(void) >> shadow_cpu_entry_end = (void *)round_up( >> (unsigned long)shadow_cpu_entry_end, PAGE_SIZE); >> >> + /* >> + * If we're in full vmalloc mode, don't back vmalloc space with early >> + * shadow pages. Instead, prepopulate pgds/p4ds so they are synced to >> + * the global table and we can populate the lower levels on demand. >> + */ >> +#ifdef CONFIG_KASAN_VMALLOC >> + kasan_shallow_populate_pgds( >> + kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM), > > This should be VMALLOC_START, there is no point to allocate pgds for the hole between linear mapping > and vmalloc, just waste of memory. It make sense to map early shadow for that hole, because if code > dereferences address in that hole we will see the page fault on that address instead of fault on the shadow. > > So something like this might work: > > kasan_populate_early_shadow( > kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM), > kasan_mem_to_shadow((void *)VMALLOC_START)); > > if (IS_ENABLED(CONFIG_KASAN_VMALLOC) > kasan_shallow_populate_pgds(kasan_mem_to_shadow(VMALLOC_START), kasan_mem_to_shadow((void *)VMALLOC_END)) > else > kasan_populate_early_shadow(kasan_mem_to_shadow(VMALLOC_START), kasan_mem_to_shadow((void *)VMALLOC_END)); > > kasan_populate_early_shadow( > kasan_mem_to_shadow((void *)VMALLOC_END + 1), > shadow_cpu_entry_begin); Sounds good. It's getting late for me so I'll change and test that and send a respin tomorrow my time. Regards, Daniel
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 45699e458057..d65b0fcc9bc0 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -135,6 +135,7 @@ config X86 select HAVE_ARCH_JUMP_LABEL select HAVE_ARCH_JUMP_LABEL_RELATIVE select HAVE_ARCH_KASAN if X86_64 + select HAVE_ARCH_KASAN_VMALLOC if X86_64 select HAVE_ARCH_KGDB select HAVE_ARCH_MMAP_RND_BITS if MMU select HAVE_ARCH_MMAP_RND_COMPAT_BITS if MMU && COMPAT diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c index 296da58f3013..8f00f462709e 100644 --- a/arch/x86/mm/kasan_init_64.c +++ b/arch/x86/mm/kasan_init_64.c @@ -245,6 +245,51 @@ static void __init kasan_map_early_shadow(pgd_t *pgd) } while (pgd++, addr = next, addr != end); } +static void __init kasan_shallow_populate_p4ds(pgd_t *pgd, + unsigned long addr, + unsigned long end, + int nid) +{ + p4d_t *p4d; + unsigned long next; + void *p; + + p4d = p4d_offset(pgd, addr); + do { + next = p4d_addr_end(addr, end); + + if (p4d_none(*p4d)) { + p = early_alloc(PAGE_SIZE, nid, true); + p4d_populate(&init_mm, p4d, p); + } + } while (p4d++, addr = next, addr != end); +} + +static void __init kasan_shallow_populate_pgds(void *start, void *end) +{ + unsigned long addr, next; + pgd_t *pgd; + void *p; + int nid = early_pfn_to_nid((unsigned long)start); + + addr = (unsigned long)start; + pgd = pgd_offset_k(addr); + do { + next = pgd_addr_end(addr, (unsigned long)end); + + if (pgd_none(*pgd)) { + p = early_alloc(PAGE_SIZE, nid, true); + pgd_populate(&init_mm, pgd, p); + } + + /* + * we need to populate p4ds to be synced when running in + * four level mode - see sync_global_pgds_l4() + */ + kasan_shallow_populate_p4ds(pgd, addr, next, nid); + } while (pgd++, addr = next, addr != (unsigned long)end); +} + #ifdef CONFIG_KASAN_INLINE static int kasan_die_handler(struct notifier_block *self, unsigned long val, @@ -352,9 +397,24 @@ void __init kasan_init(void) shadow_cpu_entry_end = (void *)round_up( (unsigned long)shadow_cpu_entry_end, PAGE_SIZE); + /* + * If we're in full vmalloc mode, don't back vmalloc space with early + * shadow pages. Instead, prepopulate pgds/p4ds so they are synced to + * the global table and we can populate the lower levels on demand. + */ +#ifdef CONFIG_KASAN_VMALLOC + kasan_shallow_populate_pgds( + kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM), + kasan_mem_to_shadow((void *)VMALLOC_END)); + + kasan_populate_early_shadow( + kasan_mem_to_shadow((void *)VMALLOC_END + 1), + shadow_cpu_entry_begin); +#else kasan_populate_early_shadow( kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM), shadow_cpu_entry_begin); +#endif kasan_populate_shadow((unsigned long)shadow_cpu_entry_begin, (unsigned long)shadow_cpu_entry_end, 0);