Message ID | 20220414195914.1648345-4-song@kernel.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | vmalloc: bpf: introduce VM_ALLOW_HUGE_VMAP | expand |
On Thu, Apr 14, 2022 at 12:59:13PM -0700, Song Liu wrote: > Introduce module_alloc_huge, which allocates huge page backed memory in > module memory space. The primary user of this memory is bpf_prog_pack > (multiple BPF programs sharing a huge page). > > Signed-off-by: Song Liu <song@kernel.org> See modules-next [0], as modules.c has been chopped up as of late. So if you want this to go throug modules this will need to rebased on that tree. fortunately the amount of code in question does not seem like much. [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=modules-next Luis
Hi Luis, On Thu, Apr 14, 2022 at 1:34 PM Luis Chamberlain <mcgrof@kernel.org> wrote: > > On Thu, Apr 14, 2022 at 12:59:13PM -0700, Song Liu wrote: > > Introduce module_alloc_huge, which allocates huge page backed memory in > > module memory space. The primary user of this memory is bpf_prog_pack > > (multiple BPF programs sharing a huge page). > > > > Signed-off-by: Song Liu <song@kernel.org> > > See modules-next [0], as modules.c has been chopped up as of late. > So if you want this to go throug modules this will need to rebased > on that tree. fortunately the amount of code in question does not > seem like much. > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=modules-next We are hoping to ship this with to 5.18, as the set addresses some issue with huge page backed vmalloc. I guess we cannot ship it via modules-next branch. How about we ship module_alloc_huge() to 5.18 in module.c for now, and once we update modules-next branch, I will send another patch to clean it up? Thanks, Song
On Thu, Apr 14, 2022 at 02:03:17PM -0700, Song Liu wrote: > Hi Luis, > > On Thu, Apr 14, 2022 at 1:34 PM Luis Chamberlain <mcgrof@kernel.org> wrote: > > > > On Thu, Apr 14, 2022 at 12:59:13PM -0700, Song Liu wrote: > > > Introduce module_alloc_huge, which allocates huge page backed memory in > > > module memory space. The primary user of this memory is bpf_prog_pack > > > (multiple BPF programs sharing a huge page). > > > > > > Signed-off-by: Song Liu <song@kernel.org> > > > > See modules-next [0], as modules.c has been chopped up as of late. > > So if you want this to go throug modules this will need to rebased > > on that tree. fortunately the amount of code in question does not > > seem like much. > > > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=modules-next > > We are hoping to ship this with to 5.18, as the set addresses some issue with > huge page backed vmalloc. I guess we cannot ship it via modules-next branch. > Huh, you intend this to go in as a fix for v5.18 (already released) once properly reviewed? This seems quite large... for a fix. > How about we ship module_alloc_huge() to 5.18 in module.c for now, and once > we update modules-next branch, I will send another patch to clean it up? I rather set the expectations right about getting such a large fix in for v5.18. I haven't even sat down to review all the changes in light of this, but a cursorary glance seems to me it's rather "large" for a fix. Luis
On Thu, Apr 14, 2022 at 2:11 PM Luis Chamberlain <mcgrof@kernel.org> wrote: > > On Thu, Apr 14, 2022 at 02:03:17PM -0700, Song Liu wrote: > > Hi Luis, > > > > On Thu, Apr 14, 2022 at 1:34 PM Luis Chamberlain <mcgrof@kernel.org> wrote: > > > > > > On Thu, Apr 14, 2022 at 12:59:13PM -0700, Song Liu wrote: > > > > Introduce module_alloc_huge, which allocates huge page backed memory in > > > > module memory space. The primary user of this memory is bpf_prog_pack > > > > (multiple BPF programs sharing a huge page). > > > > > > > > Signed-off-by: Song Liu <song@kernel.org> > > > > > > See modules-next [0], as modules.c has been chopped up as of late. > > > So if you want this to go throug modules this will need to rebased > > > on that tree. fortunately the amount of code in question does not > > > seem like much. > > > > > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=modules-next > > > > We are hoping to ship this with to 5.18, as the set addresses some issue with > > huge page backed vmalloc. I guess we cannot ship it via modules-next branch. > > > > Huh, you intend this to go in as a fix for v5.18 (already released) once > properly reviewed? This seems quite large... for a fix. > > > How about we ship module_alloc_huge() to 5.18 in module.c for now, and once > > we update modules-next branch, I will send another patch to clean it up? > > I rather set the expectations right about getting such a large fix in > for v5.18. I haven't even sat down to review all the changes in light of > this, but a cursorary glance seems to me it's rather "large" for a fix. Yes, I agree this is a little too big for a fix. I guess we can discuss whether some of the set need to wait until 5.19. Thanks, Song
On Thu, Apr 14, 2022 at 12:59:13PM -0700, Song Liu wrote: > Introduce module_alloc_huge, which allocates huge page backed memory in > module memory space. The primary user of this memory is bpf_prog_pack > (multiple BPF programs sharing a huge page). > > Signed-off-by: Song Liu <song@kernel.org> > --- > arch/x86/kernel/module.c | 21 +++++++++++++++++++++ > include/linux/moduleloader.h | 5 +++++ > kernel/module.c | 5 +++++ > 3 files changed, 31 insertions(+) > > diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c > index b98ffcf4d250..63f6a16c70dc 100644 > --- a/arch/x86/kernel/module.c > +++ b/arch/x86/kernel/module.c > @@ -86,6 +86,27 @@ void *module_alloc(unsigned long size) > return p; > } > > +void *module_alloc_huge(unsigned long size) > +{ > + gfp_t gfp_mask = GFP_KERNEL; > + void *p; > + > + if (PAGE_ALIGN(size) > MODULES_LEN) > + return NULL; > + > + p = __vmalloc_node_range(size, MODULE_ALIGN, > + MODULES_VADDR + get_module_load_offset(), > + MODULES_END, gfp_mask, PAGE_KERNEL, > + VM_DEFER_KMEMLEAK | VM_ALLOW_HUGE_VMAP, > + NUMA_NO_NODE, __builtin_return_address(0)); > + if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) { > + vfree(p); > + return NULL; > + } > + > + return p; > +} > + > #ifdef CONFIG_X86_32 > int apply_relocate(Elf32_Shdr *sechdrs, > const char *strtab, > diff --git a/include/linux/moduleloader.h b/include/linux/moduleloader.h > index 9e09d11ffe5b..d34743a88938 100644 > --- a/include/linux/moduleloader.h > +++ b/include/linux/moduleloader.h > @@ -26,6 +26,11 @@ unsigned int arch_mod_section_prepend(struct module *mod, unsigned int section); > sections. Returns NULL on failure. */ > void *module_alloc(unsigned long size); > > +/* Allocator used for allocating memory in module memory space. If size is > + * greater than PMD_SIZE, allow using huge pages. Returns NULL on failure. > + */ > +void *module_alloc_huge(unsigned long size); > + > /* Free memory returned from module_alloc. */ > void module_memfree(void *module_region); > > diff --git a/kernel/module.c b/kernel/module.c > index 6cea788fd965..b2c6cb682a7d 100644 > --- a/kernel/module.c > +++ b/kernel/module.c > @@ -2839,6 +2839,11 @@ void * __weak module_alloc(unsigned long size) > NUMA_NO_NODE, __builtin_return_address(0)); > } > > +void * __weak module_alloc_huge(unsigned long size) > +{ > + return vmalloc_huge(size); > +} Umm. This should use the same parameters as module_alloc except for also passing the new huge page flag.
> On Apr 14, 2022, at 11:32 PM, Christoph Hellwig <hch@infradead.org> wrote: > > On Thu, Apr 14, 2022 at 12:59:13PM -0700, Song Liu wrote: >> Introduce module_alloc_huge, which allocates huge page backed memory in >> module memory space. The primary user of this memory is bpf_prog_pack >> (multiple BPF programs sharing a huge page). >> >> Signed-off-by: Song Liu <song@kernel.org> >> --- >> arch/x86/kernel/module.c | 21 +++++++++++++++++++++ >> include/linux/moduleloader.h | 5 +++++ >> kernel/module.c | 5 +++++ >> 3 files changed, 31 insertions(+) >> >> diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c >> index b98ffcf4d250..63f6a16c70dc 100644 >> --- a/arch/x86/kernel/module.c >> +++ b/arch/x86/kernel/module.c >> @@ -86,6 +86,27 @@ void *module_alloc(unsigned long size) >> return p; >> } >> >> +void *module_alloc_huge(unsigned long size) >> +{ >> + gfp_t gfp_mask = GFP_KERNEL; >> + void *p; >> + >> + if (PAGE_ALIGN(size) > MODULES_LEN) >> + return NULL; >> + >> + p = __vmalloc_node_range(size, MODULE_ALIGN, >> + MODULES_VADDR + get_module_load_offset(), >> + MODULES_END, gfp_mask, PAGE_KERNEL, >> + VM_DEFER_KMEMLEAK | VM_ALLOW_HUGE_VMAP, >> + NUMA_NO_NODE, __builtin_return_address(0)); >> + if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) { >> + vfree(p); >> + return NULL; >> + } >> + >> + return p; >> +} >> + >> #ifdef CONFIG_X86_32 >> int apply_relocate(Elf32_Shdr *sechdrs, >> const char *strtab, >> diff --git a/include/linux/moduleloader.h b/include/linux/moduleloader.h >> index 9e09d11ffe5b..d34743a88938 100644 >> --- a/include/linux/moduleloader.h >> +++ b/include/linux/moduleloader.h >> @@ -26,6 +26,11 @@ unsigned int arch_mod_section_prepend(struct module *mod, unsigned int section); >> sections. Returns NULL on failure. */ >> void *module_alloc(unsigned long size); >> >> +/* Allocator used for allocating memory in module memory space. If size is >> + * greater than PMD_SIZE, allow using huge pages. Returns NULL on failure. >> + */ >> +void *module_alloc_huge(unsigned long size); >> + >> /* Free memory returned from module_alloc. */ >> void module_memfree(void *module_region); >> >> diff --git a/kernel/module.c b/kernel/module.c >> index 6cea788fd965..b2c6cb682a7d 100644 >> --- a/kernel/module.c >> +++ b/kernel/module.c >> @@ -2839,6 +2839,11 @@ void * __weak module_alloc(unsigned long size) >> NUMA_NO_NODE, __builtin_return_address(0)); >> } >> >> +void * __weak module_alloc_huge(unsigned long size) >> +{ >> + return vmalloc_huge(size); >> +} > > Umm. This should use the same parameters as module_alloc except for > also passing the new huge page flag. Will fix the set and send v4. Thanks, Song
On Thu, Apr 14, 2022 at 02:31:18PM -0700, Song Liu wrote: > On Thu, Apr 14, 2022 at 2:11 PM Luis Chamberlain <mcgrof@kernel.org> wrote: > > > > On Thu, Apr 14, 2022 at 02:03:17PM -0700, Song Liu wrote: > > > Hi Luis, > > > > > > On Thu, Apr 14, 2022 at 1:34 PM Luis Chamberlain <mcgrof@kernel.org> wrote: > > > > > > > > On Thu, Apr 14, 2022 at 12:59:13PM -0700, Song Liu wrote: > > > > > Introduce module_alloc_huge, which allocates huge page backed memory in > > > > > module memory space. The primary user of this memory is bpf_prog_pack > > > > > (multiple BPF programs sharing a huge page). > > > > > > > > > > Signed-off-by: Song Liu <song@kernel.org> > > > > > > > > See modules-next [0], as modules.c has been chopped up as of late. > > > > So if you want this to go throug modules this will need to rebased > > > > on that tree. fortunately the amount of code in question does not > > > > seem like much. > > > > > > > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=modules-next > > > > > > We are hoping to ship this with to 5.18, as the set addresses some issue with > > > huge page backed vmalloc. I guess we cannot ship it via modules-next branch. > > > > > > > Huh, you intend this to go in as a fix for v5.18 (already released) once > > properly reviewed? This seems quite large... for a fix. > > > > > How about we ship module_alloc_huge() to 5.18 in module.c for now, and once > > > we update modules-next branch, I will send another patch to clean it up? > > > > I rather set the expectations right about getting such a large fix in > > for v5.18. I haven't even sat down to review all the changes in light of > > this, but a cursorary glance seems to me it's rather "large" for a fix. > > Yes, I agree this is a little too big for a fix. I guess we can discuss whether > some of the set need to wait until 5.19. Doing a more thorough review of this now, and when the other changes landed, it seems this is *large follow up fix* for an optimization for when tons of JIT eBPF programs are used. It's so large I can't be confident this also doesn't go in with other holes or issues, or that the other stuff merged already also has some other issues. So I can't see anything screaming for why this needs to go in for v5.18 other than it'd be nice. So my preference is for this to go through v5.19 as I see no rush. Luis
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c index b98ffcf4d250..63f6a16c70dc 100644 --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -86,6 +86,27 @@ void *module_alloc(unsigned long size) return p; } +void *module_alloc_huge(unsigned long size) +{ + gfp_t gfp_mask = GFP_KERNEL; + void *p; + + if (PAGE_ALIGN(size) > MODULES_LEN) + return NULL; + + p = __vmalloc_node_range(size, MODULE_ALIGN, + MODULES_VADDR + get_module_load_offset(), + MODULES_END, gfp_mask, PAGE_KERNEL, + VM_DEFER_KMEMLEAK | VM_ALLOW_HUGE_VMAP, + NUMA_NO_NODE, __builtin_return_address(0)); + if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) { + vfree(p); + return NULL; + } + + return p; +} + #ifdef CONFIG_X86_32 int apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, diff --git a/include/linux/moduleloader.h b/include/linux/moduleloader.h index 9e09d11ffe5b..d34743a88938 100644 --- a/include/linux/moduleloader.h +++ b/include/linux/moduleloader.h @@ -26,6 +26,11 @@ unsigned int arch_mod_section_prepend(struct module *mod, unsigned int section); sections. Returns NULL on failure. */ void *module_alloc(unsigned long size); +/* Allocator used for allocating memory in module memory space. If size is + * greater than PMD_SIZE, allow using huge pages. Returns NULL on failure. + */ +void *module_alloc_huge(unsigned long size); + /* Free memory returned from module_alloc. */ void module_memfree(void *module_region); diff --git a/kernel/module.c b/kernel/module.c index 6cea788fd965..b2c6cb682a7d 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2839,6 +2839,11 @@ void * __weak module_alloc(unsigned long size) NUMA_NO_NODE, __builtin_return_address(0)); } +void * __weak module_alloc_huge(unsigned long size) +{ + return vmalloc_huge(size); +} + bool __weak module_init_section(const char *name) { return strstarts(name, ".init");
Introduce module_alloc_huge, which allocates huge page backed memory in module memory space. The primary user of this memory is bpf_prog_pack (multiple BPF programs sharing a huge page). Signed-off-by: Song Liu <song@kernel.org> --- arch/x86/kernel/module.c | 21 +++++++++++++++++++++ include/linux/moduleloader.h | 5 +++++ kernel/module.c | 5 +++++ 3 files changed, 31 insertions(+)