Message ID | 20210610154220.529122-2-imbrenda@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm: add vmalloc_no_huge and use it | expand |
Hello. See below a small nit: > The recent patches to add support for hugepage vmalloc mappings added a > flag for __vmalloc_node_range to allow to request small pages. > This flag is not accessible when calling vmalloc, the only option is to > call directly __vmalloc_node_range, which is not exported. > > This means that a module can't vmalloc memory with small pages. > > Case in point: KVM on s390x needs to vmalloc a large area, and it needs > to be mapped with small pages, because of a hardware limitation. > > This patch adds the function vmalloc_no_huge, which works like vmalloc, > but it is guaranteed to always back the mapping using small pages. This > function is exported, therefore it is usable by modules. > > Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Nicholas Piggin <npiggin@gmail.com> > Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: David Rientjes <rientjes@google.com> > Cc: Christoph Hellwig <hch@infradead.org> > --- > include/linux/vmalloc.h | 1 + > mm/vmalloc.c | 16 ++++++++++++++++ > 2 files changed, 17 insertions(+) > > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h > index 4d668abb6391..bfaaf0b6fa76 100644 > --- a/include/linux/vmalloc.h > +++ b/include/linux/vmalloc.h > @@ -135,6 +135,7 @@ extern void *__vmalloc_node_range(unsigned long size, unsigned long align, > const void *caller); > void *__vmalloc_node(unsigned long size, unsigned long align, gfp_t gfp_mask, > int node, const void *caller); > +void *vmalloc_no_huge(unsigned long size); > > extern void vfree(const void *addr); > extern void vfree_atomic(const void *addr); > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index a13ac524f6ff..296a2fcc3fbe 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2998,6 +2998,22 @@ void *vmalloc(unsigned long size) > } > EXPORT_SYMBOL(vmalloc); > > +/** > + * vmalloc_no_huge - allocate virtually contiguous memory using small pages > + * @size: allocation size > + * You state that it allocates using "small pages". I think it might be confused for people because of that vague meaning. The comment should be improved, imho, saying rather about order-0 pages what we call "small pages". > + * Allocate enough non-huge pages to cover @size from the page level > + * allocator and map them into contiguous kernel virtual space. > + * > + * Return: pointer to the allocated memory or %NULL on error > + */ > +void *vmalloc_no_huge(unsigned long size) > +{ > + return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END, GFP_KERNEL, PAGE_KERNEL, > + VM_NO_HUGE_VMAP, NUMA_NO_NODE, __builtin_return_address(0)); > +} > +EXPORT_SYMBOL(vmalloc_no_huge); > + > /** > * vzalloc - allocate virtually contiguous memory with zero fill > * @size: allocation size > -- > 2.31.1 > anyone looks good to me, please use: Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Thanks. -- Vlad Rezki
Excerpts from Claudio Imbrenda's message of June 11, 2021 1:42 am: > The recent patches to add support for hugepage vmalloc mappings added a > flag for __vmalloc_node_range to allow to request small pages. > This flag is not accessible when calling vmalloc, the only option is to > call directly __vmalloc_node_range, which is not exported. > > This means that a module can't vmalloc memory with small pages. > > Case in point: KVM on s390x needs to vmalloc a large area, and it needs > to be mapped with small pages, because of a hardware limitation. > > This patch adds the function vmalloc_no_huge, which works like vmalloc, > but it is guaranteed to always back the mapping using small pages. This > function is exported, therefore it is usable by modules. > > Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Nicholas Piggin <npiggin@gmail.com> > Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: David Rientjes <rientjes@google.com> > Cc: Christoph Hellwig <hch@infradead.org> > --- > include/linux/vmalloc.h | 1 + > mm/vmalloc.c | 16 ++++++++++++++++ > 2 files changed, 17 insertions(+) > > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h > index 4d668abb6391..bfaaf0b6fa76 100644 > --- a/include/linux/vmalloc.h > +++ b/include/linux/vmalloc.h > @@ -135,6 +135,7 @@ extern void *__vmalloc_node_range(unsigned long size, unsigned long align, > const void *caller); > void *__vmalloc_node(unsigned long size, unsigned long align, gfp_t gfp_mask, > int node, const void *caller); > +void *vmalloc_no_huge(unsigned long size); > > extern void vfree(const void *addr); > extern void vfree_atomic(const void *addr); > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index a13ac524f6ff..296a2fcc3fbe 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2998,6 +2998,22 @@ void *vmalloc(unsigned long size) > } > EXPORT_SYMBOL(vmalloc); > > +/** > + * vmalloc_no_huge - allocate virtually contiguous memory using small pages > + * @size: allocation size > + * > + * Allocate enough non-huge pages to cover @size from the page level > + * allocator and map them into contiguous kernel virtual space. > + * > + * Return: pointer to the allocated memory or %NULL on error > + */ > +void *vmalloc_no_huge(unsigned long size) > +{ > + return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END, GFP_KERNEL, PAGE_KERNEL, > + VM_NO_HUGE_VMAP, NUMA_NO_NODE, __builtin_return_address(0)); > +} > +EXPORT_SYMBOL(vmalloc_no_huge); At some point if the combination of flags becomes too much we will need a different strategy. A vmalloc API with (size, align, gfp_t, vm_flags, node) args would help 3/6 of the existing non-arch callers too. And one more if you had a prot parameter or _exec variant. But for now I'm okay with this. Acked-by: Nicholas Piggin <npiggin@gmail.com> Thanks, Nick
diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 4d668abb6391..bfaaf0b6fa76 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -135,6 +135,7 @@ extern void *__vmalloc_node_range(unsigned long size, unsigned long align, const void *caller); void *__vmalloc_node(unsigned long size, unsigned long align, gfp_t gfp_mask, int node, const void *caller); +void *vmalloc_no_huge(unsigned long size); extern void vfree(const void *addr); extern void vfree_atomic(const void *addr); diff --git a/mm/vmalloc.c b/mm/vmalloc.c index a13ac524f6ff..296a2fcc3fbe 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2998,6 +2998,22 @@ void *vmalloc(unsigned long size) } EXPORT_SYMBOL(vmalloc); +/** + * vmalloc_no_huge - allocate virtually contiguous memory using small pages + * @size: allocation size + * + * Allocate enough non-huge pages to cover @size from the page level + * allocator and map them into contiguous kernel virtual space. + * + * Return: pointer to the allocated memory or %NULL on error + */ +void *vmalloc_no_huge(unsigned long size) +{ + return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END, GFP_KERNEL, PAGE_KERNEL, + VM_NO_HUGE_VMAP, NUMA_NO_NODE, __builtin_return_address(0)); +} +EXPORT_SYMBOL(vmalloc_no_huge); + /** * vzalloc - allocate virtually contiguous memory with zero fill * @size: allocation size
The recent patches to add support for hugepage vmalloc mappings added a flag for __vmalloc_node_range to allow to request small pages. This flag is not accessible when calling vmalloc, the only option is to call directly __vmalloc_node_range, which is not exported. This means that a module can't vmalloc memory with small pages. Case in point: KVM on s390x needs to vmalloc a large area, and it needs to be mapped with small pages, because of a hardware limitation. This patch adds the function vmalloc_no_huge, which works like vmalloc, but it is guaranteed to always back the mapping using small pages. This function is exported, therefore it is usable by modules. Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Christoph Hellwig <hch@infradead.org> --- include/linux/vmalloc.h | 1 + mm/vmalloc.c | 16 ++++++++++++++++ 2 files changed, 17 insertions(+)