From patchwork Mon Oct 14 10:58:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13834679 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D7ACD1A443 for ; Mon, 14 Oct 2024 11:00:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 334676B0085; Mon, 14 Oct 2024 07:00:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 30B436B0096; Mon, 14 Oct 2024 07:00:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1ACA76B0098; Mon, 14 Oct 2024 07:00:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id F29876B0085 for ; Mon, 14 Oct 2024 07:00:04 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id CA84E120D73 for ; Mon, 14 Oct 2024 10:59:57 +0000 (UTC) X-FDA: 82671912876.07.14B9DDB Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf08.hostedemail.com (Postfix) with ESMTP id BDD1E160003 for ; Mon, 14 Oct 2024 10:59:58 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf08.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728903462; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=B2YC/X3UwnXWGexJl2uM83krn8GDdx7g25hAIge8UU4=; b=4hxsbNDiNXIMzSj3Ubpr3Pz/4Lopf4bfjfJ4lOHEO2U95mj0qvmqqeQP9TZ7ldNk/KdzxI CmZ73qFAomC8Z+YKU/KtVqS5ERDXj1O4yKXrBwrY7uklOPiTGLjU5Z9euHXHo4Y9IL4ljU jDV5Dkz46LrjHNB9EfXPx7X1BPNRJEI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728903462; a=rsa-sha256; cv=none; b=LgSb7WenxwobXVIXLAkMPCzXa73iI/K/Hkoj+6ev+1KdOyHclLU1VvYDq+B3FLveGu2rCX h3M+XcdgtJL6OmCPq5vQ0nJSJrUvRd0cLvQWv99jRzlaGBe1QMUaOupsAt0GyCDSxKJt63 uoIkrQkkXJn5IFVeDDACcadetFSjL5A= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf08.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E3B391424; Mon, 14 Oct 2024 04:00:31 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E14783F51B; Mon, 14 Oct 2024 03:59:58 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Andrey Ryabinin , Anshuman Khandual , Ard Biesheuvel , Arnd Bergmann , Catalin Marinas , David Hildenbrand , Greg Marsden , Ingo Molnar , Ivan Ivanov , Juri Lelli , Kalesh Singh , Marc Zyngier , Mark Rutland , Matthias Brugger , Miroslav Benes , Peter Zijlstra , Vincent Guittot , Will Deacon Cc: Ryan Roberts , kasan-dev@googlegroups.com, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH v1 11/57] fork: Permit boot-time THREAD_SIZE determination Date: Mon, 14 Oct 2024 11:58:18 +0100 Message-ID: <20241014105912.3207374-11-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241014105912.3207374-1-ryan.roberts@arm.com> References: <20241014105514.3206191-1-ryan.roberts@arm.com> <20241014105912.3207374-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: BDD1E160003 X-Stat-Signature: 7quwmh649fjh9wqhx1r6wbas9wyb5cz4 X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1728903598-940022 X-HE-Meta: U2FsdGVkX1/xROh/yNM4HtPpK4B3TFB2NXwFmk8h+c/v9UTU7YeIippLVfw54sQxzs2Dhm1dbCa3c3a4XhNiVe46/GW40kDax7fbGCXTpywPh1cfIcJHbi3QWy35X9a88B5d3GEWvdK3gqasQkyEwPVf1ah6/TqzDREziixExLVulPBl1uJ+Hv3v/25B27Ap6yeTlnkkfxUSTYy44tZA2lxt1CBg7nif6obl3TK6v2uEKI8gOqviaXOehIaMBV8I87ansnDptZv6ey3xLnh86y3OtmvdHKT5VYS9MplMOuxEnidAPemqPVYSlXuuh218pV1H2myWanz3H0VQljC8aFYMz4Mjg9f2yyC3BxVcFsM9ULkfq8TjyeWPl0/4nUHXI6jO5nDovx9ooqbuhgi9zVy+FY3bJglLTXi0YxxYl2nFkoS0m6QwxW0hUfp8OltOh7DMJs4IHqPVS19wrcp20ARkGa9TkBAY6GB7Dp8sQRpMZeE3BX+Gopshdmr9dtJhj62epMuLjUxxIIxv8TCIN0aE5x5fw4vrXJ338sKJ0+gMCRYbAEZYnXgZcQUVXnFUEyj4f5EBCUMA7g+XMHQLkqtoFAX7z6xOOhvGdn5lBJo4+DvXmRNZp2gZrt4IQJt8odbg8aY8u3nw2wq0dB+BjhcDzU7HCN98ce3iyyyHgGrCKLVWhNiwzbkHtWNhD22cCEVd6iGazpVHKdspYhxHxplOLw0P4q7832BdwSCKn8YtO7B5C7F7sYr3ZWYTwAFhCqdthhyKLxVrLh5DkAr8SKvXxtGFqDQmBQA3dGSTJ6stv5ksFWX/q5ynw6h1BPw7y4BkiwZgKoDFrcxCYY0vb7Llmiv5gI7l/pS+TMxbM1uWOO5HKQlOda6S29cUZ4/TwvTc2AYzvgkxMqCf8BbdZDa4E/LnV4TFfMOiAqJjPMb8hAJegNnjSAU3uF3inDaXr45/Q7wJQjqXXTQ7qfN 82z3ogUw 4N6vLum4pi4r+6wK6+KtRLDZXqAQV1BeFGI+NbrdL/0xbvWC3OqAm/QOzdquVKU+pLJ+oae1lOdQv0vbKMVGhmus2DPlzlMsnShufAqTSZ1uybaLjXYIAtvn7JJvTDkO3VLBeyYpayDLTzGRZOKpi8GcuvdUMhR+XAhb0qfLYcLJZU24TqvZOnNVknQ2Dyyn8Q+o9d2SJGS9ubPsy/5uEkA5rZs9XrH+CYasFktj2kZ10MiDYeZyvuwpe5Zw9TZJ4Thf+XHJjy7cxEXzFo7Wg/WhTzDQtL6G/hLONWcue8auDeds= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: THREAD_SIZE defines the size of a kernel thread stack. To date, it has been set at compile-time. However, when using vmap stacks, the size must be a multiple of PAGE_SIZE, and given we are in the process of supporting boot-time page size, we must also do the same for THREAD_SIZE. The alternative would be to define THREAD_SIZE for the largest supported page size, but this would waste memory when using a smaller page size. For example, arm64 requires THREAD_SIZE to be 16K, but when using 64K pages and a vmap stack, we must increase the size to 64K. If we required 64K when 4K or 16K page size was in use, we would waste 48K per kernel thread. So let's refactor to allow THREAD_SIZE to not be a compile-time constant. THREAD_SIZE_MAX (and THREAD_ALIGN_MAX) are introduced to manage the limits, as is done for PAGE_SIZE. When THREAD_SIZE is a compile-time constant, behaviour and code size should be equivalent. Signed-off-by: Ryan Roberts Acked-by: Vlastimil Babka --- ***NOTE*** Any confused maintainers may want to read the cover note here for context: https://lore.kernel.org/all/20241014105514.3206191-1-ryan.roberts@arm.com/ include/asm-generic/vmlinux.lds.h | 6 ++- include/linux/sched.h | 4 +- include/linux/thread_info.h | 10 ++++- init/main.c | 2 +- kernel/fork.c | 67 +++++++++++-------------------- mm/kasan/report.c | 3 +- 6 files changed, 42 insertions(+), 50 deletions(-) diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 5727f883001bb..f19bab7a2e8f9 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -56,6 +56,10 @@ #define LOAD_OFFSET 0 #endif +#ifndef THREAD_SIZE_MAX +#define THREAD_SIZE_MAX THREAD_SIZE +#endif + /* * Only some architectures want to have the .notes segment visible in * a separate PT_NOTE ELF Program Header. When this happens, it needs @@ -398,7 +402,7 @@ init_stack = .; \ KEEP(*(.data..init_task)) \ KEEP(*(.data..init_thread_info)) \ - . = __start_init_stack + THREAD_SIZE; \ + . = __start_init_stack + THREAD_SIZE_MAX; \ __end_init_stack = .; #define JUMP_TABLE_DATA \ diff --git a/include/linux/sched.h b/include/linux/sched.h index f8d150343d42d..3de4f655ee492 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1863,14 +1863,14 @@ union thread_union { #ifndef CONFIG_THREAD_INFO_IN_TASK struct thread_info thread_info; #endif - unsigned long stack[THREAD_SIZE/sizeof(long)]; + unsigned long stack[THREAD_SIZE_MAX/sizeof(long)]; }; #ifndef CONFIG_THREAD_INFO_IN_TASK extern struct thread_info init_thread_info; #endif -extern unsigned long init_stack[THREAD_SIZE / sizeof(unsigned long)]; +extern unsigned long init_stack[THREAD_SIZE_MAX / sizeof(unsigned long)]; #ifdef CONFIG_THREAD_INFO_IN_TASK # define task_thread_info(task) (&(task)->thread_info) diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h index 9ea0b28068f49..a7ccc448cd298 100644 --- a/include/linux/thread_info.h +++ b/include/linux/thread_info.h @@ -74,7 +74,15 @@ static inline long set_restart_fn(struct restart_block *restart, } #ifndef THREAD_ALIGN -#define THREAD_ALIGN THREAD_SIZE +#define THREAD_ALIGN THREAD_SIZE +#endif + +#ifndef THREAD_SIZE_MAX +#define THREAD_SIZE_MAX THREAD_SIZE +#endif + +#ifndef THREAD_ALIGN_MAX +#define THREAD_ALIGN_MAX max(THREAD_ALIGN, THREAD_SIZE_MAX) #endif #define THREADINFO_GFP (GFP_KERNEL_ACCOUNT | __GFP_ZERO) diff --git a/init/main.c b/init/main.c index ba1515eb20b9d..4dc28115fdf57 100644 --- a/init/main.c +++ b/init/main.c @@ -797,7 +797,7 @@ void __init __weak smp_prepare_boot_cpu(void) { } -# if THREAD_SIZE >= PAGE_SIZE +#ifdef CONFIG_VMAP_STACK void __init __weak thread_stack_cache_init(void) { } diff --git a/kernel/fork.c b/kernel/fork.c index ea472566d4fcc..cbc3e73f9b501 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -184,13 +184,7 @@ static inline void free_task_struct(struct task_struct *tsk) kmem_cache_free(task_struct_cachep, tsk); } -/* - * Allocate pages if THREAD_SIZE is >= PAGE_SIZE, otherwise use a - * kmemcache based allocator. - */ -# if THREAD_SIZE >= PAGE_SIZE || defined(CONFIG_VMAP_STACK) - -# ifdef CONFIG_VMAP_STACK +#ifdef CONFIG_VMAP_STACK /* * vmalloc() is a bit slow, and calling vfree() enough times will force a TLB * flush. Try to minimize the number of calls by caching stacks. @@ -343,46 +337,21 @@ static void free_thread_stack(struct task_struct *tsk) tsk->stack_vm_area = NULL; } -# else /* !CONFIG_VMAP_STACK */ +#else /* !CONFIG_VMAP_STACK */ -static void thread_stack_free_rcu(struct rcu_head *rh) -{ - __free_pages(virt_to_page(rh), THREAD_SIZE_ORDER); -} - -static void thread_stack_delayed_free(struct task_struct *tsk) -{ - struct rcu_head *rh = tsk->stack; - - call_rcu(rh, thread_stack_free_rcu); -} - -static int alloc_thread_stack_node(struct task_struct *tsk, int node) -{ - struct page *page = alloc_pages_node(node, THREADINFO_GFP, - THREAD_SIZE_ORDER); - - if (likely(page)) { - tsk->stack = kasan_reset_tag(page_address(page)); - return 0; - } - return -ENOMEM; -} - -static void free_thread_stack(struct task_struct *tsk) -{ - thread_stack_delayed_free(tsk); - tsk->stack = NULL; -} - -# endif /* CONFIG_VMAP_STACK */ -# else /* !(THREAD_SIZE >= PAGE_SIZE || defined(CONFIG_VMAP_STACK)) */ +/* + * Allocate pages if THREAD_SIZE is >= PAGE_SIZE, otherwise use a + * kmemcache based allocator. + */ static struct kmem_cache *thread_stack_cache; static void thread_stack_free_rcu(struct rcu_head *rh) { - kmem_cache_free(thread_stack_cache, rh); + if (THREAD_SIZE >= PAGE_SIZE) + __free_pages(virt_to_page(rh), THREAD_SIZE_ORDER); + else + kmem_cache_free(thread_stack_cache, rh); } static void thread_stack_delayed_free(struct task_struct *tsk) @@ -395,7 +364,16 @@ static void thread_stack_delayed_free(struct task_struct *tsk) static int alloc_thread_stack_node(struct task_struct *tsk, int node) { unsigned long *stack; - stack = kmem_cache_alloc_node(thread_stack_cache, THREADINFO_GFP, node); + struct page *page; + + if (THREAD_SIZE >= PAGE_SIZE) { + page = alloc_pages_node(node, THREADINFO_GFP, THREAD_SIZE_ORDER); + stack = likely(page) ? page_address(page) : NULL; + } else { + stack = kmem_cache_alloc_node(thread_stack_cache, + THREADINFO_GFP, node); + } + stack = kasan_reset_tag(stack); tsk->stack = stack; return stack ? 0 : -ENOMEM; @@ -409,13 +387,16 @@ static void free_thread_stack(struct task_struct *tsk) void thread_stack_cache_init(void) { + if (THREAD_SIZE >= PAGE_SIZE) + return; + thread_stack_cache = kmem_cache_create_usercopy("thread_stack", THREAD_SIZE, THREAD_SIZE, 0, 0, THREAD_SIZE, NULL); BUG_ON(thread_stack_cache == NULL); } -# endif /* THREAD_SIZE >= PAGE_SIZE || defined(CONFIG_VMAP_STACK) */ +#endif /* CONFIG_VMAP_STACK */ /* SLAB cache for signal_struct structures (tsk->signal) */ static struct kmem_cache *signal_cachep; diff --git a/mm/kasan/report.c b/mm/kasan/report.c index b48c768acc84d..57c877852dbc6 100644 --- a/mm/kasan/report.c +++ b/mm/kasan/report.c @@ -365,8 +365,7 @@ static inline bool kernel_or_module_addr(const void *addr) static inline bool init_task_stack_addr(const void *addr) { return addr >= (void *)&init_thread_union.stack && - (addr <= (void *)&init_thread_union.stack + - sizeof(init_thread_union.stack)); + (addr <= (void *)&init_thread_union.stack + THREAD_SIZE); } static void print_address_description(void *addr, u8 tag,