From patchwork Thu Mar 14 14:54:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pasha Tatashin X-Patchwork-Id: 13592503 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12301C54E67 for ; Thu, 14 Mar 2024 14:55:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 69BA3800B5; Thu, 14 Mar 2024 10:55:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 64BC0800B4; Thu, 14 Mar 2024 10:55:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4EDD1800B5; Thu, 14 Mar 2024 10:55:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 3B4A2800B4 for ; Thu, 14 Mar 2024 10:55:13 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id DED4B160780 for ; Thu, 14 Mar 2024 14:55:12 +0000 (UTC) X-FDA: 81895942464.14.AB76298 Received: from mail-qk1-f169.google.com (mail-qk1-f169.google.com [209.85.222.169]) by imf20.hostedemail.com (Postfix) with ESMTP id 020A21C001D for ; Thu, 14 Mar 2024 14:55:10 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=zmcbUHIN; dmarc=pass (policy=none) header.from=soleen.com; spf=pass (imf20.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.222.169 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710428111; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=wW+2zHFZQjqfYIpLd4QibSf5RSIcFb17pxrKGVZfg7k=; b=baP/73ShW8BJJP62pX8DBPxj2k2Db5AwTbNS5VcOR2AemnoqoTQGp29K3VHWQgfWruy1z7 0RkHuhH94BPyv0QU5mGtBBFl+pqaHOsScFHCBoK5du2jsHO7g84j20JebANq44Xc2vzYSA 2vSWEDEhdI3ZtjNfeUg58k+93GbwNz8= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=zmcbUHIN; dmarc=pass (policy=none) header.from=soleen.com; spf=pass (imf20.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.222.169 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710428111; a=rsa-sha256; cv=none; b=pThKHDRHKEThidb8Dp7ow1t4xucldCsqrXTX30Nu/TQVwKL8M4QIkfZj7Mv6IrA6EWlfGZ jzQWlt9H7ztvVvVksXNZjih9/VX+6D1wCTrlSliryHypEtViwvmVROXNErRbDm0oD2Fi00 ePZiYh2apvoB/OrrCn2zL9IoETmvhRg= Received: by mail-qk1-f169.google.com with SMTP id af79cd13be357-78848933458so46877185a.1 for ; Thu, 14 Mar 2024 07:55:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1710428110; x=1711032910; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=wW+2zHFZQjqfYIpLd4QibSf5RSIcFb17pxrKGVZfg7k=; b=zmcbUHINIT6P0QOi0ztxulOPfCvBurUOFh4zZNS8+mHR/BYIsh/uj5CNoU7t4DHWds sAKp/AvOrMjw5SwddvdzX3eJBfPkjg86AYxFH/fKRatvLqlZVWq2QemloBUqXh5KtE56 S87BMLpQ1jdgB8h08rKeo/4HSsVCfl3BBMEElzV60H3oiOThDHKW+UI+UBbcOnKOJZGT ocggSa0ohhqGTTk/2topXcEgZdqbQZpPYgbz/bAvx8RuFFLAU7Gvv8FglAo7ty59Uj2Y YcQGd+Mo5F/Y4iMbC0nYL4NAjM266qij+ROITvKfOc/mKy3di7JeEpgVhzLPhhGCqHAk 7GHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710428110; x=1711032910; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=wW+2zHFZQjqfYIpLd4QibSf5RSIcFb17pxrKGVZfg7k=; b=Y+XSgWv7rFRMdRYD8ZW7oa3AK3sfDtCo7AvsQO6rvYG8PwedLN4g28CH9hItb3mXFi hDiJ8LwnPJ6H61LyZQ1zOmu/2sdiGfe1bfPIied9h0IoTE7DgO8PPW3kViRBfFobm+yz r/0637FeAqCogvKdEu+T+TBHFhSbXLzs8FGZqTU0VH7rU4ZLmZlJsWE8tNxlnO+Q+XA6 2SRJm+KDRo6X9anSYCDV7kI2K9yMYqa8QDKqhB2tl7eEQtQbaQnsQY8QFGUH+p2yuSqK dEF4Rtr7lLzIAD+uPOBqiZQ+CwYZTc1MyXmdwpk1O1gNoqS+TItgz97UGqkz92lIV0A7 Lang== X-Forwarded-Encrypted: i=1; AJvYcCWRh9CyXIQ+lIG4y2tgrm+gSptr5q1X/QwdPDsUM7THpw8HgyxLhbR/wUvG5lYHisvuPmjgpwXucuxdVKp55073oqU= X-Gm-Message-State: AOJu0YwPwn+ZzVcahPAn59nCElU7VW0UduKw2RqJBsCeEJJHauh+BWPC 1wqaJSzwj01Ad3sDfK6NRCaHQd0Zhx8Y2k52Q8R7KLhdh9fsNMNTZQeobfZ3GtU= X-Google-Smtp-Source: AGHT+IFRyx8KYheppMcuTmknKZnIAQURRHVD3bqUb92bxpaPUbwp7dA3a4520rhwKxXGDEt+IFhD8g== X-Received: by 2002:a05:620a:2943:b0:789:e044:eb08 with SMTP id n3-20020a05620a294300b00789e044eb08mr739775qkp.24.1710428110047; Thu, 14 Mar 2024 07:55:10 -0700 (PDT) Received: from soleen.c.googlers.com (150.254.86.34.bc.googleusercontent.com. [34.86.254.150]) by smtp.gmail.com with ESMTPSA id pi20-20020a05620a379400b00788406f9c7dsm904821qkn.101.2024.03.14.07.55.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Mar 2024 07:55:09 -0700 (PDT) From: Pasha Tatashin To: akpm@linux-foundation.org, jpoimboe@kernel.org, pasha.tatashin@soleen.com, kent.overstreet@linux.dev, peterz@infradead.org, nphamcs@gmail.com, cerasuolodomenico@gmail.com, surenb@google.com, lizhijian@fujitsu.com, willy@infradead.org, shakeel.butt@linux.dev, vbabka@suse.cz, ziy@nvidia.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2] vmstat: Keep count of the maximum page reached by the kernel stack Date: Thu, 14 Mar 2024 14:54:57 +0000 Message-ID: <20240314145457.1106299-1-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.44.0.278.ge034bb2e1d-goog MIME-Version: 1.0 X-Rspamd-Queue-Id: 020A21C001D X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: zt4ay7fumucqq9enmaw98qj5owk17jf1 X-HE-Tag: 1710428110-432832 X-HE-Meta: U2FsdGVkX1/tsMzJUGzTT7Wq2S1RjG1tn8a1gsd4JA/9gq3PFty7nPGhpXYmt0u4zwk3fM1z7BQV+TeSBEtQBvEt3GxYTTMAN7iILramxdWey0ME4v0mndxT0IHtqiVvTHr6BV6cqRe2fIb/OaX4f9IGADAmCeew96RS+od7QQdtRQ+b4PJ8w1Syc3svxTw5PvLYWLR4cyEmGMQvlXnOi/KbAgiY+EkgSaN5SpI5tXBKHMJAbmvg0xc2/I2lZCDqgy0re5YvMzSpiyoRjNx4nh2CyqWtx7Zj3qDHncW7m/MixG1NQtpgiOZYlKbZdwEHdkO75xpmKPDPRk+iz4pkDC1qGjJNOOyNPVA3k/fb6o1DuHcyUn0Qh54FTWAVhyxymqNJkhGKhKIjRrN5yl63qd0NAttSRDGMeZzorBus53G3/RXgEPl3p/O1LW0M23Hs5QgfiOcez4CX0U+k8sHFnzd7OCl6b8LKkUa76XPKaJ0cP+hPHjOI8toxU/7TjfhQEnm7mDleICyylSUzVppnMx9KGq70nyyQ+VdyhgzcgjNMACbZAhMyINyGTqhgQLBXCqCQht5Pdwg0Mm+iWiH+TfZN0xAUifjsCE808khNCLdPSd16oQ/JWNvP+xNDeEFoqOIFSc4F3yWEMB30SlcQDNJA1givWS93kF1JgB39yLOBbwuwwXWYHUMkZjEvYbUizRqnYDGx/q4DXDfjfq1c9ZmhIyMC2qVED1kT5Z/MifT8B6MfxBu6CeJ+tTwL3msLJpE4NhmtTmGWVcwgd8ZJrhiXvAXLVDTUxX5uWAbE4feF3AVKNmL2aeAnws8GU0c/7Vgrli9Rv7bF26AR22fVrnnzLoCLR4uYkeQwTdaUJlqJWQ8D/MrF8rchS2LQu1ckPgbDv3qwqFzGZF2/inF8XyFR8ZqgFKiznQUXb9pNDyHTuKhL/eqb5xf4Ljf5YGzkPHOmSUasd7VQR5t1dEO K/pQFius IEojdqP+eP/S8xKPGwnK1/Xw6aWioYcE+9jtBGEfCXQ1IFvL9DnNOHp/Kq3Sx3HCJ19NMjBt5wxg2Jydo0tSOwc+v6o7pd2Qmq0dk7MHoSoh6cNcTaw7FXryavggkBZ5aFj3iJ0Ed0N1XJftTc4BEOPvyLBdeG5MmJdAE8+ZaMQ5LbUP2yXX5Njl2ElUIytQtV1cyx/KMrqr+uPyR7JsRLR0ttSRVnk9o9BsS0OQvA9AQ5bkPTuYaJMwdSnGNiWbB+j1Qv4HoNVDgd9l/U3kiJaY1hBub7LnOmh7cBtGu5lvzZKPNpxUsC3JwAauKGP243Gj2A0FONzWQmEcgZrIiGw1aiCcmu2cv5TXKMRdv1rwG0BFokymyheLU2jdhdJ6G1NxlGLpB9WEKXMHycHc1SyQBUXmjp1kLLDO3r3ut/CSQi/z6WXHQPt96CmP/lBe8165iTW9IgAKNK8w= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: CONFIG_DEBUG_STACK_USAGE provides a mechanism to determine the minimum amount of memory left in a stack. Every time a new low-memory record is reached, a message is printed to the console. However, this doesn't reveal how many pages within each stack were actually used. Introduce a mechanism that keeps count the number of times each of the stack's pages were reached: $ grep kstack /proc/vmstat kstack_page_1 19974 kstack_page_2 94 kstack_page_3 0 kstack_page_4 0 In the above example, out of 20,068 threads that exited on this machine, only 94 reached the second page of their stack, and none touched pages three or four. In fleet environments with millions of machines, this data can help optimize kernel stack sizes. Signed-off-by: Pasha Tatashin --- Changelog: v2: - Fixed enum name KSTACK_PAGE_5 ->KSTACK_PAGE_REST. - Improved commit message based on Christophe Leroy comment. include/linux/sched/task_stack.h | 40 ++++++++++++++++++++++++++++++-- include/linux/vm_event_item.h | 29 +++++++++++++++++++++++ include/linux/vmstat.h | 16 ------------- mm/vmstat.c | 11 +++++++++ 4 files changed, 78 insertions(+), 18 deletions(-) diff --git a/include/linux/sched/task_stack.h b/include/linux/sched/task_stack.h index ccd72b978e1f..09e6874c2ced 100644 --- a/include/linux/sched/task_stack.h +++ b/include/linux/sched/task_stack.h @@ -95,9 +95,42 @@ static inline int object_is_on_stack(const void *obj) extern void thread_stack_cache_init(void); #ifdef CONFIG_DEBUG_STACK_USAGE +#ifdef CONFIG_VM_EVENT_COUNTERS +#include + +/* Count the maximum pages reached in kernel stacks */ +static inline void count_kstack_page(int stack_max_page) +{ + switch (stack_max_page) { + case 1: + this_cpu_inc(vm_event_states.event[KSTACK_PAGE_1]); + break; + case 2: + this_cpu_inc(vm_event_states.event[KSTACK_PAGE_2]); + break; +#if THREAD_SIZE >= (4 * PAGE_SIZE) + case 3: + this_cpu_inc(vm_event_states.event[KSTACK_PAGE_3]); + break; + case 4: + this_cpu_inc(vm_event_states.event[KSTACK_PAGE_4]); + break; +#endif +#if THREAD_SIZE > (4 * PAGE_SIZE) + default: + this_cpu_inc(vm_event_states.event[KSTACK_PAGE_REST]); + break; +#endif + } +} +#else /* !CONFIG_VM_EVENT_COUNTERS */ +static inline void count_kstack_page(int stack_max_page) {} +#endif /* CONFIG_VM_EVENT_COUNTERS */ + static inline unsigned long stack_not_used(struct task_struct *p) { unsigned long *n = end_of_stack(p); + unsigned long unused_stack; do { /* Skip over canary */ # ifdef CONFIG_STACK_GROWSUP @@ -108,10 +141,13 @@ static inline unsigned long stack_not_used(struct task_struct *p) } while (!*n); # ifdef CONFIG_STACK_GROWSUP - return (unsigned long)end_of_stack(p) - (unsigned long)n; + unused_stack = (unsigned long)end_of_stack(p) - (unsigned long)n; # else - return (unsigned long)n - (unsigned long)end_of_stack(p); + unused_stack = (unsigned long)n - (unsigned long)end_of_stack(p); # endif + count_kstack_page(((THREAD_SIZE - unused_stack) >> PAGE_SHIFT) + 1); + + return unused_stack; } #endif extern void set_task_stack_end_magic(struct task_struct *tsk); diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 747943bc8cc2..1dbfe47ff048 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -153,10 +153,39 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, VMA_LOCK_ABORT, VMA_LOCK_RETRY, VMA_LOCK_MISS, +#endif +#ifdef CONFIG_DEBUG_STACK_USAGE + KSTACK_PAGE_1, + KSTACK_PAGE_2, +#if THREAD_SIZE >= (4 * PAGE_SIZE) + KSTACK_PAGE_3, + KSTACK_PAGE_4, +#endif +#if THREAD_SIZE > (4 * PAGE_SIZE) + KSTACK_PAGE_REST, +#endif #endif NR_VM_EVENT_ITEMS }; +#ifdef CONFIG_VM_EVENT_COUNTERS +/* + * Light weight per cpu counter implementation. + * + * Counters should only be incremented and no critical kernel component + * should rely on the counter values. + * + * Counters are handled completely inline. On many platforms the code + * generated will simply be the increment of a global address. + */ + +struct vm_event_state { + unsigned long event[NR_VM_EVENT_ITEMS]; +}; + +DECLARE_PER_CPU(struct vm_event_state, vm_event_states); +#endif + #ifndef CONFIG_TRANSPARENT_HUGEPAGE #define THP_FILE_ALLOC ({ BUILD_BUG(); 0; }) #define THP_FILE_FALLBACK ({ BUILD_BUG(); 0; }) diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h index 343906a98d6e..18d4a97d3afd 100644 --- a/include/linux/vmstat.h +++ b/include/linux/vmstat.h @@ -41,22 +41,6 @@ enum writeback_stat_item { }; #ifdef CONFIG_VM_EVENT_COUNTERS -/* - * Light weight per cpu counter implementation. - * - * Counters should only be incremented and no critical kernel component - * should rely on the counter values. - * - * Counters are handled completely inline. On many platforms the code - * generated will simply be the increment of a global address. - */ - -struct vm_event_state { - unsigned long event[NR_VM_EVENT_ITEMS]; -}; - -DECLARE_PER_CPU(struct vm_event_state, vm_event_states); - /* * vm counters are allowed to be racy. Use raw_cpu_ops to avoid the * local_irq_disable overhead. diff --git a/mm/vmstat.c b/mm/vmstat.c index db79935e4a54..737c85689251 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1413,6 +1413,17 @@ const char * const vmstat_text[] = { "vma_lock_retry", "vma_lock_miss", #endif +#ifdef CONFIG_DEBUG_STACK_USAGE + "kstack_page_1", + "kstack_page_2", +#if THREAD_SIZE >= (4 * PAGE_SIZE) + "kstack_page_3", + "kstack_page_4", +#endif +#if THREAD_SIZE > (4 * PAGE_SIZE) + "kstack_page_rest", +#endif +#endif #endif /* CONFIG_VM_EVENT_COUNTERS || CONFIG_MEMCG */ }; #endif /* CONFIG_PROC_FS || CONFIG_SYSFS || CONFIG_NUMA || CONFIG_MEMCG */