From patchwork Tue Jun 2 14:15:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 11583929 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3F312739 for ; Tue, 2 Jun 2020 14:16:17 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 09EE3207F7 for ; Tue, 2 Jun 2020 14:16:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 09EE3207F7 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1A89880010; Tue, 2 Jun 2020 10:16:14 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 13731280004; Tue, 2 Jun 2020 10:16:14 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F144F80010; Tue, 2 Jun 2020 10:16:13 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0167.hostedemail.com [216.40.44.167]) by kanga.kvack.org (Postfix) with ESMTP id DAD41280004 for ; Tue, 2 Jun 2020 10:16:13 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 95AE25C4AF for ; Tue, 2 Jun 2020 14:16:13 +0000 (UTC) X-FDA: 76884471426.28.wood78_362b208d1d43a Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin28.hostedemail.com (Postfix) with ESMTP id A085B4410 for ; Tue, 2 Jun 2020 14:16:12 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,vbabka@suse.cz,,RULES_HIT:30029:30030:30051:30054:30062:30070:30074,0,RBL:195.135.220.15:@suse.cz:.lbl8.mailshell.net-62.14.6.2 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: wood78_362b208d1d43a X-Filterd-Recvd-Size: 10488 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf37.hostedemail.com (Postfix) with ESMTP for ; Tue, 2 Jun 2020 14:16:12 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id B968AAE67; Tue, 2 Jun 2020 14:16:12 +0000 (UTC) From: Vlastimil Babka To: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, vinmenon@codeaurora.org, Kees Cook , Matthew Garrett , Vlastimil Babka , Jann Horn , Vijayanand Jitta Subject: [RFC PATCH 1/5] mm, slub: extend slub_debug syntax for multiple blocks Date: Tue, 2 Jun 2020 16:15:15 +0200 Message-Id: <20200602141519.7099-2-vbabka@suse.cz> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200602141519.7099-1-vbabka@suse.cz> References: <20200602141519.7099-1-vbabka@suse.cz> MIME-Version: 1.0 X-Rspamd-Queue-Id: A085B4410 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The slub_debug kernel boot parameter can either apply a single set of options to all caches or a list of caches. There is a use case where debugging is applied for all caches and then disabled at runtime for specific caches, for performance and memory consumption reasons [1]. As runtime changes are dangerous, extend the boot parameter syntax so that multiple blocks of either global or slab-specific options can be specified, with blocks delimited by ';'. This will also support the use case of [1] without runtime changes. For details see the updated Documentation/vm/slub.rst [1] https://lore.kernel.org/r/1383cd32-1ddc-4dac-b5f8-9c42282fa81c@codeaurora.org Signed-off-by: Vlastimil Babka Reviewed-by: Kees Cook --- .../admin-guide/kernel-parameters.txt | 2 +- Documentation/vm/slub.rst | 17 ++ mm/slub.c | 177 +++++++++++++----- 3 files changed, 144 insertions(+), 52 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 6253849afac2..6ea00c8dd635 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4613,7 +4613,7 @@ fragmentation. Defaults to 1 for systems with more than 32MB of RAM, 0 otherwise. - slub_debug[=options[,slabs]] [MM, SLUB] + slub_debug[=options[,slabs][;[options[,slabs]]...] [MM, SLUB] Enabling slub_debug allows one to determine the culprit if slab objects become corrupted. Enabling slub_debug can create guard zones around objects and diff --git a/Documentation/vm/slub.rst b/Documentation/vm/slub.rst index 4eee598555c9..f1154f707e06 100644 --- a/Documentation/vm/slub.rst +++ b/Documentation/vm/slub.rst @@ -41,6 +41,11 @@ slub_debug=,,,... Enable options only for select slabs (no spaces after a comma) +Multiple blocks of options for all slabs or selected slabs can be given, with +blocks of options delimited by ';'. The last of "all slabs" blocks is applied +to all slabs except those that match one of the "select slabs" block. Options +of the first "select slabs" blocks that matches the slab's name are applied. + Possible debug options are:: F Sanity checks on (enables SLAB_DEBUG_CONSISTENCY_CHECKS @@ -83,6 +88,18 @@ in low memory situations or if there's high fragmentation of memory. To slub_debug=O +You can apply different options to different list of slab names, using blocks +of options. This will enable red zoning for dentry and user tracking for +kmalloc. All other slabs will not get any debugging enabled:: + + slub_debug=Z,dentry;U,kmalloc-* + +You can also enable options (e.g. sanity checks and poisoning) for all caches +except some that are deemed too performance critical and don't need to be +debugged:: + + slub_debug=FZ;-,zs_handle,zspage + In case you forgot to enable debugging on the kernel command line: It is possible to enable debugging manually when the kernel is up. Look at the contents of:: diff --git a/mm/slub.c b/mm/slub.c index 03e063cd979f..47430aad9a65 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -499,7 +499,7 @@ static slab_flags_t slub_debug = DEBUG_DEFAULT_FLAGS; static slab_flags_t slub_debug; #endif -static char *slub_debug_slabs; +static char *slub_debug_string; static int disable_higher_order_debug; /* @@ -1262,68 +1262,132 @@ static noinline int free_debug_processing( return ret; } -static int __init setup_slub_debug(char *str) +/* + * Parse a block of slub_debug options. Blocks are delimited by ';' + * + * @str: start of block + * @flags: returns parsed flags, or DEBUG_DEFAULT_FLAGS if none specified + * @slabs: return start of list of slabs, or NULL when there's no list + * @init: assume this is initial parsing and not per-kmem-create parsing + * + * returns the start of next block if there's any, or NULL + */ +char * +parse_slub_debug_flags(char *str, slab_flags_t *flags, char **slabs, bool init) { - slub_debug = DEBUG_DEFAULT_FLAGS; - if (*str++ != '=' || !*str) - /* - * No options specified. Switch on full debugging. - */ - goto out; + bool higher_order_disable = false; - if (*str == ',') + /* Skip any completely empty blocks */ + while (*str && *str == ';') + str++; + + if (*str == ',') { /* * No options but restriction on slabs. This means full * debugging for slabs matching a pattern. */ + *flags = DEBUG_DEFAULT_FLAGS; goto check_slabs; + } + *flags = 0; - slub_debug = 0; - if (*str == '-') - /* - * Switch off all debugging measures. - */ - goto out; - - /* - * Determine which debug features should be switched on - */ - for (; *str && *str != ','; str++) { + /* Determine which debug features should be switched on */ + for (; *str && *str != ',' && *str != ';'; str++) { switch (tolower(*str)) { + case '-': + *flags = 0; + break; case 'f': - slub_debug |= SLAB_CONSISTENCY_CHECKS; + *flags |= SLAB_CONSISTENCY_CHECKS; break; case 'z': - slub_debug |= SLAB_RED_ZONE; + *flags |= SLAB_RED_ZONE; break; case 'p': - slub_debug |= SLAB_POISON; + *flags |= SLAB_POISON; break; case 'u': - slub_debug |= SLAB_STORE_USER; + *flags |= SLAB_STORE_USER; break; case 't': - slub_debug |= SLAB_TRACE; + *flags |= SLAB_TRACE; break; case 'a': - slub_debug |= SLAB_FAILSLAB; + *flags |= SLAB_FAILSLAB; break; case 'o': /* * Avoid enabling debugging on caches if its minimum * order would increase as a result. */ - disable_higher_order_debug = 1; + higher_order_disable = true; break; default: - pr_err("slub_debug option '%c' unknown. skipped\n", - *str); + if (init) + pr_err("slub_debug option '%c' unknown. skipped\n", *str); } } - check_slabs: if (*str == ',') - slub_debug_slabs = str + 1; + *slabs = ++str; + else + *slabs = NULL; + + /* Skip over the slab list */ + while (*str && *str != ';') + str++; + + /* Skip any completely empty blocks */ + while (*str && *str == ';') + str++; + + if (init && higher_order_disable) + disable_higher_order_debug = 1; + + if (*str) + return str; + else + return NULL; +} + +static int __init setup_slub_debug(char *str) +{ + slab_flags_t flags; + char *saved_str; + char *slab_list; + bool global_slub_debug_changed = false; + bool slab_list_specified = false; + + slub_debug = DEBUG_DEFAULT_FLAGS; + if (*str++ != '=' || !*str) + /* + * No options specified. Switch on full debugging. + */ + goto out; + + saved_str = str; + while (str) { + str = parse_slub_debug_flags(str, &flags, &slab_list, true); + + if (!slab_list) { + slub_debug = flags; + global_slub_debug_changed = true; + } else { + slab_list_specified = true; + } + } + + /* + * For backwards compatibility, a single list of flags with list of + * slabs means debugging is only enabled for those slabs, so the global + * slub_debug should be 0. We can extended that to multiple lists as + * long as there is no option specifying flags without a slab list. + */ + if (slab_list_specified) { + if (!global_slub_debug_changed) + slub_debug = 0; + slub_debug_string = saved_str; + } out: if ((static_branch_unlikely(&init_on_alloc) || static_branch_unlikely(&init_on_free)) && @@ -1352,36 +1416,47 @@ slab_flags_t kmem_cache_flags(unsigned int object_size, { char *iter; size_t len; + char *next_block; + slab_flags_t block_flags; /* If slub_debug = 0, it folds into the if conditional. */ - if (!slub_debug_slabs) + if (!slub_debug_string) return flags | slub_debug; len = strlen(name); - iter = slub_debug_slabs; - while (*iter) { - char *end, *glob; - size_t cmplen; - - end = strchrnul(iter, ','); + next_block = slub_debug_string; + /* Go through all blocks of debug options, see if any matches our slab's name */ + while (next_block) { + next_block = parse_slub_debug_flags(next_block, &block_flags, &iter, false); + if (!iter) + continue; + /* Found a block that has a slab list, search it */ + while (*iter) { + char *end, *glob; + size_t cmplen; + + end = strchrnul(iter, ','); + if (next_block && next_block < end) + end = next_block - 1; + + glob = strnchr(iter, end - iter, '*'); + if (glob) + cmplen = glob - iter; + else + cmplen = max_t(size_t, len, (end - iter)); - glob = strnchr(iter, end - iter, '*'); - if (glob) - cmplen = glob - iter; - else - cmplen = max_t(size_t, len, (end - iter)); + if (!strncmp(name, iter, cmplen)) { + flags |= block_flags; + return flags; + } - if (!strncmp(name, iter, cmplen)) { - flags |= slub_debug; - break; + if (!*end || *end == ';') + break; + iter = end + 1; } - - if (!*end) - break; - iter = end + 1; } - return flags; + return slub_debug; } #else /* !CONFIG_SLUB_DEBUG */ static inline void setup_object_debug(struct kmem_cache *s, From patchwork Tue Jun 2 14:15:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 11583935 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C9902739 for ; Tue, 2 Jun 2020 14:16:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9740F20757 for ; Tue, 2 Jun 2020 14:16:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9740F20757 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C0025280004; Tue, 2 Jun 2020 10:16:17 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B6E9C28000D; Tue, 2 Jun 2020 10:16:17 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 922FE280004; Tue, 2 Jun 2020 10:16:17 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0025.hostedemail.com [216.40.44.25]) by kanga.kvack.org (Postfix) with ESMTP id 72322280006 for ; Tue, 2 Jun 2020 10:16:17 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 22D7F18037FF9 for ; Tue, 2 Jun 2020 14:16:17 +0000 (UTC) X-FDA: 76884471594.13.baby15_362b04ff8280d Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin13.hostedemail.com (Postfix) with ESMTP id C090F18144D5B for ; Tue, 2 Jun 2020 14:16:12 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,vbabka@suse.cz,,RULES_HIT:30029:30034:30045:30054:30062:30070:30075:30090,0,RBL:195.135.220.15:@suse.cz:.lbl8.mailshell.net-64.201.201.201 62.14.6.2,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:30,LUA_SUMMARY:none X-HE-Tag: baby15_362b04ff8280d X-Filterd-Recvd-Size: 6676 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf13.hostedemail.com (Postfix) with ESMTP for ; Tue, 2 Jun 2020 14:16:12 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id B81F4AE2D; Tue, 2 Jun 2020 14:16:12 +0000 (UTC) From: Vlastimil Babka To: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, vinmenon@codeaurora.org, Kees Cook , Matthew Garrett , Vlastimil Babka , Vijayanand Jitta , Jann Horn Subject: [RFC PATCH 2/5] mm, slub: make some slub_debug related attributes read-only Date: Tue, 2 Jun 2020 16:15:16 +0200 Message-Id: <20200602141519.7099-3-vbabka@suse.cz> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200602141519.7099-1-vbabka@suse.cz> References: <20200602141519.7099-1-vbabka@suse.cz> MIME-Version: 1.0 X-Rspamd-Queue-Id: C090F18144D5B X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: SLUB_DEBUG creates several files under /sys/kernel/slab// that can be read to check if the respective debugging options are enabled for given cache. The options can be also toggled at runtime by writing into the files. Some of those, namely red_zone, poison, and store_user can be toggled only when no objects yet exist in the cache. Vijayanand reports [1] that there is a problem with freelist randomization if changing the debugging option's state results in different number of objects per page, and the random sequence cache needs thus needs to be recomputed. However, another problem is that the check for "no objects yet exist in the cache" is racy, as noted by Jann [2] and fixing that would add overhead or otherwise complicate the allocation/freeing paths. Thus it would be much simpler just to remove the runtime toggling support. The documentation describes it's "In case you forgot to enable debugging on the kernel command line", but the neccessity of having no objects limits its usefulness anyway for many caches. Vijayanand describes an use case [3] where debugging is enabled for all but zram caches for memory overhead reasons, and using the runtime toggles was the only way to achieve such configuration. After the previous patch it's now possible to do that directly from the kernel boot option, so we can remove the dangerous runtime toggles by making the /sys attribute files read-only. While updating it, also improve the documentation of the debugging /sys files. [1] https://lkml.kernel.org/r/1580379523-32272-1-git-send-email-vjitta@codeaurora.org [2] https://lore.kernel.org/r/CAG48ez31PP--h6_FzVyfJ4H86QYczAFPdxtJHUEEan+7VJETAQ@mail.gmail.com [3] https://lore.kernel.org/r/1383cd32-1ddc-4dac-b5f8-9c42282fa81c@codeaurora.org Reported-by: Vijayanand Jitta Reported-by: Jann Horn Signed-off-by: Vlastimil Babka Reviewed-by: Kees Cook Acked-by: Roman Gushchin --- Documentation/vm/slub.rst | 28 ++++++++++++++---------- mm/slub.c | 46 +++------------------------------------ 2 files changed, 20 insertions(+), 54 deletions(-) diff --git a/Documentation/vm/slub.rst b/Documentation/vm/slub.rst index f1154f707e06..61805e984a0d 100644 --- a/Documentation/vm/slub.rst +++ b/Documentation/vm/slub.rst @@ -100,20 +100,26 @@ except some that are deemed too performance critical and don't need to be slub_debug=FZ;-,zs_handle,zspage -In case you forgot to enable debugging on the kernel command line: It is -possible to enable debugging manually when the kernel is up. Look at the -contents of:: +The state of each debug option for a slab can be found in the respective files +under:: /sys/kernel/slab// -Look at the writable files. Writing 1 to them will enable the -corresponding debug option. All options can be set on a slab that does -not contain objects. If the slab already contains objects then sanity checks -and tracing may only be enabled. The other options may cause the realignment -of objects. - -Careful with tracing: It may spew out lots of information and never stop if -used on the wrong slab. +If the file contains 1, the option is enabled, 0 means disabled. The debug +options from the ``slub_debug`` parameter translate to the following files:: + + F sanity_checks + Z red_zone + P poison + U store_user + T trace + A failslab + +The sanity_checks, trace and failslab files are writable, so writing 1 or 0 +will enable or disable the option at runtime. The writes to trace and failslab +may return -EINVAL if the cache is subject to slab merging. Careful with +tracing: It may spew out lots of information and never stop if used on the +wrong slab. Slab merging ============ diff --git a/mm/slub.c b/mm/slub.c index 47430aad9a65..ac198202dbb0 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -5351,61 +5351,21 @@ static ssize_t red_zone_show(struct kmem_cache *s, char *buf) return sprintf(buf, "%d\n", !!(s->flags & SLAB_RED_ZONE)); } -static ssize_t red_zone_store(struct kmem_cache *s, - const char *buf, size_t length) -{ - if (any_slab_objects(s)) - return -EBUSY; - - s->flags &= ~SLAB_RED_ZONE; - if (buf[0] == '1') { - s->flags |= SLAB_RED_ZONE; - } - calculate_sizes(s, -1); - return length; -} -SLAB_ATTR(red_zone); +SLAB_ATTR_RO(red_zone); static ssize_t poison_show(struct kmem_cache *s, char *buf) { return sprintf(buf, "%d\n", !!(s->flags & SLAB_POISON)); } -static ssize_t poison_store(struct kmem_cache *s, - const char *buf, size_t length) -{ - if (any_slab_objects(s)) - return -EBUSY; - - s->flags &= ~SLAB_POISON; - if (buf[0] == '1') { - s->flags |= SLAB_POISON; - } - calculate_sizes(s, -1); - return length; -} -SLAB_ATTR(poison); +SLAB_ATTR_RO(poison); static ssize_t store_user_show(struct kmem_cache *s, char *buf) { return sprintf(buf, "%d\n", !!(s->flags & SLAB_STORE_USER)); } -static ssize_t store_user_store(struct kmem_cache *s, - const char *buf, size_t length) -{ - if (any_slab_objects(s)) - return -EBUSY; - - s->flags &= ~SLAB_STORE_USER; - if (buf[0] == '1') { - s->flags &= ~__CMPXCHG_DOUBLE; - s->flags |= SLAB_STORE_USER; - } - calculate_sizes(s, -1); - return length; -} -SLAB_ATTR(store_user); +SLAB_ATTR_RO(store_user); static ssize_t validate_show(struct kmem_cache *s, char *buf) { From patchwork Tue Jun 2 14:15:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 11583937 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7261A739 for ; Tue, 2 Jun 2020 14:16:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3DBE420757 for ; Tue, 2 Jun 2020 14:16:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3DBE420757 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 57D4328000D; Tue, 2 Jun 2020 10:16:18 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4B9BD280007; Tue, 2 Jun 2020 10:16:18 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3A36628000D; Tue, 2 Jun 2020 10:16:18 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0060.hostedemail.com [216.40.44.60]) by kanga.kvack.org (Postfix) with ESMTP id 1BC27280007 for ; Tue, 2 Jun 2020 10:16:18 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id CC2675C881 for ; Tue, 2 Jun 2020 14:16:17 +0000 (UTC) X-FDA: 76884471594.11.run31_363345b85d218 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin11.hostedemail.com (Postfix) with ESMTP id C2D6818036FD7 for ; Tue, 2 Jun 2020 14:16:12 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,vbabka@suse.cz,,RULES_HIT:30054:30070,0,RBL:195.135.220.15:@suse.cz:.lbl8.mailshell.net-62.14.6.2 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: run31_363345b85d218 X-Filterd-Recvd-Size: 3400 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf21.hostedemail.com (Postfix) with ESMTP for ; Tue, 2 Jun 2020 14:16:12 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id B9707AE69; Tue, 2 Jun 2020 14:16:12 +0000 (UTC) From: Vlastimil Babka To: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, vinmenon@codeaurora.org, Kees Cook , Matthew Garrett , Vlastimil Babka , Jann Horn , Vijayanand Jitta Subject: [RFC PATCH 3/5] mm, slub: remove runtime allocation order changes Date: Tue, 2 Jun 2020 16:15:17 +0200 Message-Id: <20200602141519.7099-4-vbabka@suse.cz> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200602141519.7099-1-vbabka@suse.cz> References: <20200602141519.7099-1-vbabka@suse.cz> MIME-Version: 1.0 X-Rspamd-Queue-Id: C2D6818036FD7 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: SLUB allows runtime changing of page allocation order by writing into the /sys/kernel/slab//order file. Jann has reported [1] that this interface allows the order to be set too small, leading to crashes. While it's possible to fix the immediate issue, closer inspection reveals potential races. Storing the new order calls calculate_sizes() which non-atomically updates a lot of kmem_cache fields while the cache is still in use. Unexpected behavior might occur even if the fields are set to the same value as they were. This could be fixed by splitting out the part of calculate_sizes() that depends on forced_order, so that we only update kmem_cache.oo field. This could still race with init_cache_random_seq(), shuffle_freelist(), allocate_slab(). Perhaps it's possible to audit and e.g. add some READ_ONCE/WRITE_ONCE accesses, it might be easier just to remove the runtime order changes, which is what this patch does. If there are valid usecases for per-cache order setting, we could e.g. extend the boot parameters to do that. [1] https://lore.kernel.org/r/CAG48ez31PP--h6_FzVyfJ4H86QYczAFPdxtJHUEEan+7VJETAQ@mail.gmail.com Reported-by: Jann Horn Signed-off-by: Vlastimil Babka Reviewed-by: Kees Cook Acked-by: Roman Gushchin --- mm/slub.c | 19 +------------------ 1 file changed, 1 insertion(+), 18 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index ac198202dbb0..58c1e9e7b3b3 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -5111,28 +5111,11 @@ static ssize_t objs_per_slab_show(struct kmem_cache *s, char *buf) } SLAB_ATTR_RO(objs_per_slab); -static ssize_t order_store(struct kmem_cache *s, - const char *buf, size_t length) -{ - unsigned int order; - int err; - - err = kstrtouint(buf, 10, &order); - if (err) - return err; - - if (order > slub_max_order || order < slub_min_order) - return -EINVAL; - - calculate_sizes(s, order); - return length; -} - static ssize_t order_show(struct kmem_cache *s, char *buf) { return sprintf(buf, "%u\n", oo_order(s->oo)); } -SLAB_ATTR(order); +SLAB_ATTR_RO(order); static ssize_t min_partial_show(struct kmem_cache *s, char *buf) { From patchwork Tue Jun 2 14:15:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 11583933 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2BC21739 for ; Tue, 2 Jun 2020 14:16:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 028DA207F7 for ; Tue, 2 Jun 2020 14:16:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 028DA207F7 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 925E0280006; Tue, 2 Jun 2020 10:16:17 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 85FBE280007; Tue, 2 Jun 2020 10:16:17 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7333A280004; Tue, 2 Jun 2020 10:16:17 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0104.hostedemail.com [216.40.44.104]) by kanga.kvack.org (Postfix) with ESMTP id 5892D280004 for ; Tue, 2 Jun 2020 10:16:17 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 0DF1B824559C for ; Tue, 2 Jun 2020 14:16:17 +0000 (UTC) X-FDA: 76884471594.13.laugh83_362efaf22bc30 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin13.hostedemail.com (Postfix) with ESMTP id C2E7018048260 for ; Tue, 2 Jun 2020 14:16:12 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,vbabka@suse.cz,,RULES_HIT:30054:30062:30070:30075:30090,0,RBL:195.135.220.15:@suse.cz:.lbl8.mailshell.net-64.201.201.201 62.14.6.2,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:58,LUA_SUMMARY:none X-HE-Tag: laugh83_362efaf22bc30 X-Filterd-Recvd-Size: 5415 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf30.hostedemail.com (Postfix) with ESMTP for ; Tue, 2 Jun 2020 14:16:12 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id CD10DAE6F; Tue, 2 Jun 2020 14:16:12 +0000 (UTC) From: Vlastimil Babka To: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, vinmenon@codeaurora.org, Kees Cook , Matthew Garrett , Vlastimil Babka , Jann Horn , Vijayanand Jitta Subject: [RFC PATCH 4/5] mm, slub: make remaining slub_debug related attributes read-only Date: Tue, 2 Jun 2020 16:15:18 +0200 Message-Id: <20200602141519.7099-5-vbabka@suse.cz> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200602141519.7099-1-vbabka@suse.cz> References: <20200602141519.7099-1-vbabka@suse.cz> MIME-Version: 1.0 X-Rspamd-Queue-Id: C2E7018048260 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: SLUB_DEBUG creates several files under /sys/kernel/slab// that can be read to check if the respective debugging options are enabled for given cache. Some options, namely sanity_checks, trace, and failslab can be also enabled and disabled at runtime by writing into the files. The runtime toggling is racy. Some options disable __CMPXCHG_DOUBLE when enabled, which means that in case of concurrent allocations, some can still use __CMPXCHG_DOUBLE and some not, leading to potential corruption. The s->flags field is also not updated or checked atomically. The simplest solution is to remove the runtime toggling. The extended slub_debug boot parameter syntax introduced by earlier patch should allow to fine-tune the debugging configuration during boot with same granularity. Signed-off-by: Vlastimil Babka Reviewed-by: Kees Cook Acked-by: Roman Gushchin --- Documentation/vm/slub.rst | 7 ++--- mm/slub.c | 62 ++------------------------------------- 2 files changed, 5 insertions(+), 64 deletions(-) diff --git a/Documentation/vm/slub.rst b/Documentation/vm/slub.rst index 61805e984a0d..f240292589bd 100644 --- a/Documentation/vm/slub.rst +++ b/Documentation/vm/slub.rst @@ -115,11 +115,8 @@ If the file contains 1, the option is enabled, 0 means disabled. The debug T trace A failslab -The sanity_checks, trace and failslab files are writable, so writing 1 or 0 -will enable or disable the option at runtime. The writes to trace and failslab -may return -EINVAL if the cache is subject to slab merging. Careful with -tracing: It may spew out lots of information and never stop if used on the -wrong slab. +Careful with tracing: It may spew out lots of information and never stop if +used on the wrong slab. Slab merging ============ diff --git a/mm/slub.c b/mm/slub.c index 58c1e9e7b3b3..38dd6f3ebb04 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -5056,20 +5056,6 @@ static ssize_t show_slab_objects(struct kmem_cache *s, return x + sprintf(buf + x, "\n"); } -#ifdef CONFIG_SLUB_DEBUG -static int any_slab_objects(struct kmem_cache *s) -{ - int node; - struct kmem_cache_node *n; - - for_each_kmem_cache_node(s, node, n) - if (atomic_long_read(&n->total_objects)) - return 1; - - return 0; -} -#endif - #define to_slab_attr(n) container_of(n, struct slab_attribute, attr) #define to_slab(n) container_of(n, struct kmem_cache, kobj) @@ -5291,43 +5277,13 @@ static ssize_t sanity_checks_show(struct kmem_cache *s, char *buf) { return sprintf(buf, "%d\n", !!(s->flags & SLAB_CONSISTENCY_CHECKS)); } - -static ssize_t sanity_checks_store(struct kmem_cache *s, - const char *buf, size_t length) -{ - s->flags &= ~SLAB_CONSISTENCY_CHECKS; - if (buf[0] == '1') { - s->flags &= ~__CMPXCHG_DOUBLE; - s->flags |= SLAB_CONSISTENCY_CHECKS; - } - return length; -} -SLAB_ATTR(sanity_checks); +SLAB_ATTR_RO(sanity_checks); static ssize_t trace_show(struct kmem_cache *s, char *buf) { return sprintf(buf, "%d\n", !!(s->flags & SLAB_TRACE)); } - -static ssize_t trace_store(struct kmem_cache *s, const char *buf, - size_t length) -{ - /* - * Tracing a merged cache is going to give confusing results - * as well as cause other issues like converting a mergeable - * cache into an umergeable one. - */ - if (s->refcount > 1) - return -EINVAL; - - s->flags &= ~SLAB_TRACE; - if (buf[0] == '1') { - s->flags &= ~__CMPXCHG_DOUBLE; - s->flags |= SLAB_TRACE; - } - return length; -} -SLAB_ATTR(trace); +SLAB_ATTR_RO(trace); static ssize_t red_zone_show(struct kmem_cache *s, char *buf) { @@ -5391,19 +5347,7 @@ static ssize_t failslab_show(struct kmem_cache *s, char *buf) { return sprintf(buf, "%d\n", !!(s->flags & SLAB_FAILSLAB)); } - -static ssize_t failslab_store(struct kmem_cache *s, const char *buf, - size_t length) -{ - if (s->refcount > 1) - return -EINVAL; - - s->flags &= ~SLAB_FAILSLAB; - if (buf[0] == '1') - s->flags |= SLAB_FAILSLAB; - return length; -} -SLAB_ATTR(failslab); +SLAB_ATTR_RO(failslab); #endif static ssize_t shrink_show(struct kmem_cache *s, char *buf) From patchwork Tue Jun 2 14:15:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 11583931 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B4C8B739 for ; Tue, 2 Jun 2020 14:16:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8C485207F7 for ; Tue, 2 Jun 2020 14:16:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8C485207F7 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0C257280005; Tue, 2 Jun 2020 10:16:15 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 06E2C280004; Tue, 2 Jun 2020 10:16:14 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E033A280005; Tue, 2 Jun 2020 10:16:14 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0176.hostedemail.com [216.40.44.176]) by kanga.kvack.org (Postfix) with ESMTP id CA629280004 for ; Tue, 2 Jun 2020 10:16:14 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 856A05C4D2 for ; Tue, 2 Jun 2020 14:16:14 +0000 (UTC) X-FDA: 76884471468.09.fog35_366e31f82f358 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin09.hostedemail.com (Postfix) with ESMTP id 69AEF18001746 for ; Tue, 2 Jun 2020 14:16:14 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,vbabka@suse.cz,,RULES_HIT:30051:30054:30070,0,RBL:195.135.220.15:@suse.cz:.lbl8.mailshell.net-62.2.6.2 64.100.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:1,LUA_SUMMARY:none X-HE-Tag: fog35_366e31f82f358 X-Filterd-Recvd-Size: 2601 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Tue, 2 Jun 2020 14:16:13 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 3D533AE89; Tue, 2 Jun 2020 14:16:13 +0000 (UTC) From: Vlastimil Babka To: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, vinmenon@codeaurora.org, Kees Cook , Matthew Garrett , Vlastimil Babka , Jann Horn , Vijayanand Jitta Subject: [RFC PATCH 5/5] mm, slub: make reclaim_account attribute read-only Date: Tue, 2 Jun 2020 16:15:19 +0200 Message-Id: <20200602141519.7099-6-vbabka@suse.cz> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200602141519.7099-1-vbabka@suse.cz> References: <20200602141519.7099-1-vbabka@suse.cz> MIME-Version: 1.0 X-Rspamd-Queue-Id: 69AEF18001746 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The attribute reflects the SLAB_RECLAIM_ACCOUNT cache flag. It's not clear why this attribute was writable in the first place, as it's tied to how the cache is used by its creator, it's not a user tunable. Furthermore: - it affects slab merging, but that's not being checked while toggled - if affects whether __GFP_RECLAIMABLE flag is used to allocate page, but the runtime toggle doesn't update allocflags - it affects cache_vmstat_idx() so runtime toggle might lead to incosistency of NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE Thus make it read-only. Signed-off-by: Vlastimil Babka Reviewed-by: Kees Cook Acked-by: Roman Gushchin --- mm/slub.c | 11 +---------- 1 file changed, 1 insertion(+), 10 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 38dd6f3ebb04..d4a9a097da50 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -5223,16 +5223,7 @@ static ssize_t reclaim_account_show(struct kmem_cache *s, char *buf) { return sprintf(buf, "%d\n", !!(s->flags & SLAB_RECLAIM_ACCOUNT)); } - -static ssize_t reclaim_account_store(struct kmem_cache *s, - const char *buf, size_t length) -{ - s->flags &= ~SLAB_RECLAIM_ACCOUNT; - if (buf[0] == '1') - s->flags |= SLAB_RECLAIM_ACCOUNT; - return length; -} -SLAB_ATTR(reclaim_account); +SLAB_ATTR_RO(reclaim_account); static ssize_t hwcache_align_show(struct kmem_cache *s, char *buf) {