From patchwork Fri Aug 26 09:09:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 12955677 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D262ECAAD7 for ; Fri, 26 Aug 2022 09:09:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 402C86B0074; Fri, 26 Aug 2022 05:09:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2E0186B007D; Fri, 26 Aug 2022 05:09:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1925C6B007B; Fri, 26 Aug 2022 05:09:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 08F1B6B0074 for ; Fri, 26 Aug 2022 05:09:26 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id D1CA640DC4 for ; Fri, 26 Aug 2022 09:09:25 +0000 (UTC) X-FDA: 79841170290.03.1B01564 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf23.hostedemail.com (Postfix) with ESMTP id 2715A14001A for ; Fri, 26 Aug 2022 09:09:24 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id B1ADB1F949; Fri, 26 Aug 2022 09:09:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1661504962; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=9r9Y4nrZ+UpyPTbE+b0JTOOTRnErnN2EMFHblvWV84E=; b=1zj74Bqwu3CTKrmQX2WgTQRQgNgsM6b/3rEm8rXayTVXOt6R4xf6cw4qRAyT6P9MBGslcU CsYfyZSyjwYwh9xGaXuLWWSPFeuvI68bVv23oqo9rARSL+n4Yr83kvUhC0c2PhyuuIO3Ta NDriDbPEWpySeZWurnO++rGdbjQgRIo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1661504962; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=9r9Y4nrZ+UpyPTbE+b0JTOOTRnErnN2EMFHblvWV84E=; b=NRwqt9pvr6ZcrbAvXxxfl+2QZ1GFMkaMHWwsyZ4aCpvSZFwQWK2sIJoSFBTaml01kHkTL7 BotrAix6sMBCpqBg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 8351013421; Fri, 26 Aug 2022 09:09:22 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id EkdtH8KNCGOxOAAAMHmgww (envelope-from ); Fri, 26 Aug 2022 09:09:22 +0000 From: Vlastimil Babka To: Christoph Lameter , Joonsoo Kim , David Rientjes , Pekka Enberg , Joel Fernandes Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>, Roman Gushchin , linux-mm@kvack.org, Matthew Wilcox , paulmck@kernel.org, rcu@vger.kernel.org, Vlastimil Babka Subject: [RFC PATCH 1/2] mm/slub: perform free consistency checks before call_rcu Date: Fri, 26 Aug 2022 11:09:11 +0200 Message-Id: <20220826090912.11292-1-vbabka@suse.cz> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661504964; a=rsa-sha256; cv=none; b=K2UtdvFoAaeUzKjM1GOem6MRJqcYNIWmbNOJweSF3Qm0Kv2Q9wY9Z2wr5k1sWw686JR03C o06JJ8zXKq8CF/I7hl8tXLkncCL9eIBQT7imvJeuXso/ptdwW+9wwnVq9MgFpw/1ukSCqt CQ7n4fP6dVclCN8nmhX4XQzzoG6bGQ4= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=1zj74Bqw; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=NRwqt9pv; spf=pass (imf23.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661504964; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=9r9Y4nrZ+UpyPTbE+b0JTOOTRnErnN2EMFHblvWV84E=; b=6e2k346s4IDKt2F+nY9AZZ1LMSGij97tG8iXbqSTyQYs8oKm68G5hU0HXatVZQph8HnT6e K2e9QQlo3t9BGCCRHVst9d3OUipI4wy6cxbik2vjeraxriZg11ERwC84+pCUxiUaaMg4F4 MXA3KFDQUTH5S4eTkbG/xDimaNA4FhY= X-Rspam-User: Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=1zj74Bqw; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=NRwqt9pv; spf=pass (imf23.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none X-Rspamd-Server: rspam12 X-Stat-Signature: gte6nyi9a9yekf7k6wywdd7kbkpk8obb X-Rspamd-Queue-Id: 2715A14001A X-HE-Tag: 1661504964-624041 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: For SLAB_TYPESAFE_BY_RCU caches we use call_rcu to perform empty slab freeing. The rcu callback rcu_free_slab() calls __free_slab() that currently includes checking the slab consistency for caches with SLAB_CONSISTENCY_CHECKS flags. This check needs the slab->objects field to be intact. Because in the next patch we want to allow rcu_head in struct slab to become larger in debug configurations and thus potentially overwrite more fields through a union than slab_list, we want to limit the fields used in rcu_free_slab(). Thus move the consistency checks to free_slab() before call_rcu(). This can be done safely even for SLAB_TYPESAFE_BY_RCU caches where accesses to the objects can still occur after freeing them. As a result, only the slab->slab_cache field has to be physically separate from rcu_head for the freeing callback to work. We also save some cycles in the rcu callback for caches with consistency checks enabled. Signed-off-by: Vlastimil Babka Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> --- mm/slub.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 862dbd9af4f5..d86be1b0d09f 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2036,14 +2036,6 @@ static void __free_slab(struct kmem_cache *s, struct slab *slab) int order = folio_order(folio); int pages = 1 << order; - if (kmem_cache_debug_flags(s, SLAB_CONSISTENCY_CHECKS)) { - void *p; - - slab_pad_check(s, slab); - for_each_object(p, s, slab_address(slab), slab->objects) - check_object(s, slab, p, SLUB_RED_INACTIVE); - } - __slab_clear_pfmemalloc(slab); __folio_clear_slab(folio); folio->mapping = NULL; @@ -2062,9 +2054,17 @@ static void rcu_free_slab(struct rcu_head *h) static void free_slab(struct kmem_cache *s, struct slab *slab) { - if (unlikely(s->flags & SLAB_TYPESAFE_BY_RCU)) { + if (kmem_cache_debug_flags(s, SLAB_CONSISTENCY_CHECKS)) { + void *p; + + slab_pad_check(s, slab); + for_each_object(p, s, slab_address(slab), slab->objects) + check_object(s, slab, p, SLUB_RED_INACTIVE); + } + + if (unlikely(s->flags & SLAB_TYPESAFE_BY_RCU)) call_rcu(&slab->rcu_head, rcu_free_slab); - } else + else __free_slab(s, slab); } From patchwork Fri Aug 26 09:09:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 12955678 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1675EECAAD2 for ; Fri, 26 Aug 2022 09:09:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1F79C6B0075; Fri, 26 Aug 2022 05:09:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1E0B26B0074; Fri, 26 Aug 2022 05:09:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ED9F0940007; Fri, 26 Aug 2022 05:09:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D84D86B0074 for ; Fri, 26 Aug 2022 05:09:25 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id A341EAC2B0 for ; Fri, 26 Aug 2022 09:09:25 +0000 (UTC) X-FDA: 79841170290.20.94508C7 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf24.hostedemail.com (Postfix) with ESMTP id 2E824180015 for ; Fri, 26 Aug 2022 09:09:24 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id E717133781; Fri, 26 Aug 2022 09:09:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1661504962; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZXkb4HY1mJ7b/yG1nPhGnlCQaSjYWZRhffChGjlIspw=; b=Uak6h5LyLWDAta0jaB+p5B8743G8eOSF8Iv1p7lHcQa8ZzuyQpUyo5iEixNzmN+4xnsh/i alXuDx/b6tD3DtPQWSOe4AfHjbmg7LlKheh9s8MUZPTJAeOv1Rln/9sRnvTZkNyOg5rnh2 qGX8nzqUiJ3FNQsKxRKFUZxJ1iUKvOk= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1661504962; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZXkb4HY1mJ7b/yG1nPhGnlCQaSjYWZRhffChGjlIspw=; b=O7YA+Q01Ob1Q7DJjscAWExfuehssAxgp+OH11zXQALwQfnEuxi4N7aJcOY66VPqEGcAvO9 PWEqpkBnBwaabPCA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id B0F5B13A82; Fri, 26 Aug 2022 09:09:22 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id WK6hKsKNCGOxOAAAMHmgww (envelope-from ); Fri, 26 Aug 2022 09:09:22 +0000 From: Vlastimil Babka To: Christoph Lameter , Joonsoo Kim , David Rientjes , Pekka Enberg , Joel Fernandes Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>, Roman Gushchin , linux-mm@kvack.org, Matthew Wilcox , paulmck@kernel.org, rcu@vger.kernel.org, Vlastimil Babka Subject: [RFC PATCH 2/2] mm/sl[au]b: rearrange struct slab fields to allow larger rcu_head Date: Fri, 26 Aug 2022 11:09:12 +0200 Message-Id: <20220826090912.11292-2-vbabka@suse.cz> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220826090912.11292-1-vbabka@suse.cz> References: <20220826090912.11292-1-vbabka@suse.cz> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661504964; a=rsa-sha256; cv=none; b=bJhOEHSzke2pK1f/Ua6L0R3056xZtXXR/AK+9OJyGFjGvs5ZzNCrt8ucaTZm14lNL/U1bW YZEqk2iGUgu6CY4rDRl3XTAJyLrNxiugdKZvHdvP3Gx42oyG4ZoVgGfXFETvekvdNhlkTS vnT8sAtfP0fsgWz0MYInhyLqgktrX8s= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=Uak6h5Ly; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=O7YA+Q01; spf=pass (imf24.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661504964; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZXkb4HY1mJ7b/yG1nPhGnlCQaSjYWZRhffChGjlIspw=; b=a8tO+xrFoRvhAc3JIVByTNOy8BMIufp9S3Vu4IA1mmP9sEXFPAVVsrteSvSHeqJ1/aRkmF z33+Xzk+XzReN46zOd8P4p4rGrE1tTvH8f1jLRR2HFXp+qhxIImsWRlHn7d94+i6XOwmvZ YwlqggUT+SVUW8wmpaDzK9YmWJSEbS0= Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=Uak6h5Ly; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=O7YA+Q01; spf=pass (imf24.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: bg61qxfeekrpcy8ccmzeweiokcbgqp1b X-Rspamd-Queue-Id: 2E824180015 X-HE-Tag: 1661504963-709750 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Joel reports [1] that increasing the rcu_head size for debugging purposes used to work before struct slab was split from struct page, but now runs into the various SLAB_MATCH() sanity checks of the layout. This is because the rcu_head in struct page is in union with large sub-structures and has space to grow without exceeding their size, while in struct slab (for SLAB and SLUB) it's in union only with a list_head. On closer inspection (and after the previous patch) we can put all fields except slab_cache to a union with rcu_head, as slab_cache is sufficient for the rcu freeing callbacks to work and the rest can be overwritten by rcu_head without causing issues. This is only somewhat complicated by the need to keep SLUB's freelist+counters aligned for cmpxchg_double. As a result the fields need to be reordered so that slab_cache is first (after page flags) and the union with rcu_head follows. For consistency, do that for SLAB as well, although not necessary there. As a result, the rcu_head field in struct page and struct slab is no longer at the same offset, but that doesn't matter as there is no casting that would rely on that in the slab freeing callbacks, so we can just drop the respective SLAB_MATCH() check. Also we need to update the SLAB_MATCH() for compound_head to reflect the new ordering. While at it, also add a static_assert to check the alignment needed for cmpxchg_double so mistakes are found sooner than a runtime GPF. [1] https://lore.kernel.org/all/85afd876-d8bb-0804-b2c5-48ed3055e702@joelfernandes.org/ Reported-by: Joel Fernandes Signed-off-by: Vlastimil Babka Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> --- mm/slab.h | 54 ++++++++++++++++++++++++++++++++---------------------- 1 file changed, 32 insertions(+), 22 deletions(-) diff --git a/mm/slab.h b/mm/slab.h index 4ec82bec15ec..2c248864ea91 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -11,37 +11,43 @@ struct slab { #if defined(CONFIG_SLAB) + struct kmem_cache *slab_cache; union { - struct list_head slab_list; + struct { + struct list_head slab_list; + void *freelist; /* array of free object indexes */ + void *s_mem; /* first object */ + }; struct rcu_head rcu_head; }; - struct kmem_cache *slab_cache; - void *freelist; /* array of free object indexes */ - void *s_mem; /* first object */ unsigned int active; #elif defined(CONFIG_SLUB) - union { - struct list_head slab_list; - struct rcu_head rcu_head; -#ifdef CONFIG_SLUB_CPU_PARTIAL - struct { - struct slab *next; - int slabs; /* Nr of slabs left */ - }; -#endif - }; struct kmem_cache *slab_cache; - /* Double-word boundary */ - void *freelist; /* first free object */ union { - unsigned long counters; struct { - unsigned inuse:16; - unsigned objects:15; - unsigned frozen:1; + union { + struct list_head slab_list; +#ifdef CONFIG_SLUB_CPU_PARTIAL + struct { + struct slab *next; + int slabs; /* Nr of slabs left */ + }; +#endif + }; + /* Double-word boundary */ + void *freelist; /* first free object */ + union { + unsigned long counters; + struct { + unsigned inuse:16; + unsigned objects:15; + unsigned frozen:1; + }; + }; }; + struct rcu_head rcu_head; }; unsigned int __unused; @@ -66,9 +72,10 @@ struct slab { #define SLAB_MATCH(pg, sl) \ static_assert(offsetof(struct page, pg) == offsetof(struct slab, sl)) SLAB_MATCH(flags, __page_flags); -SLAB_MATCH(compound_head, slab_list); /* Ensure bit 0 is clear */ #ifndef CONFIG_SLOB -SLAB_MATCH(rcu_head, rcu_head); +SLAB_MATCH(compound_head, slab_cache); /* Ensure bit 0 is clear */ +#else +SLAB_MATCH(compound_head, slab_list); /* Ensure bit 0 is clear */ #endif SLAB_MATCH(_refcount, __page_refcount); #ifdef CONFIG_MEMCG @@ -76,6 +83,9 @@ SLAB_MATCH(memcg_data, memcg_data); #endif #undef SLAB_MATCH static_assert(sizeof(struct slab) <= sizeof(struct page)); +#if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && defined(CONFIG_SLUB) +static_assert(IS_ALIGNED(offsetof(struct slab, freelist), 16)); +#endif /** * folio_slab - Converts from folio to slab.