From patchwork Tue Jan 23 09:33:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chengming Zhou X-Patchwork-Id: 13527036 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A84CC47258 for ; Tue, 23 Jan 2024 09:33:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 10B2E6B0082; Tue, 23 Jan 2024 04:33:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 06E006B0083; Tue, 23 Jan 2024 04:33:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DDB5F6B0087; Tue, 23 Jan 2024 04:33:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D016F6B0082 for ; Tue, 23 Jan 2024 04:33:56 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 9C899A1F41 for ; Tue, 23 Jan 2024 09:33:56 +0000 (UTC) X-FDA: 81710064072.27.885FB9F Received: from out-171.mta1.migadu.com (out-171.mta1.migadu.com [95.215.58.171]) by imf13.hostedemail.com (Postfix) with ESMTP id 9B9DA20008 for ; Tue, 23 Jan 2024 09:33:54 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=bytedance.com (policy=quarantine); spf=pass (imf13.hostedemail.com: domain of chengming.zhou@linux.dev designates 95.215.58.171 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706002434; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/tfGL6aqY7i7Mloy4UCzd9I3jn4oHdjydxAs/Nsi1Gk=; b=Watyrjj2C3Nw1PY1EdpVQ+eMHhXMQj3TfUQ8ugY7ovCXUGZ6Or1lOGAvHSSQadbkPNFTFv 08EDJOJ91krhy4BpS16dygWtPOe4B8/ntxOFrQMLY1aAxc64i887vLX5BaWRBxKl697hO0 8o9DsrUf0yLyecryA9evhELMchWHZUM= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=bytedance.com (policy=quarantine); spf=pass (imf13.hostedemail.com: domain of chengming.zhou@linux.dev designates 95.215.58.171 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706002434; a=rsa-sha256; cv=none; b=q2nwUCLFEZqM6qKAg5ZIsdDTwFNWMM74uyfSvmYdxMoI4M+hf8VE2jfaLLAkaCw6IOs9Kj gX9bJVuNijZsYqUAILZXTSG7jygKDJUZ75N8Aae+/v5vJRHpLJ/WkNoRzY/1svXwPPisK2 Kwi35G2UFXs2jmur9CXw/X/nMLqeZpI= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Chengming Zhou Date: Tue, 23 Jan 2024 09:33:29 +0000 Subject: [PATCH v2 1/3] mm/slub: directly load freelist from cpu partial slab in the likely case MIME-Version: 1.0 Message-Id: <20240117-slab-misc-v2-1-81766907896e@bytedance.com> References: <20240117-slab-misc-v2-0-81766907896e@bytedance.com> In-Reply-To: <20240117-slab-misc-v2-0-81766907896e@bytedance.com> To: Joonsoo Kim , Vlastimil Babka , David Rientjes , Roman Gushchin , Pekka Enberg , Christoph Lameter , Andrew Morton , Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Vlastimil Babka , linux-kernel@vger.kernel.org, Chengming Zhou , linux-mm@kvack.org, "Christoph Lameter (Ampere)" X-Developer-Signature: v=1; a=ed25519-sha256; t=1706002427; l=2205; i=zhouchengming@bytedance.com; s=20231204; h=from:subject:message-id; bh=RxryMU48iVU4pnALkZXNrPy7bZDqLdaz2AEx4p3Ji5Q=; b=378yGXlTZX4OD1uYzqIEJLiYJiKugeTycFRTFrJPpwzMaIF7wNnhWL/62RFkkKgtmJbEawLdi XuPcUkfqgQcAYhCGSKbsn/raWfAV9nLnWfdn7JURg7bvvxzMP/mmbwK X-Developer-Key: i=zhouchengming@bytedance.com; a=ed25519; pk=xFTmRtMG3vELGJBUiml7OYNdM393WOMv0iWWeQEVVdA= X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Rspamd-Queue-Id: 9B9DA20008 X-Rspamd-Server: rspam12 X-Stat-Signature: gqhtoyoe9wb3mhydtdk9aefupnpb8hnn X-Rspamd-Pre-Result: action=add header; module=dmarc; Action set by DMARC X-Rspam: Yes X-HE-Tag: 1706002434-113352 X-HE-Meta: U2FsdGVkX1+afpW6c2eTlNhhoKOayGJGYXQxsCF6kjE8lwJWB5QG+dJxDQIWW7YP1wq3Qt68UOzd4JDHCzwl3pd9K8hcKTixqGt+GWXEHihJFNGpfcIo5CVj8AY9s5VFClueniUQ26oR+h9plcNr93wKSGEqPShJ3g3usSrYzjRbibv/HHKP6FS6DrvvSmX96PTAuNf5DxumpVa40iy98MPc09/xlCcyU3qpY7WaHIVTUmDTUpZT9jT9152LvCXFJmKYThRjtAnwUXGcf6x/yi5yOShgrLuXpMkgUZczSCK4AyR8UPny3DVkCMjNn5qFkJnvVsKSTq9yK+oOgAldjY9iQ8FSz8G2iOH0jUBtDxb54sw3O7754ZEgDcn/UviADeaseWXxHVE2UZtLS2DyJDHuUZ3eidvSOMQDE7+uQFrlIBNsaCyhmtbWgT2zFoOfQBgVP80um1Pt2LWiKNP8uHEewRAYUJRO9bhSMh1T5R/b/OCYc9iq9BVRnPkP/TVVCHrzC+M0FwVTETMmCDBk11NWitBKfxqVMVG7vayxQvN9NuHfR3z/BlPJO98IZigyYI2iP7cNymI/JYYibAgFKVpv9r2EOHB3jZ07Xa256jvENoFCb3LRZ/yDbkZDPa2OfeV8KO7GtuQilWmQBaUH/Waf7JQFwLIYLSZs1wJHr3CkO2ZhkELc2CSoU2LL6fINT28CSG3i1gSUeP7fmdYtO1R59XlRDHzBQ4+WSG48cPGB8quVFuYOPxLL9KZXvAW5upDmzBd57nHN9L3ZVK5QVqMX7H6049ooc/e+AX4BX+3nBO0exTzjrbhv/jvlfczGbIXkK665UL5dCokJeAwYCXBH4UyaNQAGI90UNNq8+i0Lr9sZYw0JYq9UyLzDz+WQt99rUHOGOidcKVxKqH6pgc0/Kb1Uj/R2LD4RqacC/9EeGfMHgM0UxTNlP/Nu9jgrQwXPvSLwnvk6ohsm9VJ V3Kz+7TS 6Oziv0pzK6kAEArmoVNTkkSDJPTm8KACnlbeW949oFmhDVu7oqbBOvMhOkzuHu8cwzhHoJ5frSnkKHmGfNvDfizostps/OzjGq7GP9RJG7vDTH7IMIdM1lDM19WviTnEZEttOvGj3InQWVF3RWL2BR9RuIQmP0EkpdAl0Ja9GTIPUeHCKCMJtJrHyRvJRSlhatqnzRNHg4QiB4wvHQrhrfyTTS3kJCRkjbMJpvESIfzG8hyyU8f9XzN0TwTAc3MyJOZPXuIkyfL/qW4MojfnJzS37V9CyfgYCJyuD/j51JM6r6SPlY/tbk75S3aNJMcJseBoKKlMP8Np3fRG9onbu/lSCZ3Aa7aCV9257voJiMc5q3ay3k06z7HnAzVbvqyow1o2+VzWsSG57TrM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The likely case is that we get a usable slab from the cpu partial list, we can directly load freelist from it and return back, instead of going the other way that need more work, like reenable interrupt and recheck. But we need to remove the "VM_BUG_ON(!new.frozen)" in get_freelist() for reusing it, since cpu partial slab is not frozen. It seems acceptable since it's only for debug purpose. And get_freelist() also assumes it can return NULL if the freelist is empty, which is not possible for the cpu partial slab case, so we add "VM_BUG_ON(!freelist)" after get_freelist() to make it explicit. There is some small performance improvement too, which shows by: perf bench sched messaging -g 5 -t -l 100000 mm-stable slub-optimize Total time 7.473 7.209 Signed-off-by: Chengming Zhou Reviewed-by: Vlastimil Babka --- mm/slub.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 2ef88bbf56a3..fda402b2d649 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3326,7 +3326,6 @@ static inline void *get_freelist(struct kmem_cache *s, struct slab *slab) counters = slab->counters; new.counters = counters; - VM_BUG_ON(!new.frozen); new.inuse = slab->objects; new.frozen = freelist != NULL; @@ -3498,18 +3497,20 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, slab = slub_percpu_partial(c); slub_set_percpu_partial(c, slab); - local_unlock_irqrestore(&s->cpu_slab->lock, flags); - stat(s, CPU_PARTIAL_ALLOC); - if (unlikely(!node_match(slab, node) || - !pfmemalloc_match(slab, gfpflags))) { - slab->next = NULL; - __put_partials(s, slab); - continue; + if (likely(node_match(slab, node) && + pfmemalloc_match(slab, gfpflags))) { + c->slab = slab; + freelist = get_freelist(s, slab); + VM_BUG_ON(!freelist); + stat(s, CPU_PARTIAL_ALLOC); + goto load_freelist; } - freelist = freeze_slab(s, slab); - goto retry_load_slab; + local_unlock_irqrestore(&s->cpu_slab->lock, flags); + + slab->next = NULL; + __put_partials(s, slab); } #endif