From patchwork Wed Jan 17 11:45:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chengming Zhou X-Patchwork-Id: 13521650 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8CDAC47DA2 for ; Wed, 17 Jan 2024 11:46:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 51C6B6B00DD; Wed, 17 Jan 2024 06:46:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4CDAF6B00DF; Wed, 17 Jan 2024 06:46:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3BB596B00E0; Wed, 17 Jan 2024 06:46:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 2F33A6B00DD for ; Wed, 17 Jan 2024 06:46:36 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 093521A0BC3 for ; Wed, 17 Jan 2024 11:46:36 +0000 (UTC) X-FDA: 81688625592.15.F71F142 Received: from out-179.mta0.migadu.com (out-179.mta0.migadu.com [91.218.175.179]) by imf28.hostedemail.com (Postfix) with ESMTP id B9D67C0014 for ; Wed, 17 Jan 2024 11:46:33 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=none; spf=pass (imf28.hostedemail.com: domain of chengming.zhou@linux.dev designates 91.218.175.179 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=bytedance.com (policy=quarantine) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705491993; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HJvFuZYnuu1ODgl4d1fXT62/I2n9Fp6Oz8eo/QOcTA8=; b=DoODo4tUddnr8gCvnSOaXQHwyBE1EEkc197KSLlydmp2m6dzNg8tyPWFuB4Js0/rqcXv1U Q6pio3NVMYP26XTNbW79pItckdAtrh3InFeETTsGP63GLQThyKrQp7wX6NkdBGnLgLJ0Sq 11z3aKORQqM0hP3q2zikWU78Z3/qsK8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705491993; a=rsa-sha256; cv=none; b=sA83UPonkEk010VQdF9SxZ+w6hiTpSRrFLRmw/ha6iGdGdMMVcax3/3BlrgTkpSok0DcBf bvrF73i2e01+jQh3UJbqI6ST4m1yYECXVzH8/o+B0EN6h8ta+55H8LPDVl6zSkcxPOQ/bc P1x6riLU4VXK21ive/hK9L1YMY7Plmw= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=none; spf=pass (imf28.hostedemail.com: domain of chengming.zhou@linux.dev designates 91.218.175.179 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=bytedance.com (policy=quarantine) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Chengming Zhou Date: Wed, 17 Jan 2024 11:45:58 +0000 Subject: [PATCH 1/3] mm/slub: directly load freelist from cpu partial slab in the likely case MIME-Version: 1.0 Message-Id: <20240117-slab-misc-v1-1-fd1c49ccbe70@bytedance.com> References: <20240117-slab-misc-v1-0-fd1c49ccbe70@bytedance.com> In-Reply-To: <20240117-slab-misc-v1-0-fd1c49ccbe70@bytedance.com> To: Hyeonggon Yoo <42.hyeyoo@gmail.com>, Joonsoo Kim , Vlastimil Babka , Christoph Lameter , Pekka Enberg , Andrew Morton , Roman Gushchin , David Rientjes Cc: linux-mm@kvack.org, Chengming Zhou , linux-kernel@vger.kernel.org X-Developer-Signature: v=1; a=ed25519-sha256; t=1705491984; l=1967; i=zhouchengming@bytedance.com; s=20231204; h=from:subject:message-id; bh=1cV2tT3H0/m/Zwl/1edZJTWNsxvFuwH6aWvUH2ceDxA=; b=lspzQdoDP02OxesbrkhjrNtCmxWtOMg+h1t6s9v2piAwijqH3Tc7Q1b13y4pdQjtZJBFFuN85 MSi/7Nl/He4A7iY8555VHvt+jjR22vlwQ2KgFZAuvEDcaZ+skfH0ZEc X-Developer-Key: i=zhouchengming@bytedance.com; a=ed25519; pk=xFTmRtMG3vELGJBUiml7OYNdM393WOMv0iWWeQEVVdA= X-Migadu-Flow: FLOW_OUT X-Stat-Signature: 86yij3iuqikqeruah834k1truyjcagzq X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: B9D67C0014 X-Rspam-User: X-Rspamd-Pre-Result: action=add header; module=dmarc; Action set by DMARC X-Rspam: Yes X-HE-Tag: 1705491993-184715 X-HE-Meta: U2FsdGVkX19gCR0Fx0xBZbwUb8WzW/SXxxEMsQvqcJ550qoApRzkGLbBW2AXWldCQzgks6fo3CgjVgG5LuidemVgTs3bD6xi6DDuOot8pYIeh72Ok5sZWkG5HjLXFkSBaKhAkpCeMEBQgwdpW0gVHX0NAFrEpLZvwC3rT82FL2ccevQLvDlWEM153UI7bBYK0DfTS4SoRxAUd17twKjJbg6uvTlRLJlHYNnLW5WqXYheNvkNnc4n2LSWAYScr8ZSfexOSnpQWDG6h0+pwQp5Wx4Q3NPeeZHk5zh8xTpxdOtB6HPU4nHjfdtMmz9QUBim7k63JmvYXd7MDFvkOWLXq8WbMgrXu5g1IY0JrsjhJkWFjAGvHxtIdo3xGhcRA/S03nwoFbCwo7Goj3hVx8rXLVz50YxDPUbY9X6/Ggq60aimYH7TzGmndK6wZl15v1ktfcpS1DLrTJfRtqBsCeMdp7QwZvsu9K+heo2j9c6gazq7RQMV9MyTk9pdawYaugUej4Q/rhDlLuvoGkLaJ7uzTOmv/38iJnztQbWpwaCeiQECGJ9H8kq+dSgklAv/wiG8pYTwnITtEVu9PPc3X42pR07n1D2ro9hz96Kes1DE324bpHvOHPO8N7U32xgnPhGVGItk9P9zK8giaA1jQXPoDDf1GuvODNN5NuVRh+fyhQsl2zIq2Hkbrs7Oym3NgS7rDqCEJ14yFYXYtK4YHpgYMYjV8aVkn8kgMH8CHC+kc5AcV0kXgwaxnheYeUWz5vSwYPYZ4KnyiDqkIbYh64qrIRrcXIHwoEoe3sBrr1FthMIMJJ4G89FPJZqHLodWWDTeTq8ncxjuvglbTx1JmZ42fs/OF9po2l/j99Sl0RmH4Z+6G/1up0lE80+FbR+YAfUlOQ95NhuNUtkj7ENfdQteLyne9ee/uOz2DXBJzY1tQs4tqchf1giVpA+F186O23oSaXslQFgDWqShrfHoYSP B19dAark rgFwgJ2wKce/1sOeNansZQQh4PiLYcG1rObYX04wyY1XeUYzDmnDs9W4sSxTpabzsSoTXBt1Y3W2tLkOXlm4r/M7R0CuyUIBKrh/qp729/q080E3TOgDAspNjDIrf+La4GxA/0eB/QMD9dGYJYPii9/UpxiFw9Czv/leGjb5yk33xalh5LeEfhTRfMZWysrMdEo7UOGRmKX8k4Mjvo6jppkThXHWyLwnKmDT0Xr9GLT+y6TXhJk3sVVdXt/LDxE+gEcGYB9vJZKHiR92yoFwdcU23Nh+c3Xj9A0vcnfVRyXv6Kf2L40pj9EnL7O3oLWvrXlBoYO1UYVEZVbGZvN1+IaGjpW75nx2gaqXOb3yNvFRiL3uYfOmjpJHGZiK0PWIWzLbj9A9w0ayUVZgGMhy+ea3Jng== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The likely case is that we get a usable slab from the cpu partial list, we can directly load freelist from it and return back, instead of going the other way that need more work, like reenable interrupt and recheck. But we need to remove the "VM_BUG_ON(!new.frozen)" in get_freelist() for reusing it, since cpu partial slab is not frozen. It seems acceptable since it's only for debug purpose. There is some small performance improvement too, which shows by: perf bench sched messaging -g 5 -t -l 100000 mm-stable slub-optimize Total time 7.473 7.209 Signed-off-by: Chengming Zhou --- mm/slub.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 2ef88bbf56a3..20c03555c97b 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3326,7 +3326,6 @@ static inline void *get_freelist(struct kmem_cache *s, struct slab *slab) counters = slab->counters; new.counters = counters; - VM_BUG_ON(!new.frozen); new.inuse = slab->objects; new.frozen = freelist != NULL; @@ -3498,18 +3497,19 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, slab = slub_percpu_partial(c); slub_set_percpu_partial(c, slab); - local_unlock_irqrestore(&s->cpu_slab->lock, flags); - stat(s, CPU_PARTIAL_ALLOC); - if (unlikely(!node_match(slab, node) || - !pfmemalloc_match(slab, gfpflags))) { - slab->next = NULL; - __put_partials(s, slab); - continue; + if (likely(node_match(slab, node) && + pfmemalloc_match(slab, gfpflags))) { + c->slab = slab; + freelist = get_freelist(s, slab); + stat(s, CPU_PARTIAL_ALLOC); + goto load_freelist; } - freelist = freeze_slab(s, slab); - goto retry_load_slab; + local_unlock_irqrestore(&s->cpu_slab->lock, flags); + + slab->next = NULL; + __put_partials(s, slab); } #endif