From patchwork Sun Jul 23 19:09:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hyeonggon Yoo <42.hyeyoo@gmail.com> X-Patchwork-Id: 13323319 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1535C001DC for ; Sun, 23 Jul 2023 19:09:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2D2726B0074; Sun, 23 Jul 2023 15:09:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 25BFA6B0075; Sun, 23 Jul 2023 15:09:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0AE8D6B0078; Sun, 23 Jul 2023 15:09:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id EB12C6B0074 for ; Sun, 23 Jul 2023 15:09:34 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id AE8A81A07E3 for ; Sun, 23 Jul 2023 19:09:34 +0000 (UTC) X-FDA: 81043815468.11.BDBF6B2 Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by imf21.hostedemail.com (Postfix) with ESMTP id C41D91C0023 for ; Sun, 23 Jul 2023 19:09:32 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="n2mPX/An"; spf=pass (imf21.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690139372; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gKo85STH0xt3042Px3GwfcVzSaeLLym3sMWM8T4c8oI=; b=2rmVHrQnex5B5rl0XjQwA8qfBwuGj/HiHNcJdsz/POBVVoiYV3Era9vKkLUXykIRcHBV63 HtxKXog3G/5buhJsHwB7MQN9ioDDgg+8MWemLX7MnVySzoP5Gj3bwtbSfpCyVT5mZdbJJK HcKTIz5ae3qHV7Mpp10hYPoYHsfnB+0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690139372; a=rsa-sha256; cv=none; b=YQ2DDbWKoXw6DT4oURA2BLo0n757IO2IDzk5K8b5XtnsEXwaumxdtEjyjkvzz0Yv42z10x Fhf3/QNukgWrN3hvoaUvgfg+ANvTPmUE+A3Fez1//+7ab25lH6RQd3bu9fGRiO0SIBZbhI N5jqhkEr+rkKL6aFYSPmLc4N/480+oY= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="n2mPX/An"; spf=pass (imf21.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-1bb775625e2so10046775ad.1 for ; Sun, 23 Jul 2023 12:09:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690139372; x=1690744172; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=gKo85STH0xt3042Px3GwfcVzSaeLLym3sMWM8T4c8oI=; b=n2mPX/AnU5HRR2TatQSvEtq454ZA0uqEV+XcsaSntFyzBasdSjUD2ZxXYhdtKsJ8Ek N9htiNr2hrIdgrBE9skKL+CIacqccCRmdK02SxoejK2Q1awY9vRG/2TvqQDK6Syxfxj8 2kbd0N7UF3rp7kIS3xCjLnlJqHAmiS4StmhmdAYyMZNjjd5Dkf8T4ZdXZx+nRl8G5CyZ xyId8mYgj1BZRUun2hsOiY5K4Oo/jHj9H2P94lDE6qSxJT3Z5wZbzJKGx1BSBfKHLgV6 MGGFpyqeVUcYcdjRQFSL+G+W1+7NynXeEe0rLe7ZlLKTi9rEp8ygNOqDEXZo4x/wXXa7 /u6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690139372; x=1690744172; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gKo85STH0xt3042Px3GwfcVzSaeLLym3sMWM8T4c8oI=; b=V73ObbCeMocz3eEz89wrjhzy5b9HhWtf45sUABO3KVeNDRc5tZMuDdIB496Ocre/g9 bhlXQcaPUc+5/+DZhak+fgZgrdjHCHcKbb82UWPsEuQBSzCvlMrp1gbVnBL1Z7ufaTmE V5NudOtoVrJZidT8jaS7T/v34XlYyBwthMXY9gDXfr3xDp3ve5jxNhpysKJrbHy4gHSX bD3sDArpMZiVIsq8jPx1cN6BJo3ZQfwC6/C0WZyN2aC8+Yf67afwBJEIC5saozmz7nWU mJkwPyy9n2+BSSq8vFbxDEiHZNMVgE5uBWPPdtfrLRXV6kfHytoQ7vGgmNTWsPkjZFHv VBiw== X-Gm-Message-State: ABy/qLZmGpHsNZV86DmwR+spZOimBcl8fXoV7xYFV3aLAM02KmJtiVT7 VCrCBt/qumD8ICKpjI/ZVyI= X-Google-Smtp-Source: APBJJlF1frUPXLA+nNbwlQwx8oe1ROZ8tR8dgVHs6kFaKVR6/CntGZhb6WAHDnToH/UXDwmWuSjusA== X-Received: by 2002:a17:902:f687:b0:1bb:8cb6:3f99 with SMTP id l7-20020a170902f68700b001bb8cb63f99mr4867895plg.14.1690139371565; Sun, 23 Jul 2023 12:09:31 -0700 (PDT) Received: from fedora.. ([1.245.179.104]) by smtp.gmail.com with ESMTPSA id s10-20020a170902ea0a00b001b53d3d8f3dsm7168625plg.299.2023.07.23.12.09.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 23 Jul 2023 12:09:31 -0700 (PDT) From: Hyeonggon Yoo <42.hyeyoo@gmail.com> To: Vlastimil Babka , Christoph Lameter , Pekka Enberg , Joonsoo Kim , David Rientjes , Andrew Morton Cc: Roman Gushchin , Feng Tang , "Sang, Oliver" , Jay Patel , Binder Makin , aneesh.kumar@linux.ibm.com, tsahu@linux.ibm.com, piyushs@linux.ibm.com, fengwei.yin@intel.com, ying.huang@intel.com, lkp , "oe-lkp@lists.linux.dev" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Hyeonggon Yoo <42.hyeyoo@gmail.com> Subject: [RFC 1/2] Revert "mm, slub: change percpu partial accounting from objects to pages" Date: Mon, 24 Jul 2023 04:09:05 +0900 Message-ID: <20230723190906.4082646-2-42.hyeyoo@gmail.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230723190906.4082646-1-42.hyeyoo@gmail.com> References: <20230723190906.4082646-1-42.hyeyoo@gmail.com> MIME-Version: 1.0 X-Stat-Signature: 9gttdo848rhqscfr35f1gtkzjycx6w1w X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: C41D91C0023 X-Rspam-User: X-HE-Tag: 1690139372-747061 X-HE-Meta: U2FsdGVkX1+gD/P2N55k8YHrdesbZqunezDrEbCdMHijjAQK6fq7asYj516T8t3qYiMF7kl6FMPNOM5dnqjQE1S2aD0QhGYjkIlwQkf6bV0kQPVsv/XRrq01a2LG3ntRRS468pA4iAoRdJr2hU9KgE6pIZGM7ERkQifgQmj9fjLTdHck+xW/izTodVpPqxsnKFbwTAV/0B8QA9tLhANhnXLBzB6BlhH6N1fwl8VG7fLVQAyhI6e0iNymnqVQqXpJOok5X9CXP5FfkO2ELaimwJ+hmA90IDVXAleYubkg/Uiyd/yPzsADF8uDbRfvEOwP0cCBo5Jkv1LsnS06HlgZ9rnQNVB8k+FttGBTM4bRrOnwiS0oCZQ3ZN3OqBfi+RRxyuqr+x8fvJQMty+ZNpgSUiGQLhEejmdOfLAj2JendG8TkR7L82BIFxrUz00mBKR5fqtM93Q61yCWLlzn3qChRVAhVgonEC4zmv4GK3QX8xjXcKwMUlwbuTW/CtiEC25kvvT59XXoemw09TAl78kmFzPfm4ykh3Bou5xv4a832P6POGHlBNGZU3ij/bUReuNm589wjeFM5QHwySfAbXFpx9qnH2rG+APa94i3N1yYIBTK4A2vJEeaWFXfbsuxrXN5Esykv+m6X7wGiGtvOfL/ynLhkxt1c+VJ6/wubyNNCEcDIp1u8MeNpe5rPqf6sCiLEyUMRa70jg/tsdSIAMq12CGmNSYKWGRX2qAZp4hMyu9IglP9yrYKm36Qo+9hjPrusUKqct4yrAUtst4aczoqs19nTL+/VQXKwHLVO4H8w+jxkG3Q617FnrVWCjTLYy8ga0efLl02PEJCYuEwN2nd98WoWpX4d/yBZQxohjjpMqQEjp7fzMKHg3DH/3qaqVfh5TAgKvyL3/OOS8GbbGP1dNH3MtDL5MMJLbl347xaXxeWKvKykQCWDtGTmzDR2bo0k12MwoEfx4wLvDgQflL 9EDxksrn kxra4Ka07gEblCBYRMzWyFETkMiVib5InYkYC/UO5LoBva4mRW/0YP8qFDy0j8oKFzIEKdc0WvCWUV/NKn1X/qWCp7W8SNHtY69bzpr996iisJBltti99JcxGrVeEgj1Z38NhOjgCiIYbkwswQWwBcd5eOUO5WqFv8S78P74bgLZYhLRkWNQcWhhLMNi9hPWaPhicqipZ7oNcLQjqfE5ks8ucoO//CCaESD6vku0CqkfTfHDoIzbdIGYIouOm3BNELz1/NycwsiKAknvIe14CI9Z68fM70ZecYTu/IX9nfv3el7BiC63df8zfqpyyqhbw6jM0B99+8G0LOZbpOSmplx2FkeXqCnrCJz9b6QYRIMpx6ixYQYW+9WxE4UvMn7T/om5A/HyFT0RbosjWBsYjJFCqSXwP0tFOrYUpNgBKi1vmdjKEKJsJkPbRNR6ADvvL5suMHg4Xw3gIImDgVK6HJks0rTiQf6QMev9W9I+WMoo2EuRtOaPzxH3ZQaRmiL6y0EJ9Rk/RZuf0VgzxAlDJAFshkAiNWlmV7QVRKW0U6I0vBq9XbDSx0UgMKv1EMwpGgKYy X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is partial revert of commit b47291ef02b0 ("mm, slub: change percpu partial accounting from objects to pages"). and full revert of commit 662188c3a20e ("mm/slub: Simplify struct slab slabs field definition"). While b47291ef02b0 prevents percpu partial slab list becoming too long, it assumes that the order of slabs are always oo_order(s->oo). The current approach can surprisingly lower the number of objects cached per cpu when it fails to allocate high order slabs. Instead of accounting the number of slabs, change it back to accounting objects, but keep the assumption that the slab is always half-full. With this change, the number of cached objects per cpu is not surprisingly decreased even when it fails to allocate high order slabs. It still prevents large inaccuracy because it does not account based on the number of free objects when taking slabs. --- include/linux/slub_def.h | 2 -- mm/slab.h | 6 ++++++ mm/slub.c | 31 ++++++++++++------------------- 3 files changed, 18 insertions(+), 21 deletions(-) diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h index deb90cf4bffb..589ff6a2a23f 100644 --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -109,8 +109,6 @@ struct kmem_cache { #ifdef CONFIG_SLUB_CPU_PARTIAL /* Number of per cpu partial objects to keep around */ unsigned int cpu_partial; - /* Number of per cpu partial slabs to keep around */ - unsigned int cpu_partial_slabs; #endif struct kmem_cache_order_objects oo; diff --git a/mm/slab.h b/mm/slab.h index 799a315695c6..be38a264df16 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -65,7 +65,13 @@ struct slab { #ifdef CONFIG_SLUB_CPU_PARTIAL struct { struct slab *next; +#ifdef CONFIG_64BIT int slabs; /* Nr of slabs left */ + int pobjects; /* Approximate count */ +#else + short int slabs; + short int pobjects; +#endif }; #endif }; diff --git a/mm/slub.c b/mm/slub.c index f7940048138c..199d3d03d5b9 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -486,18 +486,7 @@ static inline unsigned int oo_objects(struct kmem_cache_order_objects x) #ifdef CONFIG_SLUB_CPU_PARTIAL static void slub_set_cpu_partial(struct kmem_cache *s, unsigned int nr_objects) { - unsigned int nr_slabs; - s->cpu_partial = nr_objects; - - /* - * We take the number of objects but actually limit the number of - * slabs on the per cpu partial list, in order to limit excessive - * growth of the list. For simplicity we assume that the slabs will - * be half-full. - */ - nr_slabs = DIV_ROUND_UP(nr_objects * 2, oo_objects(s->oo)); - s->cpu_partial_slabs = nr_slabs; } #else static inline void @@ -2275,7 +2264,7 @@ static void *get_partial_node(struct kmem_cache *s, struct kmem_cache_node *n, struct slab *slab, *slab2; void *object = NULL; unsigned long flags; - unsigned int partial_slabs = 0; + int objects_taken = 0; /* * Racy check. If we mistakenly see no partial slabs then we @@ -2312,11 +2301,11 @@ static void *get_partial_node(struct kmem_cache *s, struct kmem_cache_node *n, } else { put_cpu_partial(s, slab, 0); stat(s, CPU_PARTIAL_NODE); - partial_slabs++; + objects_taken += slab->objects / 2; } #ifdef CONFIG_SLUB_CPU_PARTIAL if (!kmem_cache_has_cpu_partial(s) - || partial_slabs > s->cpu_partial_slabs / 2) + || objects_taken > s->cpu_partial / 2) break; #else break; @@ -2699,13 +2688,14 @@ static void put_cpu_partial(struct kmem_cache *s, struct slab *slab, int drain) struct slab *slab_to_unfreeze = NULL; unsigned long flags; int slabs = 0; + int pobjects = 0; local_lock_irqsave(&s->cpu_slab->lock, flags); oldslab = this_cpu_read(s->cpu_slab->partial); if (oldslab) { - if (drain && oldslab->slabs >= s->cpu_partial_slabs) { + if (drain && oldslab->pobjects >= s->cpu_partial) { /* * Partial array is full. Move the existing set to the * per node partial list. Postpone the actual unfreezing @@ -2714,14 +2704,17 @@ static void put_cpu_partial(struct kmem_cache *s, struct slab *slab, int drain) slab_to_unfreeze = oldslab; oldslab = NULL; } else { + pobjects = oldslab->pobjects; slabs = oldslab->slabs; } } slabs++; + pobjects += slab->objects / 2; slab->slabs = slabs; slab->next = oldslab; + slab->pobjects = pobjects; this_cpu_write(s->cpu_slab->partial, slab); @@ -5653,13 +5646,13 @@ static ssize_t slabs_cpu_partial_show(struct kmem_cache *s, char *buf) slab = slub_percpu_partial(per_cpu_ptr(s->cpu_slab, cpu)); - if (slab) + if (slab) { slabs += slab->slabs; + objects += slab->objects; + } } #endif - /* Approximate half-full slabs, see slub_set_cpu_partial() */ - objects = (slabs * oo_objects(s->oo)) / 2; len += sysfs_emit_at(buf, len, "%d(%d)", objects, slabs); #ifdef CONFIG_SLUB_CPU_PARTIAL @@ -5669,7 +5662,7 @@ static ssize_t slabs_cpu_partial_show(struct kmem_cache *s, char *buf) slab = slub_percpu_partial(per_cpu_ptr(s->cpu_slab, cpu)); if (slab) { slabs = READ_ONCE(slab->slabs); - objects = (slabs * oo_objects(s->oo)) / 2; + objects = READ_ONCE(slab->pobjects); len += sysfs_emit_at(buf, len, " C%d=%d(%d)", cpu, objects, slabs); }