From patchwork Wed Jul 19 10:21:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jay Patel X-Patchwork-Id: 13318739 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4631EB64DA for ; Wed, 19 Jul 2023 10:21:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6F152280050; Wed, 19 Jul 2023 06:21:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 67A8F28004C; Wed, 19 Jul 2023 06:21:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4CCE8280050; Wed, 19 Jul 2023 06:21:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 3985C28004C for ; Wed, 19 Jul 2023 06:21:37 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 0775214019E for ; Wed, 19 Jul 2023 10:21:37 +0000 (UTC) X-FDA: 81027969834.09.7E98D73 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by imf09.hostedemail.com (Postfix) with ESMTP id B383B14000C for ; Wed, 19 Jul 2023 10:21:34 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=mbYg7NpA; spf=pass (imf09.hostedemail.com: domain of jaypatel@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=jaypatel@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689762094; a=rsa-sha256; cv=none; b=uItxui3gIp0SYhN+owDaBLF4ngsNMauk8K3+fdLEl+ekYgRCZ/dsN0HQPQ4DRKjbE4128j gMHq4LuIex4yfEn5gER+Efz0T2bZawMAEdWvNSqhgcChQZgAmgAiddc26/LinpTlR92CvF Neb7kEtLHdykyrS1sQDCtchZcvj3AX8= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=mbYg7NpA; spf=pass (imf09.hostedemail.com: domain of jaypatel@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=jaypatel@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689762094; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=v4YaCBa+fxEdaIEtx+fNqHTLoTWV5F5NQEwFls98TKQ=; b=wiuHtCUAS5t3oZy164t5BTvuuL3C8LqNBiaoT0pccBzzz7hyHtQTwDlywDw0zgyZNOm8Xa m8OBQWNKrluLqjwG5qmPzje9nLUIq/Y1geAkxlUVljpKo5GA4XAAVAgUOdjXv7JSfv+aaK EetlIcbYlxsu4LOdaCDrJIgdTVQB9Cs= Received: from pps.filterd (m0360072.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36JA9EFO025983; Wed, 19 Jul 2023 10:21:31 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : mime-version : content-transfer-encoding; s=pp1; bh=v4YaCBa+fxEdaIEtx+fNqHTLoTWV5F5NQEwFls98TKQ=; b=mbYg7NpAMRvDRKXDzjFwO7yGGnuOqAMca2ORHL/n+h5A2rxE7bnjkef4oiLhbx5OrcqX TqyJ5JH7RC8BkNuNE0QwUfMu6RG2R64XMVB6iBCwrQQS8FeoclFLe2Dw23H2VxBo6NZQ Q4bWITQeIzjdO2jXL8bMzsC25tVAXT8oYONEqiCP74giBHSI+KUH6aBbFJySVtUhJSrm 7m3WKqX18qeEMIN4/DQjy3KCYEXi/27OmmJ0yKYfmQkF/VeBhELSrgiJsgudSBgu9h4s xvCIvas/RWzOI4kZwj9k5c+hL2RJrzHbfYHHy7R87UIUJKwUNuVyELPPAL78CkMbeH2/ GQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rxc962ksn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 19 Jul 2023 10:21:30 +0000 Received: from m0360072.ppops.net (m0360072.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 36JA9kUu029634; Wed, 19 Jul 2023 10:21:30 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rxc962ksf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 19 Jul 2023 10:21:30 +0000 Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 36J9EebI004143; Wed, 19 Jul 2023 10:21:29 GMT Received: from smtprelay05.dal12v.mail.ibm.com ([172.16.1.7]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 3rv8g11vp8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 19 Jul 2023 10:21:29 +0000 Received: from smtpav03.wdc07v.mail.ibm.com (smtpav03.wdc07v.mail.ibm.com [10.39.53.230]) by smtprelay05.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 36JALSRY131680 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 19 Jul 2023 10:21:28 GMT Received: from smtpav03.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 66ADA58062; Wed, 19 Jul 2023 10:21:28 +0000 (GMT) Received: from smtpav03.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0453C5805D; Wed, 19 Jul 2023 10:21:24 +0000 (GMT) Received: from patel.in.ibm.com (unknown [9.109.195.177]) by smtpav03.wdc07v.mail.ibm.com (Postfix) with ESMTP; Wed, 19 Jul 2023 10:21:23 +0000 (GMT) From: Jay Patel To: linux-mm@kvack.org Cc: cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, aneesh.kumar@linux.ibm.com, tsahu@linux.ibm.com, piyushs@linux.ibm.com, jaypatel@linux.ibm.com Subject: [RFC PATCH v3] mm/slub: Optimize slub memory usage Date: Wed, 19 Jul 2023 15:51:04 +0530 Message-Id: <20230719102104.1954891-1-jaypatel@linux.ibm.com> X-Mailer: git-send-email 2.39.1 MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: wrKzBOJ77BixJ6M3TDgVacvDGTk2RpPp X-Proofpoint-GUID: W9cDSoWI4GFGwIuZhyRhikdQ6PU1xMwg X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-19_06,2023-07-18_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 phishscore=0 lowpriorityscore=0 mlxscore=0 spamscore=0 priorityscore=1501 bulkscore=0 clxscore=1011 suspectscore=0 impostorscore=0 mlxlogscore=716 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2307190091 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: B383B14000C X-Stat-Signature: h65fex4oz3mh9cw1racwo7qigt8ndkry X-Rspam-User: X-HE-Tag: 1689762094-93518 X-HE-Meta: U2FsdGVkX1/DlGalv4+pMIVR3BH5aJ13jPXTNHNP6/yDH5e2GrgiOV1siIB1aZWiqn7vj/zfC3/26hGhY/NFBE2Pf9TirCGfiTJ6+AE2BKUIuL93a0JAE1ZX54YEDywQsu/rxsjb2ZfnNBS8m6Nd71HGzEk7zZn3e9dcaZOKrALlErNwnuDvg0z5XzHRR1IvPweOGZ500oNWQDkdLm5ZB/SgzdRMG3oNgE1m/r+y2KJ2nPGBi4loTGGPm2juT4Qo9LhIhCZ6c9UjjPM+tzmhJWaQI5wFD7qBwDBJBCL9N2UalYL5JvYaD3qEP5Ssdg4j9J2/FfovZrF/G2CVbSGxW13GqwMx+j99WT/IY8SXipA8NNgqjv3YMLGuhXgxGWm2jcWICgjAjdrNcbasFQxvdx2n9BhYQwNaMiQF89YJ4aU6qp04S2zJGDTPpZtyUlg1yrjKuE9hZH6oj+QkREpl/5gIRrKpvH4yZ9DYdzmytBNYD3pxHMEIFlNRTCIGzbZ+2jGColKY/wCJn7NZH5EMAgHLHbXZg+fmhSxYGpjDuwWpI/KPDOj3M/dR3JMaZER3+wxrj1sWUjigZoz4FGj6Ll6TajhclGFBBJBHn6tnTjyKdEUuK2x20m2DOs0UrJbZezhx86o77qP3EeEy4/JWq8qCn5DpTQrf2U6efvTIwIGzt1GWE0Jhb1yf83XWEJB3HS2G3Ic105SlTFLnG4qEBwce/w+E/PjoiLEnRvIzclcOCWNcTzh9j46UTvNbYQsIM2hw7LJYQFvoEYWxeu70tpWF2yp1zhS+LCLvWF3tdjWv/8oPAZWc/ZbENzLXvMOuRtXSNhWYQ1Lk9w7oWZQhU2uFRRZUPcUlS6IPzxnkWpIjtYM6mhqBPZbyaEvb2kE1PqHJ5FMzPQXvApw09uvoXwd+qr/dpxjm4WLWHRbUCyJU5GiEKVNi6s53B7zk9aBySVU/I1tSf6a9fdRlPwN X9SpWCv7 CTD4H3GHU0t5s6rQcowuA/GBj9YvtTcHjmBxYFKv/xUPSiHYpaWHtbmclV/YcrVzd2S3PuzR//5iyN66zzruqep4lvkJlQgsamvR3H13iansah04p97wQFL5UHGx9iN74fEQVsnsH3iKncRQqjcU7ACJIREJZ8hW0P1da+a3ImXfzmA+IMx+m2WcBPbfqNnarXt3wuz1H4AVPaTklS/cpGzvMtj9xN9/Rab8645OkOKiufmTMkstRQ+QEXyLX3cvyv4wEJ1da+xf0yM7xKmTHG1wipQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In the current implementation of the slub memory allocator, the slab order selection process follows these criteria: 1) Determine the minimum order required to serve the minimum number of objects (min_objects). This calculation is based on the formula (order = min_objects * object_size / PAGE_SIZE). 2) If the minimum order is greater than the maximum allowed order (slub_max_order), set slub_max_order as the order for this slab. 3) If the minimum order is less than the slub_max_order, iterate through a loop from minimum order to slub_max_order and check if the condition (rem <= slab_size / fract_leftover) holds true. Here, slab_size is calculated as (PAGE_SIZE << order), rem is (slab_size % object_size), and fract_leftover can have values of 16, 8, or 4. If the condition is true, select that order for the slab. However, in point 3, when calculating the fraction left over, it can result in a large range of values (like 1 Kb to 256 bytes on 4K page size & 4 Kb to 16 Kb on 64K page size with order 0 and goes on increasing with higher order) when compared to the remainder (rem). This can lead to the selection of an order that results in more memory wastage. To mitigate such wastage, we have modified point 3 as follows: To adjust the value of fract_leftover based on the page size, while retaining the current value as the default for a 4K page size. Test results are as follows: 1) On 160 CPUs with 64K Page size +-----------------+----------------+----------------+ | Total wastage in slub memory | +-----------------+----------------+----------------+ | | After Boot |After Hackbench | | Normal | 932 Kb | 1812 Kb | | With Patch | 729 Kb | 1636 Kb | | Wastage reduce | ~22% | ~10% | +-----------------+----------------+----------------+ +-----------------+----------------+----------------+ | Total slub memory | +-----------------+----------------+----------------+ | | After Boot | After Hackbench| | Normal | 1855296 | 2944576 | | With Patch | 1544576 | 2692032 | | Memory reduce | ~17% | ~9% | +-----------------+----------------+----------------+ hackbench-process-sockets +-------+-----+----------+----------+-----------+ | Amean | 1 | 1.2727 | 1.2450 | ( 2.22%) | | Amean | 4 | 1.6063 | 1.5810 | ( 1.60%) | | Amean | 7 | 2.4190 | 2.3983 | ( 0.86%) | | Amean | 12 | 3.9730 | 3.9347 | ( 0.97%) | | Amean | 21 | 6.9823 | 6.8957 | ( 1.26%) | | Amean | 30 | 10.1867 | 10.0600 | ( 1.26%) | | Amean | 48 | 16.7490 | 16.4853 | ( 1.60%) | | Amean | 79 | 28.1870 | 27.8673 | ( 1.15%) | | Amean | 110 | 39.8363 | 39.3793 | ( 1.16%) | | Amean | 141 | 51.5277 | 51.4907 | ( 0.07%) | | Amean | 172 | 62.9700 | 62.7300 | ( 0.38%) | | Amean | 203 | 74.5037 | 74.0630 | ( 0.59%) | | Amean | 234 | 85.6560 | 85.3587 | ( 0.35%) | | Amean | 265 | 96.9883 | 96.3770 | ( 0.63%) | | Amean | 296 | 108.6893 | 108.0870 | ( 0.56%) | +-------+-----+----------+----------+-----------+ 2) On 16 CPUs with 64K Page size +----------------+----------------+----------------+ | Total wastage in slub memory | +----------------+----------------+----------------+ | | After Boot | After Hackbench| | Normal | 273 Kb | 544 Kb | | With Patch | 260 Kb | 500 Kb | | Wastage reduce | ~5% | ~9% | +----------------+----------------+----------------+ +-----------------+----------------+----------------+ | Total slub memory | +-----------------+----------------+----------------+ | | After Boot | After Hackbench| | Normal | 275840 | 412480 | | With Patch | 272768 | 406208 | | Memory reduce | ~1% | ~2% | +-----------------+----------------+----------------+ hackbench-process-sockets +-------+----+---------+---------+-----------+ | Amean | 1 | 0.9513 | 0.9250 | ( 2.77%) | | Amean | 4 | 2.9630 | 2.9570 | ( 0.20%) | | Amean | 7 | 5.1780 | 5.1763 | ( 0.03%) | | Amean | 12 | 8.8833 | 8.8817 | ( 0.02%) | | Amean | 21 | 15.7577 | 15.6883 | ( 0.44%) | | Amean | 30 | 22.2063 | 22.2843 | ( -0.35%) | | Amean | 48 | 36.0587 | 36.1390 | ( -0.22%) | | Amean | 64 | 49.7803 | 49.3457 | ( 0.87%) | +-------+----+---------+---------+-----------+ Signed-off-by: Jay Patel --- Changes from V2 1) removed all page order selection logic for slab cache base on wastage. 2) Increasing fraction size base on page size (keeping current value as default to 4K page) Changes from V1 1) If min_objects * object_size > PAGE_ALLOC_COSTLY_ORDER, then it will return with PAGE_ALLOC_COSTLY_ORDER. 2) Similarly, if min_objects * object_size < PAGE_SIZE, then it will return with slub_min_order. 3) Additionally, I changed slub_max_order to 2. There is no specific reason for using the value 2, but it provided the best results in terms of performance without any noticeable impact. arch/powerpc/include/asm/page.h | 2 ++ mm/slub.c | 15 +++++---------- 2 files changed, 7 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h index f2b6bf5687d0..0dc53692d0e1 100644 --- a/arch/powerpc/include/asm/page.h +++ b/arch/powerpc/include/asm/page.h @@ -22,6 +22,8 @@ */ #define PAGE_SHIFT CONFIG_PPC_PAGE_SHIFT #define PAGE_SIZE (ASM_CONST(1) << PAGE_SHIFT) +#define PAGE_SHIFT_4K 12 +#define PAGE_SIZE_4K (1 << PAGE_SHIFT_4K) #ifndef __ASSEMBLY__ #ifndef CONFIG_HUGETLB_PAGE diff --git a/mm/slub.c b/mm/slub.c index c87628cd8a9a..058bcc235b63 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4117,6 +4117,7 @@ static inline int calculate_order(unsigned int size) unsigned int min_objects; unsigned int max_objects; unsigned int nr_cpus; + unsigned int page_frac; /* * Attempt to find best configuration for a slab. This @@ -4145,10 +4146,12 @@ static inline int calculate_order(unsigned int size) max_objects = order_objects(slub_max_order, size); min_objects = min(min_objects, max_objects); - while (min_objects > 1) { + page_frac = ((PAGE_SIZE/PAGE_SIZE_4K) == 1) ? 0 : PAGE_SIZE/PAGE_SIZE_4K; + + while (min_objects >= 1) { unsigned int fraction; - fraction = 16; + fraction = 16 + page_frac; while (fraction >= 4) { order = calc_slab_order(size, min_objects, slub_max_order, fraction); @@ -4159,14 +4162,6 @@ static inline int calculate_order(unsigned int size) min_objects--; } - /* - * We were unable to place multiple objects in a slab. Now - * lets see if we can place a single object there. - */ - order = calc_slab_order(size, 1, slub_max_order, 1); - if (order <= slub_max_order) - return order; - /* * Doh this slab cannot be placed using slub_max_order. */