From patchwork Wed Sep 20 06:18:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Ying" X-Patchwork-Id: 13392116 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08993CE79AC for ; Wed, 20 Sep 2023 06:20:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 90F336B011B; Wed, 20 Sep 2023 02:20:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8999D6B011C; Wed, 20 Sep 2023 02:20:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 738D26B011D; Wed, 20 Sep 2023 02:20:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 5FF4A6B011B for ; Wed, 20 Sep 2023 02:20:11 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 42E4F1207F1 for ; Wed, 20 Sep 2023 06:20:11 +0000 (UTC) X-FDA: 81255975822.28.E868AAC Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) by imf29.hostedemail.com (Postfix) with ESMTP id 2C94A120007 for ; Wed, 20 Sep 2023 06:20:08 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=e97HTkmW; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf29.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695190809; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kQYc5N1rnjtiHtbCX15sF/J52hBb9PsZp7Xb3r8dfqY=; b=3AAy7h6dQ3eLu/ZQA3VsUsCTDr8QhER4kA7GtQY2ksQmU0OeVohKAtTNPJxuHcp73Spjm8 w1yu5Nz1JPlUWIJy1GMwHViRjW+Gsc2jGAXvMwi6/2XFQWa5w0qSAf8QgzjjxlKPS+qSVk y7Ay8lwyS64AVYDkiZhW9hP7YYgp8zk= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=e97HTkmW; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf29.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695190809; a=rsa-sha256; cv=none; b=cYOvSTKyes0y0vfDRj3rx3D6y6v052kvbPsc2MpVwR/ePTxYLj/Uwd32SUp6WC/yUNfYyv uYrPSiuK1nWU3rZ9O6AsjbVBVinEAq6K5t2t/fKUDvFdxtnarkDJE+JXGei1K4BYNTMZ1J HzkyiqTFlvXG7RP/QkUYQnnV+uTHnPc= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695190809; x=1726726809; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+FoiCfoy4TLCRTojH1ubjvlhPyG87O12FA/RyIO19/k=; b=e97HTkmWjwmBWqYJ2BvfTxhgtOMn93iCwx8Wx/VoQdW619wARiAfUR+h PBFq9LJAByZADaQTe9s5h5LVYgifqsNhBvBPLff0dGi6K+AtbyKJ7dOxx xPIEZFzUvgPijRIgA4VCGalczuY+dSqpQehUZ+SUyTyGz9+2vynsNX7fj yA19bEJukBC7qwrxZRL24HKQUqbz8wsTRJbusqboeIlFVMZOje/w3m4LN wKlrcoBHEoYXfIKNNaqbITjcLjHnbF4luZn5Rq/CXee7hLn4sEZtApt9q RT3a1Xt5xzOB+faZTQN6gcecHJ9ktYRoemR6vADnkC3eP0cKPEFMq5xgn g==; X-IronPort-AV: E=McAfee;i="6600,9927,10838"; a="365187788" X-IronPort-AV: E=Sophos;i="6.02,161,1688454000"; d="scan'208";a="365187788" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Sep 2023 23:20:08 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10838"; a="740060679" X-IronPort-AV: E=Sophos;i="6.02,161,1688454000"; d="scan'208";a="740060679" Received: from yhuang6-mobl2.sh.intel.com ([10.238.6.133]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Sep 2023 23:20:05 -0700 From: Huang Ying To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Arjan Van De Ven , Huang Ying , Andrew Morton , Mel Gorman , Vlastimil Babka , David Hildenbrand , Johannes Weiner , Dave Hansen , Michal Hocko , Pavel Tatashin , Matthew Wilcox , Christoph Lameter Subject: [PATCH 09/10] mm, pcp: avoid to reduce PCP high unnecessarily Date: Wed, 20 Sep 2023 14:18:55 +0800 Message-Id: <20230920061856.257597-10-ying.huang@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230920061856.257597-1-ying.huang@intel.com> References: <20230920061856.257597-1-ying.huang@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 2C94A120007 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 9i4ogutrj8qtw4dcpfhxys8mhbpk5tni X-HE-Tag: 1695190808-682944 X-HE-Meta: U2FsdGVkX19EYg+kcN0SJvNFsO4YxQRGr7q68CiEGHn2buXN/L5RlCwRlzCconEXutxnIQ0ttVgNHAwayoFX03h03Mf1AQhOnEXAo+2PfWTo7Wa+sAABRAjivDVV7LUPEgdolugw/FB+PmzgqdqFZyjCOy98SjN6DYyM38XWXMQEIcopXcQxHx5KlkFT10gogsuM/EJCR5ZuvQJ6P8FjPHoZZWXnqWFRgpyAgps1Hq22KdAJ966rXAApJ7GzPvjVfssbQ0R2JIPOAJs32HLEUuX/DptcDk3H3bkzDKxhg1KOC2Hb0/GTxl9ArhEKTzT6hgrFP72JtiBFH96Ue/7VMs57m2k5rg0dXuXvg94G5b5IaiyHxzQDeAOzktoKCLSLrxYdbnlYyp9TV2ftmhI32aWAampjju2VNjV4wmntYZPB7hVJAWgQX4A86hT6sltkpqySYAE40q09aI4XawdzVNpV24iNFy2hoCVV0XcT2KgwIyV8PMX7hk3heb19rvqrar34foknClphXaBZbrTDf8+dD6DM5KNL7cwXCfRP9sHR7f+yU9R2A+0JR0jan6iqQtARj0Ihrigs7jtJ4EGyJmctcBse6Nye0r983ItoHd6x0mAvDLYX1qu9k2iKvaivzW8RHPBildrmlKhw3a2ypcl79B7M4iSOMMCZbdcUBY3VDBBgbdIXHD3rIT6rk4ifaoDbXYDwxRl3vykGpBF6n19e4nlBpx3PDlO5NbOtJiN0fLTWnE4v+0+ovyds8JANdIA4p48AOEmucst67xlE8oDtDs/Qk8hFElsi187ggC2JlhJqvefo0iK8SlJSz5OMw/dOvE3LqaNq0whqzt4aXl+OHnSAUQi9UEWzzG5n0FSxfzGe0vF77iNnOfvEh0Jd5gE4ySluLTyTrBTKb3KUMW8v0MEqCg6xHm4A91dU7bRfINS8xFc3YlJQ/n1YTu2A8HQ5QKaiMu1BE5ZkyGz w46uv3Jm ePYLuwYWSjGMgoQaWbKeriwbwD3RcaewGkwoESAtywGJoaulByV/fbRlr9QSJyHo++qb2zNUgB7aN2dCmRo7UBBG/7tjEQxKEYJZDrxmxSQDv861JKu3/Sr1NtryFOQRzYeUNVwn0/r569OxVt+RuGFNIDJIFpWYtxT5tvC49uyFQvyi+YDLDuT7bRk+XMbI6l6HE/xJacPFfhDT7AxFiQFAR840g9wIds44uKlo7CZKzJupsfgQu5OQBlZ9Hziz+F7UMAXtKjCEaoMHslHcEg/Lejwaq6yS7Vkju+23w2zUAfp5wba2SD94CS5qjsO4+pBQINB/iEzf/HcfjcIoOP0uJgLVbrePHNQsA2uXEzJ7YoiE9UlenJjfBenjvd8iqG5lwcfW1tVZK+nEeTbC5hJ1rRaeMI8XGjILdMzXporJ+ZfWg+F8G5UMEFQn4wv8ds5KSD5PIfzsgHatP1EH8qnKszsFWCd/o16vMqaisuHJ1X/89IEPvRqvgNHcrRSKUGQPymoMMnLpzrckmx/10++BBB+SwhccWl/rh X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In PCP high auto-tuning algorithm, to minimize idle pages in PCP, in periodic vmstat updating kworker (via refresh_cpu_vm_stats()), we will decrease PCP high to try to free possible idle PCP pages. One issue is that even if the page allocating/freeing depth is larger than maximal PCP high, we may reduce PCP high unnecessarily. To avoid the above issue, in this patch, we will track the minimal PCP page count. And, the periodic PCP high decrement will not more than the recent minimal PCP page count. So, only detected idle pages will be freed. On a 2-socket Intel server with 224 logical CPU, we tested kbuild on one socket with `make -j 112`. With the patch, The number of pages allocated from zone (instead of from PCP) decreases 25.8%. Signed-off-by: "Huang, Ying" Cc: Andrew Morton Cc: Mel Gorman Cc: Vlastimil Babka Cc: David Hildenbrand Cc: Johannes Weiner Cc: Dave Hansen Cc: Michal Hocko Cc: Pavel Tatashin Cc: Matthew Wilcox Cc: Christoph Lameter --- include/linux/mmzone.h | 1 + mm/page_alloc.c | 15 ++++++++++----- 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 8a19e2af89df..35b78c7522a7 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -682,6 +682,7 @@ enum zone_watermarks { struct per_cpu_pages { spinlock_t lock; /* Protects lists field */ int count; /* number of pages in the list */ + int count_min; /* minimal number of pages in the list recently */ int high; /* high watermark, emptying needed */ int high_min; /* min high watermark */ int high_max; /* max high watermark */ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 3f8c7dfeed23..77e9b7b51688 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2166,19 +2166,20 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, */ int decay_pcp_high(struct zone *zone, struct per_cpu_pages *pcp) { - int high_min, to_drain, batch; + int high_min, decrease, to_drain, batch; int todo = 0; high_min = READ_ONCE(pcp->high_min); batch = READ_ONCE(pcp->batch); /* - * Decrease pcp->high periodically to try to free possible - * idle PCP pages. And, avoid to free too many pages to - * control latency. + * Decrease pcp->high periodically to free idle PCP pages counted + * via pcp->count_min. And, avoid to free too many pages to + * control latency. This caps pcp->high decrement too. */ if (pcp->high > high_min) { + decrease = min(pcp->count_min, pcp->high / 5); pcp->high = max3(pcp->count - (batch << PCP_BATCH_SCALE_MAX), - pcp->high * 4 / 5, high_min); + pcp->high - decrease, high_min); if (pcp->high > high_min) todo++; } @@ -2191,6 +2192,8 @@ int decay_pcp_high(struct zone *zone, struct per_cpu_pages *pcp) todo++; } + pcp->count_min = pcp->count; + return todo; } @@ -2828,6 +2831,8 @@ struct page *__rmqueue_pcplist(struct zone *zone, unsigned int order, page = list_first_entry(list, struct page, pcp_list); list_del(&page->pcp_list); pcp->count -= 1 << order; + if (pcp->count < pcp->count_min) + pcp->count_min = pcp->count; } while (check_new_pages(page, order)); return page;