From patchwork Mon Oct 16 05:30:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Ying" X-Patchwork-Id: 13422495 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23434CDB482 for ; Mon, 16 Oct 2023 05:30:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9CBDD8D0037; Mon, 16 Oct 2023 01:30:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 97C5F8D0001; Mon, 16 Oct 2023 01:30:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 81C748D0037; Mon, 16 Oct 2023 01:30:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 6CEFA8D0001 for ; Mon, 16 Oct 2023 01:30:51 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 4E1391CB047 for ; Mon, 16 Oct 2023 05:30:51 +0000 (UTC) X-FDA: 81350200302.26.925A356 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by imf28.hostedemail.com (Postfix) with ESMTP id 396FEC0013 for ; Mon, 16 Oct 2023 05:30:49 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=aNwA0sNO; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf28.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1697434249; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EsUyHh8Yv1F4HMKZPnyBlZFhM2fGhvzjiN4YX7RaOzY=; b=qbTwztuo/9rBv8eG8I/lUnLTcT9I1fcy9Ww2iqzbrkxnT8eR+wCbsReiueblY7IHxe6XsH kqoqFLGS4GAqaTrv2M5ccD+11a5KVtIqfzIOQZ1BeZfWxniRRGSh1C412Khw6g1dS14uIq FMSizuSAbwgb5YGnPOlyLmqTrkq+wsI= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=aNwA0sNO; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf28.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697434249; a=rsa-sha256; cv=none; b=LIsIjlNDk1fJnk7qsgdBEMWs+VQ1/ergCQknIyK8Nr/n/AboihGU8Ao1qK4tXMqMCI8T8T NfXTu8c2Plco/YJenuP38TQU1F6JJ2Y9zC20/yMHkIMK4jNr+iDymsOMSZDEx5qonzz+UD ygSx+sT7BgU5PqbMmk1ShK5m+SXZmd8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697434249; x=1728970249; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=oayBWHmLvZPHkilBpaQJzT0ghr/vVONoAKNZEgSFLCA=; b=aNwA0sNOFRBXky4E/8jKhyFIQyuK3bIzNwKHV6tvtr/BucBLexbY+Azk YeErJoSG87Ir9TqdE1CF8nkhTBc16Ugn8MQhOXz61TmYm6gd49hQnun2C OygZO/ejcP6O2ZNH4fZ07eE1qre5fFMG2B8KteDoRBbz7FghPkWZt3sy2 yC4ZtEsYR2XHjQjSwuAIQXWQiS7vbS4CuJzoF1JwdvTX7z8Qlgsk+nmp9 X9MtGHSsIZk+SEOHWV2jDKh9jTQMl/vb5UPmmdjqaVtzcXfpI78MVIDct o32XXQCn1fVIrY5HdpTyblYwvDIxmQxI8peU0ikyFz6s2naiVL+CDpLhn g==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="389308122" X-IronPort-AV: E=Sophos;i="6.03,228,1694761200"; d="scan'208";a="389308122" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2023 22:30:48 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="899356750" X-IronPort-AV: E=Sophos;i="6.03,228,1694761200"; d="scan'208";a="899356750" Received: from yhuang6-mobl2.sh.intel.com ([10.238.6.133]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2023 22:28:47 -0700 From: Huang Ying To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Arjan Van De Ven , Huang Ying , Mel Gorman , Vlastimil Babka , David Hildenbrand , Johannes Weiner , Dave Hansen , Michal Hocko , Pavel Tatashin , Matthew Wilcox , Christoph Lameter Subject: [PATCH -V3 8/9] mm, pcp: decrease PCP high if free pages < high watermark Date: Mon, 16 Oct 2023 13:30:01 +0800 Message-Id: <20231016053002.756205-9-ying.huang@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231016053002.756205-1-ying.huang@intel.com> References: <20231016053002.756205-1-ying.huang@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 396FEC0013 X-Stat-Signature: o9wp9k74iji4xhxjdib6gzxzgeioeu9t X-Rspam-User: X-HE-Tag: 1697434249-392697 X-HE-Meta: U2FsdGVkX19afi0ZKJpnVhYVco+Rl0J21l+iqnG7iRNgl45ccMrJi56jujiHq9AFn3KZyfUkj6OFRaxvmiI+X7pndCDNU+WtSHJSpzrIfAcP8A81MrXj8S/psRfLBad0ViTue+xLtQpi5eVSvHeAGlshJVhl37cx3qKuGBaSf5/LHRFznvd6VO2fe9u3KuoYukxXECymztGJ2pdmSQeY9q0Y0a9qoAf5ppKE6j3H1qw4MJZs5YQNP7iy84nrXjo633oq5oKZiKIu8znhFaWT1i4yq68K0bTbOC9BdjgLme9QWQbPyYHPBMoZXoOw6FdjqMBrA10nKqgi0zHo1uT9LsWQQk+rhwHGELfUFoWcV0k+9MCre1frjNdc+JgojR6gY6D+enWJlNrgTvEPREMsUPDFB0EevbtPJbOidADX1SyktkgYZlmkGOpjj6YOCQo7R+pq0ONkNw1iPAlZGDV3GUOcnJHJSNIakxzPPc04YVTGX19BYQgQFBR9DK/zY+ZbNNByu6uHHjrfTu/8SoibVdzCPipG8k5ieiLvaw0Phg1sewnNhX9AAWzyxu9S7rJzWxrkdpTOAtqmxmb/hu8AF3rqDcZlKDaq24mrcDhAIDow1xOtW9yfyH3FMKDKhZ33yV9A++srTo1smHomLaOSfKNPyWQC1gBzgAsBTdKs8xHwK/EF1HE91tFWStQRbTChvIU7TMr7F2ZE8xb9po+NzWDxa5/KZL0pGeaNl4yCgym5o9gVJdL4DQWt3EHSZXFx9wlXDP0vtHEu857AZE2g9ZXQbAkclIbUrift64KSpUY4IWlDu/1JVlIVm1wCNkM1VDoXwoU7LOoLtVUDkcT8UBwRgwxw6Y1LESdTT+AGEY0zFaM8L3MJIYBEV+XHFHMHaRsAQUCjeFeseIuB0pamXn0i5au5I8VYX4PJ+6Pd97pAFD7U0QDfErxSJSvd9YBQ/IPJcEA+LjEGKZXkQ9u zpYB7ije 2Mzr5DhaAkhSV3PT9m7yEF/H/ORbhndijbNpotwzeNdRyQcct1F9ZwFerxdEEyomSsoYBZIm/xU/LeiHFcOEvxUaknX1fYD1+8KB1G6d9wFNVOhYZsJPjwUn4aVOxkBg2LGJ1ujWpJXxPUdqeG6EvNAjrI4XZyE/Y6whB4JaivB2ehPW0irWgWxhiXH6+MH0YdNc0/JcD2hPPHiEnQNvuGrLA28HwC8fCpyEwHJ2N003Q/7T5gMp/ffVPJEQU0hkfiiy65fR3I6HrkGZnuifkLov3s2V6dd4T+FdvvbivDZ0P/dWhbG6iL2hn1PNyty6l+0FjqI/8fTwu/R6H09N5aiSTBnFa4zhDNUqfadqWcqBhutOPpareY4e83DCk1+LLUAPX5Ntn4SOSvWD2LaqqpJT72o2y1GT5wVOTh0fOH48yL6WcI/0DMNnOc7bLRltNOk6/ALy8tMkP9mpS+rG4XSeSAXbx2N8FYOEsY5o9oekRqZ9XLUio/mKIama9nzTb6z5KDDesO/EnaI9qzWrX8LPHT9GJtbvhlxwI X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: One target of PCP is to minimize pages in PCP if the system free pages is too few. To reach that target, when page reclaiming is active for the zone (ZONE_RECLAIM_ACTIVE), we will stop increasing PCP high in allocating path, decrease PCP high and free some pages in freeing path. But this may be too late because the background page reclaiming may introduce latency for some workloads. So, in this patch, during page allocation we will detect whether the number of free pages of the zone is below high watermark. If so, we will stop increasing PCP high in allocating path, decrease PCP high and free some pages in freeing path. With this, we can reduce the possibility of the premature background page reclaiming caused by too large PCP. The high watermark checking is done in allocating path to reduce the overhead in hotter freeing path. Signed-off-by: "Huang, Ying" Cc: Andrew Morton Cc: Mel Gorman Cc: Vlastimil Babka Cc: David Hildenbrand Cc: Johannes Weiner Cc: Dave Hansen Cc: Michal Hocko Cc: Pavel Tatashin Cc: Matthew Wilcox Cc: Christoph Lameter --- include/linux/mmzone.h | 1 + mm/page_alloc.c | 33 +++++++++++++++++++++++++++++++-- 2 files changed, 32 insertions(+), 2 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index ec3f7daedcc7..c88770381aaf 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1018,6 +1018,7 @@ enum zone_flags { * Cleared when kswapd is woken. */ ZONE_RECLAIM_ACTIVE, /* kswapd may be scanning the zone. */ + ZONE_BELOW_HIGH, /* zone is below high watermark. */ }; static inline unsigned long zone_managed_pages(struct zone *zone) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 8382ad2cdfd4..253fc7d0498e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2407,7 +2407,13 @@ static int nr_pcp_high(struct per_cpu_pages *pcp, struct zone *zone, return min(batch << 2, pcp->high); } - if (pcp->count >= high && high_min != high_max) { + if (high_min == high_max) + return high; + + if (test_bit(ZONE_BELOW_HIGH, &zone->flags)) { + pcp->high = max(high - (batch << pcp->free_factor), high_min); + high = max(pcp->count, high_min); + } else if (pcp->count >= high) { int need_high = (batch << pcp->free_factor) + batch; /* pcp->high should be large enough to hold batch freed pages */ @@ -2457,6 +2463,10 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp, if (pcp->count >= high) { free_pcppages_bulk(zone, nr_pcp_free(pcp, batch, high, free_high), pcp, pindex); + if (test_bit(ZONE_BELOW_HIGH, &zone->flags) && + zone_watermark_ok(zone, 0, high_wmark_pages(zone), + ZONE_MOVABLE, 0)) + clear_bit(ZONE_BELOW_HIGH, &zone->flags); } } @@ -2763,7 +2773,7 @@ static int nr_pcp_alloc(struct per_cpu_pages *pcp, struct zone *zone, int order) * If we had larger pcp->high, we could avoid to allocate from * zone. */ - if (high_min != high_max && !test_bit(ZONE_RECLAIM_ACTIVE, &zone->flags)) + if (high_min != high_max && !test_bit(ZONE_BELOW_HIGH, &zone->flags)) high = pcp->high = min(high + batch, high_max); if (!order) { @@ -3225,6 +3235,25 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags, } } + /* + * Detect whether the number of free pages is below high + * watermark. If so, we will decrease pcp->high and free + * PCP pages in free path to reduce the possibility of + * premature page reclaiming. Detection is done here to + * avoid to do that in hotter free path. + */ + if (test_bit(ZONE_BELOW_HIGH, &zone->flags)) + goto check_alloc_wmark; + + mark = high_wmark_pages(zone); + if (zone_watermark_fast(zone, order, mark, + ac->highest_zoneidx, alloc_flags, + gfp_mask)) + goto try_this_zone; + else + set_bit(ZONE_BELOW_HIGH, &zone->flags); + +check_alloc_wmark: mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK); if (!zone_watermark_fast(zone, order, mark, ac->highest_zoneidx, alloc_flags,