From patchwork Fri Mar 15 07:41:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "zhaoyang.huang" X-Patchwork-Id: 13593138 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BD1CC54E58 for ; Fri, 15 Mar 2024 07:43:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A71DF80102; Fri, 15 Mar 2024 03:43:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9FA95800B4; Fri, 15 Mar 2024 03:43:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 89AE880102; Fri, 15 Mar 2024 03:43:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 7581D800B4 for ; Fri, 15 Mar 2024 03:43:48 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 1D949A1682 for ; Fri, 15 Mar 2024 07:43:48 +0000 (UTC) X-FDA: 81898484136.10.FFEC636 Received: from SHSQR01.spreadtrum.com (unknown [222.66.158.135]) by imf30.hostedemail.com (Postfix) with ESMTP id 8A03E8000E for ; Fri, 15 Mar 2024 07:43:44 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf30.hostedemail.com: domain of zhaoyang.huang@unisoc.com designates 222.66.158.135 as permitted sender) smtp.mailfrom=zhaoyang.huang@unisoc.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710488625; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pUShuejhJicdxU5bYPaGs3d04Yvt05w78zdLWyB13IA=; b=t6JmS0bwoUV5EekPs7BK5DdJGkr3mkNLJ0RbiwSGIw0A0CfdGbTVTnnBwHWFNpoOG8Brc3 xK2ADw4BMP7cpSNOaaAVNx39Hfs+bBR0i7kNn3+P6oDoNIR4PGdXjlsc86FL1H5jabeEms J4Ww6xytMKhMoziJfP+cI7T1IvFATbk= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf30.hostedemail.com: domain of zhaoyang.huang@unisoc.com designates 222.66.158.135 as permitted sender) smtp.mailfrom=zhaoyang.huang@unisoc.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710488625; a=rsa-sha256; cv=none; b=jER3JTdu5JE5UIbc8BaRAQgTVwH82DfZBYOZOTgpfW4+SvOsJcJmv20ByWpR2tfZKdcFpi LPjcZ0Tc11khuHHsswMjJx56iDVQHF27FCjHQUFDBurFByvN/kupRd2Mn+vk9ZGmhLSSy2 TO7gB/PdsTrV4uq2fUSPqi6lUeQTvho= Received: from dlp.unisoc.com ([10.29.3.86]) by SHSQR01.spreadtrum.com with ESMTP id 42F7fgY5000390; Fri, 15 Mar 2024 15:41:42 +0800 (+08) (envelope-from zhaoyang.huang@unisoc.com) Received: from SHDLP.spreadtrum.com (bjmbx02.spreadtrum.com [10.0.64.8]) by dlp.unisoc.com (SkyGuard) with ESMTPS id 4Twx462Y8Vz2L6W18; Fri, 15 Mar 2024 15:40:22 +0800 (CST) Received: from BJMBX01.spreadtrum.com (10.0.64.7) by BJMBX02.spreadtrum.com (10.0.64.8) with Microsoft SMTP Server (TLS) id 15.0.1497.23; Fri, 15 Mar 2024 15:41:40 +0800 Received: from BJMBX01.spreadtrum.com ([fe80::54e:9a:129d:fac7]) by BJMBX01.spreadtrum.com ([fe80::54e:9a:129d:fac7%16]) with mapi id 15.00.1497.023; Fri, 15 Mar 2024 15:41:40 +0800 From: =?utf-8?b?6buE5pyd6ZizIChaaGFveWFuZyBIdWFuZyk=?= To: Yu Zhao , "liuhailong@oppo.com" CC: "akpm@linux-foundation.org" , "nathan@kernel.org" , "ndesaulniers@google.com" , "trix@redhat.com" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "llvm@lists.linux.dev" , "surenb@google.com" , "Charan Teja Kalla" , =?utf-8?b?5bq357qq5ruoIChT?= =?utf-8?b?dGV2ZSBLYW5nKQ==?= Subject: reply: [PATCH] Revert "mm: skip CMA pages when they are not available" Thread-Topic: reply: [PATCH] Revert "mm: skip CMA pages when they are not available" Thread-Index: AQHadqw3sVlAcXr2r0+8CjaNX3IUeA== Date: Fri, 15 Mar 2024 07:41:39 +0000 Message-ID: <1710488498897.75752@unisoc.com> References: <20240314141516.31747-1-liuhailong@oppo.com>, In-Reply-To: Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.0.73.40] MIME-Version: 1.0 X-MAIL: SHSQR01.spreadtrum.com 42F7fgY5000390 X-Rspamd-Queue-Id: 8A03E8000E X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: nyg8c36q64berss3n1js99oqmdcpceoz X-HE-Tag: 1710488624-83442 X-HE-Meta: U2FsdGVkX18253jOOhrnw5XbVAjZFkst+MOlW+/AgyO8OXpfq4uDm8cYgZJ8EarGvSxWnnFkd0sA97mjGy3jOlnDr4mHTvhqNDCQHagaMa5gVu6oOyD2breN/vAeeqGiO2BMsreDpXJyOspUEm5WQsGnEsvKSGiViCzXKu223qRA5stO2FmvyIHg0k1qgLMo/Y8YluYnUkC0tGgAvac/p4+VjoU3BxEDqUIAFVMVTTsb7e2pg+YpEe93bGRA6MZ+jbAbNCXQTo9bfDNR6kLrFngZFZoTF1Lec6kn+ZgH+x0kTsGIHWw+DfclwehgAglkAtiY+7y74SRdc6/yvJU+huAgxV5i61Uwi0mLr1uWUABIUcVmsp5tifXFFG+WItR6N6lvNlQdqaD7fXVpbYz+8FmQz3kSwQZv/nKS+8W2VfwywpDYT6liHw3zu5LYoOQZGWnoUdpMEwIxuSiMwz8cbC/o9mI8acfxxw1/AMgPzOaS1YZd0x2DW2eKuMKD76NIZWwPYJLrw+CJpe59wtXUgLuBMeF2GCjikXLUlnBI03C8IzHV8uFZYAAx7vaImJ9gtl+Ktu7cF8Isd15cJM/oBLfxZpixgLNMLrnLgGJvETTKhik5knmtPUidP2Y1IB/aSUPGAsH7luj4TFq5r5jNNXy+24tS+/3WNs2D5WeUOtdM0PhrrSBqBBjz4xRHkHmLINEG/GGQg2ikWz+Yd1TLKheeoKS5/EzWmoCzpHkUpJSkcPND5VYk6AWZRPElOSk4oyHYtJkLQGRzzF91EjTb9T5+LLjtc3q7R3QPRu8lkhhilvAPlASHyu/9ep1TOmbxS8hNkMS+Ux+Ace4f4sWmTSM5TiZIsO/a+ibGwZwC8d/sGIf0sSgT4XGOGKADtLI7YLzleiDHmSoIrr/QcfdqBw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Mar 14, 2024 at 10:15 AM wrote: > > From: "Hailong.Liu" > > This reverts commit 5da226dbfce3a2f44978c2c7cf88166e69a6788b. > > patch may cause system not responding. if cma pages is large in lru_list > and system is in lowmemory, many tasks would enter direct reclaim and waste > cpu time to isolate and return. Test this patch on android-5.15 device > and tasks call stack as below. > > Task name: UsbFfs-worker [affinity: 0xff] pid: 3374 cpu: 7 prio: 120 start: ffffff8897a35c80 > state: 0x0[R] exit_state: 0x0 stack base: 0xffffffc01eaa0000 > Last_enqueued_ts: 0.000000000 Last_sleep_ts: 0.000000000 > Stack: > [] __switch_to+0x180 > [] __schedule+0x4dc > [] preempt_schedule+0x5c > [] _raw_spin_unlock_irq+0x54 > [] shrink_inactive_list+0x1d0 > [] shrink_lruvec+0x1bc > [] shrink_node_memcgs+0x184 > [] shrink_node+0x2d0 > [] shrink_zones+0x14c > [] do_try_to_free_pages+0xe8 > [] try_to_free_pages+0x2e0 > [] __alloc_pages_direct_reclaim+0x84 > [] __alloc_pages_slowpath+0x4d0 > [] __alloc_pages_nodemask[jt]+0x124 > [] __vmalloc_area_node+0x188 > [] __vmalloc_node+0x148 > [] vmalloc+0x4c > [] ffs_epfile_io+0x258 > [] kretprobe_trampoline[jt]+0x0 > [] kretprobe_trampoline[jt]+0x0 > [] __io_submit_one+0x1c0 > [] io_submit_one+0x88 > [] __do_sys_io_submit+0x178 > [] __arm64_sys_io_submit+0x20 > [] el0_svc_common.llvm.9961749221945255377+0xd0 > [] do_el0_svc+0x28 > [] el0_svc+0x14 > [] el0_sync_handler+0x88 > [] el0_sync+0x1b8 > > Task name: kthreadd [affinity: 0xff] pid: 2 cpu: 7 prio: 120 start: ffffff87808c0000 > state: 0x0[R] exit_state: 0x0 stack base: 0xffffffc008078000 > Last_enqueued_ts: 0.000000000 Last_sleep_ts: 0.000000000 > Stack: > [] __switch_to+0x180 > [] __schedule+0x4dc > [] preempt_schedule+0x5c > [] _raw_spin_unlock_irq+0x54 > [] shrink_inactive_list+0x2cc > [] shrink_lruvec+0x1bc > [] shrink_node_memcgs+0x184 > [] shrink_node+0x2d0 > [] shrink_zones+0x14c > [] do_try_to_free_pages+0xe8 > [] try_to_free_pages+0x2e0 > [] __alloc_pages_direct_reclaim+0x84 > [] __alloc_pages_slowpath+0x4d0 > [] __alloc_pages_nodemask[jt]+0x124 > [] __vmalloc_area_node+0x188 > [] __vmalloc_node_range+0x88 > [] scs_alloc+0x1b8 > [] scs_prepare+0x20 > [] dup_task_struct+0xd4 > [] copy_process+0x144 > [] kernel_clone+0xb4 > [] kernel_thread+0x5c > [] kthreadd+0x184 > > without this patch, the tasks will reclaim cma pages and wakeup > oom-killer or not spin on cpus. > > Signed-off-by: Hailong.Liu > --- > mm/vmscan.c | 22 +--------------------- > 1 file changed, 1 insertion(+), 21 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 2fe4a11d63f4..197ddf62019f 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2261,25 +2261,6 @@ static __always_inline void update_lru_sizes(struct lruvec *lruvec, > > } > > -#ifdef CONFIG_CMA > -/* > - * It is waste of effort to scan and reclaim CMA pages if it is not available > - * for current allocation context. Kswapd can not be enrolled as it can not > - * distinguish this scenario by using sc->gfp_mask = GFP_KERNEL > - */ > -static bool skip_cma(struct folio *folio, struct scan_control *sc) > -{ > - return !current_is_kswapd() && > - gfp_migratetype(sc->gfp_mask) != MIGRATE_MOVABLE && > - get_pageblock_migratetype(&folio->page) == MIGRATE_CMA; > -} > -#else > -static bool skip_cma(struct folio *folio, struct scan_control *sc) > -{ > - return false; > -} > -#endif > - >NAK. >+Charan Teja Kalla -- This can cause build errors when CONFIG_LRU_GEN=y. >If you plan to post a v2, please include a reproducer. Thanks. Could you please retest the case with bellow patch, which has not been in the aosp yet. From: Zhaoyang Huang According to current CMA utilization policy, an alloc_pages(GFP_USER) could 'steal' UNMOVABLE & RECLAIMABLE page blocks via the help of CMA(pass zone_watermark_ok by counting CMA in but use U&R in rmqueue), which could lead to following alloc_pages(GFP_KERNEL) fail. Solving this by introducing second watermark checking for GFP_MOVABLE, which could have the allocation use CMA when proper. -- Free_pages(30MB) | | -- WMARK_LOW(25MB) | -- Free_CMA(12MB) | | --- Signed-off-by: Zhaoyang Huang --- v6: update comments --- --- mm/page_alloc.c | 44 ++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 40 insertions(+), 4 deletions(-) -- > /* > * Isolating page from the lruvec to fill in @dst list by nr_to_scan times. > * > @@ -2326,8 +2307,7 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan, > nr_pages = folio_nr_pages(folio); > total_scan += nr_pages; > > - if (folio_zonenum(folio) > sc->reclaim_idx || > - skip_cma(folio, sc)) { > + if (folio_zonenum(folio) > sc->reclaim_idx) { > nr_skipped[folio_zonenum(folio)] += nr_pages; > move_to = &folios_skipped; > goto move; > -- > 2.34.1 > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 452459836b71..5a146aa7c0aa 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2078,6 +2078,43 @@ __rmqueue_fallback(struct zone *zone, int order, int start_migratetype, } +#ifdef CONFIG_CMA +/* + * GFP_MOVABLE allocation could drain UNMOVABLE & RECLAIMABLE page blocks via + * the help of CMA which makes GFP_KERNEL failed. Checking if zone_watermark_ok + * again without ALLOC_CMA to see if to use CMA first. + */ +static bool use_cma_first(struct zone *zone, unsigned int order, unsigned int alloc_flags) +{ + unsigned long watermark; + bool cma_first = false; + + watermark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK); + /* check if GFP_MOVABLE pass previous zone_watermark_ok via the help of CMA */ + if (zone_watermark_ok(zone, order, watermark, 0, alloc_flags & (~ALLOC_CMA))) { + /* + * Balance movable allocations between regular and CMA areas by + * allocating from CMA when over half of the zone's free memory + * is in the CMA area. + */ + cma_first = (zone_page_state(zone, NR_FREE_CMA_PAGES) > + zone_page_state(zone, NR_FREE_PAGES) / 2); + } else { + /* + * watermark failed means UNMOVABLE & RECLAIMBLE is not enough + * now, we should use cma first to keep them stay around the + * corresponding watermark + */ + cma_first = true; + } + return cma_first; +} +#else +static bool use_cma_first(struct zone *zone, unsigned int order, unsigned int alloc_flags) +{ + return false; +} +#endif /* * Do the hard work of removing an element from the buddy allocator. * Call me with the zone->lock already held. @@ -2091,12 +2128,11 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype, if (IS_ENABLED(CONFIG_CMA)) { /* * Balance movable allocations between regular and CMA areas by - * allocating from CMA when over half of the zone's free memory - * is in the CMA area. + * allocating from CMA base on judging zone_watermark_ok again + * to see if the latest check got pass via the help of CMA */ if (alloc_flags & ALLOC_CMA && - zone_page_state(zone, NR_FREE_CMA_PAGES) > - zone_page_state(zone, NR_FREE_PAGES) / 2) { + use_cma_first(zone, order, alloc_flags)) { page = __rmqueue_cma_fallback(zone, order); if (page) return page;