From patchwork Fri Sep 9 08:36:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhenhua Huang X-Patchwork-Id: 12971228 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD7BDECAAD3 for ; Fri, 9 Sep 2022 08:41:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0B7028D0002; Fri, 9 Sep 2022 04:41:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 067638D0001; Fri, 9 Sep 2022 04:41:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E71A88D0002; Fri, 9 Sep 2022 04:41:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D7F228D0001 for ; Fri, 9 Sep 2022 04:41:55 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id A1F9780402 for ; Fri, 9 Sep 2022 08:41:55 +0000 (UTC) X-FDA: 79891904190.23.1E7790F Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) by imf26.hostedemail.com (Postfix) with ESMTP id 3B07214009E for ; Fri, 9 Sep 2022 08:41:55 +0000 (UTC) Received: from pps.filterd (m0279872.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2898ZWJN009235; Fri, 9 Sep 2022 08:41:54 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=from : to : cc : subject : date : message-id : mime-version : content-type; s=qcppdkim1; bh=Sr4rmd1r5e/D3Oa9Xn8Hqy6gIQTwZiWfz+Nh3n8C7Kw=; b=EGf2LWqzC4NcWlwZ5vvcWQRfaaq4QqDNB1ZsmWRvkoNI3qB8pfMp+mJz+sUKFsdDWZKB i4L2u/7lwGwGwCfkIvyBD1uTzJQFacSfVYmKardE+egU8PW4RQT+ENoiKXaVDikKaycL dLud1NkEDzqsX0GR0kpjdidklNtV1oM3LuyccSzzp0HF5QgxbZ82nwdaiGsxg1OFO/Yz HmiWQNtRAFyiTN+wjjJj5r54uI7Z/uWkAT1EQgXgDHB7NrDdn1epO7I7m2EovG49F+jA oGt4eG7f9ABDf6D5LqgFpS8+kxNMEFrLhjKOHCe1dej6xqwWGcW19/89uj6uKKhfzoAY 9g== Received: from nalasppmta01.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3jfdc7445k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 09 Sep 2022 08:41:54 +0000 Received: from nalasex01a.na.qualcomm.com (nalasex01a.na.qualcomm.com [10.47.209.196]) by NALASPPMTA01.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTPS id 2898arXh021202 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 9 Sep 2022 08:36:53 GMT Received: from zhenhuah-gv.qualcomm.com (10.80.80.8) by nalasex01a.na.qualcomm.com (10.47.209.196) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.29; Fri, 9 Sep 2022 01:36:51 -0700 From: Zhenhua Huang To: CC: Zhenhua Huang , , Subject: [PATCH] mm:page_alloc.c: lower the order requirement of should_reclaim_retry Date: Fri, 9 Sep 2022 16:36:34 +0800 Message-ID: <1662712594-32344-1-git-send-email-quic_zhenhuah@quicinc.com> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 X-Originating-IP: [10.80.80.8] X-ClientProxiedBy: nasanex01a.na.qualcomm.com (10.52.223.231) To nalasex01a.na.qualcomm.com (10.47.209.196) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: fMSsw8uMCWp3BvYchimUi3iCdAb2NZA8 X-Proofpoint-ORIG-GUID: fMSsw8uMCWp3BvYchimUi3iCdAb2NZA8 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-09_04,2022-09-09_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 priorityscore=1501 lowpriorityscore=0 suspectscore=0 mlxlogscore=757 spamscore=0 phishscore=0 clxscore=1015 adultscore=0 bulkscore=0 impostorscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2207270000 definitions=main-2209090029 ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcppdkim1 header.b=EGf2LWqz; dmarc=pass (policy=none) header.from=quicinc.com; spf=pass (imf26.hostedemail.com: domain of quic_zhenhuah@quicinc.com designates 205.220.180.131 as permitted sender) smtp.mailfrom=quic_zhenhuah@quicinc.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1662712915; a=rsa-sha256; cv=none; b=s/A36DM+nRDyj3SxwTRZy2GxMD7fLdOTIaLDhL2CS6mRHxNTCruNTAfuLfKPGGusz7kbgB Lz0xenYpdXLRn6zTnaoTNZ6a3FvZAJQwuxEsJ/nU5aM2IG2bFOmwiXVi3HbU0dEWwEX/lL cW6FvuErFFD/5vuC/YgqwmPYdHCSqkY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1662712915; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=Sr4rmd1r5e/D3Oa9Xn8Hqy6gIQTwZiWfz+Nh3n8C7Kw=; b=Wh6w71xcS9EyiT9cMSTM1H05CU1NvhniVOlSNy+Ygq5BMAjflO7yH0K3xQKgEhJARufoOT VO11LbbxftU3bn/tfxjk2Oe7jvj4qXsGRDGipR1T503sHYKvZpuqTiRxYiD3zJtK6mV2IM LTFGThYD983Fy3jfkB3HyDG//kpkadw= Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcppdkim1 header.b=EGf2LWqz; dmarc=pass (policy=none) header.from=quicinc.com; spf=pass (imf26.hostedemail.com: domain of quic_zhenhuah@quicinc.com designates 205.220.180.131 as permitted sender) smtp.mailfrom=quic_zhenhuah@quicinc.com X-Rspam-User: X-Rspamd-Server: rspam01 X-Stat-Signature: 6ytdgbfiqg8asgijadhrdmtm38n8hs49 X-Rspamd-Queue-Id: 3B07214009E X-HE-Tag: 1662712915-736219 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When a driver was continuously allocating order 3 pages, it would be very easily OOM even there were lots of reclaimable pages. A test module is used to reproduce this issue, several key ftrace events are as below: insmod-6968 [005] .... 321.306007: reclaim_retry_zone: node=0 zone=Normal order=3 reclaimable=539988 available=592856 min_wmark=21227 no_progress_loops=0 wmark_check=0 insmod-6968 [005] .... 321.306009: compact_retry: order=3 priority=COMPACT_PRIO_SYNC_LIGHT compaction_result=withdrawn retries=0 max_retries=16 should_retry=1 insmod-6968 [004] .... 321.308220: mm_compaction_try_to_compact_pages: order=3 gfp_mask=GFP_KERNEL priority=0 insmod-6968 [004] .... 321.308964: mm_compaction_end: zone_start=0x80000 migrate_pfn=0xaa800 free_pfn=0x80800 zone_end=0x940000, mode=sync status=complete insmod-6968 [004] .... 321.308971: reclaim_retry_zone: node=0 zone=Normal order=3 reclaimable=539830 available=592776 min_wmark=21227 no_progress_loops=0 wmark_check=0 insmod-6968 [004] .... 321.308973: compact_retry: order=3 priority=COMPACT_PRIO_SYNC_FULL compaction_result=failed retries=0 max_retries=16 should_retry=0 There're ~2GB reclaimable pages(reclaimable=539988) but VM decides not to reclaim any more: insmod-6968 [005] .... 321.306007: reclaim_retry_zone: node=0 zone=Normal order=3 reclaimable=539988 available=592856 min_wmark=21227 no_progress_loops=0 wmark_check=0 From meminfo when oom, there was NO qualified order >= 3 pages(CMA page not qualified) can meet should_reclaim_retry's requirement: Normal : 24671*4kB (UMEC) 13807*8kB (UMEC) 8214*16kB (UEC) 190*32kB (C) 94*64kB (C) 28*128kB (C) 16*256kB (C) 7*512kB (C) 5*1024kB (C) 7*2048kB (C) 46*4096kB (C) = 571796kB The reason of should_reclaim_retry early aborting was that is based on having the order pages in its free_list. For order 3 pages, that's easily fragmented. Considering enough free pages are the fundamental of compaction. It may not be suitable to stop reclaiming when lots of page cache there. Relax order by one to fix this issue. With the change meminfo output when first OOM showing page cache was nearly exhausted: Normal free: 462956kB min:8644kB low:44672kB high:50844kB reserved_highatomic:4096KB active_anon:48kB inactive_anon:12kB active_file:508kB inactive_file:552kB unevictable:109016kB writepending:160kB present:7111680kB managed:6175004kB mlocked:107784kB pagetables:78732kB bounce:0kB free_pcp:996kB local_pcp:0kB free_cma:376412kB Signed-off-by: Zhenhua Huang --- mm/page_alloc.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 36b2021..b4ca6d1 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4954,8 +4954,11 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order, /* * Would the allocation succeed if we reclaimed all * reclaimable pages? + * considering fragmentation, enough free pages are the + * fundamental of compaction: + * lower the order requirement by one */ - wmark = __zone_watermark_ok(zone, order, min_wmark, + wmark = __zone_watermark_ok(zone, order ? order - 1 : 0, min_wmark, ac->highest_zoneidx, alloc_flags, available); trace_reclaim_retry_zone(z, order, reclaimable, available, min_wmark, *no_progress_loops, wmark);