From patchwork Mon Oct 30 12:39:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Charan Teja Kalla X-Patchwork-Id: 13440548 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78527C4332F for ; Mon, 30 Oct 2023 12:40:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DCD646B01A9; Mon, 30 Oct 2023 08:40:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D7DB46B01AA; Mon, 30 Oct 2023 08:40:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C450D6B01AE; Mon, 30 Oct 2023 08:40:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B32086B01A9 for ; Mon, 30 Oct 2023 08:40:27 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 7436B1CB54E for ; Mon, 30 Oct 2023 12:40:27 +0000 (UTC) X-FDA: 81402086094.19.439111C Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) by imf05.hostedemail.com (Postfix) with ESMTP id 4D37210000A for ; Mon, 30 Oct 2023 12:40:25 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcppdkim1 header.b=BkP3BgFh; spf=pass (imf05.hostedemail.com: domain of quic_charante@quicinc.com designates 205.220.168.131 as permitted sender) smtp.mailfrom=quic_charante@quicinc.com; dmarc=pass (policy=none) header.from=quicinc.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698669625; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=oT8Q2JY+ZYriM6AyjlplkXOaiisUB1+zlmU2NxFg5bg=; b=xhf2w6vym7QuIZvJjrRkq6qNr45DFGl3uB626Ne17y/65zFBFE2fXljRn6vsmdC/eHan6M jsUy/MqOkQA/Lk34ztALibXm4cMXJ4+6+BzpYSJz56BIxSekQeA+aTYBmjrTWpeVWD5E0v e0PLoU9xQ+64uEHDPuAAGxkY6xhIWd0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698669625; a=rsa-sha256; cv=none; b=nHs7jlO6V/rShgFsIt/OGc2OfhJHX4XzJx+4N9kMvgrk9ncAzBM9pKtrZrQOgNFjxhwtZ5 g3/BIIBJ8W3CYqjHmrSjO1Lwl+mgsxGDJAMswnBctjUSWtI6Dp2BNmBrgDYRLI7aXJAqlv AZVQuwVmA4X0P7A+BcFvEISr5BA8hSg= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcppdkim1 header.b=BkP3BgFh; spf=pass (imf05.hostedemail.com: domain of quic_charante@quicinc.com designates 205.220.168.131 as permitted sender) smtp.mailfrom=quic_charante@quicinc.com; dmarc=pass (policy=none) header.from=quicinc.com Received: from pps.filterd (m0279862.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39UCUfv8002259; Mon, 30 Oct 2023 12:40:23 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=from : to : cc : subject : date : message-id : mime-version : content-type; s=qcppdkim1; bh=oT8Q2JY+ZYriM6AyjlplkXOaiisUB1+zlmU2NxFg5bg=; b=BkP3BgFhgtfOCJXwuAFbjnczCb/AfWc/JBI/r3u2GwXKqNfTRid6CF9fB9DRLUw+o6/i e8PmL7aVexkqG0PUfb/mT/JAI3jdJeo+4AdX7v/DTCDxEaT0/R4cg/RrXZlHhYgyF1R/ qFKWc+B01Mil5+fr836hVwxMqx6V4Tiga0KMdCdN0jW+EmGrDrGxlOSEGV4a9Ns3NyF+ kQywXFMWVor7aJqby7oMt+2tkSu+5Cne4kuhpChdDXE8FpWZiKvRbStZS3NsZnopuL+Z jHYBDm0iVkNshovdBgK+YOlqjkhIUgx/3w+MAOaha8PdZWIZRJViJkyP6muls21bLjBB EQ== Received: from nalasppmta01.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3u2chyg0tg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 30 Oct 2023 12:40:23 +0000 Received: from nalasex01a.na.qualcomm.com (nalasex01a.na.qualcomm.com [10.47.209.196]) by NALASPPMTA01.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTPS id 39UCeMqW009157 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 30 Oct 2023 12:40:22 GMT Received: from hu-charante-hyd.qualcomm.com (10.80.80.8) by nalasex01a.na.qualcomm.com (10.47.209.196) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.39; Mon, 30 Oct 2023 05:40:19 -0700 From: Charan Teja Kalla To: , , , , CC: , , Charan Teja Kalla Subject: [PATCH] mm: page_alloc: unreserve highatomic page blocks before oom Date: Mon, 30 Oct 2023 18:09:50 +0530 Message-ID: <1698669590-3193-1-git-send-email-quic_charante@quicinc.com> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 X-Originating-IP: [10.80.80.8] X-ClientProxiedBy: nasanex01b.na.qualcomm.com (10.46.141.250) To nalasex01a.na.qualcomm.com (10.47.209.196) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: meoXfyj-2M7sxjewK1Sa1HdZUAmpIapY X-Proofpoint-GUID: meoXfyj-2M7sxjewK1Sa1HdZUAmpIapY X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-30_10,2023-10-27_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 bulkscore=0 mlxscore=0 adultscore=0 mlxlogscore=999 spamscore=0 phishscore=0 clxscore=1011 suspectscore=0 lowpriorityscore=0 malwarescore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2310240000 definitions=main-2310300097 X-Rspamd-Queue-Id: 4D37210000A X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 7pjzrtcnpdgxpiua1asqy8ebebrdiznf X-HE-Tag: 1698669625-268514 X-HE-Meta: U2FsdGVkX183bfcc9FjfWMe8mDF7TPdl5dsJdv/QyNJn9jU7IP4JUVnE0EC+QTRiZvdESCA+GMCVURGAmy/BA6jDSveN7VgSXt1PJpOiII3ua005TJuE3wyVedWjQ86XgrIJnfyA7VMlSgFbO41NK998XJ63TK92r/IvXrIYalO5uIVL+RQI2H3QCd1+/qlqEAx1e6XdEp5ww3vXCn7HoQNr/kZingtCePkh8ItfoeWccgn1OnIklM++6GKyTP6Y1rEbgp2XbMKIcqUOga76wiVwV9GCheZuQ+IhiRM1yNEpaUBGICdMqRHUILlI0qMt6bC2+W2ewOFoJjROpCAntkUnKOO5D3x7/GrGvPErJkIvRO8cteFpWMHGpKK5lHSp7uktx01pyoPreQJVtINkLT+7DsccM+80/Kd0OoXgrSm3BFMAxR0rTzRyfJFR1ZYWtpjFAsJJYsMItBZqwTL6jD1Ue967eAlfF/jimvVcdHXvfzXdO4CbkXHUgI2Qrw6w7BOzyGqQC8ZqlFML7xb7KEmc4335WqWy8Zv6Wcc2L0SlEGPetCZbkQmW9slRLSMj+zMsL41kV1ZyQGX0unnF8zmyGQExoVEKvAFTR7MeacFIFjAa8OfcAwx2Xzdp/gn2pDGmZq3Iu8lMKfImhbRHeaxUji4TeYSixCc5BZaQWOpfWVfkGQsWtLoxd8MKfAXpxcSMPZBET95S8aOlXpMMkihNP1ZtyfVbQ+hfM53szfQ0/H7sof3C+ZA54wdC7VSg16I/z11yQy25u5mlLFODD7w4sGREwPan9vD6xunWTQfatNMbVy5h6kF14nb0MsafyIadqinZjHvU5shTclbGYfV5Xcgk9F8RbqwlbnU+M59DAqOaa2t/3Cadih2nxQfQtxn7FCH5ydVvrrjHBk3N5zUEQp1DHMF0FTR24rPNvyEsSHqWmXqf4vJVbTK2twykH0ynfPZF2i1gHjXaxQM aW5HKor0 dTuSUs8nY0zmRZOUoQS/ROFp17tAADyXHHTHkuTlPSN+OTz74JCCvvrmQFjyr5OnXT6kg4osCZntwomcWvZ6fN3LE4+hch+cgUhNpKzTrIYJPEdQhLB7bhklau+BxLZRwmCp4+gPLgixTetQTAcI/JcC03oPEEyBsr8rjQPoOTGYvn4/Dkr6luHQX7SmGQwI4n6Y5hCx/MTGo/S6bmEE+wd56xpj01hfd6XhfJyz57Pew/YD4gG9s5oFpSEqIXhoj+HNdWoJ+WsE3YjL6a5/Vczjogx+9Y35nSIBr+752vZablSA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: __alloc_pages_direct_reclaim() is called from slowpath allocation where high atomic reserves can be unreserved after there is a progress in reclaim and yet no suitable page is found. Later should_reclaim_retry() gets called from slow path allocation to decide if the reclaim needs to be retried before OOM kill path is taken. should_reclaim_retry() checks the available(reclaimable + free pages) memory against the min wmark levels of a zone and returns: a) true, if it is above the min wmark so that slow path allocation will do the reclaim retries. b) false, thus slowpath allocation takes oom kill path. should_reclaim_retry() can also unreserves the high atomic reserves **but only after all the reclaim retries are exhausted.** In a case where there are almost none reclaimable memory and free pages contains mostly the high atomic reserves but allocation context can't use these high atomic reserves, makes the available memory below min wmark levels hence false is returned from should_reclaim_retry() leading the allocation request to take OOM kill path. This is an early oom kill because high atomic reserves are holding lot of free memory and unreserving of them is not attempted. (early)OOM is encountered on a machine in the below state(excerpt from the oom kill logs): [ 295.998653] Normal free:7728kB boost:0kB min:804kB low:1004kB high:1204kB reserved_highatomic:8192KB active_anon:4kB inactive_anon:0kB active_file:24kB inactive_file:24kB unevictable:1220kB writepending:0kB present:70732kB managed:49224kB mlocked:0kB bounce:0kB free_pcp:688kB local_pcp:492kB free_cma:0kB [ 295.998656] lowmem_reserve[]: 0 32 [ 295.998659] Normal: 508*4kB (UMEH) 241*8kB (UMEH) 143*16kB (UMEH) 33*32kB (UH) 7*64kB (UH) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7752kB Per above log, the free memory of ~7MB exist in the high atomic reserves is not freed up before falling back to oom kill path. This fix includes unreserving these atomic reserves in the OOM path before going for a kill. The side effect of unreserving in oom kill path is that these free pages are checked against the high wmark. If unreserved from should_reclaim_retry()/__alloc_pages_direct_reclaim(), they are checked against the min wmark levels. Signed-off-by: Charan Teja Kalla --- mm/page_alloc.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 95546f3..2a2536d 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3281,6 +3281,8 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, .order = order, }; struct page *page; + struct zone *zone; + struct zoneref *z; *did_some_progress = 0; @@ -3295,6 +3297,16 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, } /* + * If should_reclaim_retry() encounters a state where: + * reclaimable + free doesn't satisfy the wmark levels, + * it can directly jump to OOM without even unreserving + * the highatomic page blocks. Try them for once here + * before jumping to OOM. + */ +retry: + unreserve_highatomic_pageblock(ac, true); + + /* * Go through the zonelist yet one more time, keep very high watermark * here, this is only to catch a parallel oom killing, we must fail if * we're still under heavy pressure. But make sure that this reclaim @@ -3307,6 +3319,12 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, if (page) goto out; + for_each_zone_zonelist_nodemask(zone, z, ac->zonelist, ac->highest_zoneidx, + ac->nodemask) { + if (zone->nr_reserved_highatomic > 0) + goto retry; + } + /* Coredumps can quickly deplete all memory reserves */ if (current->flags & PF_DUMPCORE) goto out;