From patchwork Sun Jun 28 07:43:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Song Bao Hua (Barry Song)" X-Patchwork-Id: 11629939 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E1A9E913 for ; Sun, 28 Jun 2020 07:45:45 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AD63B2076C for ; Sun, 28 Jun 2020 07:45:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AD63B2076C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=hisilicon.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A8F2A6B0003; Sun, 28 Jun 2020 03:45:44 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A3FBA6B0005; Sun, 28 Jun 2020 03:45:44 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 92F166B0006; Sun, 28 Jun 2020 03:45:44 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0175.hostedemail.com [216.40.44.175]) by kanga.kvack.org (Postfix) with ESMTP id 79A316B0003 for ; Sun, 28 Jun 2020 03:45:44 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id F300E180AD80F for ; Sun, 28 Jun 2020 07:45:43 +0000 (UTC) X-FDA: 76977836166.26.glass31_1e0f9d426e65 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin26.hostedemail.com (Postfix) with ESMTP id D06EA1804A301 for ; Sun, 28 Jun 2020 07:45:43 +0000 (UTC) X-Spam-Summary: 1,0,0,4e74b84c1bb676f9,d41d8cd98f00b204,song.bao.hua@hisilicon.com,,RULES_HIT:41:355:379:541:582:800:960:966:973:988:989:1152:1260:1261:1277:1311:1313:1314:1345:1431:1437:1515:1516:1518:1534:1543:1711:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:2693:2731:3138:3139:3140:3141:3142:3354:3865:3866:3867:3868:3870:3871:3872:3874:4250:4385:5007:6261:6742:7514:7903:8603:10004:10400:10450:10455:11026:11658:11914:12043:12296:12297:12438:12555:12679:12895:13548:13894:14093:14096:14097:14181:14394:14721:14819:19904:19999:21080:21451:21627:21740:21987:30012:30054:30064,0,RBL:45.249.212.191:@hisilicon.com:.lbl8.mailshell.net-62.14.2.100 64.201.201.201;04yfsyquuxdcufkdmqgna8rr137jgypmjfy739ax9tpmarrb7jif1dad3et1ata.csgy41fix5fisnz1xrmq5dwxws4m5ez1qohmhdy8sgab9d4c44utdsos8psn79j.6-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:69,LUA_SUMMARY:none X-HE-Tag: glass31_1e0f9d426e65 X-Filterd-Recvd-Size: 4665 Received: from huawei.com (szxga05-in.huawei.com [45.249.212.191]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Sun, 28 Jun 2020 07:45:42 +0000 (UTC) Received: from DGGEMS412-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id DE430D0BFC732EFC4BD1; Sun, 28 Jun 2020 15:45:37 +0800 (CST) Received: from SWX921481.china.huawei.com (10.126.201.102) by DGGEMS412-HUB.china.huawei.com (10.3.19.212) with Microsoft SMTP Server id 14.3.487.0; Sun, 28 Jun 2020 15:45:28 +0800 From: Barry Song To: CC: , , , Barry Song , "Jonathan Cameron" , Aslan Bakirov , "Roman Gushchin" , Michal Hocko , Andreas Schaufler , Mike Kravetz , "Rik van Riel" , Joonsoo Kim , Robin Murphy Subject: [PATCH] mm/cma.c: use exact_nid true to fix possible per-numa cma leak Date: Sun, 28 Jun 2020 19:43:45 +1200 Message-ID: <20200628074345.27228-1-song.bao.hua@hisilicon.com> X-Mailer: git-send-email 2.21.0.windows.1 MIME-Version: 1.0 X-Originating-IP: [10.126.201.102] X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: D06EA1804A301 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Calling cma_declare_contiguous_nid() with false exact_nid for per-numa reservation can easily cause cma leak and various confusion. For example, mm/hugetlb.c is trying to reserve per-numa cma for gigantic pages. But it can easily leak cma and make users confused when system has memoryless nodes. In case the system has 4 numa nodes, and only numa node0 has memory. if we set hugetlb_cma=4G in bootargs, mm/hugetlb.c will get 4 cma areas for 4 different numa nodes. since exact_nid=false in current code, all 4 numa nodes will get cma successfully from node0, but hugetlb_cma[1 to 3] will never be available to hugepage will only allocate memory from hugetlb_cma[0]. In case the system has 4 numa nodes, both numa node0&2 has memory, other nodes have no memory. if we set hugetlb_cma=4G in bootargs, mm/hugetlb.c will get 4 cma areas for 4 different numa nodes. since exact_nid=false in current code, all 4 numa nodes will get cma successfully from node0 or 2, but hugetlb_cma[1] and [3] will never be available to hugepage as mm/hugetlb.c will only allocate memory from hugetlb_cma[0] and hugetlb_cma[2]. This causes permanent leak of the cma areas which are supposed to be used by memoryless node. Of cource we can workaround the issue by letting mm/hugetlb.c scan all cma areas in alloc_gigantic_page() even node_mask includes node0 only. that means when node_mask includes node0 only, we can get page from hugetlb_cma[1] to hugetlb_cma[3]. But this will cause kernel crash in free_gigantic_page() while it wants to free page by: cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order) On the other hand, exact_nid=false won't consider numa distance, it might be not that useful to leverage cma areas on remote nodes. I feel it is much simpler to make exact_nid true to make everything clear. After that, memoryless nodes won't be able to reserve per-numa CMA from other nodes which have memory. Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma") Cc: Jonathan Cameron Cc: Aslan Bakirov Cc: Roman Gushchin Cc: Andrew Morton Cc: Michal Hocko Cc: Andreas Schaufler Cc: Mike Kravetz Cc: Rik van Riel Cc: Joonsoo Kim Cc: Robin Murphy Signed-off-by: Barry Song Acked-by: Roman Gushchin --- mm/cma.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/cma.c b/mm/cma.c index b24151fa2101..f472f398026f 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -338,13 +338,13 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, */ if (base < highmem_start && limit > highmem_start) { addr = memblock_alloc_range_nid(size, alignment, - highmem_start, limit, nid, false); + highmem_start, limit, nid, true); limit = highmem_start; } if (!addr) { addr = memblock_alloc_range_nid(size, alignment, base, - limit, nid, false); + limit, nid, true); if (!addr) { ret = -ENOMEM; goto err;