From patchwork Mon May 11 17:03:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sandipan Das X-Patchwork-Id: 11541339 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AD447912 for ; Mon, 11 May 2020 17:04:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 74BD320714 for ; Mon, 11 May 2020 17:04:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 74BD320714 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.ibm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 98E3D90006B; Mon, 11 May 2020 13:04:05 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 93EDE900036; Mon, 11 May 2020 13:04:05 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 853C790006B; Mon, 11 May 2020 13:04:05 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0086.hostedemail.com [216.40.44.86]) by kanga.kvack.org (Postfix) with ESMTP id 6BF5D900036 for ; Mon, 11 May 2020 13:04:05 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 264B1181AEF15 for ; Mon, 11 May 2020 17:04:05 +0000 (UTC) X-FDA: 76805060850.01.year39_5dc9cdbbd7324 X-Spam-Summary: 13,1.2,0,0ad1cdc0483c4f1c,d41d8cd98f00b204,sandipan@linux.ibm.com,,RULES_HIT:41:355:379:541:800:960:966:967:973:988:989:1260:1261:1311:1314:1345:1437:1515:1535:1543:1711:1730:1747:1777:1792:1801:2194:2196:2198:2199:2200:2201:2393:2525:2559:2564:2682:2685:2693:2859:2898:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3355:3622:3865:3866:3867:3868:3870:3871:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4117:4321:4385:4605:5007:6119:6261:7875:7903:8603:9025:9121:10008:11026:11232:11658:11914:12043:12296:12297:12438:12555:12679:12895:12986:13161:13229:13894:14096:14181:14394:14721:14819:21063:21080:21347:21433:21451:21617:21749:21811:30054:30070,0,RBL:148.163.156.1:@linux.ibm.com:.lbl8.mailshell.net-64.100.201.201 62.2.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:1:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: year39_5dc9cdbbd7324 X-Filterd-Recvd-Size: 6047 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Mon, 11 May 2020 17:04:04 +0000 (UTC) Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 04BH3Iuj178280; Mon, 11 May 2020 13:04:03 -0400 Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0a-001b2d01.pphosted.com with ESMTP id 30wrvykuru-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 11 May 2020 13:04:03 -0400 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 04BH0pmo007223; Mon, 11 May 2020 17:04:01 GMT Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by ppma03ams.nl.ibm.com with ESMTP id 30wm55msme-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 11 May 2020 17:04:00 +0000 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 04BH3w9h40632330 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 11 May 2020 17:03:58 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C88B84C044; Mon, 11 May 2020 17:03:58 +0000 (GMT) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 55B3E4C040; Mon, 11 May 2020 17:03:57 +0000 (GMT) Received: from fir03.in.ibm.com (unknown [9.121.59.65]) by d06av22.portsmouth.uk.ibm.com (Postfix) with ESMTP; Mon, 11 May 2020 17:03:57 +0000 (GMT) From: Sandipan Das To: akpm@linux-foundation.org Cc: linux-mm@kvack.org, khlebnikov@yandex-team.ru, mhocko@suse.com, vbabka@suse.cz, kirill@shutemov.name, aneesh.kumar@linux.ibm.com, srikar@linux.vnet.ibm.com Subject: [PATCH v3] mm: Reset numa stats for boot pagesets Date: Mon, 11 May 2020 22:33:56 +0530 Message-Id: <20200511170356.162531-1-sandipan@linux.ibm.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.216,18.0.676 definitions=2020-05-11_08:2020-05-11,2020-05-11 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 bulkscore=0 adultscore=0 mlxlogscore=999 clxscore=1015 mlxscore=0 impostorscore=0 lowpriorityscore=0 phishscore=0 priorityscore=1501 suspectscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005110135 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Initially, the per-cpu pagesets of each zone are set to the boot pagesets. The real pagesets are allocated later but before that happens, page allocations do occur and the numa stats for the boot pagesets get incremented since they are common to all zones at that point. The real pagesets, however, are allocated for the populated zones only. Unpopulated zones, like those associated with memory-less nodes, continue using the boot pageset and end up skewing the numa stats of the corresponding node. E.g. $ numactl -H available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 node 0 size: 0 MB node 0 free: 0 MB node 1 cpus: 4 5 6 7 node 1 size: 8131 MB node 1 free: 6980 MB node distances: node 0 1 0: 10 40 1: 40 10 $ numastat node0 node1 numa_hit 108 56495 numa_miss 0 0 numa_foreign 0 0 interleave_hit 0 4537 local_node 108 31547 other_node 0 24948 Hence, the boot pageset stats need to be cleared after the real pagesets are allocated. From this point onwards, the stats of the boot pagesets do not change as page allocations requested for a memory-less node will either fail (if __GFP_THISNODE is used) or get fulfilled by a preferred zone of a different node based on the fallback zonelist. Signed-off-by: Sandipan Das Acked-by: Vlastimil Babka --- The previous versions and discussion around them can be found at v2: https://lore.kernel.org/linux-mm/9c9c2d1b15e37f6e6bf32f99e3100035e90c4ac9.1588868430.git.sandipan@linux.ibm.com/ v1: https://lore.kernel.org/linux-mm/20200504070304.127361-1-sandipan@linux.ibm.com/ Changes in v3: - Drop the redundant static key-based check to see if numa stats are enabled as suggested by Vlastimil. Changes in v2: - Reset the stats of the boot pagesets instead of explicitly returning zero as suggested by Vlastimil. - Changed the subject to reflect the above. --- mm/page_alloc.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 13cc653122b7..38b459533cee 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6261,10 +6261,25 @@ void __init setup_per_cpu_pageset(void) { struct pglist_data *pgdat; struct zone *zone; + int __maybe_unused cpu; for_each_populated_zone(zone) setup_zone_pageset(zone); +#ifdef CONFIG_NUMA + /* + * Unpopulated zones continue using the boot pagesets. + * The numa stats for these pagesets need to be reset. + * Otherwise, they will end up skewing the stats of + * the nodes these zones are associated with. + */ + for_each_possible_cpu(cpu) { + struct per_cpu_pageset *pcp = &per_cpu(boot_pageset, cpu); + memset(pcp->vm_numa_stat_diff, 0, + sizeof(pcp->vm_numa_stat_diff)); + } +#endif + for_each_online_pgdat(pgdat) pgdat->per_cpu_nodestats = alloc_percpu(struct per_cpu_nodestat);