Message ID | 20200218082634.1596727-1-ying.huang@intel.com (mailing list archive) |
---|---|
Headers | show
Return-Path: <SRS0=f+6W=4G=kvack.org=owner-linux-mm@kernel.org> Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AE1B61580 for <patchwork-linux-mm@patchwork.kernel.org>; Tue, 18 Feb 2020 08:27:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 843252464E for <patchwork-linux-mm@patchwork.kernel.org>; Tue, 18 Feb 2020 08:27:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 843252464E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DCBDA6B0003; Tue, 18 Feb 2020 03:27:18 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DA2B46B0006; Tue, 18 Feb 2020 03:27:18 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CB8AE6B0007; Tue, 18 Feb 2020 03:27:18 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0049.hostedemail.com [216.40.44.49]) by kanga.kvack.org (Postfix) with ESMTP id B38EA6B0003 for <linux-mm@kvack.org>; Tue, 18 Feb 2020 03:27:18 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 4EF63181AEF1D for <linux-mm@kvack.org>; Tue, 18 Feb 2020 08:27:18 +0000 (UTC) X-FDA: 76502568156.02.badge15_3f8dc22c80b36 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,ying.huang@intel.com,:peterz@infradead.org:mingo@kernel.org::linux-kernel@vger.kernel.org:feng.tang@intel.com:ying.huang@intel.com:akpm@linux-foundation.org:mhocko@suse.com:riel@redhat.com:mgorman@suse.de:dave.hansen@linux.intel.com:dan.j.williams@intel.com,RULES_HIT:30003:30054:30055:30064:30070:30090,0,RBL:134.134.136.100:@intel.com:.lbl8.mailshell.net-62.18.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: badge15_3f8dc22c80b36 X-Filterd-Recvd-Size: 4075 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by imf25.hostedemail.com (Postfix) with ESMTP for <linux-mm@kvack.org>; Tue, 18 Feb 2020 08:27:16 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Feb 2020 00:27:14 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,455,1574150400"; d="scan'208";a="235466622" Received: from yhuang-dev.sh.intel.com ([10.239.159.151]) by orsmga003.jf.intel.com with ESMTP; 18 Feb 2020 00:27:11 -0800 From: "Huang, Ying" <ying.huang@intel.com> To: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@kernel.org> Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Feng Tang <feng.tang@intel.com>, Huang Ying <ying.huang@intel.com>, Andrew Morton <akpm@linux-foundation.org>, Michal Hocko <mhocko@suse.com>, Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>, Dave Hansen <dave.hansen@linux.intel.com>, Dan Williams <dan.j.williams@intel.com> Subject: [RFC -V2 0/8] autonuma: Optimize memory placement in memory tiering system Date: Tue, 18 Feb 2020 16:26:26 +0800 Message-Id: <20200218082634.1596727-1-ying.huang@intel.com> X-Mailer: git-send-email 2.24.1 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: <linux-mm.kvack.org> |
Series |
autonuma: Optimize memory placement in memory tiering system
|
expand
|
From: Huang Ying <ying.huang@intel.com> With the advent of various new memory types, there may be multiple memory types in one system, e.g. DRAM and PMEM (persistent memory). Because the performance and cost of the different types of memory may be different, the memory subsystem of the machine could be called the memory tiering system. After commit c221c0b0308f ("device-dax: "Hotplug" persistent memory for use like normal RAM"), the PMEM could be used as the cost-effective volatile memory in separate NUMA nodes. In a typical memory tiering system, there are CPUs, DRAM and PMEM in each physical NUMA node. The CPUs and the DRAM will be put in one logical node, while the PMEM will be put in another (faked) logical node. To optimize the system overall performance, the hot pages should be placed in DRAM node. To do that, we need to identify the hot pages in the PMEM node and migrate them to DRAM node via NUMA migration. While in autonuma, there are a set of existing mechanisms to identify the pages recently accessed by the CPUs in a node and migrate the pages to the node. So we can reuse these mechanisms to build mechanisms to optimize page placement in the memory tiering system. This has been implemented in this patchset. At the other hand, the cold pages should be placed in PMEM node. So, we also need to identify the cold pages in the DRAM node and migrate them to PMEM node. In the following patchset, [PATCH 0/4] [RFC] Migrate Pages in lieu of discard https://lore.kernel.org/linux-mm/20191016221148.F9CCD155@viggo.jf.intel.com/ A mechanism to demote the cold DRAM pages to PMEM node under memory pressure is implemented. Based on that, the cold DRAM pages can be demoted to PMEM node proactively to free some memory space on DRAM node. And this frees space on DRAM node for the hot PMEM pages to be migrated to. This has been implemented in this patchset too. The patchset is based on the following not-yet-merged patchset (after being rebased to v5.5): [PATCH 0/4] [RFC] Migrate Pages in lieu of discard https://lore.kernel.org/linux-mm/20191016221148.F9CCD155@viggo.jf.intel.com/ This is part of a larger patch set. If you want to apply these or play with them, I'd suggest using the tree from below, https://github.com/hying-caritas/linux/commits/autonuma-r2 With all above optimization, the score of pmbench memory accessing benchmark with 80:20 read/write ratio and normal access address distribution improves 116% on a 2 socket Intel server with Optane DC Persistent Memory. Changelog: v2: - Addressed comments for V1. - Rebased on v5.5. Best Regards, Huang, Ying