From patchwork Sat Sep 1 11:28:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fengguang Wu X-Patchwork-Id: 10584935 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B872416B1 for ; Sun, 2 Sep 2018 02:21:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A6B4629FC7 for ; Sun, 2 Sep 2018 02:21:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9907029FCA; Sun, 2 Sep 2018 02:21:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DATE_IN_PAST_12_24, MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3813329FC7 for ; Sun, 2 Sep 2018 02:21:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4141E6B5FD8; Sat, 1 Sep 2018 22:21:11 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 197666B5FD9; Sat, 1 Sep 2018 22:21:11 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E46576B5FDC; Sat, 1 Sep 2018 22:21:10 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by kanga.kvack.org (Postfix) with ESMTP id 7EC4D6B5FDA for ; Sat, 1 Sep 2018 22:21:10 -0400 (EDT) Received: by mail-pg1-f199.google.com with SMTP id r130-v6so6177500pgr.13 for ; Sat, 01 Sep 2018 19:21:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:message-id :user-agent:date:from:to:cc:cc:cc:cc:cc:cc:cc:cc:cc:subject:lines; bh=J5XU53HJxpMmVaM4NHosvO0wb0yiks5eM1iwToYw34E=; b=pfHqU8F0EveNdBc4BrVI0TsQCMaV4NpjUzk9SXJ17xPFudg0yuc9BHQosiSggE5YQo gzUHj8fIn3EEKEtmi5eROaLNgtnrTwmR+L3lwcthRmUzhN3hGDUvE0G7/FTtMu6fgWdX TJGiNKzGNKNC30WRRRa7j2hptBrj+6vFZvn+z/SkFjVnhIgRmL4n+df22Lv+GrDdPB91 o4PRwdSii/2F4CVAhun2ErQ3hNkFcqs3qGoLfYKQ8FLjvAPbp/Gf563kaRUbS45kpMTh cVh/AsbvwISoCWOb4QE6/oqsFGgQCZ6BmRLsFCxkaRPIQpDF5SmHZghAWrxni/vyg7li LW6g== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengguang.wu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APzg51DEH1qOWzgyhlbLXkWiUxu9UOkEhxmSVT/PsSrXZhtTMDIeCt7Z 4tu+izp22/vSphTr6Qfeu/cW9FC2Ft1zZVtf0EK4Aqhv4Gp8ExAGSJMip+GQv9fli9MludCl4d8 VUN/DPH7WEqCWKD2pLbhDM1lwjlrPDdHHkczrqy1/SHz0iZWOt+y+mgQgogU8XQRtxA== X-Received: by 2002:a63:be4a:: with SMTP id g10-v6mr20743761pgo.378.1535854870190; Sat, 01 Sep 2018 19:21:10 -0700 (PDT) X-Google-Smtp-Source: ANB0VdbOkwZ6MnOGhFfw5/kYYAVKD+VinyRh+UfUpVzUClwDBSY5yq7JjwBjEqjMl9xviZLVV/9K X-Received: by 2002:a63:be4a:: with SMTP id g10-v6mr20743746pgo.378.1535854869508; Sat, 01 Sep 2018 19:21:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535854869; cv=none; d=google.com; s=arc-20160816; b=ZP+6JEDVNEXCwZX3EaMVDbiuD/aXknEbwGVm4YwUGGQlorfBpxtXmutZZgjgHe9tOg +B/ZmHkbSGDDC+lKGwjvm2U7UpjvxCZ/vVeoACNIixzaQQXpkl+l9/xCd5wylLOeF2u5 8tpWgevLlldI7VZDqkILo2ZBetWAmTufk2bFveGWJmKzka4DE2s3icu9VQJcervsbhPd QSin4rqMYsvMPufspsvS9BIEtIbjlZ49OtfFZJDdRg8Gs+8HPumcO7JDQyHXFo2jh2um kp7Ub1we2yEFZfh9KiysV8snj6Q0gu7/fOaVhUuSI0qib7t7YrFTWHCQjLzsnHP8NrA9 QDuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=lines:subject:cc:cc:cc:cc:cc:cc:cc:cc:cc:to:from:date:user-agent :message-id:arc-authentication-results; bh=J5XU53HJxpMmVaM4NHosvO0wb0yiks5eM1iwToYw34E=; b=RoECyBz9LmX7T9VUTOzdyxCG3Z6jSmF9+wrRvJtrvsADzMt+QhgFj+l4WVRWfq0ixF RuiXzAiPienhJdPOv8uSEMTU4/brqiefydFnjPgspgEeN+Fw20y3pueOqYBv2ClDH4Jv +K/0ihi9QDO4R6uTC2+UDts18Dk93QmZhN5prB4qVibNQx63xacVXByNPPL8YdZ2VP3i +08HOULCW2/3ShKAKHqvbND7NXjh8hb73X/bSyNeHiDCSqTMlvEx7tzKFYiRhvxqO4Y0 HSa1g2dyxsJn3W7HgEgRhMG7Q0Zbq8IrD0cHCUs5NaM/mMiHYc24TCdsXjuHXnoTDmyx uEAg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengguang.wu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga04.intel.com (mga04.intel.com. [192.55.52.120]) by mx.google.com with ESMTPS id d32-v6si13682201pla.93.2018.09.01.19.21.09 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 01 Sep 2018 19:21:09 -0700 (PDT) Received-SPF: pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) client-ip=192.55.52.120; Authentication-Results: mx.google.com; spf=pass (google.com: domain of fengguang.wu@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengguang.wu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Sep 2018 19:21:08 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,318,1531810800"; d="scan'208";a="80211542" Received: from dbxu-mobl.ccr.corp.intel.com (HELO wfg-t570.sh.intel.com) ([10.254.212.218]) by orsmga003.jf.intel.com with ESMTP; 01 Sep 2018 19:20:58 -0700 Received: from wfg by wfg-t570.sh.intel.com with local (Exim 4.89) (envelope-from ) id 1fwI0X-0003ZX-4h; Sun, 02 Sep 2018 10:20:57 +0800 Message-Id: <20180901112818.126790961@intel.com> User-Agent: quilt/0.63-1 Date: Sat, 01 Sep 2018 19:28:18 +0800 From: Fengguang Wu To: Andrew Morton cc: Linux Memory Management List cc: kvm@vger.kernel.org cc: Peng DongX cc: Liu Jingqi cc: Dong Eddie CC: Dave Hansen cc: Huang Ying CC: Brendan Gregg Cc: Fengguang Wu , LKML Subject: [RFC][PATCH 0/5] introduce /proc/PID/idle_bitmap Lines: 29 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This new /proc/PID/idle_bitmap interface aims to complement the current global /sys/kernel/mm/page_idle/bitmap. To enable efficient user space driven migrations. The pros and cons will be discussed in changelog of "[PATCH] proc: introduce /proc/PID/idle_bitmap". The driving force is to improve efficiency by 10+ times, so that hot/cold page tracking can be done in some regular intervals in user space w/o too much overheads. Making it possible for some user space daemon to do regular page migration between NUMA nodes of different speeds. Note it's not about NUMA migration between local and remote nodes -- we already have NUMA balancing for that. This interface and user space migration daemon targets for NUMA nodes made of different mediums -- ie. DIMM and NVDIMM(*) -- with larger performance gaps. Basic policy will be "move hot pages to DIMM; cold pages to NVDIMM". Since NVDIMMs size can easily reach several Terabytes, working set tracking efficiency will matter and be challeging. (*) Here we use persistent memory (PMEM) w/o using its persistence. Persistence is good to have, however it requires modifying applications. Upcoming NVDIMM products like Intel Apache Pass (AEP) will be more cost and energy effective than DRAM, but slower. Merely using it in form of NUMA memory node could immediately benefit many workloads. For example, warm but not hot apps, workloads with sharp hot/cold page distribution (good for migration), or relies more on memory size than latency and bandwidth, and do more reads than writes. This is an early RFC version to collect feedbacks. It's complete enough to demo the basic ideas and performance, however not usable yet. Regards, Fengguang