From patchwork Wed Aug 7 17:15:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joel Fernandes X-Patchwork-Id: 11082497 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 27FC213B1 for ; Wed, 7 Aug 2019 17:16:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 15B7628A38 for ; Wed, 7 Aug 2019 17:16:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 094EF28A3A; Wed, 7 Aug 2019 17:16:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E15CC28A38 for ; Wed, 7 Aug 2019 17:16:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C759F6B0007; Wed, 7 Aug 2019 13:16:11 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C01BA6B0008; Wed, 7 Aug 2019 13:16:11 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A7A1B6B000A; Wed, 7 Aug 2019 13:16:11 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f200.google.com (mail-pl1-f200.google.com [209.85.214.200]) by kanga.kvack.org (Postfix) with ESMTP id 655F26B0007 for ; Wed, 7 Aug 2019 13:16:11 -0400 (EDT) Received: by mail-pl1-f200.google.com with SMTP id n4so52606473plp.4 for ; Wed, 07 Aug 2019 10:16:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:mime-version:content-transfer-encoding; bh=kwNMw3tfSAgsxfQd6cmcDmIiNN49UpfqFGiK+CAE+vw=; b=bK1sH/RgbU3lAebUog6qbrPgA0qf+L1vl3qvQ8/ZOJNXMnsuECxFySVV9aNStD1n6k nN+7J4X3WN0j7Ula+v+NIOW3J1UGCWJszozmRAiv3ppxCLlpz8StSWgJ1wAMQQtFII5F J4izyJ5TukcBxZ9Q69kBx6ByMeolCJ8B135zkmNt+J5z7s6aFdwn9pQOZdIlW/8NC3yT rHGae0mbp1s60KfICbaqHQ7XxWULuK0jIBI31Pqx17It9uAcUk4lf1cDnQon9pjDHYvL Y/8xgchWnp86BF0ZUZTmVunTScwlIwhjHVaeqoa7Bio11zMymIIC8bjm5qu03WHunjhS 6YQA== X-Gm-Message-State: APjAAAW9OjJs+KaiKBnJWp/erJVMvdLQl97Xs/Os7a2zesf3CrIPeU5R 6tGcGSbODp9ipXWtCKv3oujvcAtfNzZM6rsDxe992Mqp88zxNPdlZUVx/v8Xo9EidY/nkesPsJ4 PFXoG5t/BP0BCiF9bV0szyIiu1rs/Kx8On0UCuayAPPKBGejk9xtZPz7r5sqa/0EWTQ== X-Received: by 2002:a17:90a:3086:: with SMTP id h6mr1006731pjb.14.1565198170802; Wed, 07 Aug 2019 10:16:10 -0700 (PDT) X-Received: by 2002:a17:90a:3086:: with SMTP id h6mr1006605pjb.14.1565198169008; Wed, 07 Aug 2019 10:16:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565198169; cv=none; d=google.com; s=arc-20160816; b=PXCDQgXN8V1IPyDHHAShrMOMacvrnlt/bKCrmfxOr9y2ivHigrWUNM52tHzkpbZnQg XZD7y3b46fjnpTN8JFicKl9O0Ei+/AjGiFNcUcApl5QV4Ifna/+UfJJXsqjKkUWqZ3p3 h3vbs3EKzlCF2I+hlbzggQHZJcRfebE5xlZpzg/lIpMrPFGv0thQCyni4LrQeTjc44M2 rPmrR8Bu0rluANa6wsSBG22akwazGgS4PeWPny+hBbvnbjHaK5+q/561gKctL8dKKIGC p1q+wlz20jEW08OyCRXsNo4iCVsqjXB91XpnmQilJUjG85aUxSt1Q3b0zDjAwOkRkZ/2 UUdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dkim-signature; bh=kwNMw3tfSAgsxfQd6cmcDmIiNN49UpfqFGiK+CAE+vw=; b=VE9dxQ+Ucf9z+nYOlmG4KYR08+guqTHZGEoXxThS3qmOtUPbr0Pp5RMYL4VYTkBoD0 GtCwKfRg5URobN+qibNMJSUSkXwP1aiJzcd516r0sQpPXmlc0KuHNBjdWLINXhUVpz1c LhgM6prfinjKdunelIlFU3pGF038eL5WDGLMNXiS9Eqo2iS1dU/ikOyMvVLfy9VagkHP sNnrcUhN1qtCtjl3/I/IdCA5OPUH2m4CcTYQsbfqukItcRVWtvTubuKmJ7IO7HDQZ394 6+hepeXlLata72TU+59dY4HPiDfYUMuKi/oK/KNvj6Vo2EkETTgR9e15FgqzHp6q0ZRy aOgA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=EaKbMWOI; spf=pass (google.com: domain of joel@joelfernandes.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=joel@joelfernandes.org Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id y1sor45127766plb.65.2019.08.07.10.16.08 for (Google Transport Security); Wed, 07 Aug 2019 10:16:08 -0700 (PDT) Received-SPF: pass (google.com: domain of joel@joelfernandes.org designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=EaKbMWOI; spf=pass (google.com: domain of joel@joelfernandes.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=joel@joelfernandes.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=kwNMw3tfSAgsxfQd6cmcDmIiNN49UpfqFGiK+CAE+vw=; b=EaKbMWOI4W9lbLhaIFF7P4qLAUMsBQ6NB48fzxhcwctLHl4ntf/fKgT2JH1ar4MFFx amsgfw+40lW7z3Q5c3BocesXT2wnD+sz8VN51IxjL1MtI9j1BcKaClUHGRa3kA2H1ljH qHTBnLJkhIaxTL5mKVSHGnnW8VNwHX8sqwj/w= X-Google-Smtp-Source: APXvYqwOvh5fv9YPPkELPqezqZMHunyYysuKQkQRLMtPFm/lOj5teF51zHN97WthIhWg3ihU+CA5dQ== X-Received: by 2002:a17:902:d70a:: with SMTP id w10mr8605610ply.251.1565198168180; Wed, 07 Aug 2019 10:16:08 -0700 (PDT) Received: from joelaf.cam.corp.google.com ([2620:15c:6:12:9c46:e0da:efbf:69cc]) by smtp.gmail.com with ESMTPSA id a1sm62692130pgh.61.2019.08.07.10.16.04 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Wed, 07 Aug 2019 10:16:07 -0700 (PDT) From: "Joel Fernandes (Google)" To: linux-kernel@vger.kernel.org Cc: "Joel Fernandes (Google)" , Alexey Dobriyan , Andrew Morton , Borislav Petkov , Brendan Gregg , Catalin Marinas , Christian Hansen , dancol@google.com, fmayer@google.com, "H. Peter Anvin" , Ingo Molnar , joelaf@google.com, Jonathan Corbet , Kees Cook , kernel-team@android.com, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Michal Hocko , Mike Rapoport , minchan@kernel.org, namhyung@google.com, paulmck@linux.ibm.com, Robin Murphy , Roman Gushchin , Stephen Rothwell , surenb@google.com, Thomas Gleixner , tkjos@google.com, Vladimir Davydov , Vlastimil Babka , Will Deacon Subject: [PATCH v5 1/6] mm/page_idle: Add per-pid idle page tracking using virtual index Date: Wed, 7 Aug 2019 13:15:54 -0400 Message-Id: <20190807171559.182301-1-joel@joelfernandes.org> X-Mailer: git-send-email 2.22.0.770.g0f2c4a37fd-goog MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The page_idle tracking feature currently requires looking up the pagemap for a process followed by interacting with /sys/kernel/mm/page_idle. Looking up PFN from pagemap in Android devices is not supported by unprivileged process and requires SYS_ADMIN and gives 0 for the PFN. This patch adds support to directly interact with page_idle tracking at the PID level by introducing a /proc//page_idle file. It follows the exact same semantics as the global /sys/kernel/mm/page_idle, but now looking up PFN through pagemap is not needed since the interface uses virtual frame numbers, and at the same time also does not require SYS_ADMIN. In Android, we are using this for the heap profiler (heapprofd) which profiles and pin points code paths which allocates and leaves memory idle for long periods of time. This method solves the security issue with userspace learning the PFN, and while at it is also shown to yield better results than the pagemap lookup, the theory being that the window where the address space can change is reduced by eliminating the intermediate pagemap look up stage. In virtual address indexing, the process's mmap_sem is held for the duration of the access. Signed-off-by: Joel Fernandes (Google) --- v3->v4: Minor fixups (Minchan) Add swap pte handling (Konstantin, Minchan) v2->v3: Fixed a bug where I was doing a kfree that is not needed due to not needing to do GFP_ATOMIC allocations. v1->v2: Mark swap ptes as idle (Minchan) Avoid need for GFP_ATOMIC (Andrew) Get rid of idle_page_list lock by moving list to stack Internal review -> v1: Fixes from Suren. Corrections to change log, docs (Florian, Sandeep) fs/proc/base.c | 3 + fs/proc/internal.h | 1 + fs/proc/task_mmu.c | 42 +++++ include/linux/page_idle.h | 4 + mm/page_idle.c | 337 +++++++++++++++++++++++++++++++++----- 5 files changed, 342 insertions(+), 45 deletions(-) diff --git a/fs/proc/base.c b/fs/proc/base.c index ebea9501afb8..fd2f74bd4e35 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -3039,6 +3039,9 @@ static const struct pid_entry tgid_base_stuff[] = { REG("smaps", S_IRUGO, proc_pid_smaps_operations), REG("smaps_rollup", S_IRUGO, proc_pid_smaps_rollup_operations), REG("pagemap", S_IRUSR, proc_pagemap_operations), +#ifdef CONFIG_IDLE_PAGE_TRACKING + REG("page_idle", S_IRUSR|S_IWUSR, proc_page_idle_operations), +#endif #endif #ifdef CONFIG_SECURITY DIR("attr", S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, proc_attr_dir_operations), diff --git a/fs/proc/internal.h b/fs/proc/internal.h index cd0c8d5ce9a1..bc9371880c63 100644 --- a/fs/proc/internal.h +++ b/fs/proc/internal.h @@ -293,6 +293,7 @@ extern const struct file_operations proc_pid_smaps_operations; extern const struct file_operations proc_pid_smaps_rollup_operations; extern const struct file_operations proc_clear_refs_operations; extern const struct file_operations proc_pagemap_operations; +extern const struct file_operations proc_page_idle_operations; extern unsigned long task_vsize(struct mm_struct *); extern unsigned long task_statm(struct mm_struct *, diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 582c5e680176..192ffc4e24d7 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1650,6 +1650,48 @@ const struct file_operations proc_pagemap_operations = { .open = pagemap_open, .release = pagemap_release, }; + +#ifdef CONFIG_IDLE_PAGE_TRACKING +static ssize_t proc_page_idle_read(struct file *file, char __user *buf, + size_t count, loff_t *ppos) +{ + return page_idle_proc_read(file, buf, count, ppos); +} + +static ssize_t proc_page_idle_write(struct file *file, const char __user *buf, + size_t count, loff_t *ppos) +{ + return page_idle_proc_write(file, (char __user *)buf, count, ppos); +} + +static int proc_page_idle_open(struct inode *inode, struct file *file) +{ + struct mm_struct *mm; + + mm = proc_mem_open(inode, PTRACE_MODE_READ); + if (IS_ERR(mm)) + return PTR_ERR(mm); + file->private_data = mm; + return 0; +} + +static int proc_page_idle_release(struct inode *inode, struct file *file) +{ + struct mm_struct *mm = file->private_data; + + mmdrop(mm); + return 0; +} + +const struct file_operations proc_page_idle_operations = { + .llseek = mem_lseek, /* borrow this */ + .read = proc_page_idle_read, + .write = proc_page_idle_write, + .open = proc_page_idle_open, + .release = proc_page_idle_release, +}; +#endif /* CONFIG_IDLE_PAGE_TRACKING */ + #endif /* CONFIG_PROC_PAGE_MONITOR */ #ifdef CONFIG_NUMA diff --git a/include/linux/page_idle.h b/include/linux/page_idle.h index 1e894d34bdce..a765a6d14e1a 100644 --- a/include/linux/page_idle.h +++ b/include/linux/page_idle.h @@ -106,6 +106,10 @@ static inline void clear_page_idle(struct page *page) } #endif /* CONFIG_64BIT */ +ssize_t page_idle_proc_write(struct file *file, + char __user *buf, size_t count, loff_t *ppos); +ssize_t page_idle_proc_read(struct file *file, + char __user *buf, size_t count, loff_t *ppos); #else /* !CONFIG_IDLE_PAGE_TRACKING */ static inline bool page_is_young(struct page *page) diff --git a/mm/page_idle.c b/mm/page_idle.c index 295512465065..9de4f4c67a8c 100644 --- a/mm/page_idle.c +++ b/mm/page_idle.c @@ -5,17 +5,22 @@ #include #include #include -#include -#include -#include #include +#include #include #include +#include +#include +#include +#include +#include #define BITMAP_CHUNK_SIZE sizeof(u64) #define BITMAP_CHUNK_BITS (BITMAP_CHUNK_SIZE * BITS_PER_BYTE) /* + * Get a reference to a page for idle tracking purposes, with additional checks. + * * Idle page tracking only considers user memory pages, for other types of * pages the idle flag is always unset and an attempt to set it is silently * ignored. @@ -25,18 +30,13 @@ * page tracking. With such an indicator of user pages we can skip isolated * pages, but since there are not usually many of them, it will hardly affect * the overall result. - * - * This function tries to get a user memory page by pfn as described above. */ -static struct page *page_idle_get_page(unsigned long pfn) +static struct page *page_idle_get_page(struct page *page_in) { struct page *page; pg_data_t *pgdat; - if (!pfn_valid(pfn)) - return NULL; - - page = pfn_to_page(pfn); + page = page_in; if (!page || !PageLRU(page) || !get_page_unless_zero(page)) return NULL; @@ -51,6 +51,18 @@ static struct page *page_idle_get_page(unsigned long pfn) return page; } +/* + * This function tries to get a user memory page by pfn as described above. + */ +static struct page *page_idle_get_page_pfn(unsigned long pfn) +{ + + if (!pfn_valid(pfn)) + return NULL; + + return page_idle_get_page(pfn_to_page(pfn)); +} + static bool page_idle_clear_pte_refs_one(struct page *page, struct vm_area_struct *vma, unsigned long addr, void *arg) @@ -118,6 +130,47 @@ static void page_idle_clear_pte_refs(struct page *page) unlock_page(page); } +/* Helper to get the start and end frame given a pos and count */ +static int page_idle_get_frames(loff_t pos, size_t count, struct mm_struct *mm, + unsigned long *start, unsigned long *end) +{ + unsigned long max_frame; + + /* If an mm is not given, assume we want physical frames */ + max_frame = mm ? (mm->task_size >> PAGE_SHIFT) : max_pfn; + + if (pos % BITMAP_CHUNK_SIZE || count % BITMAP_CHUNK_SIZE) + return -EINVAL; + + *start = pos * BITS_PER_BYTE; + if (*start >= max_frame) + return -ENXIO; + + *end = *start + count * BITS_PER_BYTE; + if (*end > max_frame) + *end = max_frame; + return 0; +} + +static bool page_idle_pte_check(struct page *page) +{ + if (!page) + return false; + + if (page_is_idle(page)) { + /* + * The page might have been referenced via a + * pte, in which case it is not idle. Clear + * refs and recheck. + */ + page_idle_clear_pte_refs(page); + if (page_is_idle(page)) + return true; + } + + return false; +} + static ssize_t page_idle_bitmap_read(struct file *file, struct kobject *kobj, struct bin_attribute *attr, char *buf, loff_t pos, size_t count) @@ -125,35 +178,21 @@ static ssize_t page_idle_bitmap_read(struct file *file, struct kobject *kobj, u64 *out = (u64 *)buf; struct page *page; unsigned long pfn, end_pfn; - int bit; + int bit, ret; - if (pos % BITMAP_CHUNK_SIZE || count % BITMAP_CHUNK_SIZE) - return -EINVAL; - - pfn = pos * BITS_PER_BYTE; - if (pfn >= max_pfn) - return 0; - - end_pfn = pfn + count * BITS_PER_BYTE; - if (end_pfn > max_pfn) - end_pfn = max_pfn; + ret = page_idle_get_frames(pos, count, NULL, &pfn, &end_pfn); + if (ret == -ENXIO) + return 0; /* Reads beyond max_pfn do nothing */ + else if (ret) + return ret; for (; pfn < end_pfn; pfn++) { bit = pfn % BITMAP_CHUNK_BITS; if (!bit) *out = 0ULL; - page = page_idle_get_page(pfn); - if (page) { - if (page_is_idle(page)) { - /* - * The page might have been referenced via a - * pte, in which case it is not idle. Clear - * refs and recheck. - */ - page_idle_clear_pte_refs(page); - if (page_is_idle(page)) - *out |= 1ULL << bit; - } + page = page_idle_get_page_pfn(pfn); + if (page && page_idle_pte_check(page)) { + *out |= 1ULL << bit; put_page(page); } if (bit == BITMAP_CHUNK_BITS - 1) @@ -170,23 +209,16 @@ static ssize_t page_idle_bitmap_write(struct file *file, struct kobject *kobj, const u64 *in = (u64 *)buf; struct page *page; unsigned long pfn, end_pfn; - int bit; + int bit, ret; - if (pos % BITMAP_CHUNK_SIZE || count % BITMAP_CHUNK_SIZE) - return -EINVAL; - - pfn = pos * BITS_PER_BYTE; - if (pfn >= max_pfn) - return -ENXIO; - - end_pfn = pfn + count * BITS_PER_BYTE; - if (end_pfn > max_pfn) - end_pfn = max_pfn; + ret = page_idle_get_frames(pos, count, NULL, &pfn, &end_pfn); + if (ret) + return ret; for (; pfn < end_pfn; pfn++) { bit = pfn % BITMAP_CHUNK_BITS; if ((*in >> bit) & 1) { - page = page_idle_get_page(pfn); + page = page_idle_get_page_pfn(pfn); if (page) { page_idle_clear_pte_refs(page); set_page_idle(page); @@ -224,6 +256,221 @@ struct page_ext_operations page_idle_ops = { }; #endif +/* page_idle tracking for /proc//page_idle */ + +struct page_node { + struct page *page; + unsigned long addr; + struct list_head list; +}; + +struct page_idle_proc_priv { + unsigned long start_addr; + char *buffer; + int write; + + /* Pre-allocate and provide nodes to pte_page_idle_proc_add() */ + struct page_node *page_nodes; + int cur_page_node; + struct list_head *idle_page_list; +}; + +/* + * Add page to list to be set as idle later. + */ +static void pte_page_idle_proc_add(struct page *page, + unsigned long addr, struct mm_walk *walk) +{ + struct page *page_get = NULL; + struct page_node *pn; + int bit; + unsigned long frames; + struct page_idle_proc_priv *priv = walk->private; + u64 *chunk = (u64 *)priv->buffer; + + if (priv->write) { + VM_BUG_ON(!page); + + /* Find whether this page was asked to be marked */ + frames = (addr - priv->start_addr) >> PAGE_SHIFT; + bit = frames % BITMAP_CHUNK_BITS; + chunk = &chunk[frames / BITMAP_CHUNK_BITS]; + if (((*chunk >> bit) & 1) == 0) + return; + } + + if (page) { + page_get = page_idle_get_page(page); + if (!page_get) + return; + } + + /* + * For all other pages, add it to a list since we have to walk rmap, + * which acquires ptlock, and we cannot walk rmap right now. + */ + pn = &(priv->page_nodes[priv->cur_page_node++]); + pn->page = page_get; + pn->addr = addr; + list_add(&pn->list, priv->idle_page_list); +} + +static int pte_page_idle_proc_range(pmd_t *pmd, unsigned long addr, + unsigned long end, + struct mm_walk *walk) +{ + pte_t *pte; + spinlock_t *ptl; + struct page *page; + struct vm_area_struct *vma = walk->vma; + + ptl = pmd_trans_huge_lock(pmd, vma); + if (ptl) { + if (pmd_present(*pmd)) { + page = follow_trans_huge_pmd(vma, addr, pmd, + FOLL_DUMP|FOLL_WRITE); + if (!IS_ERR_OR_NULL(page)) + pte_page_idle_proc_add(page, addr, walk); + } + spin_unlock(ptl); + return 0; + } + + if (pmd_trans_unstable(pmd)) + return 0; + + pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); + for (; addr != end; pte++, addr += PAGE_SIZE) { + if (!pte_present(*pte)) + continue; + + page = vm_normal_page(vma, addr, *pte); + if (page) + pte_page_idle_proc_add(page, addr, walk); + } + + pte_unmap_unlock(pte - 1, ptl); + return 0; +} + +ssize_t page_idle_proc_generic(struct file *file, char __user *ubuff, + size_t count, loff_t *pos, int write) +{ + int ret; + char *buffer; + u64 *out; + unsigned long start_addr, end_addr, start_frame, end_frame; + struct mm_struct *mm = file->private_data; + struct mm_walk walk = { .pmd_entry = pte_page_idle_proc_range, }; + struct page_node *cur; + struct page_idle_proc_priv priv; + bool walk_error = false; + LIST_HEAD(idle_page_list); + + if (!mm || !mmget_not_zero(mm)) + return -EINVAL; + + if (count > PAGE_SIZE) + count = PAGE_SIZE; + + buffer = kzalloc(PAGE_SIZE, GFP_KERNEL); + if (!buffer) { + ret = -ENOMEM; + goto out_mmput; + } + out = (u64 *)buffer; + + if (write && copy_from_user(buffer, ubuff, count)) { + ret = -EFAULT; + goto out; + } + + ret = page_idle_get_frames(*pos, count, mm, &start_frame, &end_frame); + if (ret) + goto out; + + start_addr = (start_frame << PAGE_SHIFT); + end_addr = (end_frame << PAGE_SHIFT); + priv.buffer = buffer; + priv.start_addr = start_addr; + priv.write = write; + + priv.idle_page_list = &idle_page_list; + priv.cur_page_node = 0; + priv.page_nodes = kzalloc(sizeof(struct page_node) * + (end_frame - start_frame), GFP_KERNEL); + if (!priv.page_nodes) { + ret = -ENOMEM; + goto out; + } + + walk.private = &priv; + walk.mm = mm; + + down_read(&mm->mmap_sem); + + /* + * idle_page_list is needed because walk_page_vma() holds ptlock which + * deadlocks with page_idle_clear_pte_refs(). So we have to collect all + * pages first, and then call page_idle_clear_pte_refs(). + */ + ret = walk_page_range(start_addr, end_addr, &walk); + if (ret) + walk_error = true; + + list_for_each_entry(cur, &idle_page_list, list) { + int bit, index; + unsigned long off; + struct page *page = cur->page; + + if (unlikely(walk_error)) + goto remove_page; + + if (write) { + if (page) { + page_idle_clear_pte_refs(page); + set_page_idle(page); + } + } else { + if (page_idle_pte_check(page)) { + off = ((cur->addr) >> PAGE_SHIFT) - start_frame; + bit = off % BITMAP_CHUNK_BITS; + index = off / BITMAP_CHUNK_BITS; + out[index] |= 1ULL << bit; + } + } +remove_page: + if (page) + put_page(page); + } + + if (!write && !walk_error) + ret = copy_to_user(ubuff, buffer, count); + + up_read(&mm->mmap_sem); + kfree(priv.page_nodes); +out: + kfree(buffer); +out_mmput: + mmput(mm); + if (!ret) + ret = count; + return ret; + +} + +ssize_t page_idle_proc_read(struct file *file, char __user *ubuff, + size_t count, loff_t *pos) +{ + return page_idle_proc_generic(file, ubuff, count, pos, 0); +} + +ssize_t page_idle_proc_write(struct file *file, char __user *ubuff, + size_t count, loff_t *pos) +{ + return page_idle_proc_generic(file, ubuff, count, pos, 1); +} + static int __init page_idle_init(void) { int err; From patchwork Wed Aug 7 17:15:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joel Fernandes X-Patchwork-Id: 11082499 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5B04B13B1 for ; Wed, 7 Aug 2019 17:16:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4A65D2012F for ; Wed, 7 Aug 2019 17:16:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3E4F928A39; Wed, 7 Aug 2019 17:16:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 985CF2012F for ; Wed, 7 Aug 2019 17:16:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4D4E46B0008; Wed, 7 Aug 2019 13:16:14 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 45FE06B000A; Wed, 7 Aug 2019 13:16:14 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 262986B000C; Wed, 7 Aug 2019 13:16:14 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f197.google.com (mail-pl1-f197.google.com [209.85.214.197]) by kanga.kvack.org (Postfix) with ESMTP id E08036B0008 for ; Wed, 7 Aug 2019 13:16:13 -0400 (EDT) Received: by mail-pl1-f197.google.com with SMTP id r7so53128963plo.6 for ; Wed, 07 Aug 2019 10:16:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=P5X7WGyKIvWzv40/K4oiDqLDCW/4zY9Hm3MOMY58jQs=; b=Kkz4KsG8fReGWY+JVng/JZiWQ38k9nEm/SZGsfdo9J8MJbPXQZlBP+n/M3pRC5Nqoh ZcGkJCpVSEXuLvG8+/0mnSzslY340nuU6YIzHMpepCjp3A6Yp3+KILh79vlQG5rOfLu1 vg1UqSljnLAQTnU3BNH3N5YF+1lUX3C0+MIYqdYkbGiWt7XN1Q46w2BHSX1Qr6n+H93n wBi31XZDV1xDeukocDWF6bm4mhpyecORAElxyP9CaioSUahRdNjPBT6uPZYwAAwB4aUk H77evetta1cjGyWi67PfT3VR2etOrwMhZizyG+aPviQGZzXlXVPGLERE4xjP+e1mujiA RlNw== X-Gm-Message-State: APjAAAUzWk33/bPTge2DhDmQAl64rk0ssAvzZKwqXdpPNIloWZy98YvJ lTtn1+7TYVkkNmYRRSRN6hqNsmr7+9okAamwt9hZB88OmmnrqWyC+5JOG/1uX+ouwzyXyLAN/li C6m8ofng0NUfRVCm/DAE3gyEwpcsPAV9v0BMuBrGdEMh/w6kn8xmjn8YAVjvZO45Qag== X-Received: by 2002:a63:6947:: with SMTP id e68mr8782903pgc.60.1565198173502; Wed, 07 Aug 2019 10:16:13 -0700 (PDT) X-Received: by 2002:a63:6947:: with SMTP id e68mr8782846pgc.60.1565198172536; Wed, 07 Aug 2019 10:16:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565198172; cv=none; d=google.com; s=arc-20160816; b=GqtCEKOaCOkIDs4NNJSQw9NIrW1gOu7Hc3Ja6pOPqngMHrtrz7Zpc9YxJCBxm42jty 8WbCMAeGQ9oB6N2kTbly5BhnhTf04KITkBmbN6p2mRGM4Dmq9n92EllNA9mtJ+IQ6fYO 2NU5Jzxn8uJ0zxOZZiof4BtT16xt0JD2NKzaymAX323KxAb9H48OEVTMtvFo/UpibY+/ uESbnMF9/e90gYbWW+iJ/lP2nMcThT1WyiifrQoFt+V4dlNO7ZJ+rOdKfxNUNgHv+LQ3 9Fkdf1cNUL8RXSgcjbg+g0rXlWk2wuuxn+2rXDw1YUJpv/8rIsh4NBQ78gRABkTSDKWF jcqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=P5X7WGyKIvWzv40/K4oiDqLDCW/4zY9Hm3MOMY58jQs=; b=qPjhdm1ZvjRCbvZ5aUbTVlFB6wJYlFXr75Azy57d3XF3ZzhepWtM0XREDPSrpkswPD pyByURJ68kcpk2Lntth2L1NTF70b8CiJV62gcK45UBsJUinw9/vaxV2cX4QmXCeeC41U dHRC221voaSDfW5SbImhxeWXVXYYD/Gkcz16g9wlh+jdne/t32WbuVQXoRnpskrEqiB0 GXqjd+sTC24SlrYyjrMHgzsiFmwG3QqOflP0Bzt/hgmJaqn+0MCf6B6D5bIRjJ4YZAPS mao8iHpfgg6fyhK5+srqW+T84HqrOCheDnBQGmSqwztwZNM8CO8BqqrqYsmZ4tGdD6Gh iUTg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=mD9C8IKc; spf=pass (google.com: domain of joel@joelfernandes.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=joel@joelfernandes.org Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id w11sor72127249pfi.56.2019.08.07.10.16.12 for (Google Transport Security); Wed, 07 Aug 2019 10:16:12 -0700 (PDT) Received-SPF: pass (google.com: domain of joel@joelfernandes.org designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=mD9C8IKc; spf=pass (google.com: domain of joel@joelfernandes.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=joel@joelfernandes.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=P5X7WGyKIvWzv40/K4oiDqLDCW/4zY9Hm3MOMY58jQs=; b=mD9C8IKc4XKpcbSYfwPnZKRldyQiL1dBx81QYoVmJDdux8Iep7i59/JSHFJYHk+XrJ vrMJi8Xp2bdjRbFDtFrhqEh5Yc3Oopi7lxX7lQLO8r3AEgEvgNhzwo8UcjWMEY04/AC8 JYsHF/E+aVAAZwpyg1/P7F6EoLQTnU39+G9fc= X-Google-Smtp-Source: APXvYqy2HYLAn0m4KciMvOj56Jnl+GhF3rHn4PDJ2pDxkWs47lImKTGAEMqQOwqhhdZJFZxyaLa3aQ== X-Received: by 2002:a62:1ac9:: with SMTP id a192mr10254306pfa.260.1565198172140; Wed, 07 Aug 2019 10:16:12 -0700 (PDT) Received: from joelaf.cam.corp.google.com ([2620:15c:6:12:9c46:e0da:efbf:69cc]) by smtp.gmail.com with ESMTPSA id a1sm62692130pgh.61.2019.08.07.10.16.08 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Wed, 07 Aug 2019 10:16:11 -0700 (PDT) From: "Joel Fernandes (Google)" To: linux-kernel@vger.kernel.org Cc: "Joel Fernandes (Google)" , Minchan Kim , Alexey Dobriyan , Andrew Morton , Borislav Petkov , Brendan Gregg , Catalin Marinas , Christian Hansen , dancol@google.com, fmayer@google.com, "H. Peter Anvin" , Ingo Molnar , joelaf@google.com, Jonathan Corbet , Kees Cook , kernel-team@android.com, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Michal Hocko , Mike Rapoport , namhyung@google.com, paulmck@linux.ibm.com, Robin Murphy , Roman Gushchin , Stephen Rothwell , surenb@google.com, Thomas Gleixner , tkjos@google.com, Vladimir Davydov , Vlastimil Babka , Will Deacon Subject: [PATCH v5 2/6] mm/page_idle: Add support for handling swapped PG_Idle pages Date: Wed, 7 Aug 2019 13:15:55 -0400 Message-Id: <20190807171559.182301-2-joel@joelfernandes.org> X-Mailer: git-send-email 2.22.0.770.g0f2c4a37fd-goog In-Reply-To: <20190807171559.182301-1-joel@joelfernandes.org> References: <20190807171559.182301-1-joel@joelfernandes.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Idle page tracking currently does not work well in the following scenario: 1. mark page-A idle which was present at that time. 2. run workload 3. page-A is not touched by workload 4. *sudden* memory pressure happen so finally page A is finally swapped out 5. now see the page A - it appears as if it was accessed (pte unmapped so idle bit not set in output) - but it's incorrect. To fix this, we store the idle information into a new idle bit of the swap PTE during swapping of anonymous pages. Also in the future, madvise extensions will allow a system process manager (like Android's ActivityManager) to swap pages out of a process that it knows will be cold. To an external process like a heap profiler that is doing idle tracking on another process, this procedure will interfere with the idle page tracking similar to the above steps. Suggested-by: Minchan Kim Signed-off-by: Joel Fernandes (Google) --- arch/Kconfig | 3 +++ include/asm-generic/pgtable.h | 6 ++++++ mm/page_idle.c | 26 ++++++++++++++++++++++++-- mm/rmap.c | 2 ++ 4 files changed, 35 insertions(+), 2 deletions(-) diff --git a/arch/Kconfig b/arch/Kconfig index a7b57dd42c26..3aa121ce824e 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -575,6 +575,9 @@ config ARCH_WANT_HUGE_PMD_SHARE config HAVE_ARCH_SOFT_DIRTY bool +config HAVE_ARCH_PTE_SWP_PGIDLE + bool + config HAVE_MOD_ARCH_SPECIFIC bool help diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 75d9d68a6de7..6d51d0a355a7 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -712,6 +712,12 @@ static inline void ptep_modify_prot_commit(struct vm_area_struct *vma, #define arch_start_context_switch(prev) do {} while (0) #endif +#ifndef CONFIG_HAVE_ARCH_PTE_SWP_PGIDLE +static inline pte_t pte_swp_mkpage_idle(pte_t pte) { return pte; } +static inline int pte_swp_page_idle(pte_t pte) { return 0; } +static inline pte_t pte_swp_clear_mkpage_idle(pte_t pte) { return pte; } +#endif + #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY #ifndef CONFIG_ARCH_ENABLE_THP_MIGRATION static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) diff --git a/mm/page_idle.c b/mm/page_idle.c index 9de4f4c67a8c..2766d4ab348c 100644 --- a/mm/page_idle.c +++ b/mm/page_idle.c @@ -276,7 +276,7 @@ struct page_idle_proc_priv { }; /* - * Add page to list to be set as idle later. + * Set a page as idle or add it to a list to be set as idle later. */ static void pte_page_idle_proc_add(struct page *page, unsigned long addr, struct mm_walk *walk) @@ -303,6 +303,13 @@ static void pte_page_idle_proc_add(struct page *page, page_get = page_idle_get_page(page); if (!page_get) return; + } else { + /* For swapped pages, set output bit as idle */ + frames = (addr - priv->start_addr) >> PAGE_SHIFT; + bit = frames % BITMAP_CHUNK_BITS; + chunk = &chunk[frames / BITMAP_CHUNK_BITS]; + *chunk |= (1 << bit); + return; } /* @@ -323,6 +330,7 @@ static int pte_page_idle_proc_range(pmd_t *pmd, unsigned long addr, spinlock_t *ptl; struct page *page; struct vm_area_struct *vma = walk->vma; + struct page_idle_proc_priv *priv = walk->private; ptl = pmd_trans_huge_lock(pmd, vma); if (ptl) { @@ -341,6 +349,19 @@ static int pte_page_idle_proc_range(pmd_t *pmd, unsigned long addr, pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); for (; addr != end; pte++, addr += PAGE_SIZE) { + /* For swap_pte handling, we use an idle bit in the swap pte. */ + if (is_swap_pte(*pte)) { + if (priv->write) { + set_pte_at(walk->mm, addr, pte, + pte_swp_mkpage_idle(*pte)); + } else { + /* If swap pte has idle bit set, report it as idle */ + if (pte_swp_page_idle(*pte)) + pte_page_idle_proc_add(NULL, addr, walk); + } + continue; + } + if (!pte_present(*pte)) continue; @@ -432,7 +453,8 @@ ssize_t page_idle_proc_generic(struct file *file, char __user *ubuff, set_page_idle(page); } } else { - if (page_idle_pte_check(page)) { + /* If page is NULL, it was swapped out */ + if (!page || page_idle_pte_check(page)) { off = ((cur->addr) >> PAGE_SHIFT) - start_frame; bit = off % BITMAP_CHUNK_BITS; index = off / BITMAP_CHUNK_BITS; diff --git a/mm/rmap.c b/mm/rmap.c index e5dfe2ae6b0d..4bd618aab402 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1629,6 +1629,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, swp_pte = swp_entry_to_pte(entry); if (pte_soft_dirty(pteval)) swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (page_is_idle(page)) + swp_pte = pte_swp_mkpage_idle(swp_pte); set_pte_at(mm, address, pvmw.pte, swp_pte); /* Invalidate as we cleared the pte */ mmu_notifier_invalidate_range(mm, address, From patchwork Wed Aug 7 17:15:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joel Fernandes X-Patchwork-Id: 11082501 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3C6941395 for ; Wed, 7 Aug 2019 17:16:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2949A2012F for ; Wed, 7 Aug 2019 17:16:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1C70628A39; Wed, 7 Aug 2019 17:16:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 94B4328A38 for ; Wed, 7 Aug 2019 17:16:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 12A3C6B000A; Wed, 7 Aug 2019 13:16:18 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 0B5EC6B000C; Wed, 7 Aug 2019 13:16:18 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E70886B000D; Wed, 7 Aug 2019 13:16:17 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by kanga.kvack.org (Postfix) with ESMTP id AE1E66B000A for ; Wed, 7 Aug 2019 13:16:17 -0400 (EDT) Received: by mail-pl1-f199.google.com with SMTP id q11so53076739pll.22 for ; Wed, 07 Aug 2019 10:16:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=1UeFCEkBe8WB1WXsCYwcP/7nKnryRnwhqxRiHu1oF/s=; b=bMPI7TQcegACxKx5eSCMeCDIF5ZVVKiQ24BC09vig0tOhNzu9++GRP/hRAW1WXe/cB hrQP5347NIfrTWcqwG84FFlw5kkDlaTwePce6Bz/C9u/q9DRSvYjAQmUcqBryu5syC0y Bhlzi4v9zjGrN2hyLD1WN5d469kOgoWvMogXhzt127kgitoPrem8V7c6kfa5rqHi6ntw Q6gzEEFAi4v/+H150rUAMPn04RclShn+KRx4kCQTUSm3pnHGAo6tnR0DDOXQtQbA2rUs q7phVWYJ4lOZbF1S4zJ+q4/x1uV1UbFHVBcG5WFOx2257LfD52hU5qfu3X2YA0oAkVIo ZqiQ== X-Gm-Message-State: APjAAAVD8KwqsxldcnbIV+Au4GsSO6CTKz+7vMbqxmO3QMdPUnbMVVZH GR37GSpmqAy1HzQbHm/14qRgeIQKE3C0FXh4fvD0SalpmkpqJrjSvS1AkuthLj9Nx2qx6NYIvTy +wpPM2s/QOsTped5GGtf7QpXNElt+Z2bgWBLX8zmDFWQXZ6rX4sa0DCNw1hs5jfTqSA== X-Received: by 2002:a65:65c5:: with SMTP id y5mr8547437pgv.342.1565198177287; Wed, 07 Aug 2019 10:16:17 -0700 (PDT) X-Received: by 2002:a65:65c5:: with SMTP id y5mr8547371pgv.342.1565198176363; Wed, 07 Aug 2019 10:16:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565198176; cv=none; d=google.com; s=arc-20160816; b=zNa+a8U2OW9IFQvQrJSWPQ+MQbUyDVNoeS7xx7FKEvsJYHbDFR1/CHUAtryhMmJ9ek vP2wASutO5CAmmLlTm6vmu8miQv+y7KuCOXNj2SnbHxl0ZV8WqFibPw25++uXHPZ1uY1 uolaN88KGGDgEGUk1Z0USXHAIF+8S/OOx1UKCAvg3hGHwsC1k7rBxMWw8u+0CTCs9pKv veI/3GZ84BjSQlJ3NEiJt4ygIBP+MVpQIVuFyvTGbiqf2uNEr21s8ExGpYbI88TruSvB GpHYhQDQjZUIDESNWWgi96doe3CceNVb6wmpCV8VC5PgxkMDgsDNCWB4QgYJ7ePVnemk W9Ow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=1UeFCEkBe8WB1WXsCYwcP/7nKnryRnwhqxRiHu1oF/s=; b=d3Oxf6fMMMOHt+B0RY71zOsfT99+H6m/jIIOHlo96WI8Nz1NKd83rKTW2vOE+4TJKV 7NsvK1js9hzLXrSg5WMltpaWssK/Go9bK5hcLzuUR3tWT7WKIS2kHypCWxuNKxFs4F6W e6aGLACGiLwj0gsTGL7PFGKdVi7scVFaA2fsudWWnG+dXdqGHIdc0gKHJHAJznEp0XkL e4WEa61SayyzUKZNHpRUc+XUlZXXhcPrUCCs2vdFElx3pI8SLiVyT8onBuVW4g9H/0D7 /q8qPF/Cw8gBgGrgcvEIQ3YQqzULB2Fn1rCkwJP4D9da+80XJWfNklmBgQcqx1QCmHg2 0PFw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=HcPjW0Di; spf=pass (google.com: domain of joel@joelfernandes.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=joel@joelfernandes.org Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id r202sor72818011pfr.51.2019.08.07.10.16.16 for (Google Transport Security); Wed, 07 Aug 2019 10:16:16 -0700 (PDT) Received-SPF: pass (google.com: domain of joel@joelfernandes.org designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=HcPjW0Di; spf=pass (google.com: domain of joel@joelfernandes.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=joel@joelfernandes.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=1UeFCEkBe8WB1WXsCYwcP/7nKnryRnwhqxRiHu1oF/s=; b=HcPjW0DivADdEfHkyp4B3Mi8RU+x160HluTp+508rUVD+SzJgBrJq6t1BUOsPiC8sf fsTMIHTi5sb49Zvfxvz+q0a3vrz9N1UySTOGDCYAfcU7uD8iy1sWRdXzXXFJ6HB9lday L8GmF1g24wM+UOHJXMd0arOKopP7Q91uFDxtM= X-Google-Smtp-Source: APXvYqwHVvygOlXUtIZKyi/92023MeWzg0pU1BdqY1xO4rXhd1tufTZGiEHLR2SPeSZbDPbxKba8KQ== X-Received: by 2002:aa7:8817:: with SMTP id c23mr10572797pfo.146.1565198175976; Wed, 07 Aug 2019 10:16:15 -0700 (PDT) Received: from joelaf.cam.corp.google.com ([2620:15c:6:12:9c46:e0da:efbf:69cc]) by smtp.gmail.com with ESMTPSA id a1sm62692130pgh.61.2019.08.07.10.16.12 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Wed, 07 Aug 2019 10:16:15 -0700 (PDT) From: "Joel Fernandes (Google)" To: linux-kernel@vger.kernel.org Cc: "Joel Fernandes (Google)" , Alexey Dobriyan , Andrew Morton , Borislav Petkov , Brendan Gregg , Catalin Marinas , Christian Hansen , dancol@google.com, fmayer@google.com, "H. Peter Anvin" , Ingo Molnar , joelaf@google.com, Jonathan Corbet , Kees Cook , kernel-team@android.com, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Michal Hocko , Mike Rapoport , minchan@kernel.org, namhyung@google.com, paulmck@linux.ibm.com, Robin Murphy , Roman Gushchin , Stephen Rothwell , surenb@google.com, Thomas Gleixner , tkjos@google.com, Vladimir Davydov , Vlastimil Babka , Will Deacon Subject: [PATCH v5 3/6] [RFC] x86: Add support for idle bit in swap PTE Date: Wed, 7 Aug 2019 13:15:56 -0400 Message-Id: <20190807171559.182301-3-joel@joelfernandes.org> X-Mailer: git-send-email 2.22.0.770.g0f2c4a37fd-goog In-Reply-To: <20190807171559.182301-1-joel@joelfernandes.org> References: <20190807171559.182301-1-joel@joelfernandes.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This bit will be used by idle page tracking code to correctly identify if a page that was swapped out was idle before it got swapped out. Without this PTE bit, we lose information about if a page is idle or not since the page frame gets unmapped and the page gets freed. Bits 2-6 are unused in the swap PTE (see the comment in arch/x86/include/asm/pgtable_64.h). Bit 2 corresponds to _PAGE_USER. Use it for swap PTE purposes. Signed-off-by: Joel Fernandes (Google) --- arch/x86/Kconfig | 1 + arch/x86/include/asm/pgtable.h | 15 +++++++++++++++ arch/x86/include/asm/pgtable_types.h | 6 ++++++ 3 files changed, 22 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 222855cc0158..728f22370f17 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -139,6 +139,7 @@ config X86 select HAVE_ARCH_MMAP_RND_COMPAT_BITS if MMU && COMPAT select HAVE_ARCH_COMPAT_MMAP_BASES if MMU && COMPAT select HAVE_ARCH_PREL32_RELOCATIONS + select HAVE_ARCH_PTE_SWP_PGIDLE select HAVE_ARCH_SECCOMP_FILTER select HAVE_ARCH_THREAD_STRUCT_WHITELIST select HAVE_ARCH_STACKLEAK diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 0bc530c4eb13..ef3e662cee4a 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1371,6 +1371,21 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) #endif #endif +static inline pte_t pte_swp_mkpage_idle(pte_t pte) +{ + return pte_set_flags(pte, _PAGE_SWP_PGIDLE); +} + +static inline int pte_swp_page_idle(pte_t pte) +{ + return pte_flags(pte) & _PAGE_SWP_PGIDLE; +} + +static inline pte_t pte_swp_clear_mkpage_idle(pte_t pte) +{ + return pte_clear_flags(pte, _PAGE_SWP_PGIDLE); +} + #define PKRU_AD_BIT 0x1 #define PKRU_WD_BIT 0x2 #define PKRU_BITS_PER_PKEY 2 diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index b5e49e6bac63..6739cba4c900 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -100,6 +100,12 @@ #define _PAGE_SWP_SOFT_DIRTY (_AT(pteval_t, 0)) #endif +#ifdef CONFIG_IDLE_PAGE_TRACKING +#define _PAGE_SWP_PGIDLE _PAGE_USER +#else +#define _PAGE_SWP_PGIDLE (_AT(pteval_t, 0)) +#endif + #if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE) #define _PAGE_NX (_AT(pteval_t, 1) << _PAGE_BIT_NX) #define _PAGE_DEVMAP (_AT(u64, 1) << _PAGE_BIT_DEVMAP) From patchwork Wed Aug 7 17:15:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joel Fernandes X-Patchwork-Id: 11082503 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EDC2613B1 for ; Wed, 7 Aug 2019 17:16:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DD6832012F for ; Wed, 7 Aug 2019 17:16:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D0A2B28A39; Wed, 7 Aug 2019 17:16:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 519032012F for ; Wed, 7 Aug 2019 17:16:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0B3C36B000C; Wed, 7 Aug 2019 13:16:22 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 03F596B000D; Wed, 7 Aug 2019 13:16:21 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DFB266B000E; Wed, 7 Aug 2019 13:16:21 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f197.google.com (mail-pl1-f197.google.com [209.85.214.197]) by kanga.kvack.org (Postfix) with ESMTP id A8F646B000C for ; Wed, 7 Aug 2019 13:16:21 -0400 (EDT) Received: by mail-pl1-f197.google.com with SMTP id o6so53129667plk.23 for ; Wed, 07 Aug 2019 10:16:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=RUBxnyz640QfvTz6lz/Tqs2pyOPmGqnxC24rkQCQqR8=; b=AgY8CBuf9bDZvSv4sly5qKzhKBKvnCtXmRQucWMO2JHYbJ5z+rPIkKYbq4pBDGVh/z LXqjINuheHzT85ipMQg9l6jDQmCziuFkQfrNXou5Yy9Xck8r41Zo37oTemJixt3RgyTg qviTMhJtM88V2obWvp+OpGD1+0BFeRYkOnTbIKiUy4YdbRsk+d6PAvgkueVZ1TUGUWXe J/Eto2pDNFPrfNzlujciEl7yhsrlyahXCNgl6FsWNFLt+OyTtTyC+Ud/Q4jJQYIvKbf1 WEjcSgysyqyQIVuIji7KQ/FCMwtVsoY7BnE6UsLtR8ZVZCXVJYZbn54vha09jhvi108Y NUhw== X-Gm-Message-State: APjAAAW39mwG6KvgWLKS8UKjyMgeNQJoOjchShMxl1Dun3V3OubScIlG nPDFmRkVog4udEbr8hIzCUkjc1nvnJX8g002BJUwIDZLlJ+W3AHNIYFVqRp/e1ElwQBOhI/bUqA nrl90IqFIqIsdS10cMYdhOMN5vM+q5gUTdgXCkkNoFInaEi+1KUTx/jr60URTtSopeA== X-Received: by 2002:a65:5a8c:: with SMTP id c12mr8424688pgt.73.1565198181250; Wed, 07 Aug 2019 10:16:21 -0700 (PDT) X-Received: by 2002:a65:5a8c:: with SMTP id c12mr8424628pgt.73.1565198180393; Wed, 07 Aug 2019 10:16:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565198180; cv=none; d=google.com; s=arc-20160816; b=BYBHOiY5lf2vZqyhlL2xSMCngnb+kW8xz3DZ2UJNzHSgS7LV3iW/nlt26t8A8sMGIk kDq4JveIzG/iaViXyCM+kw6DKhvCGQsJNl9Kb1PDDXn381oOPz0CRj0T4nZdPTmCJMLg AXFWJyiulKK2GIoAoY7KJcTF0adMVDEQ+64wLiG52Vm7fyjZjla7qB1M7f3jdFDlXNsB 0o1nbkh10NSq/Tv9Ix2F8NBZVBkIGGSnEe7y+e6wNyKs5ypdFRqtN+nlaXMoj95/yRm6 C4cgZi0gZAwMNMwI3k0U4SUvrYdmWbCKb0BMbXWFIguenEtd6LXilj51iVAZnDUJ6l5B K8Jg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=RUBxnyz640QfvTz6lz/Tqs2pyOPmGqnxC24rkQCQqR8=; b=EIFt9w2LJUbmtK1iNgzkVPL3opNQxb3SD1ffZaFAqkD6iYCK5IK26WK17nSQvcm/ZI ltJEvQs5jxAE5sab/5dBDWkIMyIkbm/bIP+dFaiuve/yHMY8E5fwdpeeueIOnAi6uxhA DR/FXOX+0nX+1Owhq4NAMXu6UENSmVODMombBHn88bpYg9MpWd5oq1ij+18K/5WncOLB xaxNa7B8Q/kKprfTDCNlZ+Vn3azbJx5+HxzknAzmmnIfnnMkhiPOu7Qqp6FlTb0Ujsjo F36zCeOtTWeTvQT0grIdwO70oNOuQJGirSo07iua4sq25S8LQvAwLsRD8LKu0/WltQco XQcQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=qUcsAiEO; spf=pass (google.com: domain of joel@joelfernandes.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=joel@joelfernandes.org Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id a63sor2067191pla.38.2019.08.07.10.16.20 for (Google Transport Security); Wed, 07 Aug 2019 10:16:20 -0700 (PDT) Received-SPF: pass (google.com: domain of joel@joelfernandes.org designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=qUcsAiEO; spf=pass (google.com: domain of joel@joelfernandes.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=joel@joelfernandes.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=RUBxnyz640QfvTz6lz/Tqs2pyOPmGqnxC24rkQCQqR8=; b=qUcsAiEOKB0kyiDTK9YasOF7A4ofZDbfv+p8AIH2GkVHqwEwA1IA/h3IzTFeMkODCy rbSy4s5NCnbvnUouu7MfVETepWhtmKX5MsEvPzADXzvQqfRMT3AznSoRGnX5bZHxlply d1709s86c+06D6w3zpjya8G4jJGHYnl8N2mVk= X-Google-Smtp-Source: APXvYqyljthmaedQMd1J+PXKwGtyBWQtdiL8Gz9KlrlERrf+UnJhUAJEITHMkjgMGR3wYbFETkajsg== X-Received: by 2002:a17:902:b497:: with SMTP id y23mr9265019plr.68.1565198179959; Wed, 07 Aug 2019 10:16:19 -0700 (PDT) Received: from joelaf.cam.corp.google.com ([2620:15c:6:12:9c46:e0da:efbf:69cc]) by smtp.gmail.com with ESMTPSA id a1sm62692130pgh.61.2019.08.07.10.16.16 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Wed, 07 Aug 2019 10:16:19 -0700 (PDT) From: "Joel Fernandes (Google)" To: linux-kernel@vger.kernel.org Cc: "Joel Fernandes (Google)" , Robin Murphy , Alexey Dobriyan , Andrew Morton , Borislav Petkov , Brendan Gregg , Catalin Marinas , Christian Hansen , dancol@google.com, fmayer@google.com, "H. Peter Anvin" , Ingo Molnar , joelaf@google.com, Jonathan Corbet , Kees Cook , kernel-team@android.com, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Michal Hocko , Mike Rapoport , minchan@kernel.org, namhyung@google.com, paulmck@linux.ibm.com, Roman Gushchin , Stephen Rothwell , surenb@google.com, Thomas Gleixner , tkjos@google.com, Vladimir Davydov , Vlastimil Babka , Will Deacon Subject: [PATCH v5 4/6] [RFC] arm64: Add support for idle bit in swap PTE Date: Wed, 7 Aug 2019 13:15:57 -0400 Message-Id: <20190807171559.182301-4-joel@joelfernandes.org> X-Mailer: git-send-email 2.22.0.770.g0f2c4a37fd-goog In-Reply-To: <20190807171559.182301-1-joel@joelfernandes.org> References: <20190807171559.182301-1-joel@joelfernandes.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This bit will be used by idle page tracking code to correctly identify if a page that was swapped out was idle before it got swapped out. Without this PTE bit, we lose information about if a page is idle or not since the page frame gets unmapped. In this patch we reuse PTE_DEVMAP bit since idle page tracking only works on user pages in the LRU. Device pages should not consitute those so it should be unused and safe to use. Cc: Robin Murphy Signed-off-by: Joel Fernandes (Google) --- arch/arm64/Kconfig | 1 + arch/arm64/include/asm/pgtable-prot.h | 1 + arch/arm64/include/asm/pgtable.h | 15 +++++++++++++++ 3 files changed, 17 insertions(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 3adcec05b1f6..9d1412c693d7 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -128,6 +128,7 @@ config ARM64 select HAVE_ARCH_MMAP_RND_BITS select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT select HAVE_ARCH_PREL32_RELOCATIONS + select HAVE_ARCH_PTE_SWP_PGIDLE select HAVE_ARCH_SECCOMP_FILTER select HAVE_ARCH_STACKLEAK select HAVE_ARCH_THREAD_STRUCT_WHITELIST diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h index 92d2e9f28f28..917b15c5d63a 100644 --- a/arch/arm64/include/asm/pgtable-prot.h +++ b/arch/arm64/include/asm/pgtable-prot.h @@ -18,6 +18,7 @@ #define PTE_SPECIAL (_AT(pteval_t, 1) << 56) #define PTE_DEVMAP (_AT(pteval_t, 1) << 57) #define PTE_PROT_NONE (_AT(pteval_t, 1) << 58) /* only when !PTE_VALID */ +#define PTE_SWP_PGIDLE PTE_DEVMAP /* for idle page tracking during swapout */ #ifndef __ASSEMBLY__ diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 3f5461f7b560..558f5ebd81ba 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -212,6 +212,21 @@ static inline pte_t pte_mkdevmap(pte_t pte) return set_pte_bit(pte, __pgprot(PTE_DEVMAP)); } +static inline int pte_swp_page_idle(pte_t pte) +{ + return 0; +} + +static inline pte_t pte_swp_mkpage_idle(pte_t pte) +{ + return set_pte_bit(pte, __pgprot(PTE_SWP_PGIDLE)); +} + +static inline pte_t pte_swp_clear_page_idle(pte_t pte) +{ + return clear_pte_bit(pte, __pgprot(PTE_SWP_PGIDLE)); +} + static inline void set_pte(pte_t *ptep, pte_t pte) { WRITE_ONCE(*ptep, pte); From patchwork Wed Aug 7 17:15:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joel Fernandes X-Patchwork-Id: 11082509 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A4F5B1709 for ; Wed, 7 Aug 2019 17:16:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9303228A38 for ; Wed, 7 Aug 2019 17:16:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 85EB728A3A; Wed, 7 Aug 2019 17:16:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 17B0E28A38 for ; Wed, 7 Aug 2019 17:16:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EE56D6B000D; Wed, 7 Aug 2019 13:16:25 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E6EC76B000E; Wed, 7 Aug 2019 13:16:25 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CEA4F6B0010; Wed, 7 Aug 2019 13:16:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by kanga.kvack.org (Postfix) with ESMTP id 95B256B000D for ; Wed, 7 Aug 2019 13:16:25 -0400 (EDT) Received: by mail-pl1-f198.google.com with SMTP id k9so53150326pls.13 for ; Wed, 07 Aug 2019 10:16:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=xT1EUv7l9ELnBbW3Pj9GyH6zMTpmJcDHOSB5E5y40eQ=; b=B/1bY5YJ/heHouRlEAIadImmWHI8a4l2cNhwcOD3wHuBljkjsTZ7XeNUFU3SX3iXNU tabhfKyJC/negHOgcjBuy+/vYx2kAunzVE6PjzOtJABdZQx5CR5wgCwQHgu0KG6XpCNw PjjICEKXgO5ly0SGfMwwiGyeaEQJaacOjJA53F9oJssdVpuWyiDBXpgJFgz2JXTkdcn8 QmJT9z6wlwlxKAOnJwP9r0rQ7DZA2TQNp8MnxB2kKrjamucOoK3dj3kvPPf/bnNoBB5j iG/qij41g1ANz9o1n8X/nD48auIA3kUFlCqkuwPwjrhEcjfYOAthbiU+jVJMHXyHhX62 Lw8g== X-Gm-Message-State: APjAAAXwGyKp/nG5ppFgd4jESSfZbbwIDiPmufG9y3wkH99XdHwtCHFY dpAEVP+E8p7r/5kBff7TkZpuejq9zdn4KANGt5Tt/NGg1ShpAmxWAFG/0VY6ciyS0qojZykNLXr 3Kyj/vsMSjtS59bcg/WsBwRgr395/wedFV41OvlghyXx843KDYGuKyBjEpTgKUFiX/Q== X-Received: by 2002:a65:4189:: with SMTP id a9mr8365003pgq.399.1565198185167; Wed, 07 Aug 2019 10:16:25 -0700 (PDT) X-Received: by 2002:a65:4189:: with SMTP id a9mr8364943pgq.399.1565198184270; Wed, 07 Aug 2019 10:16:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565198184; cv=none; d=google.com; s=arc-20160816; b=Ymx46tka09zxmD+MPpKn8CC3pYl/XwNgoa+9rTR+fPOpRD8/04ZCeY2yHBMVGeVm1A uVPj/72l8qb2QvP4toQM7omzhjc/ADm+VZ38WRg7R2b61YSE14W5u6DBKT3A0efH36iM niY7w425reWEzR9Z/WsE+Hzw5nMIY4Z5m9u4XSORkA63JGjd2incsZAfWR6Zm/FItIIY zxTnoj6Ngo8SwtdOx8Gtn1sHy9aoNnejyVHU62WQgUA1L0IUaSnIQZ3qUs1sqVJ2rpNR M+9KkNcfwWM+uLAShunA5YJ6BCsXXWyKH+qko8Ss7mgDSQ1TlumNQK9SCCmdpFXwB5Aq IWLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=xT1EUv7l9ELnBbW3Pj9GyH6zMTpmJcDHOSB5E5y40eQ=; b=zwTpqBvtx7eQWP9eGbeHeIXLdokqOU0d3rd3GQpVGEQUiVoZQjxh5rTZaUwr98vY9Z Qx0U48eJIajynzie709VTKYu569igF/kO0MpUN8Oro80u9KX8o4rw4+jePqYoSgAXQtO 4RDZrqfhzS7R8JR9UNxA3XnvVkVKt9W7XB1G6qgUszBHPEvYAv5vT+yFXLHggqEaEmrH y85wG3nU2AvoSz+83ifBR32JWPxFe5zTINrduq+g7P/7nJcoiAV0tm5hwWB/l2MUC3Fr vJ0sJxEtX34UWTAiizWbejiIT4Uqytu28EDPk3XsxPRYkBnfeX2ordji/7mwQwZrAIlD 2U0Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=cpS86uEm; spf=pass (google.com: domain of joel@joelfernandes.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=joel@joelfernandes.org Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id ca12sor819380pjb.6.2019.08.07.10.16.24 for (Google Transport Security); Wed, 07 Aug 2019 10:16:24 -0700 (PDT) Received-SPF: pass (google.com: domain of joel@joelfernandes.org designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=cpS86uEm; spf=pass (google.com: domain of joel@joelfernandes.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=joel@joelfernandes.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=xT1EUv7l9ELnBbW3Pj9GyH6zMTpmJcDHOSB5E5y40eQ=; b=cpS86uEmYIzhxFzCySaqqOeImVDTWodZUXCQVQcRGqqRAGGDs1sNZw0oH3tmy2iZOE XkIu9l2OhfxN0xt6NyH+lVLF+8mXgqz7NSPIIJ+egtHoGZEtUgQY7iySTu1NQl565je/ qL/IqLU05tBxt6AwDFcVVWZEJCirczwMCOG/I= X-Google-Smtp-Source: APXvYqwugi3AW6n6dfP+TB/Uc5APuLJqoUYRpEzJ6Mn8Htn03idxWxXllDnzug9Hw6aLl8p384B8Ag== X-Received: by 2002:a17:90a:1785:: with SMTP id q5mr950338pja.106.1565198183794; Wed, 07 Aug 2019 10:16:23 -0700 (PDT) Received: from joelaf.cam.corp.google.com ([2620:15c:6:12:9c46:e0da:efbf:69cc]) by smtp.gmail.com with ESMTPSA id a1sm62692130pgh.61.2019.08.07.10.16.20 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Wed, 07 Aug 2019 10:16:23 -0700 (PDT) From: "Joel Fernandes (Google)" To: linux-kernel@vger.kernel.org Cc: "Joel Fernandes (Google)" , Alexey Dobriyan , Andrew Morton , Borislav Petkov , Brendan Gregg , Catalin Marinas , Christian Hansen , dancol@google.com, fmayer@google.com, "H. Peter Anvin" , Ingo Molnar , joelaf@google.com, Jonathan Corbet , Kees Cook , kernel-team@android.com, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Michal Hocko , Mike Rapoport , minchan@kernel.org, namhyung@google.com, paulmck@linux.ibm.com, Robin Murphy , Roman Gushchin , Stephen Rothwell , surenb@google.com, Thomas Gleixner , tkjos@google.com, Vladimir Davydov , Vlastimil Babka , Will Deacon Subject: [PATCH v5 5/6] page_idle: Drain all LRU pagevec before idle tracking Date: Wed, 7 Aug 2019 13:15:58 -0400 Message-Id: <20190807171559.182301-5-joel@joelfernandes.org> X-Mailer: git-send-email 2.22.0.770.g0f2c4a37fd-goog In-Reply-To: <20190807171559.182301-1-joel@joelfernandes.org> References: <20190807171559.182301-1-joel@joelfernandes.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP During idle page tracking, we see that sometimes faulted anon pages are in pagevec but are not drained to LRU. Idle page tracking only considers pages on LRU. I am able to find multiple issues involving this. One issue looks like idle tracking is completely broken. It shows up in my testing as if a page that is marked as idle is always "accessed" -- because it was never marked as idle (due to not draining of pagevec). The other issue shows up as a failure during swapping (support for which this series adds), with the following sequence: 1. Allocate some pages 2. Write to them 3. Mark them as idle <--- fails 4. Introduce some memory pressure to induce swapping. 5. Check the swap bit I introduced in this series. <--- fails to set idle bit in swap PTE. To fix this, this patch drains all CPU's pagevec before starting idle tracking. Signed-off-by: Joel Fernandes (Google) --- mm/page_idle.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/mm/page_idle.c b/mm/page_idle.c index 2766d4ab348c..26440a497609 100644 --- a/mm/page_idle.c +++ b/mm/page_idle.c @@ -180,6 +180,13 @@ static ssize_t page_idle_bitmap_read(struct file *file, struct kobject *kobj, unsigned long pfn, end_pfn; int bit, ret; + /* + * Idle page tracking currently works only on LRU pages, so drain + * them. This can cause slowness, but in the future we could + * remove this operation if we are tracking non-LRU pages too. + */ + lru_add_drain_all(); + ret = page_idle_get_frames(pos, count, NULL, &pfn, &end_pfn); if (ret == -ENXIO) return 0; /* Reads beyond max_pfn do nothing */ @@ -211,6 +218,13 @@ static ssize_t page_idle_bitmap_write(struct file *file, struct kobject *kobj, unsigned long pfn, end_pfn; int bit, ret; + /* + * Idle page tracking currently works only on LRU pages, so drain + * them. This can cause slowness, but in the future we could + * remove this operation if we are tracking non-LRU pages too. + */ + lru_add_drain_all(); + ret = page_idle_get_frames(pos, count, NULL, &pfn, &end_pfn); if (ret) return ret; @@ -428,6 +442,13 @@ ssize_t page_idle_proc_generic(struct file *file, char __user *ubuff, walk.private = &priv; walk.mm = mm; + /* + * Idle page tracking currently works only on LRU pages, so drain + * them. This can cause slowness, but in the future we could + * remove this operation if we are tracking non-LRU pages too. + */ + lru_add_drain_all(); + down_read(&mm->mmap_sem); /* From patchwork Wed Aug 7 17:15:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joel Fernandes X-Patchwork-Id: 11082511 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D03F41709 for ; Wed, 7 Aug 2019 17:16:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BAD4128A38 for ; Wed, 7 Aug 2019 17:16:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A68E628A39; Wed, 7 Aug 2019 17:16:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1B26B2012F for ; Wed, 7 Aug 2019 17:16:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E22196B0010; Wed, 7 Aug 2019 13:16:29 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DAB226B0266; Wed, 7 Aug 2019 13:16:29 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C72D86B0269; Wed, 7 Aug 2019 13:16:29 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f197.google.com (mail-pl1-f197.google.com [209.85.214.197]) by kanga.kvack.org (Postfix) with ESMTP id 8C2316B0010 for ; Wed, 7 Aug 2019 13:16:29 -0400 (EDT) Received: by mail-pl1-f197.google.com with SMTP id t2so53119252plo.10 for ; Wed, 07 Aug 2019 10:16:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=EpW+/zsfaWUWqnp/CzJ8iSD2ZsdUd34jJeLFlQmjMJk=; b=iwun2aSGaLhvIuxR+uqmXX1bq8UAh1apKkg7k1Xio/Lkzs5xoYEXVMNfcgF3iB1NVQ KruEy+endovPJQnQKEjNdVWBdAGuYGjqvzgsAOFMqXlcmgJcUJSOuwio7l3QYt+H7xvf Z0ZBNfr8QJTOzne2YbZ/NuwO2E7Tc05L4rbJ6O9gDNxa2rUod6z7pwe0n02qAesILsKb 08LdYN7SN8nF7eynJL/GOXUsP893N6lbln2dN13JXBoTyVwfQ0DRVGZdlfhFqjkt55m1 /GEK2Z2lEWbCU8OECis4eKJuHTMhnEjVhYNGpI0ELUkqi+AkJyorvz2Yo8WJWwON2PvH Agtw== X-Gm-Message-State: APjAAAWmocdPGaBsCp0Hq2UJhNBLlwLQMPpkrPWyRKJuUiWhRG9qqEQ8 QbiODD83NdZThgjyHKrad0uWoVOG2BPGAYPCT/JKQD7XvhgHA+3UrH7YQne6o6R5uQAxrPxQP/M nCf5Xt5cfy7IxKGBY9YLkhjx0jwYCfIkhKXK7oVpy9XZUZVhDdgpj3gfeTxqFLEswNA== X-Received: by 2002:a63:6c7:: with SMTP id 190mr8566230pgg.7.1565198189155; Wed, 07 Aug 2019 10:16:29 -0700 (PDT) X-Received: by 2002:a63:6c7:: with SMTP id 190mr8566173pgg.7.1565198188245; Wed, 07 Aug 2019 10:16:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565198188; cv=none; d=google.com; s=arc-20160816; b=EAACBllx2MPcJAFEu05nBIsOEHnpDWi1q3ZB7WXJ+HRjk1ZsaEDuIwyoVDJhyVSkah ztWsYpeto6uz0I79AolJmNHfLozFG2z7C54ZmHpKSHFGqFhrSrTZQNTXk2MxrtEl3RvW avVaqi8msGQXAv222RCRkVexXUXgSlOXKAbexjAOw/Zwyx/pSZYP0pMy6EHKxXMb1tPn W6e8mwtCMy2+mBWpB6kopwesLkP98ZvvRfSMNzcGIEEwXnO04RAdoBu8pt+mxP3DGLnH 3W+Vuz0WeOtCeK8aA/SKBhOf8yLNgM3jldivA+q1aaKHxfUTQP2mMQV2KX+nACFrMWvb VPOg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=EpW+/zsfaWUWqnp/CzJ8iSD2ZsdUd34jJeLFlQmjMJk=; b=mrVWRa60JnxxpdAtomsB3fOer5ReWWWWqxplQ8wg5jsmIaJP6IntTOgORLUTw7vdVc bCCKFD4wYncdFi02STMOZU5O3aqKE2XFvxTBx1Pg+i8drEKIhDF2zQL/ZtQfi1MBw9lg rzAXVyDvO9bw+lrPc90ZXmey2UxxihbDGnIWnZ1Hc7Ec6XXCyVtxklgtbQTp3cTT2s2B tQ0jVwafVmpVYT5SEQU20nWFqftaZbS58qzIREuC+LxMfAZzGb3qqtXsnfsv1g/u4w9j WRgBv4kO/J1cxCXqCJZVtXkLFp7qtwQzdI1+eYZX3ANfO0UGz9PN7iBckylmqW46TCe6 39/Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=qSJJinjx; spf=pass (google.com: domain of joel@joelfernandes.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=joel@joelfernandes.org Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id f41sor793385pjg.15.2019.08.07.10.16.28 for (Google Transport Security); Wed, 07 Aug 2019 10:16:28 -0700 (PDT) Received-SPF: pass (google.com: domain of joel@joelfernandes.org designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=qSJJinjx; spf=pass (google.com: domain of joel@joelfernandes.org designates 209.85.220.65 as permitted sender) smtp.mailfrom=joel@joelfernandes.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=EpW+/zsfaWUWqnp/CzJ8iSD2ZsdUd34jJeLFlQmjMJk=; b=qSJJinjx4WgXKOGFZMmrrGe4d2ALaNficzckUCKQ9VzSczmwMe7F1Cq0uQ7tfoks+e rtGpQwgXk30A1hVz+miPyOItSF28qM821PY2ovorbdNSmTsSYUUDhgdU2OE+wjQDzDPX CSpRQfnNtyIHoIY+g92oO+tp+BGpBJMqcqBT0= X-Google-Smtp-Source: APXvYqwFZ4pqna4e9p6C0a1Fkqp6qRc76QmaknFJcHpGMEiflBOSwsFgniRNvWcMNGSw3sFAcWvCGQ== X-Received: by 2002:a17:90a:7148:: with SMTP id g8mr938080pjs.51.1565198187764; Wed, 07 Aug 2019 10:16:27 -0700 (PDT) Received: from joelaf.cam.corp.google.com ([2620:15c:6:12:9c46:e0da:efbf:69cc]) by smtp.gmail.com with ESMTPSA id a1sm62692130pgh.61.2019.08.07.10.16.23 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Wed, 07 Aug 2019 10:16:26 -0700 (PDT) From: "Joel Fernandes (Google)" To: linux-kernel@vger.kernel.org Cc: "Joel Fernandes (Google)" , Mike Rapoport , Sandeep Patil , Alexey Dobriyan , Andrew Morton , Borislav Petkov , Brendan Gregg , Catalin Marinas , Christian Hansen , dancol@google.com, fmayer@google.com, "H. Peter Anvin" , Ingo Molnar , joelaf@google.com, Jonathan Corbet , Kees Cook , kernel-team@android.com, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Michal Hocko , minchan@kernel.org, namhyung@google.com, paulmck@linux.ibm.com, Robin Murphy , Roman Gushchin , Stephen Rothwell , surenb@google.com, Thomas Gleixner , tkjos@google.com, Vladimir Davydov , Vlastimil Babka , Will Deacon Subject: [PATCH v5 6/6] doc: Update documentation for page_idle virtual address indexing Date: Wed, 7 Aug 2019 13:15:59 -0400 Message-Id: <20190807171559.182301-6-joel@joelfernandes.org> X-Mailer: git-send-email 2.22.0.770.g0f2c4a37fd-goog In-Reply-To: <20190807171559.182301-1-joel@joelfernandes.org> References: <20190807171559.182301-1-joel@joelfernandes.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This patch updates the documentation with the new page_idle tracking feature which uses virtual address indexing. Reviewed-by: Mike Rapoport Reviewed-by: Sandeep Patil Signed-off-by: Joel Fernandes (Google) --- .../admin-guide/mm/idle_page_tracking.rst | 43 ++++++++++++++++--- 1 file changed, 36 insertions(+), 7 deletions(-) diff --git a/Documentation/admin-guide/mm/idle_page_tracking.rst b/Documentation/admin-guide/mm/idle_page_tracking.rst index df9394fb39c2..9eef32000f5e 100644 --- a/Documentation/admin-guide/mm/idle_page_tracking.rst +++ b/Documentation/admin-guide/mm/idle_page_tracking.rst @@ -19,10 +19,14 @@ It is enabled by CONFIG_IDLE_PAGE_TRACKING=y. User API ======== +There are 2 ways to access the idle page tracking API. One uses physical +address indexing, another uses a simpler virtual address indexing scheme. -The idle page tracking API is located at ``/sys/kernel/mm/page_idle``. -Currently, it consists of the only read-write file, -``/sys/kernel/mm/page_idle/bitmap``. +Physical address indexing +------------------------- +The idle page tracking API for physical address indexing using page frame +numbers (PFN) is located at ``/sys/kernel/mm/page_idle``. Currently, it +consists of the only read-write file, ``/sys/kernel/mm/page_idle/bitmap``. The file implements a bitmap where each bit corresponds to a memory page. The bitmap is represented by an array of 8-byte integers, and the page at PFN #i is @@ -74,6 +78,31 @@ See :ref:`Documentation/admin-guide/mm/pagemap.rst ` for more information about ``/proc/pid/pagemap``, ``/proc/kpageflags``, and ``/proc/kpagecgroup``. +Virtual address indexing +------------------------ +The idle page tracking API for virtual address indexing using virtual frame +numbers (VFN) for a process ```` is located at ``/proc//page_idle``. +It is a bitmap that follows the same semantics as +``/sys/kernel/mm/page_idle/bitmap`` except that it uses virtual instead of +physical frame numbers. + +This idle page tracking API does not deal with PFN so it does not require prior +lookups of ``pagemap``. This is an advantage on some systems where looking up +PFN is considered a security issue. Also in some cases, this interface could +be slightly more reliable to use than physical address indexing, since in +physical address indexing, address space changes can occur between reading the +``pagemap`` and reading the ``bitmap``, while in virtual address indexing, the +process's ``mmap_sem`` is held for the duration of the access. + +To estimate the amount of pages that are not used by a workload one should: + + 1. Mark all the workload's pages as idle by setting corresponding bits in + ``/proc//page_idle``. + + 2. Wait until the workload accesses its working set. + + 3. Read ``/proc//page_idle`` and count the number of bits set. + .. _impl_details: Implementation Details @@ -99,10 +128,10 @@ When a dirty page is written to swap or disk as a result of memory reclaim or exceeding the dirty memory limit, it is not marked referenced. The idle memory tracking feature adds a new page flag, the Idle flag. This flag -is set manually, by writing to ``/sys/kernel/mm/page_idle/bitmap`` (see the -:ref:`User API ` -section), and cleared automatically whenever a page is referenced as defined -above. +is set manually, by writing to ``/sys/kernel/mm/page_idle/bitmap`` for physical +addressing or by writing to ``/proc//page_idle`` for virtual +addressing (see the :ref:`User API ` section), and cleared +automatically whenever a page is referenced as defined above. When a page is marked idle, the Accessed bit must be cleared in all PTEs it is mapped to, otherwise we will not be able to detect accesses to the page coming