From patchwork Fri Dec 30 21:52:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13084552 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81A61C4332F for ; Fri, 30 Dec 2022 21:53:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D77CA8E0002; Fri, 30 Dec 2022 16:53:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D27A28E0001; Fri, 30 Dec 2022 16:53:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C3D548E0002; Fri, 30 Dec 2022 16:53:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B55438E0001 for ; Fri, 30 Dec 2022 16:53:01 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 864B4A097D for ; Fri, 30 Dec 2022 21:53:01 +0000 (UTC) X-FDA: 80300323362.24.053640C Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf21.hostedemail.com (Postfix) with ESMTP id F08411C0005 for ; Fri, 30 Dec 2022 21:52:59 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=asPESHfj; spf=pass (imf21.hostedemail.com: domain of 3u12vYwYKCFgOKP70E6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--yuzhao.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3u12vYwYKCFgOKP70E6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--yuzhao.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672437180; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=8TFtch4afv11+liPGrcYUQIUPF8E0jytiv2ZtV4VP60=; b=uEo3BwKr/e4Rwn2g3uV9hGz/6wwWB/7ksZ6OlJlLCLAWpS0zOvy2g+w6s7kIP2ADG7nNvR nEcTSVuycemf2G+Ihj2nje0r1jA29Zcs27u8GnASXCC/84tsZCvE7pnF4KXj5/tKKnZZKh LLnMrq9k9L6nI9UwmctHSkz57t7/tMw= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=asPESHfj; spf=pass (imf21.hostedemail.com: domain of 3u12vYwYKCFgOKP70E6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--yuzhao.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3u12vYwYKCFgOKP70E6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--yuzhao.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672437180; a=rsa-sha256; cv=none; b=hJkgjTUwb+rYoomKI74j8dldkMYCQkgc6xZkD5HS+5GdqSC8/slSGCo4TSJd6Q6ESYr40B SqWLLajyDh1YKUlsIwVTEaJQU8Zo0iV65VBjplz7cknlMWNW2QNWkfszCGiInmK8EzcI+Z 7GV7edGJoeO60KpRFIwC1otsW7Mu89g= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-4755eb8a57bso151918057b3.12 for ; Fri, 30 Dec 2022 13:52:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:mime-version:message-id:date:from:to:cc:subject :date:message-id:reply-to; bh=8TFtch4afv11+liPGrcYUQIUPF8E0jytiv2ZtV4VP60=; b=asPESHfjUZVqU8QJN1ZRjtdLbyZ3asXwvb7J7xZeFOEkZdLReFK5zvD6iEweEPLRNO uMNcn1qJJP1VCQjdCQWbtIgvNNGCbyKbKLAKi5actBzYN3l6drHtqt2ovdoS2SOwaASF NdVzXqbMnDicqkUCQSYzivZA5q90Qqk77ytGWSmT0yBS/Of60q41AJzk4B1uRTM68dOZ YrElqL40Nhf8OrFL1PCZfvDG9Jd5EOj2KQnz9vcwJ1pmGwDSvlEQCjJsfN7Nsot/OKAJ 3q3d2pFsTNsrAkhElMMca0lXrTpYZJCkp6JSjcheVNR3Sn91EDanrNymPKWy77Rd+67c wbBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:mime-version:message-id:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=8TFtch4afv11+liPGrcYUQIUPF8E0jytiv2ZtV4VP60=; b=1KyvUM6oi2gwCO/7MPNllwh4pn8jmME+XHzW/KqLjy0eZ0PtndL6W9lqIXtFJvs3KU RUsU6Y4A6YfI5iCQ1v/wNO3rFkLrs1MFQCqFSakBcvTyH+Yw7BhNUgHmTk7mZyZeB+aA jN2Vrz+kwXQc3jDzENM/zjMnqe/2kE98yOPykBRlF4WkDjGwH0bOVvTHk4MwMqhi0yjZ y2yXiGYSpyz+IGUSf3tVY6a9X8WzgR0EDqJj5yYoyEa1C3vR1VMmuWdLN/VKaqVU0Jlm AR095dZR23nRg9XK/YXB6vTompYSih+PJOrGNaLvAI/3QeDAmEzmoT/1HcdQSLh0es1r dE6g== X-Gm-Message-State: AFqh2kpjiLdPMpFwYSBgchwVbqA6bV/1iNY3ffLrQRwL8maK732YS3cU IPoCMenJe0A32ZHptx69UgjVKha7nKI= X-Google-Smtp-Source: AMrXdXtb9FEN+IrQw0fMFoCYU37iBqM6LqwOvQ0wWXeG+vFQ3r+4lVRzKnphHsXxu7vtPxqZ4AKu3MbENKY= X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:81fe:2008:27c1:d0cb]) (user=yuzhao job=sendgmr) by 2002:a05:690c:830:b0:483:a506:cb0e with SMTP id by16-20020a05690c083000b00483a506cb0emr2053093ywb.123.1672437179072; Fri, 30 Dec 2022 13:52:59 -0800 (PST) Date: Fri, 30 Dec 2022 14:52:51 -0700 Message-Id: <20221230215252.2628425-1-yuzhao@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Subject: [PATCH mm-unstable v2 1/2] mm: add vma_has_recency() From: Yu Zhao To: Andrew Morton Cc: Alexander Viro , Andrea Righi , Johannes Weiner , Michael Larabel , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@google.com, Yu Zhao X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: F08411C0005 X-Stat-Signature: qc3x6isgqn5ko4goco8jg94ddsf6d18i X-Rspam-User: X-HE-Tag: 1672437179-453417 X-HE-Meta: U2FsdGVkX18FjfJAXbAoX9XD151rAy/t4FM4SFyDV5US6giot5CTY4dYPuBQv86cO3RhkJTdpX7DMCFiCHUsX2vUUpbXJu3v5BiDMjUvA1HTIv0tC6ToJU2SohwNA/p5ZHPHl4jgZTMM/tf67V1jltMWqwcn9HGjwWq6DJg+7PMXhF3kAjfoHvs4Y4XiSsRKsN/rB3N0HPBbJb0NCCy379WzJ4F1sEZYelv5RGeaLfiPP8CjDrkgY0W09FoQq5Ot2BqYeMwTjdVe6cxEITDfEHrtqArW2itKHYVY+/bL/cUvDT1A8VtDtBnQqZCSr0ekqMXVkLI6neBBxmHKTdiFue2vTs5RVph2oPXWTacZBa2O5bKx/A9itImUURUCHfHDmBf6lxW/aRG2bARx233n1/d0QJndfhuj9wLW7aJMv0jr1vFGB8gP3UEro3GucXdK12PWFk4b8n7/YtgBXIDEBISBq5et3Cz5p42RFUsgoju2kIU32vesVmyLAzhhR+cYWelkKrNqnkw+3RVdQ8S0rkJ5Zu5dn5K6HojgjlvM2XcXLSpuIbFZ7EP2sA+wVIRechWnP3RYT6mE8k4J470WZC45GVmatYc6D2yzXKzrkm2JcfgNlPKkifFNiIQE1ODSXtxwT1krKh9pHwBw4s49xMlNgq34AZNhMKkmLZZIn8+6F7pKSR0UG/ihPAm073UbisshVRvpe5eLcOkv5cWEt7zwVPTE0vsO0bpPlP4YgnphR7hts7uRBnXyKhfl8j0YmRUIaa5fHapIZwbz8vf6G3fHTB65GEL/T5srfn5Nb5Fjt5zZd4sC0oPG3nlD1O8Kh1K5jUClJ7eytst/af0T5kkBvz9XIZovRyXyiMQ4z47RMFq2jsPbCBrBV+2vke1ZungTv+6cXsCI5kqpOb5GtCQs9apOIHMUoo6xHMgRZZGNeZWVy5s/Rqlyf6sFoZcI03hUN48q/mnEr/rp4Ni FU1r7FKH ExE4FVW4xWIYC0Z9qvravtFyIp8crwj0hRkD7xxvqAfsnFdBeEK/MaVGnj37nuqlEWZQfuILKIZt0/4cHtGQJOD6b0OZ2GPA8l9q6vI44zl+9OcPfHHTqr9BjAiI4Hcfz5MutwdBj6/5QQMbBhQQHFyeYId38dsMwqYG0xHtepwd9erGau7R2XhbA6OF5ZUHZ2AruHxVAyiKfFEhnV5ywT4SoRGM8ZwxTzU301s6Eyz1/48rPHBRgXfDv84km9M+h7LetJ8QpU+fTXg1wE0EAVUtI6GP0FN8dyRRWEUG7UXhsfy9d9779NhyrdL7sUkAo6adVOMdGDUD9+YzpqsFld0hjJfTCxplsvw0/NKsj0+EL6tVpI4JGKaW2l0z8OzLDmci4HtOp1Pi/dK/HMyjuf/TxUynB2iQKvjsM X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch adds vma_has_recency() to indicate whether a VMA may exhibit temporal locality that the LRU algorithm relies on. This function returns false for VMAs marked by VM_SEQ_READ or VM_RAND_READ. While the former flag indicates linear access, i.e., a special case of spatial locality, both flags indicate a lack of temporal locality, i.e., the reuse of an area within a relatively small duration. "Recency" is chosen over "locality" to avoid confusion between temporal and spatial localities. Before this patch, the active/inactive LRU only ignored the accessed bit from VMAs marked by VM_SEQ_READ. After this patch, the active/inactive LRU and MGLRU share the same logic: they both ignore the accessed bit if vma_has_recency() returns false. For the active/inactive LRU, the following fio test showed a [6, 8]% increase in IOPS when randomly accessing mapped files under memory pressure. kb=$(awk '/MemTotal/ { print $2 }' /proc/meminfo) kb=$((kb - 8*1024*1024)) modprobe brd rd_nr=1 rd_size=$kb dd if=/dev/zero of=/dev/ram0 bs=1M mkfs.ext4 /dev/ram0 mount /dev/ram0 /mnt/ swapoff -a fio --name=test --directory=/mnt/ --ioengine=mmap --numjobs=8 \ --size=8G --rw=randrw --time_based --runtime=10m \ --group_reporting The discussion that led to this patch is here [1]. Additional test results are available in that thread. [1] https://lore.kernel.org/r/Y31s%2FK8T85jh05wH@google.com/ Signed-off-by: Yu Zhao --- include/linux/mm_inline.h | 8 ++++++++ mm/memory.c | 7 +++---- mm/rmap.c | 42 +++++++++++++++++---------------------- mm/vmscan.c | 5 ++++- 4 files changed, 33 insertions(+), 29 deletions(-) diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index d1c1f211a86f..fe5b8449e14a 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -595,4 +595,12 @@ pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, #endif } +static inline bool vma_has_recency(struct vm_area_struct *vma) +{ + if (vma->vm_flags & (VM_SEQ_READ | VM_RAND_READ)) + return false; + + return true; +} + #endif diff --git a/mm/memory.c b/mm/memory.c index 4000e9f017e0..ee72badad847 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1402,8 +1402,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, force_flush = 1; } } - if (pte_young(ptent) && - likely(!(vma->vm_flags & VM_SEQ_READ))) + if (pte_young(ptent) && likely(vma_has_recency(vma))) mark_page_accessed(page); } rss[mm_counter(page)]--; @@ -5148,8 +5147,8 @@ static inline void mm_account_fault(struct pt_regs *regs, #ifdef CONFIG_LRU_GEN static void lru_gen_enter_fault(struct vm_area_struct *vma) { - /* the LRU algorithm doesn't apply to sequential or random reads */ - current->in_lru_fault = !(vma->vm_flags & (VM_SEQ_READ | VM_RAND_READ)); + /* the LRU algorithm only applies to accesses with recency */ + current->in_lru_fault = vma_has_recency(vma); } static void lru_gen_exit_fault(void) diff --git a/mm/rmap.c b/mm/rmap.c index 8a24b90d9531..9abffdd63a6a 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -823,25 +823,14 @@ static bool folio_referenced_one(struct folio *folio, } if (pvmw.pte) { - if (lru_gen_enabled() && pte_young(*pvmw.pte) && - !(vma->vm_flags & (VM_SEQ_READ | VM_RAND_READ))) { + if (lru_gen_enabled() && pte_young(*pvmw.pte)) { lru_gen_look_around(&pvmw); referenced++; } if (ptep_clear_flush_young_notify(vma, address, - pvmw.pte)) { - /* - * Don't treat a reference through - * a sequentially read mapping as such. - * If the folio has been used in another mapping, - * we will catch it; if this other mapping is - * already gone, the unmap path will have set - * the referenced flag or activated the folio. - */ - if (likely(!(vma->vm_flags & VM_SEQ_READ))) - referenced++; - } + pvmw.pte)) + referenced++; } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { if (pmdp_clear_flush_young_notify(vma, address, pvmw.pmd)) @@ -875,7 +864,20 @@ static bool invalid_folio_referenced_vma(struct vm_area_struct *vma, void *arg) struct folio_referenced_arg *pra = arg; struct mem_cgroup *memcg = pra->memcg; - if (!mm_match_cgroup(vma->vm_mm, memcg)) + /* + * Ignore references from this mapping if it has no recency. If the + * folio has been used in another mapping, we will catch it; if this + * other mapping is already gone, the unmap path will have set the + * referenced flag or activated the folio in zap_pte_range(). + */ + if (!vma_has_recency(vma)) + return true; + + /* + * If we are reclaiming on behalf of a cgroup, skip counting on behalf + * of references from different cgroups. + */ + if (memcg && !mm_match_cgroup(vma->vm_mm, memcg)) return true; return false; @@ -906,6 +908,7 @@ int folio_referenced(struct folio *folio, int is_locked, .arg = (void *)&pra, .anon_lock = folio_lock_anon_vma_read, .try_lock = true, + .invalid_vma = invalid_folio_referenced_vma, }; *vm_flags = 0; @@ -921,15 +924,6 @@ int folio_referenced(struct folio *folio, int is_locked, return 1; } - /* - * If we are reclaiming on behalf of a cgroup, skip - * counting on behalf of references from different - * cgroups - */ - if (memcg) { - rwc.invalid_vma = invalid_folio_referenced_vma; - } - rmap_walk(folio, &rwc); *vm_flags = pra.vm_flags; diff --git a/mm/vmscan.c b/mm/vmscan.c index 6929402db149..cdf96aec39dc 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3794,7 +3794,10 @@ static int should_skip_vma(unsigned long start, unsigned long end, struct mm_wal if (is_vm_hugetlb_page(vma)) return true; - if (vma->vm_flags & (VM_LOCKED | VM_SPECIAL | VM_SEQ_READ | VM_RAND_READ)) + if (!vma_has_recency(vma)) + return true; + + if (vma->vm_flags & (VM_LOCKED | VM_SPECIAL)) return true; if (vma == get_gate_vma(vma->vm_mm)) From patchwork Fri Dec 30 21:52:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13084553 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7458BC3DA7C for ; Fri, 30 Dec 2022 21:53:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 00AA28E0003; Fri, 30 Dec 2022 16:53:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F24158E0001; Fri, 30 Dec 2022 16:53:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D9DC58E0003; Fri, 30 Dec 2022 16:53:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id CAC3F8E0001 for ; Fri, 30 Dec 2022 16:53:02 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A03EBC0715 for ; Fri, 30 Dec 2022 21:53:02 +0000 (UTC) X-FDA: 80300323404.20.8E17334 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf24.hostedemail.com (Postfix) with ESMTP id 07446180007 for ; Fri, 30 Dec 2022 21:53:00 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=FgGwqmDX; spf=pass (imf24.hostedemail.com: domain of 3vF2vYwYKCFkPLQ81F7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--yuzhao.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3vF2vYwYKCFkPLQ81F7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--yuzhao.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672437181; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=O4DokS/FZJlMuz65NYBFzd2Wq7AgsGKfPd6D2qH+qms=; b=Zy4RZj4/043Ve2QfMX+wVeV9EruTGMIMlVyoQdoq5BGm69yD3qjOPbMSRN6i1nDJEMRi2d 2GdQrb+GUbnWy8CeZ0NJwAlmwJbqkCjMm9QGVVlEp3rQid7JcFhhR3uYRL0Dk8RndayWLq yVUaTMnBo06QPE9Bw8H9LzohBkTgb0A= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=FgGwqmDX; spf=pass (imf24.hostedemail.com: domain of 3vF2vYwYKCFkPLQ81F7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--yuzhao.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3vF2vYwYKCFkPLQ81F7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--yuzhao.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672437181; a=rsa-sha256; cv=none; b=XREa/yxaWSBzc8SWIIiyCf0DFEXNQPQlE7ipPjBbOzUrEQLr3twuKL4m40TysTRuW/PRtH jnMvhg/zSQ+zZMs6X/ieAk5qWYorYw7j3RSPSMVvAOLYje5tD9zAZ2Bt0gT16GR7LSIpOv sPW7z1MI9geGK0J2AbqjEEw/23E4c4w= Received: by mail-yb1-f202.google.com with SMTP id z17-20020a25e311000000b00719e04e59e1so23311931ybd.10 for ; Fri, 30 Dec 2022 13:53:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=O4DokS/FZJlMuz65NYBFzd2Wq7AgsGKfPd6D2qH+qms=; b=FgGwqmDX00GIWfIYPn/v8jRlZd2/7/eGhsa3XhZaxfOffHYe3todQjnwAAsP2bTkVA NXtj5qc/YfA9UTTSKVXDsNyMAQiVWsVl5biYy5PbMY3paPPCX1TsCSWPWHZT0vluz8M2 VScUMyAHmNjsnk5aY8zH9ZZjn9mmLgCd7ClnRRUHlssSxX4w6uZ375mhpBYm4hH6bUqk xB7kwz3GbYZZCAfOS0HKy4R4y5kIVxhSP1t+NvJeST2lASf5xuxdPggulq/bzWk8UuQ1 hAI2KXL04UliLjieBCQZ/eRt4IbXiY4KhcH6y4MZ0aG+I4BPczrrrZjPSI8kHJD1RcBs 60eQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=O4DokS/FZJlMuz65NYBFzd2Wq7AgsGKfPd6D2qH+qms=; b=ssps5jiFI+bakhTKFwQoaq3Bpo4BfXvVEd39K8/pbqrUnEZNtz3FaN12ik62XgFttS G0qrv/X5v9Upk/4QqLU5YCSSUOd4yYUxkzEkAcI0sRiOspA8m8CTLda/BuvENjo+1jbF Jjp6v35WB7XeSY9tWwJso2ZRCD4HEaVRa04kZaqG1OjlvSAUJ3JlyAJWnreaglt5n7jg YHxSur7U4kMR2NH6wsLeOXkokpdJB9HSjgtNuj43e1e6lcLW+kZ2kIawAe+RlnzcqAFh DDxp0uV23ySiWo30zc2/63du/Lhu8aYeZAsO6DF/FfuMIu/EK3z9FO0X+yXOS5hweQDY cwcA== X-Gm-Message-State: AFqh2krqrZKl4TTV/cm6zNg+tEIWtUAVUfwvD5jlA+BsM0+zIVMV8j0r fX2OXOlUa/bcpqmDJYOuDFUAOfQX9WU= X-Google-Smtp-Source: AMrXdXtTmFzViPQcW6Bc4aF7poiGpJjp1VELwrivuun/rdXIksgRPlpJqcuBPyGHvQEmV73KcdhsYhrdZTI= X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:81fe:2008:27c1:d0cb]) (user=yuzhao job=sendgmr) by 2002:a81:494f:0:b0:480:c531:5824 with SMTP id w76-20020a81494f000000b00480c5315824mr2177473ywa.247.1672437180183; Fri, 30 Dec 2022 13:53:00 -0800 (PST) Date: Fri, 30 Dec 2022 14:52:52 -0700 In-Reply-To: <20221230215252.2628425-1-yuzhao@google.com> Message-Id: <20221230215252.2628425-2-yuzhao@google.com> Mime-Version: 1.0 References: <20221230215252.2628425-1-yuzhao@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Subject: [PATCH mm-unstable v2 2/2] mm: support POSIX_FADV_NOREUSE From: Yu Zhao To: Andrew Morton Cc: Alexander Viro , Andrea Righi , Johannes Weiner , Michael Larabel , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@google.com, Yu Zhao X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 07446180007 X-Stat-Signature: wnqouzykabmyr7w4uc5au5aspk1zjxwa X-HE-Tag: 1672437180-594127 X-HE-Meta: U2FsdGVkX19kmjP1HPWPywtJs0YiFXne27WroINRfaff7ntpwfwNhEcRCa60Hh2JNTsLXmzZgdeQcr6i1TDrri1pkIRRDM4PXRpJNlMM3FdJQ/tDiCoYeO1fZCCQN6OlG3eqjYnG7aCD8TclYrF87TqOE8C3IIlnqan5ciqz2KwQVpo82rHgTeaAAQ2bWPzEMyNd/Q9K2sRMckm7U6mKtP0hnCfppe6wNdImKmzmygtWRqNWsNC6ob1AgfdIoihTcqkd6eMmrgz8y5csDnUkA9Pq8x0ktHR/bTUUxtEQC7CZoZCvoKh6Ok02jcugsBil2eKi0sz7mRBv1iX3enhxIYBPIztCx0CnaGfJh+B+ZIQ5r0Qo0NeGHnutroknj71zZz7dwz2Y5T2YJFhLKZzoiG4N7G4hquDSF4GbEoll3f2nKDlaF6xNRkzhe1P8Adk8ddFzLHNVo+n0ohQP+m1uB0Pr9wSMdq2zoHO9vB07soqfU4BCRyGIDgaf/VdXqdzuAlhOyhyeVxu7XLdgnDjMwVWDWWA5uiQKEXEP1ZlhAtoN9tsFloCtShL5IIASmEe5Fpcm0dW8sj3H/uhmaa5Pt8Otq/6l11e2pBrUjEXFxOqUK9cL83aBem88esC6/9nhThAjTvdyXk0edcw2juFIQZHz3bckgfO5aBQ56JJEAVI3eyvBQmz7OS7x6uPElSiBQtO+h7EpNhA2/3YYhRT99KDk08+SvPxaTkUhyEN4iadI4Rk66QvBcBkW/yfkpqm7Q9ElH7506dKjMfFZt9MLdIDxVXfP5lB1rH6RMhv+aLNHZqHsIoxtP37GDMWWNKM/XLDKjObiacxCHhZKAaMST18+JBDJ4F4N253nZdnHDbZaUjm/CCjuEcWY7otEpHEfCau1YVUtoeaKPqtfET/Cm2m5he/ExlUaYB9DRjKXdZ/X5wsS0tScg87YCaOSBzD2p3X4FzH8nBmpHEXJCiD N50lA0Za hlUTFYxsxqtmHZXvsuUwT+YhvntUN9mwhsRe5joSQZ+6cH7APZPY9NniBXi0w5Kzu91SJ8gZCD8U/ftvsvFLtOe3qLd3NQn6wnfwCT2UNa/UpdPJ487mJ/UQ2CgZ/7QQUx2/N2/lQxIcBnrGizKTRaXniJZFJa0teCqvTSItMynwLG0X7KMvYxh5dXvgv6tiHiRc5VZXmxL3WlSMp4X7oQwNIoY6Wxe+JfnbFdk4grHB37HIYvQvScacrOv6AVr2O9w4tBWg6kJOodd34yVzog+YHawytQcor2+fePffNfqXqv9XjoRIqt7DzwYHJb+vMlED8IeqXKCjupzuWr4VVy5pK5qQiBAPoT8SvmNA/n4N/29NITpGbPbvMYhfpkb0IR86w3ivM688qEzzsZE4fAoX/moOG99Up74wc X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch adds POSIX_FADV_NOREUSE to vma_has_recency() so that the LRU algorithm can ignore access to mapped files marked by this flag. The advantages of POSIX_FADV_NOREUSE are: 1. Unlike MADV_SEQUENTIAL and MADV_RANDOM, it does not alter the default readahead behavior. 2. Unlike MADV_SEQUENTIAL and MADV_RANDOM, it does not split VMAs and therefore does not take mmap_lock. 3. Unlike MADV_COLD, setting it has a negligible cost, regardless of how many pages it affects. Its limitations are: 1. Like POSIX_FADV_RANDOM and POSIX_FADV_SEQUENTIAL, it currently does not support range. IOW, its scope is the entire file. 2. It currently does not ignore access through file descriptors. Specifically, for the active/inactive LRU, given a file page shared by two users and one of them having set POSIX_FADV_NOREUSE on the file, this page will be activated upon the second user accessing it. This corner case can be covered by checking POSIX_FADV_NOREUSE before calling folio_mark_accessed() on the read path. But it is considered not worth the effort. There have been a few attempts to support POSIX_FADV_NOREUSE, e.g., [1]. This time the goal is to fill a niche: a few desktop applications, e.g., large file transferring and video encoding/decoding, want fast file streaming with mmap() rather than direct IO. Among those applications, an SVT-AV1 regression was reported when running with MGLRU [2]. The following test can reproduce that regression. kb=$(awk '/MemTotal/ { print $2 }' /proc/meminfo) kb=$((kb - 8*1024*1024)) modprobe brd rd_nr=1 rd_size=$kb dd if=/dev/zero of=/dev/ram0 bs=1M mkfs.ext4 /dev/ram0 mount /dev/ram0 /mnt/ swapoff -a fallocate -l 8G /mnt/swapfile mkswap /mnt/swapfile swapon /mnt/swapfile wget http://ultravideo.cs.tut.fi/video/Bosphorus_3840x2160_120fps_420_8bit_YUV_Y4M.7z 7z e -o/mnt/ Bosphorus_3840x2160_120fps_420_8bit_YUV_Y4M.7z SvtAv1EncApp --preset 12 -w 3840 -h 2160 \ -i /mnt/Bosphorus_3840x2160.y4m For MGLRU, the following change showed a [9-11]% increase in FPS, which makes it on par with the active/inactive LRU. patch Source/App/EncApp/EbAppMain.c < #include 35d35 < #include /* _O_BINARY */ 117a118 > posix_fadvise(config->mmap.fd, 0, 0, POSIX_FADV_NOREUSE); EOF [1] https://lore.kernel.org/r/1308923350-7932-1-git-send-email-andrea@betterlinux.com/ [2] https://openbenchmarking.org/result/2209259-PTS-MGLRU8GB57 Signed-off-by: Yu Zhao --- include/linux/fs.h | 2 ++ include/linux/mm_inline.h | 3 +++ mm/fadvise.c | 5 ++++- 3 files changed, 9 insertions(+), 1 deletion(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index 066555ad1bf8..5660ed0edf1a 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -166,6 +166,8 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset, /* File supports DIRECT IO */ #define FMODE_CAN_ODIRECT ((__force fmode_t)0x400000) +#define FMODE_NOREUSE ((__force fmode_t)0x800000) + /* File was opened by fanotify and shouldn't generate fanotify events */ #define FMODE_NONOTIFY ((__force fmode_t)0x4000000) diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index fe5b8449e14a..064f92c78bfa 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -600,6 +600,9 @@ static inline bool vma_has_recency(struct vm_area_struct *vma) if (vma->vm_flags & (VM_SEQ_READ | VM_RAND_READ)) return false; + if (vma->vm_file && (vma->vm_file->f_mode & FMODE_NOREUSE)) + return false; + return true; } diff --git a/mm/fadvise.c b/mm/fadvise.c index bf04fec87f35..fb7c5f43fd2a 100644 --- a/mm/fadvise.c +++ b/mm/fadvise.c @@ -80,7 +80,7 @@ int generic_fadvise(struct file *file, loff_t offset, loff_t len, int advice) case POSIX_FADV_NORMAL: file->f_ra.ra_pages = bdi->ra_pages; spin_lock(&file->f_lock); - file->f_mode &= ~FMODE_RANDOM; + file->f_mode &= ~(FMODE_RANDOM | FMODE_NOREUSE); spin_unlock(&file->f_lock); break; case POSIX_FADV_RANDOM: @@ -107,6 +107,9 @@ int generic_fadvise(struct file *file, loff_t offset, loff_t len, int advice) force_page_cache_readahead(mapping, file, start_index, nrpages); break; case POSIX_FADV_NOREUSE: + spin_lock(&file->f_lock); + file->f_mode |= FMODE_NOREUSE; + spin_unlock(&file->f_lock); break; case POSIX_FADV_DONTNEED: __filemap_fdatawrite_range(mapping, offset, endbyte,