From patchwork Sun Apr 12 15:04:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrea Righi X-Patchwork-Id: 11484599 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6AB1314B4 for ; Sun, 12 Apr 2020 15:04:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 290C920705 for ; Sun, 12 Apr 2020 15:04:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 290C920705 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=canonical.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 375218E00D2; Sun, 12 Apr 2020 11:04:56 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 3272B8E00D0; Sun, 12 Apr 2020 11:04:56 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 215128E00D2; Sun, 12 Apr 2020 11:04:56 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0178.hostedemail.com [216.40.44.178]) by kanga.kvack.org (Postfix) with ESMTP id 0A1F98E00D0 for ; Sun, 12 Apr 2020 11:04:56 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id B4A2F180AD802 for ; Sun, 12 Apr 2020 15:04:55 +0000 (UTC) X-FDA: 76699525350.29.bean85_8244d0380ce1d X-Spam-Summary: 2,0,0,54d4800d91823122,d41d8cd98f00b204,andrea.righi@canonical.com,,RULES_HIT:2:41:69:355:379:800:960:973:982:988:989:1260:1277:1312:1313:1314:1345:1431:1437:1516:1518:1519:1535:1593:1594:1595:1596:1605:1730:1747:1777:1792:1801:2393:2553:2559:2562:2693:2890:2901:2910:2912:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4042:4049:4120:4250:4321:4605:5007:6119:6261:6630:7875:7903:8603:8784:9010:9121:10004:11026:11233:11473:11658:11914:12043:12295:12296:12297:12438:12517:12519:12555:12679:12895:12986:13149:13156:13161:13228:13229:13230:13439:13895:14394:21060:21080:21444:21450:21451:21627:21740:21939:21990:30001:30019:30054:30055:30056:30060:30090,0,RBL:91.189.89.112:@canonical.com:.lbl8.mailshell.net-62.8.15.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fn,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: bean85_8244d0380ce1d X-Filterd-Recvd-Size: 9626 Received: from youngberry.canonical.com (youngberry.canonical.com [91.189.89.112]) by imf08.hostedemail.com (Postfix) with ESMTP for ; Sun, 12 Apr 2020 15:04:55 +0000 (UTC) Received: from mail-wm1-f69.google.com ([209.85.128.69]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1jNeAH-0002xW-T0 for linux-mm@kvack.org; Sun, 12 Apr 2020 15:04:53 +0000 Received: by mail-wm1-f69.google.com with SMTP id n127so2052115wme.4 for ; Sun, 12 Apr 2020 08:04:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:mime-version :content-disposition; bh=fJjRaEPRrE7fJirNO4lI/3HJIVV/jD3+1wYWzIXtexc=; b=ZdXvsnkK/oXnBKUvjwAFOcek9++cbCApsuEYQcA/YO65on7jz+sCCKguTdM2dfRqIz H9QgXqCNfzgRLEAuAVBFIrGhfSFnulzs/201CfY5gr1mRFa6+XXC5II5ruzHU9klF80d fqHIk8Toaw+4WpqtGgJCp3W5gqf6hMsy3VdxrQjmKgAuJX+AAKuGJnzg9xsOI0PLK63+ 9G5DrxsbLWZYK0ArBzdxPVQ603d2szm4tOHp6EbkgrneRaJSGgJGIb1rX+itGflS8sPO iwWRJI3BvV25lSsE2ktlEoOSajbI8zluqjHAhslOZ5HnqloGZ0GJb0bLCrxpQeyF+MTU 8gSw== X-Gm-Message-State: AGi0PuZz1Z91IsR+/hRFhGH+EFE4gHhHj++hM0m5IjLw9R6I8Yf6D9WB egHwVJO30ULZZvXJZ2TGIVkMYhxaRZHrIi3ljmOBlJlovEPoS/BHMSPjDTYXwCpITw97eBon47p PCLv0y05vR7gceRKmCRkK5d+Nn/QW X-Received: by 2002:a7b:cb59:: with SMTP id v25mr15439210wmj.13.1586703892954; Sun, 12 Apr 2020 08:04:52 -0700 (PDT) X-Google-Smtp-Source: APiQypJn43HzYAGi+OOAgQeytkjnsRjwCWR7qWF6O1U3MMr0NRUzsqmBbLRbdblruK0EXV8imemNeQ== X-Received: by 2002:a7b:cb59:: with SMTP id v25mr15439180wmj.13.1586703892591; Sun, 12 Apr 2020 08:04:52 -0700 (PDT) Received: from localhost (host123-127-dynamic.36-79-r.retail.telecomitalia.it. [79.36.127.123]) by smtp.gmail.com with ESMTPSA id p5sm12041710wrg.49.2020.04.12.08.04.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 Apr 2020 08:04:52 -0700 (PDT) Date: Sun, 12 Apr 2020 17:04:49 +0200 From: Andrea Righi To: Andrew Morton Cc: Huang Ying , Minchan Kim , Anchal Agarwal , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH] mm: swap: introduce fixed-size readahead policy Message-ID: <20200412150449.GA740985@xps-13> MIME-Version: 1.0 Content-Disposition: inline X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Introduce a new fixed-size swap-in readahead policy that can be selected at run-time. The global swap-in readahead policy takes in account the previous access patterns, using a scaling heuristic to determine the optimal readahead chunk dynamically. This works pretty well in most cases, but like any heuristic there are specific cases when this approach is not ideal, for example the swapoff scenario. During swapoff we just want to load back into memory all the swapped-out pages and for this specific use case a fixed-size readahead is more efficient. This patch introduces a new sysfs interface (/sys/kernel/mm/swap/swap_ra_policy) that can be set as following: - 0: current scaling swap-in readahead policy (default) - 1: fixed-size readahead policy (size is determined by vm.page-cluster) The specific use case this patch is addressing is to improve swapoff performance when a VM has been hibernated, resumed and all memory needs to be forced back to RAM by disabling swap (see the test case below). But it is not the only case where a fixed-size readahead can show its benefits. More in general, the fixed-size policy can be beneficial in all the cases where a large part of swapped out pages need to be loaded back to memory as fast as possible. Testing environment =================== - Host: CPU: 1.8GHz Intel Core i7-8565U (quad-core, 8MB cache) HDD: PC401 NVMe SK hynix 512GB MEM: 16GB - Guest (kvm): 8GB of RAM virtio block driver 16GB swap file on ext4 (/swapfile) Test case ========= - allocate 85% of memory - `systemctl hibernate` to force all the pages to be swapped-out to the swap file - resume the system - measure the time that swapoff takes to complete: # /usr/bin/time swapoff /swapfile Result ====== default fixed-size readahead readahead ------- ---------- page-cluster=1 26.77s 21.25s page-cluster=2 28.29s 12.66s page-cluster=3 22.09s 8.77s page-cluster=4 21.50s 7.60s page-cluster=5 25.35s 7.75s page-cluster=6 23.19s 8.32s page-cluster=7 22.25s 9.40s page-cluster=8 22.09s 8.93s The fixed-size readahead should not be the default, because in a regular live system the default scaling readahead policy just works better, but there are special cases, like the swapoff one, where it would be really useful to be able to select this other option (and eventually add more policies in the future). Signed-off-by: Andrea Righi --- .../ABI/testing/sysfs-kernel-mm-swap | 13 +++++ include/linux/mm.h | 7 +++ mm/swap.c | 3 ++ mm/swap_state.c | 49 ++++++++++++++++++- 4 files changed, 71 insertions(+), 1 deletion(-) diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-swap b/Documentation/ABI/testing/sysfs-kernel-mm-swap index 94672016c268..c432f0edb20a 100644 --- a/Documentation/ABI/testing/sysfs-kernel-mm-swap +++ b/Documentation/ABI/testing/sysfs-kernel-mm-swap @@ -14,3 +14,16 @@ Description: Enable/disable VMA based swap readahead. still used for tmpfs etc. other users. If set to false, the global swap readahead algorithm will be used for all swappable pages. + +What: /sys/kernel/mm/swap/swap_ra_policy +Date: Apr 2020 +Contact: Linux memory management mailing list +Description: Select the global swap readahead policy. + + At the moment the following policies are available: + - 0 (scaling): default kernel heuristic that dynamically + adjusts the swap-in readahead size based on previous hit + ratio and access pattern + + - 1 (fixed): swap-in readahead is constant and it is + determined only by sysctl's vm.page-cluster diff --git a/include/linux/mm.h b/include/linux/mm.h index 5a323422d783..1cc1a8ff588a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -74,6 +74,13 @@ static inline void totalram_pages_add(long count) extern void * high_memory; extern int page_cluster; +/* Supported swap-in readahead policies */ +enum { + SWAP_READAHEAD_SCALING = 0, + SWAP_READAHEAD_FIXED, +}; +extern int swap_readahead_policy; + #ifdef CONFIG_SYSCTL extern int sysctl_legacy_va_layout; #else diff --git a/mm/swap.c b/mm/swap.c index bf9a79fed62d..15e02923052d 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -44,6 +44,9 @@ /* How many pages do we try to swap or page in/out together? */ int page_cluster; +/* Select page swap-in readahead policy */ +int swap_readahead_policy __read_mostly; + static DEFINE_PER_CPU(struct pagevec, lru_add_pvec); static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs); static DEFINE_PER_CPU(struct pagevec, lru_deactivate_file_pvecs); diff --git a/mm/swap_state.c b/mm/swap_state.c index ebed37bbf7a3..cb6e80a4599a 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -498,7 +498,7 @@ static unsigned int __swapin_nr_pages(unsigned long prev_offset, return pages; } -static unsigned long swapin_nr_pages(unsigned long offset) +static unsigned long swapin_nr_pages_scaling(unsigned long offset) { static unsigned long prev_offset; unsigned int hits, pages, max_pages; @@ -518,6 +518,22 @@ static unsigned long swapin_nr_pages(unsigned long offset) return pages; } +static unsigned long swapin_nr_pages(unsigned long offset) +{ + unsigned long pages; + + switch (swap_readahead_policy) { + case SWAP_READAHEAD_FIXED: + pages = 1 << READ_ONCE(page_cluster); + break; + default: + pages = swapin_nr_pages_scaling(offset); + break; + } + + return pages; +} + /** * swap_cluster_readahead - swap in pages in hope we need them soon * @entry: swap entry of this memory @@ -809,8 +825,39 @@ static struct kobj_attribute vma_ra_enabled_attr = __ATTR(vma_ra_enabled, 0644, vma_ra_enabled_show, vma_ra_enabled_store); +static ssize_t swap_ra_policy_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sprintf(buf, "%d\n", swap_readahead_policy); +} + +static ssize_t swap_ra_policy_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + unsigned long val; + + if (kstrtoul(buf, 10, &val)) + return -EINVAL; + + switch (val) { + case SWAP_READAHEAD_SCALING: + case SWAP_READAHEAD_FIXED: + swap_readahead_policy = val; + break; + default: + return -EINVAL; + } + + return count; +} +static struct kobj_attribute swap_ra_policy_attr = + __ATTR(swap_ra_policy, 0644, swap_ra_policy_show, + swap_ra_policy_store); + static struct attribute *swap_attrs[] = { &vma_ra_enabled_attr.attr, + &swap_ra_policy_attr.attr, NULL, };