From patchwork Thu Mar 2 23:27:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13158046 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A24B6C7EE33 for ; Thu, 2 Mar 2023 23:28:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A468C6B0081; Thu, 2 Mar 2023 18:28:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 939B26B0080; Thu, 2 Mar 2023 18:28:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6C5ED6B0082; Thu, 2 Mar 2023 18:28:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 47FDF6B007D for ; Thu, 2 Mar 2023 18:28:07 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id EE39A40F21 for ; Thu, 2 Mar 2023 23:28:06 +0000 (UTC) X-FDA: 80525548572.15.0715258 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf11.hostedemail.com (Postfix) with ESMTP id 378EA40009 for ; Thu, 2 Mar 2023 23:28:05 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=TlDiW9om; spf=none (imf11.hostedemail.com: domain of mcgrof@infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=mcgrof@infradead.org; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677799685; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aVGWGg28pehMWy6o19l/TsZGWBXxeVmCZNXdGLwnZq4=; b=U4/+RYh63pmq+83TaMHKVCD4uGzihoG09fA52kf8INtTjhQZrukDcGpD+ctJwGey5jBoeV iCpKWBUBNbiffeyVwLR6EDDf36Q+XTV2j9KsModDjRdcNnZwSLScuj1JZN0kdzukkWuHiO PzSKKfRYOP6RlYEEI1Fpy9aIxG+Rjm8= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=TlDiW9om; spf=none (imf11.hostedemail.com: domain of mcgrof@infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=mcgrof@infradead.org; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677799685; a=rsa-sha256; cv=none; b=vUjmtiSgaSRidVKIOY7b5Twdu1s8VWFZXpB2nSIW+O/TW9AacvTxUtYwQu8gOasvEU+kZn KZhjKGZNCSCTUDi4XekP+vraquIsCstxf3Th7b2ez8QfSH6FdqNYmVagw3q9Y8xlD/z4px 4qLfrj307OPBBYqrQHwWagLhyqJgrSc= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=aVGWGg28pehMWy6o19l/TsZGWBXxeVmCZNXdGLwnZq4=; b=TlDiW9om0AtjuznlGSshw1b4sC ghR2WXi7V+f5PZ1jG/o45vCIdVVPAervPRo0lvMQJdxtcovf9IN60XRQyXJ1TOSoZDYvbKBXbI/Wi kuqHDamMQIx8Oh2hC1RmvcgDdngdxl8vCJelPxKnPdI6sZqCLJP7oKLTX7HCNfOVjJ+oJWhrHo56l jIjCnoGvXoEsBAierza5C+Z0oTy0fXnLWb9sfdDC4drisylwEXR0+Vwcjgn7AYvnMbzbxgtrQjiQg eoDfsRFufkoQ6AunUMT7LpC6d+FMUW/ESv6NhPIS/cqtmkwozQFqphCo+anHmfRTZfKqu4eV620Lr q5ku9HZg==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1pXsL9-003j3S-Gf; Thu, 02 Mar 2023 23:27:59 +0000 From: Luis Chamberlain To: hughd@google.com, akpm@linux-foundation.org, willy@infradead.org, brauner@kernel.org Cc: linux-mm@kvack.org, p.raghav@samsung.com, da.gomez@samsung.com, a.manzanares@samsung.com, dave@stgolabs.net, yosryahmed@google.com, keescook@chromium.org, mcgrof@kernel.org, patches@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 6/6] shmem: add support to ignore swap Date: Thu, 2 Mar 2023 15:27:58 -0800 Message-Id: <20230302232758.888157-7-mcgrof@kernel.org> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20230302232758.888157-1-mcgrof@kernel.org> References: <20230302232758.888157-1-mcgrof@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 378EA40009 X-Stat-Signature: 4yi6h67153iurdi47osfnjrzy6kb86pb X-HE-Tag: 1677799685-284610 X-HE-Meta: U2FsdGVkX19IyuhgqhGimpl/VLI42A3+gAVE8DsX+2yCDvssCn/eO9rzOBbuXCxKmaoTtyiAdZ2NYiHvCn81ZmxEbTXhvc9kb2mkAfhr1YmtsCZ8Cq1ZS35m8fCrtRGP1spo2mE+3R592e6wFY8PhKkoN4wg1AZ0Qb1NxlZaNE0/JBmJ/RNHMjlhoVXf3eD53tJd0I4VrfFI34JQrTXzGG6S/OOhesHkUiMtQsKiJLWPNX8rAUWqMiGu5VIt3L6fFkzLPJZA7V36TQW0Ey3nKk6bUKO4BqnrAZS9c1wOXLw686SnUYeNm8PMkDnr6mTA8/JBWpEpC1qOimkZN5WHY5e67XQuwknxI/8mobej/az0SWTVoi72K6ABG231rr73u2sPHIMNAxQs/L4kvg/U/Ow6AV+YbKuScXVJRSOfI8Jp/9GThpGr/KHwEgbk95Kjcxbz1TM7TP/xOarHxsKB8QMkQHJyqZ/it5CHHLjjf5V/MlnOUxtkSSK1FCCy+zSdEukylBPwNgiwdXZTsnZ6FTSoFgQ/P1PDbDU8b9c4pJpNe00Dx6FOHWGbL/YsSqV9q9wuDNT2kE2kSPNFROU9BvRbv/oGW5C5IWITFJnP3xdT4jxfIjcxsjDxE9HluvCa4j5yJXyHT0YdPYaARsO8YIdtXPJXB9Elk4i7/EozepAkNQPsbwUIWWUsTc0J1zxQgKyiD7YKre2tdVxe4NU8/zPdu5J/zg2LUsIY5Vkw7dzvkahHBgLAyFlvQmswmKJH4lTDnYXxqlOtm8g28gVZt1Y71b70u2e6wUY0Ce+Sayl8NaqeQXfuWahSNSnElvpnm+c7D2Bj5+++q3puhgZy1v46x6WlrXf2eD8aaV/WTLfOVsiMkdXRnbEA9hyyTadMCgfqfDBZ6Jub+8sx2Xn1+a8aIUGfqagHcvExnxj+SakM7fh1spkIUM01Vi8NPE+yDdwPptobwiGL12r+ID+ XJXWTl8I ipXcqNQqlBLtOghM8VD/AhUkpB7pRquaIDd0p7G4tmdLFV+rURcJWCIscXwqflbYsGvON5VvK1T0adYiuln8Gl2N+pQDa/FBNzfxCRL77TWI85z/jsEngkuVD8sDE0zddBmOt9Lu45RpdXXhPJV8rmPSNX1HWiHXcGURNIyIovqcbxd7Ui0cG7ISN5jk4Xb2rFMVrMJxY8erOGtDyiiCGfBLSSu6nMmWQK2AXYz60pPKB9ur7wXZJ7QCLnJoKIsxews4cRkTF4Gu1tYQJzTdHVScZvzMXiz6RVX2c+W5ruUkmtU6n1OyKV2YX+rkkjCMw0h6af5A9YnYYeQMBJ/AFSeE+6AALUFv31S+ixB8WT/EIvHszhvUF5VgDEwY70j7CA0xx8NjjV5WDHK/+fne0acareij8AG9Pt3lN4xnmG+e3rdKfr7Iu4NI/Kg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In doing experimentations with shmem having the option to avoid swap becomes a useful mechanism. One of the *raves* about brd over shmem is you can avoid swap, but that's not really a good reason to use brd if we can instead use shmem. Using brd has its own good reasons to exist, but just because "tmpfs" doesn't let you do that is not a great reason to avoid it if we can easily add support for it. I don't add support for reconfiguring incompatible options, but if we really wanted to we can add support for that. To avoid swap we use mapping_set_unevictable() upon inode creation, and put a WARN_ON_ONCE() stop-gap on writepages() for reclaim. Acked-by: Christian Brauner Signed-off-by: Luis Chamberlain --- Documentation/filesystems/tmpfs.rst | 9 ++++++--- Documentation/mm/unevictable-lru.rst | 2 ++ include/linux/shmem_fs.h | 1 + mm/shmem.c | 28 +++++++++++++++++++++++++++- 4 files changed, 36 insertions(+), 4 deletions(-) diff --git a/Documentation/filesystems/tmpfs.rst b/Documentation/filesystems/tmpfs.rst index e77ebdacadd0..551b621f34d9 100644 --- a/Documentation/filesystems/tmpfs.rst +++ b/Documentation/filesystems/tmpfs.rst @@ -13,7 +13,8 @@ everything stored therein is lost. tmpfs puts everything into the kernel internal caches and grows and shrinks to accommodate the files it contains and is able to swap -unneeded pages out to swap space. +unneeded pages out to swap space, if swap was enabled for the tmpfs +filesystem. tmpfs extends ramfs with a few userspace configurable options listed and explained further below, some of which can be reconfigured dynamically on the @@ -33,8 +34,8 @@ configured in size at initialization and you cannot dynamically resize them. Contrary to brd ramdisks, tmpfs has its own filesystem, it does not rely on the block layer at all. -Since tmpfs lives completely in the page cache and on swap, all tmpfs -pages will be shown as "Shmem" in /proc/meminfo and "Shared" in +Since tmpfs lives completely in the page cache and on optionally on swap, +all tmpfs pages will be shown as "Shmem" in /proc/meminfo and "Shared" in free(1). Notice that these counters also include shared memory (shmem, see ipcs(1)). The most reliable way to get the count is using df(1) and du(1). @@ -83,6 +84,8 @@ nr_inodes The maximum number of inodes for this instance. The default is half of the number of your physical RAM pages, or (on a machine with highmem) the number of lowmem RAM pages, whichever is the lower. +noswap Disables swap. Remounts must respect the original settings. + By default swap is enabled. ========= ============================================================ These parameters accept a suffix k, m or g for kilo, mega and giga and diff --git a/Documentation/mm/unevictable-lru.rst b/Documentation/mm/unevictable-lru.rst index 92ac5dca420c..3cdcbb6e00a0 100644 --- a/Documentation/mm/unevictable-lru.rst +++ b/Documentation/mm/unevictable-lru.rst @@ -42,6 +42,8 @@ The unevictable list addresses the following classes of unevictable pages: * Those owned by ramfs. + * Those owned by tmpfs with the noswap option. + * Those mapped into SHM_LOCK'd shared memory regions. * Those mapped into VM_LOCKED [mlock()ed] VMAs. diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 103d1000a5a2..21989d4f8cbe 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -45,6 +45,7 @@ struct shmem_sb_info { kuid_t uid; /* Mount uid for root directory */ kgid_t gid; /* Mount gid for root directory */ bool full_inums; /* If i_ino should be uint or ino_t */ + bool noswap; /* ingores VM relcaim / swap requests */ ino_t next_ino; /* The next per-sb inode number to use */ ino_t __percpu *ino_batch; /* The next per-cpu inode number to use */ struct mempolicy *mpol; /* default memory policy for mappings */ diff --git a/mm/shmem.c b/mm/shmem.c index 6006dbb7dbcb..cd36cb3d974c 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -119,10 +119,12 @@ struct shmem_options { bool full_inums; int huge; int seen; + bool noswap; #define SHMEM_SEEN_BLOCKS 1 #define SHMEM_SEEN_INODES 2 #define SHMEM_SEEN_HUGE 4 #define SHMEM_SEEN_INUMS 8 +#define SHMEM_SEEN_NOSWAP 16 }; #ifdef CONFIG_TMPFS @@ -1337,6 +1339,7 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc) struct address_space *mapping = folio->mapping; struct inode *inode = mapping->host; struct shmem_inode_info *info = SHMEM_I(inode); + struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb); swp_entry_t swap; pgoff_t index; @@ -1352,7 +1355,7 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc) goto redirty; } - if (WARN_ON_ONCE(info->flags & VM_LOCKED)) + if (WARN_ON_ONCE((info->flags & VM_LOCKED) || sbinfo->noswap)) goto redirty; if (!total_swap_pages) @@ -2489,6 +2492,8 @@ static struct inode *shmem_get_inode(struct mnt_idmap *idmap, struct super_block shmem_set_inode_flags(inode, info->fsflags); INIT_LIST_HEAD(&info->shrinklist); INIT_LIST_HEAD(&info->swaplist); + if (sbinfo->noswap) + mapping_set_unevictable(inode->i_mapping); simple_xattrs_init(&info->xattrs); cache_no_acl(inode); mapping_set_large_folios(inode->i_mapping); @@ -3576,6 +3581,7 @@ enum shmem_param { Opt_uid, Opt_inode32, Opt_inode64, + Opt_noswap, }; static const struct constant_table shmem_param_enums_huge[] = { @@ -3597,6 +3603,7 @@ const struct fs_parameter_spec shmem_fs_parameters[] = { fsparam_u32 ("uid", Opt_uid), fsparam_flag ("inode32", Opt_inode32), fsparam_flag ("inode64", Opt_inode64), + fsparam_flag ("noswap", Opt_noswap), {} }; @@ -3680,6 +3687,10 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param) ctx->full_inums = true; ctx->seen |= SHMEM_SEEN_INUMS; break; + case Opt_noswap: + ctx->noswap = true; + ctx->seen |= SHMEM_SEEN_NOSWAP; + break; } return 0; @@ -3778,6 +3789,14 @@ static int shmem_reconfigure(struct fs_context *fc) err = "Current inum too high to switch to 32-bit inums"; goto out; } + if ((ctx->seen & SHMEM_SEEN_NOSWAP) && ctx->noswap && !sbinfo->noswap) { + err = "Cannot disable swap on remount"; + goto out; + } + if (!(ctx->seen & SHMEM_SEEN_NOSWAP) && !ctx->noswap && sbinfo->noswap) { + err = "Cannot enable swap on remount if it was disabled on first mount"; + goto out; + } if (ctx->seen & SHMEM_SEEN_HUGE) sbinfo->huge = ctx->huge; @@ -3798,6 +3817,10 @@ static int shmem_reconfigure(struct fs_context *fc) sbinfo->mpol = ctx->mpol; /* transfers initial ref */ ctx->mpol = NULL; } + + if (ctx->noswap) + sbinfo->noswap = true; + raw_spin_unlock(&sbinfo->stat_lock); mpol_put(mpol); return 0; @@ -3852,6 +3875,8 @@ static int shmem_show_options(struct seq_file *seq, struct dentry *root) seq_printf(seq, ",huge=%s", shmem_format_huge(sbinfo->huge)); #endif shmem_show_mpol(seq, sbinfo->mpol); + if (sbinfo->noswap) + seq_printf(seq, ",noswap"); return 0; } @@ -3895,6 +3920,7 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc) ctx->inodes = shmem_default_max_inodes(); if (!(ctx->seen & SHMEM_SEEN_INUMS)) ctx->full_inums = IS_ENABLED(CONFIG_TMPFS_INODE64); + sbinfo->noswap = ctx->noswap; } else { sb->s_flags |= SB_NOUSER; }