From patchwork Mon Mar 4 08:13:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13580152 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D008C48BF6 for ; Mon, 4 Mar 2024 08:14:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1ACAF6B00A5; Mon, 4 Mar 2024 03:14:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 15B486B00A6; Mon, 4 Mar 2024 03:14:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0243D6B00A7; Mon, 4 Mar 2024 03:14:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id E41516B00A5 for ; Mon, 4 Mar 2024 03:14:54 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id B702EA090D for ; Mon, 4 Mar 2024 08:14:54 +0000 (UTC) X-FDA: 81858645708.03.3712DB5 Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) by imf27.hostedemail.com (Postfix) with ESMTP id 1D65B40011 for ; Mon, 4 Mar 2024 08:14:52 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Q30qUKtF; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf27.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709540093; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VEHcUaBOURO/yC0z8mfKLfYRHHxSSbfVWo8QOD6O6SM=; b=OQpwICz6npPL/tFAv5WJxQRWwzBTKzFUJ6ypUbdxlxDzSdf6Uu0aRxR+u/G/k+JkTzhahM RugfgACb0mKjkjq2lc0rHPpkhmNwI8SvMz/EEntL9/n2j+Ab3ZP9GAiydsDXJ6ePMBk0IJ yT97Sdxo17ux1qt8HvAT6rsen6basxw= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Q30qUKtF; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf27.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709540093; a=rsa-sha256; cv=none; b=7P4SVFSuGEl0lWq5Ozgx6jmoiPqkEbLXodssrbAT+KNV/4OLjNOVbfl//h7t2AxKBw40M4 vgS2fQ9AFD00CYsJm2XVbtWPE70mb+hPcP/2HPHgkohbrab1OkghExVaruCOOZKZU0ffhu ogdStFK6BQ8lq3SxCsVlBWJxUMHsCzc= Received: by mail-pf1-f169.google.com with SMTP id d2e1a72fcca58-6da202aa138so3020994b3a.2 for ; Mon, 04 Mar 2024 00:14:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709540092; x=1710144892; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=VEHcUaBOURO/yC0z8mfKLfYRHHxSSbfVWo8QOD6O6SM=; b=Q30qUKtFLH4Xt9nnQlHr/CpqCBopJFydnF+kI1Y9kYArD7iBU8AJHqgS179xzNNigE tDL4S3sBVBv07jiK+1CK+670RmDjkT9EsbLwCQTbazKt97ti741AtBbZ8idKN0XdMd1C QcR2Ee5zienGEoCAZQhQN6QmLw5qqdAvdilAzylrKDh9+R9ckKcq7/uJKTtCV43Xmb0Q RcSoJMqjmntZdcETmt2OYniK7omIHSUzkbEDool3zECRfFgsjheqSpnkUWUsMyyGysGW fRQv5oPPs1IdIoxJy1pSVlHK2y5N648H5rZZ0s0RXFz54snAIX93djXhzbJKxHwBzIwX FAug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709540092; x=1710144892; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VEHcUaBOURO/yC0z8mfKLfYRHHxSSbfVWo8QOD6O6SM=; b=px4WmNBWkRTmNJZ8qhTMix/hjLTBe3xIK6GFXxfO1jWaMfeiq+Um/k+RFSSmt5mL0l giVUyDD9QSTByp/oJloXw2Brp3uaMCsWSAJRD+cqdC6p0bB8OuAC2bbg5TQBIPpCEYXr JFHE4mBIOJvps4l6S2/x/r5CtBMr6lugcsr6tGFdch4hfeztfV0+UE03oeV/4g6h9TNI 6EGLh5nzUr1LcAOUktH/dcGtuHrBHk1+VLO0qwMI7Moo0WzRdDEeryHhVL+keLzCZxgY TRnlCHELzmjMihI1tvTn5JCpdTVFVIzsGyf4RKAmn43x6406IZfpk/4pwJ+V5QTWP0FI NTOg== X-Forwarded-Encrypted: i=1; AJvYcCU9/slQ/qwGN9z6DwnQCn8KxO2FHat+M/yfvRhHVZS9rYx9cNwKiNNLeiuPQYTwVA9LAQVOuhuY4MMItOnyPvDTIq8= X-Gm-Message-State: AOJu0YwkJ+PBLtEYsa5LsrIvj2UbMuutvA4uwWdrBfE4eOPcKz3NqNWM dk0WeY/UfpnKSF9AKRSe5EiZBokM7aT4442JuHa7gdlHRC4O2ti0 X-Google-Smtp-Source: AGHT+IGE7Q24s92a3q9O7BOh15+H0RGSvY6KsWDz94mjR2aHEmrzB8E5OFPS9feP8EbqMCx8xVkVsg== X-Received: by 2002:a05:6a00:2d99:b0:6e6:2499:d189 with SMTP id fb25-20020a056a002d9900b006e62499d189mr2330890pfb.2.1709540091964; Mon, 04 Mar 2024 00:14:51 -0800 (PST) Received: from localhost.localdomain ([2407:7000:8942:5500:aaa1:59ff:fe57:eb97]) by smtp.gmail.com with ESMTPSA id ka42-20020a056a0093aa00b006e558a67374sm6686387pfb.0.2024.03.04.00.14.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Mar 2024 00:14:51 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, linux-mm@kvack.org, ryan.roberts@arm.com Cc: chengming.zhou@linux.dev, chrisl@kernel.org, david@redhat.com, hannes@cmpxchg.org, kasong@tencent.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, mhocko@suse.com, nphamcs@gmail.com, shy828301@gmail.com, steven.price@arm.com, surenb@google.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yosryahmed@google.com, yuzhao@google.com, Barry Song , Hugh Dickins , Minchan Kim , SeongJae Park Subject: [RFC PATCH v3 4/5] mm: swap: introduce swapcache_prepare_nr and swapcache_clear_nr for large folios swap-in Date: Mon, 4 Mar 2024 21:13:47 +1300 Message-Id: <20240304081348.197341-5-21cnbao@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240304081348.197341-1-21cnbao@gmail.com> References: <20240304081348.197341-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: edb3jhs7onnjdme68mqu4qyhj6xo4uhd X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 1D65B40011 X-HE-Tag: 1709540092-686252 X-HE-Meta: U2FsdGVkX197dUO1tl253ewnNcTyIicWo7Weo0t65zrQDbu6PQfZ/7xl2Id4CJu2WHlwFBz4VPhnsjJK54no8BKH6HIHQ8SxLtzb3zjc3vo8rw+wkPjUCxlSuLi4DorznlrZxIBlu9USp1/R4VhCj7ErrTNx/cWGJiv/mQkYk/9f+wZwTQF1LZX/QM6QEMfg99oDSq5e+09drXgZwck3h/vUoa3FLm8wlztalrX92wA7wwhTLHyIB7vqy8pYtmk9IeKFuq9wLq11Yi8pmWW8IrVQYjXor36K47sAze2BboWQr9hbnlXPyatpnvH3lvm88TJJjDTXeKew3w2zsvGJ875aGYXrwACUNIdxhfrwePxB1e5tVVsizo+xsTH+ZBz5m+TTHrgx+RNPuwxC06imf4GgwwBJRkQxVoxywwg9SUmkutrSXeqLdtww+Vt7uIRplnewiuOs8K7OQ5JLfgLFvTAGmgMy6JvVm1ATHJsNdWlvxMrX9HgeeYuGnCYto/w5O/Qwl5SzYhaE+NTZ8DrrXBmdp52m6I7i+vPiB9z91y3aJ6cE0JjRAf8yIAt9tCNIehV5c9elyGT0hFEilgqNqJIO3DTwbdXave108Md0jX/oxPBI1R7jWUcm4HYZUrKQ6KwrV5p+R3UwrEWOK7bx7GYasJAlWTvFavSbuaGaCJ+6rK88eVgu9EqssTRZUOSzmlzUmibNgaPRYdiUhDee+s2ilhVzlAlcy2Cbs2ejIVtk3Y8XTxbBJom3eUyQAcji6luPWrmkPpFdTt6Yi7mdOje3Qs+8HsmtwB65vmhy9xFEBLuUklQdOYhkyok7D9l3G7Z34Qsh9s1ijsOz2urp1b08teXPFx0qyUGy7FgSuoXSYCsksZNcwk98mNd8tC5HBvo04W3pTDJJIgAvq88uzURcZ9saI+BxAAceQZnhbPY81j262bxixHHHNXz92bcnrQGDtRkJ2DAU740TOhC WXQ2ZVVV jlz/bW3gYeA1gxZzPySeICHqGHHgNK1GVio9SBfB0ki6VYY7kH36EEgoagu3Aoo5+tn925e4uw65ZlyDkr1g1CTO+pNWXWEf9wQrbXxUzzaC6stail+g7POtUDDAjjZivH7nviuJZq/vy6Veo5KRthLINmrP1B14LSbaV X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song Commit 13ddaf26be32 ("mm/swap: fix race when skipping swapcache") supports one entry only, to support large folio swap-in, we need to handle multiple swap entries. Cc: Kairui Song Cc: "Huang, Ying" Cc: David Hildenbrand Cc: Chris Li Cc: Hugh Dickins Cc: Johannes Weiner Cc: Matthew Wilcox (Oracle) Cc: Michal Hocko Cc: Minchan Kim Cc: Yosry Ahmed Cc: Yu Zhao Cc: SeongJae Park Signed-off-by: Barry Song --- include/linux/swap.h | 1 + mm/swap.h | 1 + mm/swapfile.c | 118 ++++++++++++++++++++++++++----------------- 3 files changed, 74 insertions(+), 46 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index d6ab27929458..22105f0fe2d4 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -480,6 +480,7 @@ extern int add_swap_count_continuation(swp_entry_t, gfp_t); extern void swap_shmem_alloc(swp_entry_t); extern int swap_duplicate(swp_entry_t); extern int swapcache_prepare(swp_entry_t); +extern int swapcache_prepare_nr(swp_entry_t entry, int nr); extern void swap_free(swp_entry_t); extern void swap_nr_free(swp_entry_t entry, int nr_pages); extern void swapcache_free_entries(swp_entry_t *entries, int n); diff --git a/mm/swap.h b/mm/swap.h index fc2f6ade7f80..1cec991efcda 100644 --- a/mm/swap.h +++ b/mm/swap.h @@ -42,6 +42,7 @@ void delete_from_swap_cache(struct folio *folio); void clear_shadow_from_swap_cache(int type, unsigned long begin, unsigned long end); void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry); +void swapcache_clear_nr(struct swap_info_struct *si, swp_entry_t entry, int nr); struct folio *swap_cache_get_folio(swp_entry_t entry, struct vm_area_struct *vma, unsigned long addr); struct folio *filemap_get_incore_folio(struct address_space *mapping, diff --git a/mm/swapfile.c b/mm/swapfile.c index 244106998a69..bae1b8165b11 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -3309,7 +3309,7 @@ void si_swapinfo(struct sysinfo *val) } /* - * Verify that a swap entry is valid and increment its swap map count. + * Verify that nr swap entries are valid and increment their swap map count. * * Returns error code in following case. * - success -> 0 @@ -3319,66 +3319,76 @@ void si_swapinfo(struct sysinfo *val) * - swap-cache reference is requested but the entry is not used. -> ENOENT * - swap-mapped reference requested but needs continued swap count. -> ENOMEM */ -static int __swap_duplicate(swp_entry_t entry, unsigned char usage) +static int __swap_duplicate_nr(swp_entry_t entry, int nr, unsigned char usage) { struct swap_info_struct *p; struct swap_cluster_info *ci; unsigned long offset; - unsigned char count; - unsigned char has_cache; - int err; + unsigned char count[SWAPFILE_CLUSTER]; + unsigned char has_cache[SWAPFILE_CLUSTER]; + int err, i; p = swp_swap_info(entry); offset = swp_offset(entry); ci = lock_cluster_or_swap_info(p, offset); - count = p->swap_map[offset]; - - /* - * swapin_readahead() doesn't check if a swap entry is valid, so the - * swap entry could be SWAP_MAP_BAD. Check here with lock held. - */ - if (unlikely(swap_count(count) == SWAP_MAP_BAD)) { - err = -ENOENT; - goto unlock_out; - } - - has_cache = count & SWAP_HAS_CACHE; - count &= ~SWAP_HAS_CACHE; - err = 0; - - if (usage == SWAP_HAS_CACHE) { + for (i = 0; i < nr; i++) { + count[i] = p->swap_map[offset + i]; - /* set SWAP_HAS_CACHE if there is no cache and entry is used */ - if (!has_cache && count) - has_cache = SWAP_HAS_CACHE; - else if (has_cache) /* someone else added cache */ - err = -EEXIST; - else /* no users remaining */ + /* + * swapin_readahead() doesn't check if a swap entry is valid, so the + * swap entry could be SWAP_MAP_BAD. Check here with lock held. + */ + if (unlikely(swap_count(count[i]) == SWAP_MAP_BAD)) { err = -ENOENT; + goto unlock_out; + } - } else if (count || has_cache) { - - if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX) - count += usage; - else if ((count & ~COUNT_CONTINUED) > SWAP_MAP_MAX) - err = -EINVAL; - else if (swap_count_continued(p, offset, count)) - count = COUNT_CONTINUED; - else - err = -ENOMEM; - } else - err = -ENOENT; /* unused swap entry */ + has_cache[i] = count[i] & SWAP_HAS_CACHE; + count[i] &= ~SWAP_HAS_CACHE; + err = 0; + + if (usage == SWAP_HAS_CACHE) { + + /* set SWAP_HAS_CACHE if there is no cache and entry is used */ + if (!has_cache[i] && count[i]) + has_cache[i] = SWAP_HAS_CACHE; + else if (has_cache[i]) /* someone else added cache */ + err = -EEXIST; + else /* no users remaining */ + err = -ENOENT; + } else if (count[i] || has_cache[i]) { + + if ((count[i] & ~COUNT_CONTINUED) < SWAP_MAP_MAX) + count[i] += usage; + else if ((count[i] & ~COUNT_CONTINUED) > SWAP_MAP_MAX) + err = -EINVAL; + else if (swap_count_continued(p, offset + i, count[i])) + count[i] = COUNT_CONTINUED; + else + err = -ENOMEM; + } else + err = -ENOENT; /* unused swap entry */ - if (!err) - WRITE_ONCE(p->swap_map[offset], count | has_cache); + if (err) + break; + } + if (!err) { + for (i = 0; i < nr; i++) + WRITE_ONCE(p->swap_map[offset + i], count[i] | has_cache[i]); + } unlock_out: unlock_cluster_or_swap_info(p, ci); return err; } +static int __swap_duplicate(swp_entry_t entry, unsigned char usage) +{ + return __swap_duplicate_nr(entry, 1, usage); +} + /* * Help swapoff by noting that swap entry belongs to shmem/tmpfs * (in which case its reference count is never incremented). @@ -3417,17 +3427,33 @@ int swapcache_prepare(swp_entry_t entry) return __swap_duplicate(entry, SWAP_HAS_CACHE); } -void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry) +int swapcache_prepare_nr(swp_entry_t entry, int nr) +{ + return __swap_duplicate_nr(entry, nr, SWAP_HAS_CACHE); +} + +void swapcache_clear_nr(struct swap_info_struct *si, swp_entry_t entry, int nr) { struct swap_cluster_info *ci; unsigned long offset = swp_offset(entry); - unsigned char usage; + unsigned char usage[SWAPFILE_CLUSTER]; + int i; ci = lock_cluster_or_swap_info(si, offset); - usage = __swap_entry_free_locked(si, offset, SWAP_HAS_CACHE); + for (i = 0; i < nr; i++) + usage[i] = __swap_entry_free_locked(si, offset + i, SWAP_HAS_CACHE); unlock_cluster_or_swap_info(si, ci); - if (!usage) - free_swap_slot(entry); + for (i = 0; i < nr; i++) { + if (!usage[i]) { + free_swap_slot(entry); + entry.val++; + } + } +} + +void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry) +{ + swapcache_clear_nr(si, entry, 1); } struct swap_info_struct *swp_swap_info(swp_entry_t entry)