From patchwork Mon Mar 4 08:13:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13580148 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55AECC54E41 for ; Mon, 4 Mar 2024 08:14:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8FE2A6B009C; Mon, 4 Mar 2024 03:14:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8ADA26B009E; Mon, 4 Mar 2024 03:14:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 74EE56B009F; Mon, 4 Mar 2024 03:14:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 63AAF6B009C for ; Mon, 4 Mar 2024 03:14:21 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id F056B140971 for ; Mon, 4 Mar 2024 08:14:20 +0000 (UTC) X-FDA: 81858644280.15.18BA3B0 Received: from mail-pf1-f177.google.com (mail-pf1-f177.google.com [209.85.210.177]) by imf03.hostedemail.com (Postfix) with ESMTP id 4DB0F20008 for ; Mon, 4 Mar 2024 08:14:18 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="mT/8J1Fy"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.177 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709540058; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=ZkZynPjlnCna4NEvK+MO2EuanxrrVGp4epakxir9YQY=; b=oGz79yFF7AYnCEVcWSC139zYYDEf7+jdAWaRteHfdO0lT9AxTS2NjzPsPZqWZVAa4hoYHY rweIh4Xwq+L6RLR1B2jUZEb6s3JDiF9+Rk0yphHI+2I73fhKjlKYOVljfpse0nGhCTMZtd Q2KLLMxolObGmMCmHgYhdzSLbUR8wKM= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="mT/8J1Fy"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.177 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709540058; a=rsa-sha256; cv=none; b=wE/a+v5a6QLr0xlMkBChO0vrph9uyjj85R9wMqK+mPKWJ0NlEsEuDDixhnxVexeHcNlYqV NJI7YSg9lJ/ZMZGNYxonpqhzBD+Eydgq+2D87rG0HFoV8QWZzo2ZLLaYbpmGxBoYeEYPVy RAi3feeiSgfZ4GcOwf0WACpYSLjXzpg= Received: by mail-pf1-f177.google.com with SMTP id d2e1a72fcca58-6e5dddd3b95so1352003b3a.1 for ; Mon, 04 Mar 2024 00:14:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709540057; x=1710144857; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=ZkZynPjlnCna4NEvK+MO2EuanxrrVGp4epakxir9YQY=; b=mT/8J1FyCiLnvGhDIWMbCZT+dV8nk4AnyDKiHbmBRFyRo6CstlXZPgqTLTgzaRD5HJ Az1glzil0vwWNR4QV47S6t1Ep9HIKQwpRhTqDivlK251M6aAa0yjcMgWa6vwwZcATLKI B0/l9+ea56XoIb85SV5tQa5afOmOC3ZvzNPkivKaFaFrsqM062ts+jP4PGvwPj4dxrNo P74dHmG4woQG7YwBxzXI7yFLbAx3s+Zbu4DNDa2QxB8AX0vVnvxV1o51aCwY4D8uTsnd MyONsJL+5couNRQdg2C+ZKGP573IFh7AL9Slw6AjeQqzePCVzg6Xjj2Tv5hxqReqOCkr y4Xg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709540057; x=1710144857; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ZkZynPjlnCna4NEvK+MO2EuanxrrVGp4epakxir9YQY=; b=dFcKXHnLaF4vBLDNUEBjv2vy3aC1Z2fr1Ji334VrHYA/7uLJ3jaPjOnepKCkjcnkR/ 4maIwAmIp6sFvkXTa3ggHq0klXPVlFlX6QbW4T3KnVJ2lz6SSbxoTUNtPRz73Via21fS xK71+9qFPWtm2UY9R1EnEw6FLTNMk66quz7xwwVm6d+2+GUpAaH7RoqY6YBwjwECaNay EGIGNznT/wJ9NBO0EX16X64f/dLRLpdaB5ipx8fx/WfZUpFsT6/C6OLtn5vocPhbqdRV cbxmLmWYxYp8N3kiPoH7z88hy2Lytj4alU779IEH7Q2wMYxy6wBJQE4Z6U8pCjbM9Ro/ +1Yg== X-Forwarded-Encrypted: i=1; AJvYcCXgeFdoe4wtc9NfL4B8FhcuxYTsy17LttsrTuTLdV7e/ATAR1c99l02A6ER9TmMiIbRDKAoCH0HX3465I+KsmgOtAY= X-Gm-Message-State: AOJu0YygA4Xi+Pd/30jF1V0+k5gQFHnSM1cVNsvV2wyCQVlQhdAkhSYk erwlNmx2VCRjBWMyy+MU2bq+1WKpRhX7QiLX7RDI6cCqMi7lG5tC X-Google-Smtp-Source: AGHT+IG7cMnkkwJP2OiNWJzW62ieySJWgdBup5i1dm2d6VMw4AtK13Ku5h9JCBXdgBkpNlIRG71qVQ== X-Received: by 2002:a05:6a20:9143:b0:1a1:48df:d55c with SMTP id x3-20020a056a20914300b001a148dfd55cmr6499631pzc.0.1709540057031; Mon, 04 Mar 2024 00:14:17 -0800 (PST) Received: from localhost.localdomain ([2407:7000:8942:5500:aaa1:59ff:fe57:eb97]) by smtp.gmail.com with ESMTPSA id ka42-20020a056a0093aa00b006e558a67374sm6686387pfb.0.2024.03.04.00.14.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Mar 2024 00:14:16 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, linux-mm@kvack.org, ryan.roberts@arm.com Cc: chengming.zhou@linux.dev, chrisl@kernel.org, david@redhat.com, hannes@cmpxchg.org, kasong@tencent.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, mhocko@suse.com, nphamcs@gmail.com, shy828301@gmail.com, steven.price@arm.com, surenb@google.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yosryahmed@google.com, yuzhao@google.com, Barry Song Subject: [RFC PATCH v3 0/5] mm: support large folios swap-in Date: Mon, 4 Mar 2024 21:13:43 +1300 Message-Id: <20240304081348.197341-1-21cnbao@gmail.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Rspamd-Queue-Id: 4DB0F20008 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 8r8o5jfzjokajg58ja3u9aynmfqa8z51 X-HE-Tag: 1709540058-160662 X-HE-Meta: U2FsdGVkX1/X8Dgh3oQa7Ci6x2P9OKH7K1pzXtS1QCH6yEOKUZ0qywn7R50SFosEWZ8BZHoBD5eqyihJHXTQidnD4AfQqqSPOgXstmJUUapmiMJNf0isVG46WLmpwAtFYKpuHEJgGhRRY/V0umEPMsZV4pOPWvf630pyqkrPUfzf28cvMe2ExziqSZi02y8E0TERIotn20EcLeb324h5fdB0ANSTfLut7uVU7fFLsVRQu0qXlv6MGh96F322fhfnQDVgg4suZd4dOa+mfoTklGjkKZqYg3u8pE3BhP/+1oC4aBcCfmlaZElASdyhw8E/LojdAYWquL28ADLvKLFbd9R/LT6b8AcXxUghSfNAaIRjf2fbSXQ7WsL6pCpLKuZiImqZUgzbY1MGL5Pc9kiumu472cVaz9m/WlaB+Xf6KQWYoP5ikNYV/CZxPYr5pKJA+hdP2mV11IypdRrQCEhMuyeXHMkCc0BdSFW0QgrAmD5Qj00vtJomqI/0Itm5g6kZoXRURBTkRHDjG7j0iWjKJCbZvShiVuDRjNiiKsY9Vx44MO9W/JS9ui4TXECtQ98MQqmIV+ypSFCfVhadu9mjRAg3zX5xK8XeWoZwqIDQwbBRyhDTftHlVzkEDd4dYgzFuBLir98C0MzwP97Omz5PuQDe4eJ4ytuITGfsDiERwZnl8d9/4sVAQ/3v6DIQMfUDnWCK8FCnVhWIlfGIsQAEmWWkkVUebUminl9CpvLFKmg7eq4pD1L6lvhd4L468/cFh/jkkGxSdGUzVOC6/gTHJO2VG5CD9gf4+fmDvkE8SwCOo7X5LoW7OkcvM8OIbLgs920hRYP644fW7rI2iXCGllFWEMN8QpWF+hxvgr+mQKsd73YCd6QZV53yg15lGjd5dvz8jEqsrhC88JvyEzTRa/0cyQmDEaEUevtbz0+t+v48CaBgv1Rf+e1bnuIKlZgwXw7H5L/4dt742wXkuFE LQv8DGZx LwBnlGpZlTwATPaBhLxHczBOsoDv8GhdHrAfA7EYq2xI+jHZDSdksYEasWKSeqAQAjoeMKA+5Klkied6aNRfAWKei14AMgWuUdnOiCwW2/uHXYlrjLZJFgE4ZLg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song -v3: * avoid over-writing err in __swap_duplicate_nr, pointed out by Yosry, thanks! * fix the issue folio is charged twice for do_swap_page, separating alloc_anon_folio and alloc_swap_folio as they have many differences now on * memcg charing * clearing allocated folio or not -v2: https://lore.kernel.org/linux-mm/20240229003753.134193-1-21cnbao@gmail.com/ * lots of code cleanup according to Chris's comments, thanks! * collect Chris's ack tags, thanks! * address David's comment on moving to use folio_add_new_anon_rmap for !folio_test_anon in do_swap_page, thanks! * remove the MADV_PAGEOUT patch from this series as Ryan will intergrate it into swap-out series * Apply Kairui's work of "mm/swap: fix race when skipping swapcache" on large folios swap-in as well * fixed corrupted data(zero-filled data) in two races: zswap and a part of entries are in swapcache while some others are not in by checking SWAP_HAS_CACHE while swapping in a large folio -v1: https://lore.kernel.org/all/20240118111036.72641-1-21cnbao@gmail.com/#t On an embedded system like Android, more than half of anon memory is actually in swap devices such as zRAM. For example, while an app is switched to back- ground, its most memory might be swapped-out. Now we have mTHP features, unfortunately, if we don't support large folios swap-in, once those large folios are swapped-out, we immediately lose the performance gain we can get through large folios and hardware optimization such as CONT-PTE. In theory, we don't need to rely on Ryan's swap out patchset[1]. That is to say, before swap-out, if some memory were normal pages, but when swapping in, we can also swap-in them as large folios. But this might require I/O happen at some random places in swap devices. So we limit the large folios swap-in to those areas which were large folios before swapping-out, aka, swaps are also contiguous in swapdevice. On the other hand, in OPPO's product, we've deployed anon large folios on millions of phones[2]. we enhanced zsmalloc and zRAM to compress and decompress large folios as a whole, which help improve compression ratio and decrease CPU consumption significantly. In zsmalloc and zRAM we can save large objects whose original size are 64KiB for example (related patches are coming). So it is also a good choice for us to support swap-in large folios for those compressed large objects as a large folio can be decompressed all together. Note I am moving my previous "arm64: mm: swap: support THP_SWAP on hardware with MTE" to this series as it might help review. [1] [PATCH v3 0/4] Swap-out small-sized THP without splitting https://lore.kernel.org/linux-mm/20231025144546.577640-1-ryan.roberts@arm.com/ [2] OnePlusOSS / android_kernel_oneplus_sm8550 https://github.com/OnePlusOSS/android_kernel_oneplus_sm8550/tree/oneplus/sm8550_u_14.0.0_oneplus11 Barry Song (2): arm64: mm: swap: support THP_SWAP on hardware with MTE mm: swap: introduce swapcache_prepare_nr and swapcache_clear_nr for large folios swap-in Chuanhua Han (3): mm: swap: introduce swap_nr_free() for batched swap_free() mm: swap: make should_try_to_free_swap() support large-folio mm: support large folios swapin as a whole arch/arm64/include/asm/pgtable.h | 19 +-- arch/arm64/mm/mteswap.c | 43 ++++++ include/linux/huge_mm.h | 12 -- include/linux/pgtable.h | 2 +- include/linux/swap.h | 7 + mm/memory.c | 252 ++++++++++++++++++++++++++----- mm/page_io.c | 2 +- mm/swap.h | 1 + mm/swap_slots.c | 2 +- mm/swapfile.c | 153 +++++++++++++------ 10 files changed, 376 insertions(+), 117 deletions(-)