From patchwork Thu Feb 29 00:37:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13576144 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8CCF7C5478C for ; Thu, 29 Feb 2024 00:38:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0869A6B009C; Wed, 28 Feb 2024 19:38:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0372D6B009D; Wed, 28 Feb 2024 19:38:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E41AC6B009E; Wed, 28 Feb 2024 19:38:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id D56F56B009C for ; Wed, 28 Feb 2024 19:38:24 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 8E254140EC6 for ; Thu, 29 Feb 2024 00:38:24 +0000 (UTC) X-FDA: 81842980128.12.C579AF0 Received: from mail-pg1-f170.google.com (mail-pg1-f170.google.com [209.85.215.170]) by imf27.hostedemail.com (Postfix) with ESMTP id E34D44001A for ; Thu, 29 Feb 2024 00:38:22 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Pj31BupW; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf27.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.215.170 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709167102; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=PLlvZgGlSRzongjxq0MK8434n2l98HY21vL9Cchmy+s=; b=I7uF8kE+Lnl68KPpWuOHiOsks087jP/W6fv0GU9rPUm3hR4gkW5EzaPycqor3v2Ly9Mg7i MLvMztMsCJvUdCuyqWGpJF+vP3kh9ktSZi81U+oLjCGZdYo68elNTKUOv9dOEjzdGPgg9j 3BoKb1go809iUMzPENo8/2AYMVAUb90= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Pj31BupW; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf27.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.215.170 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709167103; a=rsa-sha256; cv=none; b=1i2zOVPl4pajZqu0oey/2sMRJlBCSoC9OuxcQ9GYDr8oHAnr+nxbkNTQmutl/GXkC/it4g IoZ9fcY7FfcdhzlEmaj9Sp0Eq3p8MMe37S134jVWFIKSw6KxkziIWW30ynWFkUTyQNII9m pnoiFgbBdtfycesKyeGdPlPGdd/78Zw= Received: by mail-pg1-f170.google.com with SMTP id 41be03b00d2f7-5d8b276979aso238983a12.2 for ; Wed, 28 Feb 2024 16:38:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709167102; x=1709771902; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=PLlvZgGlSRzongjxq0MK8434n2l98HY21vL9Cchmy+s=; b=Pj31BupWzoe3dzVCzlT7MliKVSs6ggr8Upthbg1BSKUDQ1/vdooxZaoZ+gzmoQqktX k2H5PLbe8PD16DsU5Dbvd17jBSApk7qCzeF4DGjTmSNpQGNOkke87qvTrHZVtaY6VlKw MNe874rD6/nHILXaeVol9MlNBQ52yYkg4v0A+BLXN8OQk/WOVgo5GxEulkUO3Hp6rRNh if6YQW/yEGR98n2Vr5eyFdv3WqZ7HyLYS4kSaxsrwnUqMFq/8Y3JkGbmA+dXNeQ8YzAh YY3++9QZnqziYVrFFHQNeUJzhMfv+ptq6uvHV70CuvICMdo9o6lLbzF65UDjydP30Ocz KdDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709167102; x=1709771902; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=PLlvZgGlSRzongjxq0MK8434n2l98HY21vL9Cchmy+s=; b=tJAb3w64Ia9wGHRbO2Q/yOW2+6/EVSkixYd0Vd+UKB70EXoqaTaNFMoLE2iR3l4D1q zHy8lfHxhTu2iMh8zSkTIV6SG/O2qfC4Wwmo4XMRblc9rvzAXsoFko7KOiLcpMAr7lNF YVNRCdbWHIxfMHstTZmT8Sv6Rp95Vp2D5ppMePKFZu3RF3TYvoGrphDlVFWMJXZPTS8T l32PYb4IMXX37WxhwCKWQTvwqMiJ9xds1OkTDyj6PXBxDhWjcKj2IoObwAhwPnD6M22g MhnIQhFQ0dpFr2cY67Na91CVxdOKBm+gkxQ0YNLh8ncrGUJ3gjNu4HDIh1aCJ0xSZq0e KBPQ== X-Forwarded-Encrypted: i=1; AJvYcCX86/JnboEC3Q2SFls5WxS2aCsKZiTg55cl/hx6W0g19o91fQLUZLxVDZjQD4Q04kKEo9/uW+TVFWd6yRYbAYt/qtU= X-Gm-Message-State: AOJu0Yx+RriaPXmsmXM1vX+WkX2zw59IX3x486aqOmhyurUx1Qcz2Q++ mFXx7tuc4lABxZDGoebJi4T6bNOLjPjcm03H8vonW9AKmvpyaovK X-Google-Smtp-Source: AGHT+IEd9ZL6V0bBtXZY0UKUADZ6hFH+KIkxlAwsFnjMdxe4pljodJu/eu75SFY+Vu1uM5BOFc05gQ== X-Received: by 2002:a05:6a21:394b:b0:19e:9da6:c73b with SMTP id ac11-20020a056a21394b00b0019e9da6c73bmr1382377pzc.8.1709167101626; Wed, 28 Feb 2024 16:38:21 -0800 (PST) Received: from localhost.localdomain ([2407:7000:8942:5500:5158:ed66:78b3:7fda]) by smtp.gmail.com with ESMTPSA id p3-20020a170902780300b001d9641003cfsm62647pll.142.2024.02.28.16.38.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Feb 2024 16:38:21 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org, ryan.roberts@arm.com, chrisl@kernel.org Cc: 21cnbao@gmail.com, linux-kernel@vger.kernel.org, mhocko@suse.com, shy828301@gmail.com, steven.price@arm.com, surenb@google.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yuzhao@google.com, kasong@tencent.com, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, hannes@cmpxchg.org, linux-arm-kernel@lists.infradead.org, Barry Song Subject: [PATCH RFC v2 0/5] mm: support large folios swap-in Date: Thu, 29 Feb 2024 13:37:48 +1300 Message-Id: <20240229003753.134193-1-21cnbao@gmail.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Rspamd-Queue-Id: E34D44001A X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: qgpgtw6h6iwbngndsyd9mxrxpjwgczta X-HE-Tag: 1709167102-408173 X-HE-Meta: U2FsdGVkX18pmNTCr2plzUPXGrTGvZsX0q1mdiwetahuL3wBaelcHRg4Tv/KogOiqfrXUWBsTYk+Fg2DOFWxdvu9UNXS0eyGZNHVwGoVT5PVJSaz++9bF5gEM9vaKwZldI+ewugQg1h9gHonksZvQT+CN3R+NhNPoG+UPAYLh3b/iqmnt6ShrwqTmM/JJjU5TEBZQzp4kPOpyDB/FBaLFHps+v7bE5DIZb7DqByE8Z9IZad8wP/flOLJXl8gPhmHqMq/9Eb1VIN38K4dXzVct/477AAIR+vMxpGU8do5JNJ909kc8vCy5fpac2+IcI1KHsmIf035nYg0YDktShcqvPYgEsqtMupVtbWkER31yXukVO+C7gzKSSr53mLi1jQrjvziK5DW9rK+SijLMniNlCCiyE780nFEWeVaWQretn1F2v97Qh98ZNYI6tRXwVnlHcoI6p5OeT8PV2ms5KdMmJMYgg+OSg8RWZUs1Ml3KoOrVNt8zcBS0Z98YdX0eAJaOPsDVHA/SGfiSK0VDBiUwmHJ16PiyYlpQGkVZk9rPEqAQhHQNuxiWVKwxG2f7/TjtQw8Zgl9q2TVO8en+IwskcMjxQDqgIEpMrkl0htkX4re4/6585mY41RNV6lK9YKuGN9Z027bqW73j21czh/fcEX3Eup25FM1geGA1NdKvL/UE7TppPt/CJKgXnhyMGRXX/DsxiNNW9zagg5A1Sun1kL01b9Kx629ARjh+6idVYCAkQhr6whwdRhEiugBsbZeItz7JkF5qcjGyOsZCS4qwB36BQS18Sc2Zrm5EV4IwkpKcHwcf5WokXBeZTpGPKz9hGt6njJcRw/WsbcigficvewVpAhbpJojhgjR8luZCWmuIsVTiInZwHzQHsgbFwjOW1UwEUwgGQAFMaxSaL7FfeyzU7WkE+57+JAbZ+GpmESs/lIdMAoCPdW/GrTootUGfgBT62oh8qqyMzCMtiN 0Uxf8+M7 cpTo9j9kKxXrMDWIb4d4ndlSseiWZI0Sk3cQDDUmRJ5WRVLbw3wGul64/C3wxj/znH+Hz8IlGvGNCSxFN6pCenCHBe4KvOpMuDrUvOKUuPbqv3oQ23pqeodBj+w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song -v2: * lots of code cleanup according to Chris's comments, thanks! * collect Chris's ack tags, thanks! * address David's comment on moving to use folio_add_new_anon_rmap for !folio_test_anon in do_swap_page, thanks! * remove the MADV_PAGEOUT patch from this series as Ryan will intergrate it into swap-out series * Apply Kairui's work of "mm/swap: fix race when skipping swapcache" on large folios swap-in as well * fixed corrupted data(zero-filled data) in two races: zswap and a part of entries are in swapcache while some others are not in by checking SWAP_HAS_CACHE while swapping in a large folio -v1: https://lore.kernel.org/all/20240118111036.72641-1-21cnbao@gmail.com/#t On an embedded system like Android, more than half of anon memory is actually in swap devices such as zRAM. For example, while an app is switched to back- ground, its most memory might be swapped-out. Now we have mTHP features, unfortunately, if we don't support large folios swap-in, once those large folios are swapped-out, we immediately lose the performance gain we can get through large folios and hardware optimization such as CONT-PTE. In theory, we don't need to rely on Ryan's swap out patchset[1]. That is to say, before swap-out, if some memory were normal pages, but when swapping in, we can also swap-in them as large folios. But this might require I/O happen at some random places in swap devices. So we limit the large folios swap-in to those areas which were large folios before swapping-out, aka, swaps are also contiguous in swapdevice. On the other hand, in OPPO's product, we've deployed anon large folios on millions of phones[2]. we enhanced zsmalloc and zRAM to compress and decompress large folios as a whole, which help improve compression ratio and decrease CPU consumption significantly. In zsmalloc and zRAM we can save large objects whose original size are 64KiB for example (related patches are coming). So it is also a good choice for us to support swap-in large folios for those compressed large objects as a large folio can be decompressed all together. Note I am moving my previous "arm64: mm: swap: support THP_SWAP on hardware with MTE" to this series as it might help review. [1] [PATCH v3 0/4] Swap-out small-sized THP without splitting https://lore.kernel.org/linux-mm/20231025144546.577640-1-ryan.roberts@arm.com/ [2] OnePlusOSS / android_kernel_oneplus_sm8550 https://github.com/OnePlusOSS/android_kernel_oneplus_sm8550/tree/oneplus/sm8550_u_14.0.0_oneplus11 Barry Song (2): arm64: mm: swap: support THP_SWAP on hardware with MTE mm: swap: introduce swapcache_prepare_nr and swapcache_clear_nr for large folios swap-in Chuanhua Han (3): mm: swap: introduce swap_nr_free() for batched swap_free() mm: swap: make should_try_to_free_swap() support large-folio mm: support large folios swapin as a whole arch/arm64/include/asm/pgtable.h | 19 +-- arch/arm64/mm/mteswap.c | 43 +++++++ include/linux/huge_mm.h | 12 -- include/linux/pgtable.h | 2 +- include/linux/swap.h | 7 ++ mm/memory.c | 193 +++++++++++++++++++++++++------ mm/page_io.c | 2 +- mm/swap.h | 1 + mm/swap_slots.c | 2 +- mm/swapfile.c | 152 ++++++++++++++++-------- 10 files changed, 319 insertions(+), 114 deletions(-)