Message ID | 20250214093015.51024-1-21cnbao@gmail.com (mailing list archive) |
---|---|
Headers | show
Return-Path: <owner-linux-mm@kvack.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D47DC021A4 for <linux-mm@archiver.kernel.org>; Fri, 14 Feb 2025 09:30:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 02DFA6B0085; Fri, 14 Feb 2025 04:30:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EFA7B6B0088; Fri, 14 Feb 2025 04:30:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D7345280001; Fri, 14 Feb 2025 04:30:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B1BDE6B0085 for <linux-mm@kvack.org>; Fri, 14 Feb 2025 04:30:33 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 6564B141E69 for <linux-mm@kvack.org>; Fri, 14 Feb 2025 09:30:33 +0000 (UTC) X-FDA: 83118029946.09.06EFCAB Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) by imf21.hostedemail.com (Postfix) with ESMTP id 8D43C1C000E for <linux-mm@kvack.org>; Fri, 14 Feb 2025 09:30:31 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=XHjMapTx; spf=pass (imf21.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.171 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739525431; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=DduguDwADVTSjMMK9ejwaGtXsiipFClLhmxdmWDmii0=; b=uvtEi2a9de4lUWCgzjWk7s9hzPsEWr5BaikZzrffb29thWEt8Kzt3puls/J3HAWw4kcrZp mKU8Z/X9l07fHGFQDYZfLdX5biIcUeaIVOE9oam+MZKMARmFXyY2p15dK3cpo46cessqD3 MshGL9tJurG8a9F1fDxciqIkzw7J3pU= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=XHjMapTx; spf=pass (imf21.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.171 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739525431; a=rsa-sha256; cv=none; b=Ps7edDT7mUkFwGUCuwILa6oqflBFUF0dSiV5xZexh6R8O5oWGmRiD4Zng2S0GPixtpOjm0 DjO6ay7PzbwTBdTTsWDzZPNLF0O6XSstzux6SidqCTQhJHA8jAtq2HHOrEXKvvl2N9qU5D xyRicR17pDYzD7xjuvbZZysG9GopICw= Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-21f78b1fb7dso31370895ad.3 for <linux-mm@kvack.org>; Fri, 14 Feb 2025 01:30:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739525430; x=1740130230; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=DduguDwADVTSjMMK9ejwaGtXsiipFClLhmxdmWDmii0=; b=XHjMapTxOFuJux36UpUKny7enDZ5APwjdUPSDFCtkaqREmYaXMqFC2OSyXkbKgQrnk IRhAR7oWUXdthb2iz4ZYlDCUO01E85Ibtn84qk9all4Im+JZyudGFfuHAaHyJXyuv7CV 4iWTOj0Y3I6dnrC8rVyoD8ku1JvCQuzf/BQk9SVXGVuRd7oGqDIqizPu2/1NKn+KqDJF MjH8NCMlKoPictkvYvKDmcT8tgdGoGGI+ZDTxng//DSW0ExeaVh5IiTg+IlxCMLTiwG1 43iYC4SL9boV6TUab74Oq3s1vain5Lb+j4z+SCyLcSQ5UEsGhxtqg7zJ4195bzpdR7+q VcFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739525430; x=1740130230; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=DduguDwADVTSjMMK9ejwaGtXsiipFClLhmxdmWDmii0=; b=RJWzI30B28VeKPUClOKflAr2QwRyBWtYX5XfitfBgyfSN9kjElj8RQw7qce+wBZuA/ 8I4yyxSJJJ+MYMggqoXYvn85pLt7Q/G4QlUSgsk2Xj/qTug22UK0CjNYvk/o+Tuw6IhE 1UdKiOGffyrryTJlsDEqSvwiUIwuI1qH5A8wXy0gyDQU68p9XKY09zCrmQNHmg7Izs+5 TeeV3HN9uivJG6bqPQyf2Gig42qXXnyCwlqXk2CZgapxquCevdhM+YEPawKm1v+iR2x7 Skljvh7u53xkgX3Uhm9X6KOtL5K1q44YJHTR+6xzIPrvZbOOErBrktmOmDSM8YTnO/bC btsQ== X-Forwarded-Encrypted: i=1; AJvYcCWJncbFbbNxa+T7Wx2AROggzhTKrKPS10xk4Ct3wn1E7iH139ygI0SeWRZeM1Jwn1KF7tuINZp3Ng==@kvack.org X-Gm-Message-State: AOJu0Yw0cCkIsbEOS0AjTuTXkr7eUMqS4eJhE3GPAFZaPWHb8Q5J6VAM tPWDoU/1kcSRNZ5kC1JyvAShwTf5klxB6eihPIacPkha7IKFZ0QY X-Gm-Gg: ASbGncuzXGQb6Lc4I40S+V7BQ9nhT6fGuyYEf5nyaSZGC4hFGXbWt5sDDmhFFTUfH9F xawRH15/zyG4i0VJct/MiX8T4/nGpxEWPd4KUewIuEqD+6gT9JNSYTwazkaqQlF9KvwxZRbdvn1 W0Sz/8FjC1kw+l6Vg9+bDdkYM0JK6AlkLbq1v5OuXUq1lNYnZ9u50mof8d105IBNt8w0oWDmOeP 6EXum3pooffQVp81sZMU8j0MIoSvMVkZrV+EiLC9hM+nJh4XXZW/v9wNfAEkrJCKGuudg+8/NpS KOazJhDrzjjunExDWuiXSY4a/zrCAZo= X-Google-Smtp-Source: AGHT+IFysR9EXVj8qAfuBiPky9qxf+cg8EqLe8Kd8ETuM6oFy6cIZ+w4UdkVArLA8+ej6cCexdZqiA== X-Received: by 2002:a17:902:ce8b:b0:216:4883:fb43 with SMTP id d9443c01a7336-220bbc8f747mr173264085ad.32.1739525430357; Fri, 14 Feb 2025 01:30:30 -0800 (PST) Received: from Barrys-MBP.hub ([118.92.30.135]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-220d545c814sm25440515ad.148.2025.02.14.01.30.24 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 14 Feb 2025 01:30:29 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: 21cnbao@gmail.com, baolin.wang@linux.alibaba.com, chrisl@kernel.org, david@redhat.com, ioworker0@gmail.com, kasong@tencent.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, lorenzo.stoakes@oracle.com, ryan.roberts@arm.com, v-songbaohua@oppo.com, x86@kernel.org, ying.huang@intel.com, zhengtangquan@oppo.com Subject: [PATCH v4 0/4] mm: batched unmap lazyfree large folios during reclamation Date: Fri, 14 Feb 2025 22:30:11 +1300 Message-Id: <20250214093015.51024-1-21cnbao@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 8D43C1C000E X-Stat-Signature: ei7aoumyuupuos49u1zhk3u5xxkns7aa X-HE-Tag: 1739525431-599769 X-HE-Meta: U2FsdGVkX1/FNB79D+A8NY8wt3Wtk/Fq0h51oDeUVfRZZhjhsG8YLL9/fP5ac9/UPt+5l+hpAUu7SjfSUI89VqBR/5Ey+8qp2fH/F89/E5XSNUowU4nu8H6zYWadXq/wflonJDCccsoFm7GSC8Iq6qcXUJINL//1ZrBDSHXo/Q3OzvjSKAI5R4LYf6CsxatI3OGau0dq+Wg+2CNOqXx0v5YZiGK4wuD4nZuWIJMygdF9WTtmwpfYrNZ0UhJq5nRVfJhSnbEk5KBXtXQHEXmMJt68x8kCqBt35SdL6epcmcqc1L7k+TZbRbe2k61zsHcdX9yrwS4Y5D3gkArKMFKkLmbFnHoiZo6s59q1BaHHockB/XyOU2hPAnZVvO+OePnVT8Sb3V+pMkbkWgeAkFMwD7hCoF9HGMI0NSCYGDJJXgfVxfu0Ijogy7mDHSyo+JcxhxEKvDJxTiZ04GTYPQw0I8VQeuT1oio5APA9QI/h4DyAvcErhV9MFpH+ztQqfV9NOgMgzGDTHPpKdxb86Ll4ELD/MxWueHbXP8KztX5Mg0kYh3sccqqDZAts6YuntGvM75QS3uf2r3zl8sTAkUAqS+X5zz860IK+oR6g9DwcEx9CUTJNeLmU3EOpG7ZiWK4qjXRrLCHCjlffJEx+dpOWPCpVtEP9PcO5jSiu1fLJJe1cP83tkNQEwDVMNyCBsvSN5XKPdw6cRal6acdY9xxcz94wuUbqFKVkJ6crW3ABJ1xTKy2otJdvcF8D1gfjb6A+HOJKIkda3uaRTjEaJYpGPG0TSy6YZKopxoJVbD2XaJnSRbzGpbqeC2pVlayyB/b24zke/lpF4RwYRiSujIgnL7WHnY/yrUZAqt82Atd5nsgK5mWCi6mhmdjuaH2dwJ+M3kFqRH2dw/WOaUxKq9T5OLAboMtoMsey1nZG/TGRlJf4uMi+K6rUNxZQ1XE0LVsc1r42L9yECtdC0dYjAf/ PFcXL6RU gaIG/4VJN9GEf8XEdeExYW9Tty46buZM4fBcXQmO0Haq2D/hd2C3SoJRKTXkksAtm0spX7TsgSTGU1YvAyCxLgJeAt2RpC1aVPKWL1plTbdkGoBpfwAX6MgBg+qMQa8Ad33ui1BDMVt28JPV0/EsnaQmJTqNc5RskSjj38kfYOE6dWHSJaA7P2uuq5+VYjA/5XHjbRU5O4RLVA0w4W/kIhxZgENom+xw1RZvKN2w0iLvWwJUdaRXeHhin///z/fjG7omwJ0cSYHhb3hORyXV/Ahat1sylEo0NZMLPnXQ+BwbN0j8gGTXMRhP8M2cLJSg8wl+CxdzWO0gyadxhkYXGY5Ufdb8EqUdLQnYCKx1k2GeOMnITLEaq2aykz+310nFYn//qIVSbVzlFiIy9RAEW8DZeXEHxDD7cUOr5YpiX2Gh3PaEnDc64HNKr0xb42Xtu6EYVqADxX1WzmKu7oGIAVIt+bhxd4L7V8AGh X-Bogosity: Ham, tests=bogofilter, spamicity=0.000022, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: <linux-mm.kvack.org> List-Subscribe: <mailto:majordomo@kvack.org> List-Unsubscribe: <mailto:majordomo@kvack.org> |
Series |
mm: batched unmap lazyfree large folios during reclamation
|
expand
|
From: Barry Song <v-songbaohua@oppo.com> Commit 735ecdfaf4e8 ("mm/vmscan: avoid split lazyfree THP during shrink_folio_list()") prevents the splitting of MADV_FREE'd THP in madvise.c. However, those folios are still added to the deferred_split list in try_to_unmap_one() because we are unmapping PTEs and removing rmap entries one by one. Firstly, this has rendered the following counter somewhat confusing, /sys/kernel/mm/transparent_hugepage/hugepages-size/stats/split_deferred The split_deferred counter was originally designed to track operations such as partial unmap or madvise of large folios. However, in practice, most split_deferred cases arise from memory reclamation of aligned lazyfree mTHPs as observed by Tangquan. This discrepancy has made the split_deferred counter highly misleading. Secondly, this approach is slow because it requires iterating through each PTE and removing the rmap one by one for a large folio. In fact, all PTEs of a pte-mapped large folio should be unmapped at once, and the entire folio should be removed from the rmap as a whole. Thirdly, it also increases the risk of a race condition where lazyfree folios are incorrectly set back to swapbacked, as a speculative folio_get may occur in the shrinker's callback. deferred_split_scan() might call folio_try_get(folio) since we have added the folio to split_deferred list while removing rmap for the 1st subpage, and while we are scanning the 2nd to nr_pages PTEs of this folio in try_to_unmap_one(), the entire mTHP could be transitioned back to swap-backed because the reference count is incremented, which can make "ref_count == 1 + map_count" within try_to_unmap_one() false. /* * The only page refs must be one from isolation * plus the rmap(s) (dropped by discard:). */ if (ref_count == 1 + map_count && (!folio_test_dirty(folio) || ... (vma->vm_flags & VM_DROPPABLE))) { dec_mm_counter(mm, MM_ANONPAGES); goto discard; } This patchset resolves the issue by marking only genuinely dirty folios as swap-backed, as suggested by David, and transitioning to batched unmapping of entire folios in try_to_unmap_one(). Consequently, the deferred_split count drops to zero, and memory reclamation performance improves significantly — reclaiming 64KiB lazyfree large folios is now 2.5x faster(The specific data is embedded in the changelog of patch 3/4). By the way, while the patchset is primarily aimed at PTE-mapped large folios, Baolin and Lance also found that try_to_unmap_one() handles lazyfree redirtied PMD-mapped large folios inefficiently — it splits the PMD into PTEs and iterates over them. This patchset removes the unnecessary splitting, enabling us to skip redirtied PMD-mapped large folios 3.5X faster during memory reclamation. (The specific data can be found in the changelog of patch 4/4). -v4: * collect reviewed-by of Kefeng, Baolin, Lance, thanks! * rebase on top of David's "mm: fixes for device-exclusive entries (hmm)" patchset v2: https://lore.kernel.org/all/20250210193801.781278-1-david@redhat.com/ -v3: https://lore.kernel.org/all/20250115033808.40641-1-21cnbao@gmail.com/ * collect reviewed-by and acked-by of Baolin, David, Lance and Will. thanks! * refine pmd-mapped THP lazyfree code per Baolin and Lance. * refine tlbbatch deferred flushing range support code per David. -v2: https://lore.kernel.org/linux-mm/20250113033901.68951-1-21cnbao@gmail.com/ * describle backgrounds, problems more clearly in cover-letter per Lorenzo Stoakes; * also handle redirtied pmd-mapped large folios per Baolin and Lance; * handle some corner cases such as HWPOSION, pte_unused; * riscv and x86 build issues. -v1: https://lore.kernel.org/linux-mm/20250106031711.82855-1-21cnbao@gmail.com/ Barry Song (4): mm: Set folio swapbacked iff folios are dirty in try_to_unmap_one mm: Support tlbbatch flush for a range of PTEs mm: Support batched unmap for lazyfree large folios during reclamation mm: Avoid splitting pmd for lazyfree pmd-mapped THP in try_to_unmap arch/arm64/include/asm/tlbflush.h | 23 +++-- arch/arm64/mm/contpte.c | 2 +- arch/riscv/include/asm/tlbflush.h | 3 +- arch/riscv/mm/tlbflush.c | 3 +- arch/x86/include/asm/tlbflush.h | 3 +- mm/huge_memory.c | 24 ++++-- mm/rmap.c | 136 ++++++++++++++++++------------ 7 files changed, 115 insertions(+), 79 deletions(-)