From patchwork Tue Jan 2 17:53:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kairui Song X-Patchwork-Id: 13509225 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2987C47073 for ; Tue, 2 Jan 2024 17:53:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 331BD6B00DA; Tue, 2 Jan 2024 12:53:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2E0C86B00DC; Tue, 2 Jan 2024 12:53:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 182126B00DD; Tue, 2 Jan 2024 12:53:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 0533F6B00DA for ; Tue, 2 Jan 2024 12:53:56 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id C6B0EC083E for ; Tue, 2 Jan 2024 17:53:55 +0000 (UTC) X-FDA: 81635119230.14.B6B92DE Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by imf27.hostedemail.com (Postfix) with ESMTP id E872E40007 for ; Tue, 2 Jan 2024 17:53:53 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=M3SZOjbn; spf=pass (imf27.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1704218034; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=E8hA/72KITSlGftqc83O7J2ZNglP6mXl1Xu18eqxeHI=; b=McWWSVJ4q65VMc6RjFClGcJoVapkcNR5dwIuTKf1CXAJDp53pKcDlHDnMVYlw2/BsQvZzI wi4E0bq9YPhyaAt2/U6ZzHkMGAbozstMjl7KoyRClifkkN/1jZs9McC1bsrnAL2dQTxnIh yVXfvdk2ZHAY2N58uLUpmEGtIh2zhfA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1704218034; a=rsa-sha256; cv=none; b=RECWhrNJhN2w63kC2+SiZkAM4B5JHO2URriOG8OMNbueuzasVANCQccWhN3DNwApDirs2t rdMeoUDmzSpAv57rKgFNKJRX1F5+PsXAKZlFSTrubEb9De2hKES5FgYeAhitUBJSRsVJBQ EDZT5U56Y/L97YvJFTHelZjzcXFphI4= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=M3SZOjbn; spf=pass (imf27.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-1d3f3ee00a2so33822355ad.3 for ; Tue, 02 Jan 2024 09:53:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1704218032; x=1704822832; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:message-id:date :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=E8hA/72KITSlGftqc83O7J2ZNglP6mXl1Xu18eqxeHI=; b=M3SZOjbn2Lm+13ZRdtPEpapt+iDzxlC+3eo6C9EiqwLvmLNfE6igrSVAJta3uiVs60 XYWZ2ZgAXMdBVfdWpDfurLkfQEYZg3WTfp296bVGl+vEp6/950VNMaBoQPDOWeWzRgGl f0p99mcqe1EoLycUDXFTmh7u0wVrmWiRswvOXOKdiQeGkknBCX7cp4KPGCF9XtWwiZVV st1PRNboP9tp02k7SvDNn5BxUMNwWLZ8SiBM9+Yhr0x0zUSAkj6BYF2HIsROcWVqZ2X6 issKJW+RRlRwv94I2OffAKYLrERrLXvde6fUbT/sgfIn7kq2/iSG/cTnQmyaHLEDQlkJ oUEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704218032; x=1704822832; h=content-transfer-encoding:mime-version:reply-to:message-id:date :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=E8hA/72KITSlGftqc83O7J2ZNglP6mXl1Xu18eqxeHI=; b=K3s4V8LdTTYa66PI+AD9GxGUXJoykNr6mqJfxrfgL4dWYYQj8EgG4cKse6ltuhYfu5 M4JH8XBuRO2lyKzWbShONjmRqlcAngWgi/dFAv1KEzVHz2YN3trFi0WXVlQ4OdzCQZbn ErHyRg5QabMTpkLB+UVBb/Q9VQG/yelmThof7KlDbMBxHaZ9h3CgkzUpSjdGTLeAicee MFZ2bbDuWw4bMJgI1mgaG9SL6jg8DBZfI7c+y5dNTisdxANB/I1dNeRCmDPeCUftd+0M V9oMymogQYr3ElwZMuZbNX47q7bOuq1ID/rAzE4vQ4aoXdT8yP2OUt/FAozwpvMDLesv JvNg== X-Gm-Message-State: AOJu0YxZRSEPM1Kwp+06uCGb6TSzDec3X9PQ1rLsgGu6H2Fuso/wfzX/ dcrvbC606DwCeLse1zcCMjSDUwdJkFSxxL98 X-Google-Smtp-Source: AGHT+IEwPqnH6rm6GvjwS5rmDTNtvrSi/Q9MvKWuVpyN3fIKb/ncxZoEHKsqrw1LI+/pJpZD5Mb6lg== X-Received: by 2002:a17:902:ea04:b0:1d4:55e0:bd0e with SMTP id s4-20020a170902ea0400b001d455e0bd0emr8204290plg.18.1704218032001; Tue, 02 Jan 2024 09:53:52 -0800 (PST) Received: from KASONG-MB2.tencent.com ([115.171.41.9]) by smtp.gmail.com with ESMTPSA id be10-20020a170902aa0a00b001d3c3d486bfsm22151969plb.163.2024.01.02.09.53.48 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 02 Jan 2024 09:53:51 -0800 (PST) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Chris Li , "Huang, Ying" , Hugh Dickins , Johannes Weiner , Matthew Wilcox , Michal Hocko , Yosry Ahmed , David Hildenbrand , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v2 0/9] swapin refactor for optimization and unified readahead Date: Wed, 3 Jan 2024 01:53:29 +0800 Message-ID: <20240102175338.62012-1-ryncsn@gmail.com> X-Mailer: git-send-email 2.43.0 Reply-To: Kairui Song MIME-Version: 1.0 X-Rspamd-Queue-Id: E872E40007 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: mi8pxay6u5rcfxmqqej9nsnf7r3k5dry X-HE-Tag: 1704218033-285746 X-HE-Meta: U2FsdGVkX1+jJoo5stGQEscHr9MrOWaphP8Cdv8kHk3KgBoEa+RIBR8dUhbuJ/tBez9+xpLQdCkWEPUaP04kS8NTVRm2hcFDqMmw5CQKmymB57ZckvFDQHlYHxWfe6DuAZRRw0wHgTvOWNWB8/tCqptLOItElvIeoTqfKmtsDLStWzFbdY/P+w8XMPu/SMbQB8cprZCHnO7hSHBeyv2i2eEMSxHfUY9yTzGdaRZLNr1PA3yDAOnMuIYbSgDynn/iTmU2w9k9HEHus5RVSUlPFbGJdYDB9+ID+ZgIri+bn/NoT/VEJ7t9dVv3jRYCEd2S7XRnOyKjFkHvLpMdwK016Lu2Gy3udLjFG3kLWRKFhvnnVh3hgWPr4yPwrvdqdn31RUg0cSkGA2JrvM0jFGmWx02wk0DkhxYcWSFjrmBCJAnFn9FGY4UAmu+GIWnulyQfd2tqU9Z7I56eY40Gxo6Kui/7WOJKL/bOFMc6YQ/82GnabgbycvuAgCR+3jH5AsnrSfJj9EG1Gwrt8Ie7L78DfSXXBN3FHyhMfVdsv78rDSH2Zy+2l9W74YJPiL+5I+mJ/KAsgroiruhszcLfBN+rB9hRTQavABGl2ppW8Y3CVn0MG8MWulkA00I0HxJwr5+wBnwoDoQ38IqT2sNoqmePfW4xMSvSL+Sniom2UjgkSyPjTO8c/vvdTbJmk9mNR/WnK+uic5aWqqMlqjxe6KuLlm/4Wm4Rt+clqW/p93Zo39Rp92hKsuS22WZPMueeCUOE7lXezzFIkR5ioniN5hXiBHhF0sLTgPhrAjh8aeNKNFx9zIe8JUYa2L26r5bnXTSjICicOyOSxt/Yh7h3tHIMILUj7UCx2xhzffgf1dmyhomESmqlPQ4mjUiFxS+hglG7nz74w0U34LtWx5r0XuMc1XYWnf14V4UKZF2AYJyvMyAu7h5jmyGoxFOG1em8LABSvEHNmgp3CMzxv6MfQhi pxwu/TQe tAeRotDLVeZo+ede1IcHzbmC2cyCzMR7EKGpzRmdpXuk7ffqHZaXsgAQsX81FTp9kSDC0eUZ7MNFtr0k6s8mt3RQwY/hF0YVkX2JTx2IKvq96/b2dOj+a4JcTvv69JXl+MRyEhlvTpaR1f3p2RNKG1ly3zztP8AdYUCn6u9PDBjPxcETmaaT/Q/UGXWttf4mUiITCemcMBdr4ydJC6z+VJBK1eqKokXmFChObUFrXi0PUuB5xZQMKVTcUOJ7IgPYgsdial5Z0dO9RB8LJ9iQ94KFW30oK2REOFi3P5Li1Z2gLoyluLH4LG9uhpGuBkwJrfRGJIuLpw/lvW1A7iH9d1NxfkHG9bHAgZ770yVhpI3UseT+o7Cud1KPJQPD1SgOxfC82SKVA0ehdvLhkdz5bDiXoT8wk35cQc8s3yqsScgHQpIwwPFZkFtkrKEc/S7H7P73BHpcNEog4BwwMU1tz7LFCMrEBrkdw3ZGMjMoqqoB44vuO7jxoGoHWPKFNGraOjOiAtfS/DBDhUOnFTYhcyIrV4oBC5D40MmTTUc//fioz8Vz+hN+GZ7s0gXZ3CtS4hQUtPnFITJR2wqjcwHZA/V0Ykplok4IEx15V42s0GJxp4OtAKqcDvYw7DsAJtQTv1I1B/baqC8OZb82BjOjhYfGJPnkgArajmGLFMUC4h6JZwJ9j89bPmoPjww== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kairui Song This series is rebased on latest mm-stable to avoid conflicts. This series tries to unify and clean up the swapin path, introduce minor optimization, and make both shmem swapoff make use of SWP_SYNCHRONOUS_IO flag to skip readahead and swapcache for better performance. 1. Some benchmark for dropping readahead and swapcache for shmem with ZRAM: - Single file sequence read: perf stat --repeat 20 dd if=/tmpfs/test of=/dev/null bs=1M count=8192 (/tmpfs/test is a zero filled file, using brd as swap, 4G memcg limit) Before: 22.248 +- 0.549 After: 22.021 +- 0.684 (-1.1%) - Random read stress test: fio -name=tmpfs --numjobs=16 --directory=/tmpfs \ --size=256m --ioengine=mmap --rw=randread --random_distribution=random \ --time_based --ramp_time=1m --runtime=5m --group_reporting (using brd as swap, 2G memcg limit) Before: 1818MiB/s After: 1888MiB/s (+3.85%) - Zipf biased random read stress test: fio -name=tmpfs --numjobs=16 --directory=/tmpfs \ --size=256m --ioengine=mmap --rw=randread --random_distribution=zipf:1.2 \ --time_based --ramp_time=1m --runtime=5m --group_reporting (using brd as swap, 2G memcg limit) Before: 31.1GiB/s After: 32.3GiB/s (+3.86%) Previously, shmem always used cluster readahead, it doesn't help much even for single sequence read, and for random stress tests, the performance is better without it. In reality, due to memory and swap fragmentation cluster read-head is less helpful for ZRAM. 2. Micro benchmark which use madvise to swap out 10G zero-filled data to ZRAM then read them in, shows a performance gain for swapin path: Before: 11143285 us After: 10692644 us (+4.1%) 3. Swap off an 10G ZRAM: Before: time swapoff /dev/zram0 real 0m12.337s user 0m0.001s sys 0m12.329s After: time swapoff /dev/zram0 real 0m9.728s user 0m0.001s sys 0m9.719s This also clean up the path to apply a per swap device readahead policy for all swapin paths. V1: https://lkml.org/lkml/2023/11/19/296 Update from V1: - Rebased based on mm-unstable. - Remove behaviour changing patches, will submit in seperate series later. - Code style, naming and comments updates. - Thanks to Chris Li for very detailed and helpful review of V1. Thanks to Matthew Wilcox and Huang Ying for helpful suggestions. Kairui Song (9): mm/swapfile.c: add back some comment mm/swap: move no readahead swapin code to a stand-alone helper mm/swap: avoid doing extra unlock error checks for direct swapin mm/swap: always account swapped in page into current memcg mm/swap: introduce swapin_entry for unified readahead policy mm/swap: also handle swapcache lookup in swapin_entry mm/swap: avoid a duplicated swap cache lookup for SWP_SYNCHRONOUS_IO mm/swap: introduce a helper for swapin without vmfault swap, shmem: use new swapin helper and skip readahead conditionally mm/memory.c | 74 +++++++------------------- mm/shmem.c | 67 +++++++++++------------ mm/swap.h | 39 ++++++++++---- mm/swap_state.c | 138 +++++++++++++++++++++++++++++++++++++++++------- mm/swapfile.c | 32 +++++------ 5 files changed, 218 insertions(+), 132 deletions(-)