From patchwork Tue Jun 18 23:26:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13703156 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D913C2BA15 for ; Tue, 18 Jun 2024 23:27:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9C6CB6B03D3; Tue, 18 Jun 2024 19:27:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 976146B03D4; Tue, 18 Jun 2024 19:27:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 83E776B03D5; Tue, 18 Jun 2024 19:27:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 662166B03D3 for ; Tue, 18 Jun 2024 19:27:01 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id D9F97C039F for ; Tue, 18 Jun 2024 23:27:00 +0000 (UTC) X-FDA: 82245597000.16.8C7B73A Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf07.hostedemail.com (Postfix) with ESMTP id C421540006 for ; Tue, 18 Jun 2024 23:26:58 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718753214; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references; bh=aaIUIigJZtd5kpGZ9fp10up8O3JAkkQq28q3cOpgoeA=; b=5jazSC/NV+bi4BhfwHUFxoTMoKpZIXD3U9CzJUJK/SveJfQBanZ3FIAIWSAC63oEQbj3Br BmfM4Xt6HVdN11YpPAbBdIy5Df1XgX/+e4VvUaDn6k055Te9GyweS4zjxigsAJ9HiZtx2F IBWmHcHpygB2zW9PVoL++hu1bdYL23w= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718753214; a=rsa-sha256; cv=none; b=rcpVPQ36mTdz4HFJqmauTdxz9dY4dcU1HsgqBFFYY/CqITB2eDsRkxJfNMF++DKoFr+pHu PrV/zupC5juSLl8cYipeH1uzSP/9aVad+EMMUp+bfGvxAr7S502FlfAQHzUQ5El4mroDka ZYLjtf7fLGwOQqVu2aFEMNDXQI1uLDA= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5E219DA7; Tue, 18 Jun 2024 16:27:22 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EE8CA3F64C; Tue, 18 Jun 2024 16:26:55 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Chris Li , Kairui Song , "Huang, Ying" , Kalesh Singh , Barry Song , Hugh Dickins , David Hildenbrand Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH v1 0/5] Alternative mTHP swap allocator improvements Date: Wed, 19 Jun 2024 00:26:40 +0100 Message-ID: <20240618232648.4090299-1-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 X-Stat-Signature: jaszh4x94mk77y8yfzsyoafffz61bfj1 X-Rspam-User: X-Rspamd-Queue-Id: C421540006 X-Rspamd-Server: rspam02 X-HE-Tag: 1718753218-162745 X-HE-Meta: U2FsdGVkX18gQvAIQ8YkY6QyzzGqKN0yKVqhVSez5xuCeC+LKbo82FgrgcgGnJuqB4e84zsonZ4TrLR5zd4R2QosfSjKcp2wg0ioKuQWwhYqoDaURal4lTAD5AQyc5UgXt0ucKu/GextW4Nt9bEpskTT+QaCdXrJBFVia0IirrAnG7KM2huLbNRB4HVjR2KK1RM/cOq9iOTnQgodxZ6d52edwGMmnJEvQZq4sDUGoUHUumVcLWu6Re10Px3ichuP0QmfFYOWj+/ppJl7j8GcocQNBayCLeqmfbtIUUN5CSuVHOvUoY/ZeofqZVTA0k2BLj6sDN81aLOYQYzyMLz4wK//FzhY/AAkIgN6rWwXzMTvqr1ZXQu1iA9STbbOYOIYUd0KFlcCmr4Tu8fV8g+ZYwJQ5kbAHISZZMYfpjgGSQ0jVdA8g3QpmausKCNz6oM3pqC+zIj3n2A2tvn+7iombYwT9uj3J7Q/i6mFP6cXgWqCKYZQXASdZMlIKPeClEPR90T3uLL5e+61eo2pc3+M5WnI6keWyr6BgJit1TFaEnB2xRkVkHEOmlf3JwOX5ydmnDr6zkYfM3EdBOkLd4R17wTwF9g17GRv9iWpFdhGqhrXz0XS48Ql2FVESpS4nZrZcvnXIekm4DpYGnO+ob9IBIy7QZjAq+EEyGutDX51SzE9ahDEQyzaJqoS3gPR9P0qZtPwyNcmKCbdzwt9wmWDkjJCq3wQJw8pOdoImD5Is0zjZOkCJpDNZyr5QCuuQK+Zjz+tvERg+L79ZhNHJCYFDOeFUADHMhmgOnT1mf6X2WGLZGDU3xStVuL6qrjW1ul1M485UC+rt9f+Q8VN3dgF8FTGWO0A/iaoaPKnxYFX5YweLLBN8cxAvhEl3NQm0BkSD1dJnW7rDKyURFwqcGSHj+jPin0FAehfLF90dixOHI1kFmSnehsWHp9Yx5gnBkgo7H5WAzeDo+denaW+On8 rzk56cbV iORCv5OaUkufksEnR41sa+hje5l0F+EcgFb9jOs98x3sWRmDGuUlOnO07co/bNJGlWVLqp3zllcB+ASaiua+fIIt4RPEftwzbKw5XadZbEUnROnhqzlDfrcqgq9eWXVMSuT5nJkrQwkqNy0My3BldVlJ2gkJ9YpjRGRUUhUkJaUAIXfdDJUn1vAn73A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi All, Chris has been doing great work at [1] to clean up my mess in the mTHP swap entry allocator. But Barry posted a test program and results at [2] showing that even with Chris's changes, there are still some fallbacks (around 5% - 25% in some cases). I was interested in why that might be and ended up putting this PoC patch set together to try to get a better understanding. This series ends up achieving 0% fallback, even with small folios ("-s") enabled. I haven't done much testing beyond that (yet) but thought it was worth posting on the strength of that result alone. At a high level this works in a similar way to Chris's series; it marks a cluster as being for a particular order and if a new cluster cannot be allocated then it scans through the existing non-full clusters. But it does it by scanning through the clusters rather than assembling them into a list. Cluster flags are used to mark clusters that have been scanned and are known not to have enough contiguous space, so the efficiency should be similar in practice. Because its not based around a linked list, there is less churn and I'm wondering if this is perhaps easier to review and potentially even get into v6.10-rcX to fix up what's already there, rather than having to wait until v6.11 for Chris's series? I know Chris has a larger roadmap of improvements, so at best I see this as a tactical fix that will ultimately be superseeded by Chris's work. There are a few differences to note vs Chris's series: - order-0 fallback scanning is still allowed in any cluster; the argument in the past was that swap should always use all the swap space, so I've left this mechanism in. It is only a fallback though; first the the new per-order scanner is invoked, even for order-0, so if there are free slots in clusters already assigned for order-0, then the allocation will go there. - CPUs can steal slots from other CPU's current clusters; those clusters remain scannable while they are current for a CPU and are only made unscannable when no more CPUs are scanning that particular cluster. - I'm preferring to allocate a free cluster ahead of per-order scanning, since, as I understand it, the original intent of a per-cpu current cluster was to get pages for an application adjacent in the swap to speed up IO. I'd be keen to hear if you think we could get something like this into v6.10 to fix the mess - I'm willing to work quickly to address comments and do more testing. If not, then this is probably just a distraction and we should concentrate on Chris's series. This applies on top of v6.10-rc4. [1] https://lore.kernel.org/linux-mm/20240614-swap-allocator-v2-0-2a513b4a7f2f@kernel.org/ [2] https://lore.kernel.org/linux-mm/20240615084714.37499-1-21cnbao@gmail.com/ Thanks, Ryan Ryan Roberts (5): mm: swap: Simplify end-of-cluster calculation mm: swap: Change SWAP_NEXT_INVALID to highest value mm: swap: Track allocation order for clusters mm: swap: Scan for free swap entries in allocated clusters mm: swap: Optimize per-order cluster scanning include/linux/swap.h | 18 +++-- mm/swapfile.c | 164 ++++++++++++++++++++++++++++++++++++++----- 2 files changed, 157 insertions(+), 25 deletions(-) --- 2.43.0