From patchwork Wed Jul 12 15:54:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Gowans, James" X-Patchwork-Id: 13310598 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85622EB64DA for ; Wed, 12 Jul 2023 15:55:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 02DEC6B007E; Wed, 12 Jul 2023 11:55:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F20516B0080; Wed, 12 Jul 2023 11:55:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DE7B88D0001; Wed, 12 Jul 2023 11:55:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id D088C6B007E for ; Wed, 12 Jul 2023 11:55:07 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 7229F402CC for ; Wed, 12 Jul 2023 15:55:07 +0000 (UTC) X-FDA: 81003408654.20.07912E9 Received: from smtp-fw-9106.amazon.com (smtp-fw-9106.amazon.com [207.171.188.206]) by imf10.hostedemail.com (Postfix) with ESMTP id AE724C001B for ; Wed, 12 Jul 2023 15:55:03 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=Y30eqz0m; spf=pass (imf10.hostedemail.com: domain of "prvs=5500ff0e1=jgowans@amazon.com" designates 207.171.188.206 as permitted sender) smtp.mailfrom="prvs=5500ff0e1=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689177304; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=LVjgjUbjERTsFVSNtscDtjPvDGcfdrNKfXLSL6zFYW8=; b=egwKmw6u6e+ldpLO4e+i+0pim6EhhQVQDtIkQ/nFpBz76EIhTMfOHY8dnmH4iTzensMVJE CFTl4/g8v4EZA0zXStWcuv1dkLFaZukfLSKRJnQQt6OgC9HRs8vAqsDfihVrPrWH3McCk6 FNVaQR6pTN/K/vNzy4vnkKE9BCcQGtg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689177304; a=rsa-sha256; cv=none; b=qQcL2BRIqxqmsYgWFSqkf/2117Tkb6UyIRrsdWN4OLhskEVcM86DhFBmjJF7MlTC9Ds74q z+faSm6qRUIGTiHoT9Pof9W7IA0h1Nm+k/oTnHXYGFQVIM9sU+7zFRnQbh1nisPgdGNupC ixIDkAZYImj7IsUsQvCuuboCN8227NU= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=Y30eqz0m; spf=pass (imf10.hostedemail.com: domain of "prvs=5500ff0e1=jgowans@amazon.com" designates 207.171.188.206 as permitted sender) smtp.mailfrom="prvs=5500ff0e1=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1689177304; x=1720713304; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=LVjgjUbjERTsFVSNtscDtjPvDGcfdrNKfXLSL6zFYW8=; b=Y30eqz0mFjn7yOmXqPfrxJf7F+uf87Rd3bax86VDYECMQ7IWpBxOisJ6 w2tACt/2ZkWcNhpw/vUzyZKBgh8OPnhn8lmEYbzIYe+zgfuYJaX7oXeG6 LnxKOSXe4qJ1vjzLUDfCC7I/1P2mVnHwaSKGVCAEwJHdo7IBptin2fCcD E=; X-IronPort-AV: E=Sophos;i="6.01,200,1684800000"; d="scan'208";a="659499494" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO email-inbound-relay-iad-1a-m6i4x-edda28d4.us-east-1.amazon.com) ([10.25.36.214]) by smtp-border-fw-9106.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2023 15:54:55 +0000 Received: from EX19D014EUC004.ant.amazon.com (iad12-ws-svc-p26-lb9-vlan3.iad.amazon.com [10.40.163.38]) by email-inbound-relay-iad-1a-m6i4x-edda28d4.us-east-1.amazon.com (Postfix) with ESMTPS id 9940480531; Wed, 12 Jul 2023 15:54:51 +0000 (UTC) Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.225) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.30; Wed, 12 Jul 2023 15:54:44 +0000 From: James Gowans To: CC: , , James Gowans , =?utf-8?q?Jan_H_=2E_Sch=C3=B6nherr?= , Andrew Morton , Vlastimil Babka , Baolin Wang , Mel Gorman , Matthew Wilcox , Johannes Weiner , Kefeng Wang , Minghao Chi Subject: [RFC] mm: compaction: suitable_migration_target checks for higher order buddies Date: Wed, 12 Jul 2023 17:54:21 +0200 Message-ID: <20230712155421.875491-1-jgowans@amazon.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Originating-IP: [10.146.13.225] X-ClientProxiedBy: EX19D041UWB004.ant.amazon.com (10.13.139.143) To EX19D014EUC004.ant.amazon.com (10.252.51.182) X-Rspamd-Queue-Id: AE724C001B X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 96ezszbesr1h8i4kpisu11u5ykjnq135 X-HE-Tag: 1689177303-589553 X-HE-Meta: U2FsdGVkX19pPW7dq0Hs486bMAWgJNSQXIveRbhYDJlWqm957UObKCTAkXrMmq40xJN3GtOGDdwYGxREcG3SVLT76xVHhIU0zi/hv42S3Ef5yjyVXXiYTaaML+bAwfYoHWfXNDDNRDsfXcy26bmVSAst+cIYpFcaxEWsB//iDoGmT5/GRstfM43+lZn9vFuAgNGrnjNBRcuQkgP8E6QVMwtoe3RmGCZ1VAfpqYnLhJhRYVfdht4OLIIXYdvkwpGz78FWs5sMaGadEs8tw7CWYg3pGOeAUdKy53PslBD2z1u4Zyw8Vbn/SkWRkQFF5lpXGPfMDGw/iS0rgkJqnwkIj/a37jYvURw98leBfGqNAHNcYPTHXgLAUSO9kBn/39ogCCujCLBlYNcOSwRe3Ep+kS3e7OypSSQpk9RhHU1aO2MrSh7tczcDXiQL7bbJjoBQlMd113+J2b1CWm+ebkszeCjxsSRWb+gssq6zgwiILLL9eQrKeTS29TtHu4L0jyl31XnNll5S/aIWLn9crna4A/p0ziOVnPokB4UESNOQ3pR83RtXwgPAjYHh6F557q1wDwlE81aCWxJ400cIUZ+CO7n+AJ8G8VKEVDC/rt0uM3862eJ2rQHONuooKSP5DyJ1KLw2ZFyLabq5zcP4uvMXlwwO9wVjdEEmHut2bpSoFPx6JWuva9SMA2Tl3pjY2QIY1AAkzWmayIIY3v+IIauesQkzlvb47/lEyW3elW4Q9hqc2YBUpvrHpqD0toNiqGZYV4w7p7WFswC2ft1L0LwPlRpjUiY2sKtCuvG9TBLxJIfmW8+iS4K7mYm7iIyMTuBuCH83XYkuqp+2rJu0H5sO4t0988XSdkjCdzIqxAP9TqGf9lmaKJjCjTuKvZpo4BSAEGtgp69BGaXKEFryfw1AoqzrnfkA7gZ+U7y52gjUWjEbWdD8dn9atRX1NmyjtVFXDuvhknuIDN/dwz2CmQt GuRzuKOs RQQSsMI2I9E/zMuaYV0EQ8VCyazD01laYQb48cqC0TgvNQAoBaHzXzsgGboi4IMZ9CxCyFAvmgtyXS/BoFEVDcdRQaJW3rWvH7CgbSbKTjkykl9sv3bmTrm/KvdNWgH+S1DorYeMiG/1PJ/xo2xhLPIdDLXJG4z37GxW+Oh/LiVrGtomMAPXEHsHmEVKAnfboh7Dn7ai2wCo45zoL3/FdEFN9dBn+St9k0OdgNMWt+6sRRC3DhriKxoJ3H2zP/uguLsrd7ysAYQebMarqcz0gC8TPA4OI1PSaxq9iaJUkMJQzgjNjJ4ssKcWbTUOopaCftY17FWHKyd9X3wzg4+1wQpgm3BLaPOr7+N8a4rfKA1PhdCoy/BiggUjZFs5eQNbMzVJTCfvGCSLyZAyLvUrnpRVRbWeUeCkR1EFt46uBfsx9C5uB4xKOALA9E5+mqNslclXgvGuhzV7TifbMYFj/bzZorywjbbiSjyuaOUaDZGeVO73mgaa3DeTj7NzgTxC2idih5TNI4palYPG7Qm9sqMNcmmGJN1ye5dPAYWG4CN+l24grJQQ67IzNjy7991yHbaPQ+JR1WaP3Zg6X2RrXmzzqOZd1RqawsZg5tCdlY+QOOAHfXTsKcr/qnMgsSD42CqE8cRCwlyBYheJMSN59tDnlWikQoxACRCK8P/Y2G8ekXkQf41esc4BgQSy8Ymgw8b7jgq2kXXZ1ZDrt7/MS8nw57ilJ3osTxZjmL/7XHW+rLY/ffiiMy9SzK7sg5I7fTNnslubmIPLU3R4wrZX4m04MYm0tZd1iG4rcsfYQWtoVixueaC6qPsfHlMwUwJcFveQ4GyCPrLWeVM4+0Reg4gNzPyAlqUXjNXQaXnaKkemsKe7tMnpZjWGGZiDE1mLSdK4Nkc7TPwQPMZ39jqg/50uIU6vlzEvIM2A3DfPajeEW4pKTCG7sOB4YjBkyVoffSnPQYQQ4yg1lU025xDrFkQyWb+bk H7nt8Pt9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Huge page compaction finds free target pages to which source pages can be migrated when compacting. A huge page sized and aligned block is considered a valid source of target pages if it passes the suitable_migration_target() test. One of the things which suitable_migration_target() does is to ensure that the entire block isn't currently free. It would counter productive to use an already fully free huge page sized block as a migration target because using pages from that free huge page block would decrease the number of available huge pages in the system. suitable_migration_source() attempts to ensure that the supplied block is not currently a fully free block by checking PageBuddy flag on the starting page of the huge page sized and aligned block. This approach is flawed: the buddy list can and does maintain buddies at a larger order than huge page size. For example on a typical x86 system, huge page pageblock_order is 2 MiB, but the buddy list MAX_ORDER is 4 MiB. Because of this, a pageblock_order sized block may be free because it is part of a larger order buddy list buddy, but the pageblock_order sized block won't itself be part of the buddy list, only the larger order block will be. The current suitable_migration_target() implementation of just checking the PageBuddy flag on the pageblock_order block is hence insufficient as it will appear that the block is not free and hence try to use it as a source of migration target pages. Enhance suitable_migration_target() to cater for this case by scanning up the buddy orders from the current pageblock_order page to MAX_ORDER to see if any of the larger page blocks have the PageBuddy flag set. In practice incorrectly considering a page block as a suitable migration target doesn't actually cause the block to be broken down. That block is passed to isolate_freepages_block() which will scan it for any pages currently in the buddy list. The assumption is that buddy list nodes will be found because the entire block is not free. In the case described above actually no buddy list nodes will be found because the higher order block is free. It's just unnecessary scanning. As such, the user visible effect of this change is only (in theory [1]) very slightly faster huge compaction by avoiding scanning entirely free blocks for free pages. Even if the effect is negligible, this change better conveys what the function is attempting to do: check whether this page block is entirely free or not. [1] I have not actually measured whether the difference is noticeable. Notes and caveats for this RFC: - If the supplied struct page is already the "left most" page in a MAX_ORDER block, the page will be checked multiple times unnecessarily. Iterating up the orders will result in zeroing bits which were already zero. Not sure if we want to get fancier here and detect this by finding the starting order? - The PFN bit masking is somewhat yucky. We could use helper functions but the ones I know of rely on knowing the existing order of the supplied struct page, which this function is currently oblivious to. - Is this change even worth it? My contention is "yes" or else this function wouldn't bother checking the PageBuddy flag today - it clearly wants to try to avoid unnecessary scans... Either let's do the job properly or delete the check rather than a half job. Suggested-by: Jan H. Schönherr Signed-off-by: James Gowans Cc: Andrew Morton Cc: Vlastimil Babka Cc: Baolin Wang Cc: Mel Gorman Cc: Matthew Wilcox Cc: Johannes Weiner Cc: Kefeng Wang Cc: Minghao Chi --- mm/compaction.c | 39 ++++++++++++++++++++++++++++++++------- 1 file changed, 32 insertions(+), 7 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index 9641e2131901..fb0e37d99364 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -1342,15 +1342,40 @@ static bool suitable_migration_source(struct compact_control *cc, static bool suitable_migration_target(struct compact_control *cc, struct page *page) { - /* If the page is a large free page, then disallow migration */ - if (PageBuddy(page)) { + unsigned int higher_order; + /* + * If the supplied page is part of a pageblock_order or larger free + * block it is not a suitable migration target block. Detect this case + * by starting at the pageorder_block aligned page and scan upwards to + * MAX_ORDER aligned page. Scan to see if any of the struct pages are + * in the buddy list for the order of the larger block. Disallow + * migration if so. + */ + for (higher_order = pageblock_order; higher_order <= MAX_ORDER; ++higher_order) { + struct page *higher_order_page; + unsigned long higher_order_pfn; /* - * We are checking page_order without zone->lock taken. But - * the only small danger is that we skip a potentially suitable - * pageblock, so it's not worth to check order for valid range. + * This is legal provided that struct pages are always initialised + * to at least start at MAX_ORDER alignment. */ - if (buddy_order_unsafe(page) >= pageblock_order) - return false; + higher_order_pfn &= ~((1 << higher_order) - 1); + higher_order_page = pfn_to_page(higher_order_pfn); + if (PageBuddy(higher_order_page)) { + /* + * We are checking page_order without zone->lock taken. But + * the only small danger is that we skip a potentially suitable + * pageblock, so it's not worth to check order for valid range. + */ + if (buddy_order_unsafe(higher_order_page) >= higher_order) + return false; + /* + * This is a buddy but not a sufficiently large buddy. + * There will never be a larger one above this. + */ + else + break; + } + } if (cc->ignore_block_suitable)