From patchwork Fri Apr 28 20:54:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Doug Anderson X-Patchwork-Id: 13226851 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1805DC77B60 for ; Fri, 28 Apr 2023 20:55:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 078406B0072; Fri, 28 Apr 2023 16:55:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 028BF6B0074; Fri, 28 Apr 2023 16:55:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E32966B0075; Fri, 28 Apr 2023 16:55:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id D27016B0072 for ; Fri, 28 Apr 2023 16:55:22 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 9DE0F40334 for ; Fri, 28 Apr 2023 20:55:22 +0000 (UTC) X-FDA: 80732005284.13.D11EFB3 Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) by imf11.hostedemail.com (Postfix) with ESMTP id CACD040015 for ; Fri, 28 Apr 2023 20:55:19 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=ce9F3+ul; spf=pass (imf11.hostedemail.com: domain of dianders@chromium.org designates 209.85.210.181 as permitted sender) smtp.mailfrom=dianders@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682715319; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=mKH9cNLoRl7F0jBRyhbLlrHWLrcYg1/BWBgCd+GuWTY=; b=3aaGgDFM0TVhFRe77Pd9L8KTMWoMJmnLoROk8qltH233TXHpFlPE8FgZmPWN162KoXc1o1 TjgYTCK2fbdTIFZcGs/HETQgwXZeneLzHUjFM01CNFWLK0gHC2DEwR94G+Iei6SqKHTQg9 kbbUeqFud2iyaduebvl5DkXwDyX4Uek= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=ce9F3+ul; spf=pass (imf11.hostedemail.com: domain of dianders@chromium.org designates 209.85.210.181 as permitted sender) smtp.mailfrom=dianders@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682715320; a=rsa-sha256; cv=none; b=zS2mq8ZCIuJOyaVVu66aNfijfvD0q7vn665A5e64/50vwGMCEDfTFG4FPTBDphjvyP3McG 29NubnsNEzJfIBRbNenXFgu5ZP8Dw3L/0tG6n+qIXrt/IYQZ5WyvG60xlGNUskiv0InYtV hbwkq5DTwO+PaJVkOOLPrv7hWmasOnU= Received: by mail-pf1-f181.google.com with SMTP id d2e1a72fcca58-64115e652eeso15396137b3a.0 for ; Fri, 28 Apr 2023 13:55:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1682715318; x=1685307318; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=mKH9cNLoRl7F0jBRyhbLlrHWLrcYg1/BWBgCd+GuWTY=; b=ce9F3+ulqAEqFwiYR/spGPVhrpwPFT1fWBM0liGcnadWqkv4icXpwr2gdbNiKWQ8rE HcqhSBIph7l3DtZaF/tFWpHZ/+MuocskRsD6H860hXG4OI8j6KPYCjbakr5agU/0+dBd rdxHgWSg5gYqhDB7e0psB12+n1j0kz1T5SSfo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682715318; x=1685307318; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=mKH9cNLoRl7F0jBRyhbLlrHWLrcYg1/BWBgCd+GuWTY=; b=Jipyjeb46/1HRQ+6A4HCNXtamqV9BDZkfPQ2I9IS9jpdw9QFTOO/1joH6JYuWwxcqn rxD1DWx4EqbVgyiJrULAkSFonKgMBY3d8u6QybvTXzZwz+ggzsVEFW5IKYo7ii9+lHSM hD2iiHb+qLUZI9ZHT2oSAltjez6BeUoAML9UVbI/c8qfkH1eew4XP1WpG1gc21glfOhS yAcnAtvj2qhrg7W4kId54xbp8csQoJQfeQ+CfA8ri0r6aUWOPzEAuMHEDRrMsw5BCh30 3SUVktf3v7MKLKe+jylzX9Ntw3u5OkpEtlW0wFU55QDTxLwMMl4JbxVuSF5rTDLjPHlC jhsA== X-Gm-Message-State: AC+VfDy1qqfBxTSRU71eIkjTN/HXNP/yRfLLPlY6yDFzkimYIqdQ5yzr 1YPJa6JKfrYVPuIliwB6XXbr4w== X-Google-Smtp-Source: ACHHUZ6De0YxbQH2D6OwuGFtK8lmn3ZJojdNUDvryKWf288AdhWSrN9Pts0NChi6SUTLfH3Cn69SfA== X-Received: by 2002:a17:903:2341:b0:1a2:8924:2230 with SMTP id c1-20020a170903234100b001a289242230mr14459019plh.27.1682715318482; Fri, 28 Apr 2023 13:55:18 -0700 (PDT) Received: from tictac2.mtv.corp.google.com ([2620:15c:9d:2:2d50:d792:6783:38db]) by smtp.gmail.com with ESMTPSA id bf9-20020a170902b90900b0019f9fd10f62sm13704122plb.70.2023.04.28.13.55.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 28 Apr 2023 13:55:17 -0700 (PDT) From: Douglas Anderson To: Andrew Morton , Mel Gorman , Vlastimil Babka , Ying , Alexander Viro , Christian Brauner Cc: Hillf Danton , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Gao Xiang , Matthew Wilcox , Yu Zhao , Douglas Anderson Subject: [PATCH v3] migrate_pages: Avoid blocking for IO in MIGRATE_SYNC_LIGHT Date: Fri, 28 Apr 2023 13:54:38 -0700 Message-ID: <20230428135414.v3.1.Ia86ccac02a303154a0b8bc60567e7a95d34c96d3@changeid> X-Mailer: git-send-email 2.40.1.495.gc816e09b53d-goog MIME-Version: 1.0 X-Rspamd-Queue-Id: CACD040015 X-Stat-Signature: 7y11aahsztq71ebtcsn1eh7ufmrwdje3 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1682715319-374649 X-HE-Meta: U2FsdGVkX1/JkWP+kJBlw4AFqLHnqG0KJvP9SSZcCf/yfG6ov0l5qVTYz8EIsbVlanvrHdult5T+trhSK715goBQJEF+Mmj0pcoaMFq/vp5IdbhrLPEgfrSbgnYzjKe369zITLw0k9rj4VOrKoYqAQcekxUQkwDHuLCrrxH9hNITBzr4A+0aj2yCV/6IgN1k/XOr/hg2gvhIoRcVzH1lBYU0KF7zDNn+0qUIBkgOTTzWkvqhMvUEqf8mdDhDwogREkTRiteeau4Rm9O88oG0+SErM09QtHu0Kkcos89mDw9iUTZ8lo+VK+5Oet80PldN+duf3yjyP0ZHzhZfkkUYynFwM1t7UCsIigxy7F+LhT6rbNvGHzkmcuXs65A+ftvCSYVP3wKn2lg3RxVCoYjndZSc+4vEv5b+xbSz0SfAN9zFXR3d22+s8MW7RSknhiVk98+dFL92qNN6w6rP07Yu7ODX+632GLGtsxeFsmyk3Yl0UWYbhLCk4ksR9F2z4OPPRuhkZUgfkhb0WXf8wR/Rrqv8FnKqfEttHm90iyhH+8sCguQeSR1y0+o8QL0mx2iFFk42k1n8aTjgvg9IkST1HLxLphC6W6HDAAALkuRtEAg03WD7Savy0H1AJqAaw8awZBqWkciLK/+al7ecmeCLA5wOB65kqojtX/5tbSVd0Z/PPzNP2IuDpG97oDZOCH325UQb1sjgU4OTqm9aWXSWWweDriIf44X3dT8kD4gxBt+Zn/G7iKMDtT42cbefH1hij9QE/NdfXJfcel3bwYBPAB9Q+VO4Za8Z2CqlNC+hcCqquk8aNQi0zm8BiNFbsMjIgbMaoyhQp6Bei1qWg7TDDc9YB/wCuQg5IQTFY82AB8tqGXPKr9Ezc626SI1y79QBhG6wRHOzkzSZ28UtYFCvBUdyKVuvfXCC1UbTj5ShCYNqx/CCz9DIPr+9ra+SVmb2zeq8viSRkMbSrnEkqOU xEhvMx3g s2zHdTvNUuhBQAC/23p9qKC1FYqcOi/paOOtOJ+nYjijVUirfbjSIec4gLTqYpn6bsw4QrRlz3A350Wbl9UI69DQ3gd/oK2CoQ/f01dvtYMwSB5VnlNhSGkoPWUXPL1mZ1BycllaSn/cTgVaesZe6BuoQgdHeW5qzYzfwgaqFTvsSqjgW1PDXNRbI8V0qnFNP+r10d/QOl+/svBKXaqCU13UUiotoBFyFc6HG1Y9b1gheRFH9CuDwb03CMMFAvtF+EQ0DD6gLn2XoncO+6NKMQoMe0S5hb8uIYxdz+plb2qnLH86kxFu4HA/HXlRZSV0bh9WZJOv8Y4clRM4M3IGqr4LXl56xtlyPtj7B3uDmNKJGheOvgG16rhOOnGDkFTCdmNVSqQPKqttwd0RTixBvexqdHn4sJRpqqy/Zj9ppnyr4vdtLAe4veVWsFix+jopoaN93lz8p1K3S29iMX3JDlMnLA1sTsKz82K/vHJtLzopMikfmKkcw1nc4wnwPFcK6VCVqfMDaRSp2ExMrHsZl1AB4wp2ZdCqy7Ieiqrk2enedGHiHbhATXDOCmdyWhKTXEjSfPkdz8abNqT3ZPH8G5juk8QFFxbP1a24JwsH4y/ODkCahSMER7kEDFJTBILVJHMcuzOa1LsJDA1f8JRun/azJqYjprM4Ddp7EMCQhgpgsLoIpjEwThdwNzdIlitOV8laPctpubfyH15IshS+D8IN3mA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The MIGRATE_SYNC_LIGHT mode is intended to block for things that will finish quickly but not for things that will take a long time. Exactly how long is too long is not well defined, but waits of tens of milliseconds is likely non-ideal. When putting a Chromebook under memory pressure (opening over 90 tabs on a 4GB machine) it was fairly easy to see delays waiting for some locks in the kcompactd code path of > 100 ms. While the laptop wasn't amazingly usable in this state, it was still limping along and this state isn't something artificial. Sometimes we simply end up with a lot of memory pressure. Putting the same Chromebook under memory pressure while it was running Android apps (though not stressing them) showed a much worse result (NOTE: this was on a older kernel but the codepaths here are similar). Android apps on ChromeOS currently run from a 128K-block, zlib-compressed, loopback-mounted squashfs disk. If we get a page fault from something backed by the squashfs filesystem we could end up holding a folio lock while reading enough from disk to decompress 128K (and then decompressing it using the somewhat slow zlib algorithms). That reading goes through the ext4 subsystem (because it's a loopback mount) before eventually ending up in the block subsystem. This extra jaunt adds extra overhead. Without much work I could see cases where we ended up blocked on a folio lock for over a second. With more extreme memory pressure I could see up to 25 seconds. We considered adding a timeout in the case of MIGRATE_SYNC_LIGHT for the two locks that were seen to be slow [1] and that generated much discussion. After discussion, it was decided that we should avoid waiting for the two locks during MIGRATE_SYNC_LIGHT if they were being held for IO. We'll continue with the unbounded wait for the more full SYNC modes. With this change, I couldn't see any slow waits on these locks with my previous testcases. NOTE: The reason I stated digging into this originally isn't because some benchmark had gone awry, but because we've received in-the-field crash reports where we have a hung task waiting on the page lock (which is the equivalent code path on old kernels). While the root cause of those crashes is likely unrelated and won't be fixed by this patch, analyzing those crash reports did point out these very long waits seemed like something good to fix. With this patch we should no longer hang waiting on these locks, but presumably the system will still be in a bad shape and hang somewhere else. [1] https://lore.kernel.org/r/20230421151135.v2.1.I2b71e11264c5c214bc59744b9e13e4c353bc5714@changeid Suggested-by: Matthew Wilcox Cc: Mel Gorman Cc: Hillf Danton Cc: Gao Xiang Signed-off-by: Douglas Anderson Reviewed-by: Matthew Wilcox (Oracle) Acked-by: Mel Gorman --- Most of the actual code in this patch came from emails written by Matthew Wilcox and I just cleaned the code up to get it to compile. I'm happy to set authorship to him if he would like, but for now I've credited him with Suggested-by. This patch has changed pretty significantly between versions, so adding a link to previous versions to help anyone needing to find the history: v1 - https://lore.kernel.org/r/20230413182313.RFC.1.Ia86ccac02a303154a0b8bc60567e7a95d34c96d3@changeid v2 - https://lore.kernel.org/r/20230421221249.1616168-1-dianders@chromium.org/ Changes in v3: - Combine patches for buffers and folios. - Use buffer_uptodate() and folio_test_uptodate() instead of timeout. Changes in v2: - Keep unbounded delay in "SYNC", delay with a timeout in "SYNC_LIGHT". - Also add a timeout for locking of buffers. mm/migrate.c | 49 ++++++++++++++++++++++++++----------------------- 1 file changed, 26 insertions(+), 23 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index db3f154446af..4a384eb32917 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -698,37 +698,32 @@ static bool buffer_migrate_lock_buffers(struct buffer_head *head, enum migrate_mode mode) { struct buffer_head *bh = head; + struct buffer_head *failed_bh; - /* Simple case, sync compaction */ - if (mode != MIGRATE_ASYNC) { - do { - lock_buffer(bh); - bh = bh->b_this_page; - - } while (bh != head); - - return true; - } - - /* async case, we cannot block on lock_buffer so use trylock_buffer */ do { if (!trylock_buffer(bh)) { - /* - * We failed to lock the buffer and cannot stall in - * async migration. Release the taken locks - */ - struct buffer_head *failed_bh = bh; - bh = head; - while (bh != failed_bh) { - unlock_buffer(bh); - bh = bh->b_this_page; - } - return false; + if (mode == MIGRATE_ASYNC) + goto unlock; + if (mode == MIGRATE_SYNC_LIGHT && !buffer_uptodate(bh)) + goto unlock; + lock_buffer(bh); } bh = bh->b_this_page; } while (bh != head); + return true; + +unlock: + /* We failed to lock the buffer and cannot stall. */ + failed_bh = bh; + bh = head; + while (bh != failed_bh) { + unlock_buffer(bh); + bh = bh->b_this_page; + } + + return false; } static int __buffer_migrate_folio(struct address_space *mapping, @@ -1162,6 +1157,14 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page if (current->flags & PF_MEMALLOC) goto out; + /* + * In "light" mode, we can wait for transient locks (eg + * inserting a page into the page table), but it's not + * worth waiting for I/O. + */ + if (mode == MIGRATE_SYNC_LIGHT && !folio_test_uptodate(src)) + goto out; + folio_lock(src); } locked = true;