From patchwork Fri Apr 21 22:12:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Doug Anderson X-Patchwork-Id: 13220744 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F742C7618E for ; Fri, 21 Apr 2023 22:13:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ECFFE6B0078; Fri, 21 Apr 2023 18:13:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E80506B007B; Fri, 21 Apr 2023 18:13:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D20FE6B007D; Fri, 21 Apr 2023 18:13:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C3BD36B0078 for ; Fri, 21 Apr 2023 18:13:31 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id A3F70120497 for ; Fri, 21 Apr 2023 22:13:31 +0000 (UTC) X-FDA: 80706800622.14.B189F7E Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) by imf25.hostedemail.com (Postfix) with ESMTP id D0371A001F for ; Fri, 21 Apr 2023 22:13:28 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=AkSpP2B6; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf25.hostedemail.com: domain of dianders@chromium.org designates 209.85.210.169 as permitted sender) smtp.mailfrom=dianders@chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682115208; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=u93zuMvcW9crJ3xx/ibodoPbhN86+1UwtbTmDpeZKPY=; b=dsxGJ3qPUaARHM3CGHW2Yv/8Rs/Ux1XTBfFvGXaCI9DUramKJHdwBU3adF4r7lJSUpSwzS 3oO0nKWN91ikUlW+QBkczqc5V9ctx0wd4v4eD0J3tTbxKII4RchaLZGgir8yUgTpnTrCdF 26L3dkgppSQzWE3lKkVpFQkfIiQAFxQ= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=AkSpP2B6; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf25.hostedemail.com: domain of dianders@chromium.org designates 209.85.210.169 as permitted sender) smtp.mailfrom=dianders@chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682115208; a=rsa-sha256; cv=none; b=nUqZnCRrGlqF9yhs5FNuSJ66o9Nc9TBoBeZl/JznSRPf9XQMEjBTTZXh9OMa+UCF3hwBIZ A8cmBSDehm2yfxboxXXNWaECmUB0e9QLM66vied36W0TOB4Mbec0oiqDhbS+qbtrqxhh6C cVBR0AgZTovlkXjloRru9JRZrU4P+d0= Received: by mail-pf1-f169.google.com with SMTP id d2e1a72fcca58-63b4a64c72bso2269471b3a.0 for ; Fri, 21 Apr 2023 15:13:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1682115208; x=1684707208; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=u93zuMvcW9crJ3xx/ibodoPbhN86+1UwtbTmDpeZKPY=; b=AkSpP2B6MEWZUl0frTvsfsvLWFK8V1GAoZ0xgM/K0816fyASmzFYyukMou4sJkPvvQ x8m+0LZ0Kley490iLqLstsoKAwAWY4snWPkkl+IHwzXeFi5Wamq/VJvtD6G0iFq0oHV0 OAD4MLYCsMD/2Dj11rG/SZNFI7U3EbP8CQc1w= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682115208; x=1684707208; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=u93zuMvcW9crJ3xx/ibodoPbhN86+1UwtbTmDpeZKPY=; b=LZyIOQQqvPR5CXYYtUjQ+U1NejaiwAf0RXpdcnSQ8W2nndpDF2escQ0l+4sQ/ozJEE Gz1B9w0M8Eolachy7aKujbAEpi+k7f+f+AO8aucVGnUIkDUInVRTxGjNyctP3KKWTQLL m3Q/5D1zFEZC72ex/A8AGCV1oM7TE50KTXYKfigaA9c2t7yGzGMj/LxT0BFWp0Mlottj /Aey4+8JKnb7iuZwICVAdCLhf8aDWDEWTUnMEkVNHpXyMj0jLqmB5WXcTYRgik+mPUCP nTYTIBBJisbW8JHTtBT39bpSy+x6VRS3rH0fz6kN1wK1NW91/S7VdCsb3UZiRXMfQ2aR ndCA== X-Gm-Message-State: AAQBX9c+UVwc5h3JBoFhEQeh9MQSiKjlyN8uAsZrphAINLod5NEZoSvK b90YBeLaWXlXLsQXKshos3RTcw== X-Google-Smtp-Source: AKy350YerkDyIlCdlYz+2bbDYDMPvivuf/02m3z67Um6oKR/8X1Rqic07nKJ4+CvFtUj6ZUAdN+jUw== X-Received: by 2002:a05:6a00:2196:b0:63a:5bcd:e580 with SMTP id h22-20020a056a00219600b0063a5bcde580mr8793160pfi.9.1682115207703; Fri, 21 Apr 2023 15:13:27 -0700 (PDT) Received: from tictac2.mtv.corp.google.com ([2620:15c:9d:2:87cc:9018:e569:4a27]) by smtp.gmail.com with ESMTPSA id y72-20020a62644b000000b006372791d708sm3424715pfb.104.2023.04.21.15.13.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 21 Apr 2023 15:13:26 -0700 (PDT) From: Douglas Anderson To: Andrew Morton , Mel Gorman , Vlastimil Babka , Ying , Alexander Viro , Christian Brauner Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao , linux-fsdevel@vger.kernel.org, Matthew Wilcox , Douglas Anderson Subject: [PATCH v2 3/4] migrate_pages: Don't wait forever locking pages in MIGRATE_SYNC_LIGHT Date: Fri, 21 Apr 2023 15:12:47 -0700 Message-ID: <20230421151135.v2.3.Ia86ccac02a303154a0b8bc60567e7a95d34c96d3@changeid> X-Mailer: git-send-email 2.40.0.634.g4ca3ef3211-goog In-Reply-To: <20230421221249.1616168-1-dianders@chromium.org> References: <20230421221249.1616168-1-dianders@chromium.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: D0371A001F X-Stat-Signature: a3trx3re3f4n6ye8n89imxe7g4tikk9f X-HE-Tag: 1682115208-77729 X-HE-Meta: U2FsdGVkX19c+ymT7LNoUXgT7zj8NdOYFFPImVUWCrUvYOo/GYFdzbTq1HPsT2GSPkAcQYvLRw5Plj4QFD4Fo15G6pmc6Hfij50djG5sQqrtnd0YILW4YtxT1jMKqj2alAh9pNeFuePUTM2gX4iVO9uBg5S4fE5iE9YQzMrbD9taQ9vatc/udCX5bSYyXJrFXmlhyogn0j5S64DX609n/h/ntfwnI7eaw1t1ptUF8M1qlEJYQxouAHrFFqZFsz/OQQBQwAnhS/9lOfHubiGUvN23UCR84n5oZ1sK16f4qeugXyjV2RiZ3Xw1Amx9iaOtckAG7aIXDw2FrFJLBP/9BgAG94VGey8auqa6544yfZPtePqQLl3J7StREMtfBJQXbE4QC3vN41t2c1e2EE6KGZ5YaZR+b5H79wfT4e+rR/ZvBogvqbbgXR81Z77GHCoJg5BYzyp3uX/86E+pZx2fAvpcwck7KQAXKDJJX/12T3kn+jGeRm1xy4W8AZyw3CV7/65onvgDct6bjm9ivipidsloI/Ien2xHIxymxMfV17auzEkDq0hR/ho0ZeMrzdHyrs47j8ESiQzQ2hF78ytsYiIWeVfKLrEVIDfrcDuEaPE6oBfDtPQluRSiQ+qkWDzs9m9L/jmjQ7ypwzQRQc3m47l95xNCmHjtpZgfQSz5DgwEtEuHh8yDDnsU03gQt/uv5XMLr8Eo/JLHB2DYT/vFIYtgSIfI/DQfSWneDYICidU9n7sEtFpjmjPHyb1Zc1EjQQspwsyoPORYVgumuOcSpb+kMecUx1rQT3KLFGy/k82pqTi8QxRIAc01M45THQfNOQRhIcxt9UD6vve0MiPCW8rdH+QncsAb02FQMdnbzIVzYXOWaWaZf6BUM7EJ2ibxbB1ROHx8rL8+duAk+plsVTzSUQ2faeYzOd5PT0SnTmMDGa+esXQpBGx9SBn/N96cx0N+nWtkDdYWiGaHdQR lciIg4lg oBgJx8opkiLsDdIldqaIQUQIgoP4LITJkCaNOjwmkBKp+ZLDExW8fI5ejvKaqJtxicfOuZFxa4tdOCP4kwcigb6VoOcTDDxwv1fEGSvxehbV2Ocs/UAl7vZWW8rxMub6u/BDM1sNRoExg1XD1yo1RTByAU+ppnR08E2U7cAxcgM3hSdWfpD7Sx91RjjgBb1BOUJ+HCfi6G8ul2xYU4LxPoSpZE7Uf4tOtqonmgg8Dq8kAufJ6hcRzNKb0Au2WKOh5Pv+9VdIuACCLLsp311QOEcJ6EBsGKkI2KkJ/HjWbpwBkaUMEjo1R+jXQzYBIQaSFjhfzgUKXcAw6HyDQhp0lvDSj3Aty1dsefXiIyYAr2sd6JB/umMQnm5VP3owds9R5z09ki8q0kl9Toj31zPZyKLRbMzPvxKGKk5sn4ovad6uJRWQ5YgFUXTGL4bbtcvO1I0K4EubGAVLoWxkySMA03S1bydlYOfUDaxHpQB19on7rXGAUobNvtB27KSTx1kbJajCzVjXQwGcTMgs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The MIGRATE_SYNC_LIGHT mode is intended to block for things that will finish quickly but not for things that will take a long time. Exactly how long is too long is not well defined, but waits of tens of milliseconds is likely non-ideal. Waiting on the folio lock in isolate_movable_page() is something that usually is pretty quick, but is not officially bounded. Nothing stops another process from holding a folio lock while doing an expensive operation. Having an unbounded wait like this is not within the design goals of MIGRATE_SYNC_LIGHT. When putting a Chromebook under memory pressure (opening over 90 tabs on a 4GB machine) it was fairly easy to see delays waiting for the lock of > 100 ms. While the laptop wasn't amazingly usable in this state, it was still limping along and this state isn't something artificial. Sometimes we simply end up with a lot of memory pressure. Putting the same Chromebook under memory pressure while it was running Android apps (though not stressing them) showed a much worse result (NOTE: this was on a older kernel but the codepaths here are similar). Android apps on ChromeOS currently run from a 128K-block, zlib-compressed, loopback-mounted squashfs disk. If we get a page fault from something backed by the squashfs filesystem we could end up holding a folio lock while reading enough from disk to decompress 128K (and then decompressing it using the somewhat slow zlib algorithms). That reading goes through the ext4 subsystem (because it's a loopback mount) before eventually ending up in the block subsystem. This extra jaunt adds extra overhead. Without much work I could see cases where we ended up blocked on a folio lock for over a second. With more more extreme memory pressure I could see up to 25 seconds. Let's bound the amount of time we can wait for the folio lock. The SYNC_LIGHT migration mode can already handle failure for things that are slow, so adding this timeout in is fairly straightforward. With this timeout, it can be seen that kcompactd can move on to more productive tasks if it's taking a long time to acquire a lock. NOTE: The reason I stated digging into this isn't because some benchmark had gone awry, but because we've received in-the-field crash reports where we have a hung task waiting on the page lock (which is the equivalent code path on old kernels). While the root cause of those crashes is likely unrelated and won't be fixed by this patch, analyzing those crash reports did point out this unbounded wait and it seemed like something good to fix. ALSO NOTE: the timeout mechanism used here uses "jiffies" and we also will retry up to 7 times. That doesn't give us much accuracy in specifying the timeout. On 1000 Hz machines we'll end up timing out in 7-14 ms. On 100 Hz machines we'll end up in 70-140 ms. Given that we don't have a strong definition of how long "too long" is, this is probably OK. Suggested-by: Mel Gorman Signed-off-by: Douglas Anderson --- Changes in v2: - Keep unbounded delay in "SYNC", delay with a timeout in "SYNC_LIGHT" mm/migrate.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/mm/migrate.c b/mm/migrate.c index db3f154446af..60982df71a93 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -58,6 +58,23 @@ #include "internal.h" +/* Returns the schedule timeout for a non-async mode */ +static long timeout_for_mode(enum migrate_mode mode) +{ + /* + * We'll always return 1 jiffy as the timeout. Since all places using + * this timeout are in a retry loop this means that the maximum time + * we might block is actually NR_MAX_MIGRATE_SYNC_RETRY jiffies. + * If a jiffy is 1 ms that's 7 ms, though with the accuracy of the + * timeouts it often ends up more like 14 ms; if a jiffy is 10 ms + * that's 70-140 ms. + */ + if (mode == MIGRATE_SYNC_LIGHT) + return 1; + + return MAX_SCHEDULE_TIMEOUT; +} + bool isolate_movable_page(struct page *page, isolate_mode_t mode) { struct folio *folio = folio_get_nontail_page(page); @@ -1162,7 +1179,8 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page if (current->flags & PF_MEMALLOC) goto out; - folio_lock(src); + if (folio_lock_timeout(src, timeout_for_mode(mode))) + goto out; } locked = true;