From patchwork Wed Apr 5 16:18:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13202196 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E59CC76188 for ; Wed, 5 Apr 2023 16:19:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9FDBE6B0071; Wed, 5 Apr 2023 12:19:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9AD526B0074; Wed, 5 Apr 2023 12:19:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8759B6B0075; Wed, 5 Apr 2023 12:19:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 75E3F6B0071 for ; Wed, 5 Apr 2023 12:19:12 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 416451A04B4 for ; Wed, 5 Apr 2023 16:19:12 +0000 (UTC) X-FDA: 80647846944.16.CA204DD Received: from mail-pf1-f174.google.com (mail-pf1-f174.google.com [209.85.210.174]) by imf24.hostedemail.com (Postfix) with ESMTP id 136D7180010 for ; Wed, 5 Apr 2023 16:19:09 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=WIri22gc; spf=pass (imf24.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.174 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680711550; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=1To4EWudWGMS35VthxJEVPn4ml+5FfmSSEC6TcGeIwk=; b=jRY1Z+NA5DZE6U7TpCIeCvnAHo48Gra3oh64RqYSIy974+WCc/Uh3RuvLM3G4wH9hJWOKl 4A00gmWRwrzuXsUL71JF5KaxG6b4yMRdjB7/rudKQJG47c4rU6uy4GxJ8y6rNS5lLBi4VZ /755nBjPwi0sMeNudSPpAbenI6vlkgg= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=WIri22gc; spf=pass (imf24.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.174 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680711550; a=rsa-sha256; cv=none; b=TfJRB8ivvRFe3OT7EkgN0z1seE1xwdROfUY3aiVcO9H2+59PsA5jjJxlyHkNJJQjA1AIVn qcadSlLAj09ACCDOiRGQLhZVk/W1Xsml55+yohk98u/BOorMAQXCq1lZeRKDuzMaYRFz9B wzvdSBGELbWG9vhH7txXW7vEq50bZ3Q= Received: by mail-pf1-f174.google.com with SMTP id d2e1a72fcca58-6296a51e563so95881b3a.1 for ; Wed, 05 Apr 2023 09:19:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1680711549; x=1683303549; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=1To4EWudWGMS35VthxJEVPn4ml+5FfmSSEC6TcGeIwk=; b=WIri22gcv76zZgGKYCx26KszTqldyT8X6yf2bGElCpnSwQnDbraJXBaEcsLvbViTwi Wf6OppbwwjoohEEzK1A5/liDV7LqBW3U+pWSgPcI2sAOtGe7SGvMHM7NlKuLxBnkEJVu 4p2OeLHdMa5o7kJzkXZhPrwpx/f6A0w/qLBod8JTmiUwJa7ynfvRx72WCUCdhDaD6jqZ ss7ZbhS12KjvH8BLVL2FAslvEWx/EvgaU8lvilFwH731oKKW+7eBCSgxFPjUlSOzeTyh UvsT7nM66ZrHANZ3zhZJBAHvN/qa4qxHe/9M3c1h/W180TPinfY65ggffPvVI2HdKL8E jPDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680711549; x=1683303549; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=1To4EWudWGMS35VthxJEVPn4ml+5FfmSSEC6TcGeIwk=; b=20U2AdB/4x4fiQkqkmKO+DQRMx83kO034RCdzlZDRDntlhrJqA6j2NmyEHVC5g0Fu0 4xvh5iQvyJuYgazjelhQ6HkJO9imAUsETpVUMWj4xKMqKQ8xjBiqL3RvnLqh6eaTq7gg R/JKj/PCny2Wxz9Stlfura/jdrocz7MFV1t0WQOi8m+rLvW4v69i/9YC/3zRbAOuTZpG EiFw7g98qj0ZLuzXDFaZNO6oj1GAxDYCRUyr23zH/PmRf0fWmfIjLhpUrXLOlxLOtz0i dVaou38/9mcGaCj5Peu+Lvlt9bzgYyDhqtO4CKwqVzQda902wLBZbPdrQMm6oIA6p4IJ i6JQ== X-Gm-Message-State: AAQBX9fu2BGwaRl1KoBdfwPioVHqoGRetHzpqFLDvjxUhvrdxJpHntn5 UqmRhcQwTQJiQaMPU0EZm0KOyA== X-Google-Smtp-Source: AKy350bdafxIOqUOS+kynVUx2lRXuw5T6O0AdkM8E0EzXecvmU2OlBId01ShnGvrgz16ldkfAPZ6zQ== X-Received: by 2002:a05:6a21:6da0:b0:e4:9a37:2707 with SMTP id wl32-20020a056a216da000b000e49a372707mr2934273pzb.5.1680711548776; Wed, 05 Apr 2023 09:19:08 -0700 (PDT) Received: from localhost.localdomain ([139.177.225.244]) by smtp.gmail.com with ESMTPSA id x24-20020a62fb18000000b00582f222f088sm10878011pfm.47.2023.04.05.09.19.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Apr 2023 09:19:08 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, willy@infradead.org, lstoakes@gmail.com Cc: mgorman@suse.de, vbabka@suse.cz, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [PATCH v2 1/2] mm: swap: fix performance regression on sparsetruncate-tiny Date: Thu, 6 Apr 2023 00:18:53 +0800 Message-Id: <20230405161854.6931-1-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 136D7180010 X-Stat-Signature: zojweu8qf7qjmgrc6mdybr3ew7qxk7ks X-Rspam-User: X-HE-Tag: 1680711549-268776 X-HE-Meta: U2FsdGVkX1/mJUcMiZ+wrSFCyAj7hZQgaXDQlq/Oe71MmlpDqPC8uvhbYYDm1DA21lWmH5IxDov3+rny+BMwaJvM8K2c4BrOe3/A975PLthE39Ra75QYYpGTbM6szJFHI16jyP0l7TAvWGoIKI2vCLKV8gvpYk2Sk2XP5ugB5G4VtY51QshgHDKsxcZeOKGky4fYLvaoK8MvMoCZNCnCw/Cxo/FMDHV0heHcQJAyLC3T4psFzSm0uGaOk58gOQVMwhNl9R+4kw1vaknZYi5/dvyaqwDQNtRa1+6CFyln7CtufUqKbX4KhEFw3qQXwZVa/wEjlOlrpWCEblHXqRgJnVHIj6M7o+i+LEmJ+ezAGHBHn6ZehhiQflMyX7o4GJPm+SrnoDb2mHOQfIBcCP+8E+1W8EayNwpxzyFoViAjLwHMSFFPIx/CXT4x94TPOcHnwCiq+bu2UB211Z1TMhqPlq/vPmxH94STLcZG2fugyDfL9/R95jQXE99tDnWv8C7uWJyKsCYLIdOSQEvwk35Ss0Jc8by+YEePBiHlcpOcYKbMKnZtRg9EhDA8y1qtsKVyZ+tY2kL//hx8yNvDmNGh5ywsIHEZhwS2Xy6hvQ7PULNXm6E33mGqYpqqlj5bmHqndr+MoTMDVWAMBBpGWlCXlqLOae9Gt7A9nyDf0n6ymqM1JvY+IsaIAaveajlexbipnwnoWadYcqxspriii/3IqcXAFVCQwSqLJ1MOQmbZ7sOed4rs4Fe+S1zxxlZ7i3RnI255rPluX0RUNsRXon63HkXgYcCoj5F2XcyJjoRzmpHnnjSzUOsOFrlDgQZVQ2w27oOogxCoL6Xxn4e+N9NaxfqIczFlI+1kmsEpLD9ne4fLTaoF5Kx8lKJQ06GIlCYTsFlMfjMznMmffRddIAeK5YjFYdB6nDnVmGbammlVkf+hn+Rj2Uu3sP0eG79KGgQ76ylLn65GAuzjeLUpvi+ 5nmRng5P zqhj49syy8Vr9H2xTjJONF5JKFque834GG40Lfy+4JYAG6zZKqiZVC+cS4YTGvjru5hFW9LEnQ0/DduJakBqW9bcXbFMgyeg4K0ZH9Y+5on/49Mvjm5p1P4KqZB3c+5csr/S8SZRh3dGPTeVlxHMK0iZ6YmrOP75BqmMO8xHoS06f1nOJgSVQeQjnlbP3YspzJguIdQft0/53wieu7Jv7Mk84aUu9rQ924oVPG0lwDgVyjmuZadJNgVKOBnrQPu3X8Qb/OhgevKOdoqa4SjDfi6Nakm9O1eIMOrKUUE+dWPTCGaHEiaRTzQ12Dnlm7mdfHDlnKcCmeiIZ2gcYC+/je+qvPuUEpC1kl9pO6PemIFk9pGxYa8BfQSsPZsDFfjZMVB7toFSvBLbHeVtYgwc/xoWlvvakl7H56EZtA0pNYHBJ26CYzzCLUhoLgQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The ->percpu_pvec_drained was originally introduced by commit d9ed0d08b6c6 ("mm: only drain per-cpu pagevecs once per pagevec usage") to drain per-cpu pagevecs only once per pagevec usage. But after converting the swap code to be more folio-based, the commit c2bc16817aa0 ("mm/swap: add folio_batch_move_lru()") breaks this logic, which would cause ->percpu_pvec_drained to be reset to false, that means per-cpu pagevecs will be drained multiple times per pagevec usage. In theory, there should be no functional changes when converting code to be more folio-based. We should call folio_batch_reinit() in folio_batch_move_lru() instead of folio_batch_init(). And to verify that we still need ->percpu_pvec_drained, I ran mmtests/sparsetruncate-tiny and got the following data: baseline with baseline/ patch/ Min Time 326.00 ( 0.00%) 328.00 ( -0.61%) 1st-qrtle Time 334.00 ( 0.00%) 336.00 ( -0.60%) 2nd-qrtle Time 338.00 ( 0.00%) 341.00 ( -0.89%) 3rd-qrtle Time 343.00 ( 0.00%) 347.00 ( -1.17%) Max-1 Time 326.00 ( 0.00%) 328.00 ( -0.61%) Max-5 Time 327.00 ( 0.00%) 330.00 ( -0.92%) Max-10 Time 328.00 ( 0.00%) 331.00 ( -0.91%) Max-90 Time 350.00 ( 0.00%) 357.00 ( -2.00%) Max-95 Time 395.00 ( 0.00%) 390.00 ( 1.27%) Max-99 Time 508.00 ( 0.00%) 434.00 ( 14.57%) Max Time 547.00 ( 0.00%) 476.00 ( 12.98%) Amean Time 344.61 ( 0.00%) 345.56 * -0.28%* Stddev Time 30.34 ( 0.00%) 19.51 ( 35.69%) CoeffVar Time 8.81 ( 0.00%) 5.65 ( 35.87%) BAmean-99 Time 342.38 ( 0.00%) 344.27 ( -0.55%) BAmean-95 Time 338.58 ( 0.00%) 341.87 ( -0.97%) BAmean-90 Time 336.89 ( 0.00%) 340.26 ( -1.00%) BAmean-75 Time 335.18 ( 0.00%) 338.40 ( -0.96%) BAmean-50 Time 332.54 ( 0.00%) 335.42 ( -0.87%) BAmean-25 Time 329.30 ( 0.00%) 332.00 ( -0.82%) From the above it can be seen that we get similar data to when ->percpu_pvec_drained was introduced, so we still need it. Let's call folio_batch_reinit() in folio_batch_move_lru() to restore the original logic. Fixes: c2bc16817aa0 ("mm/swap: add folio_batch_move_lru()") Signed-off-by: Qi Zheng Reviewed-by: Matthew Wilcox (Oracle) Acked-by: Mel Gorman --- Changlog in v1 to v2: - revise commit message and add test data mm/swap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/swap.c b/mm/swap.c index 57cb01b042f6..423199ee8478 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -222,7 +222,7 @@ static void folio_batch_move_lru(struct folio_batch *fbatch, move_fn_t move_fn) if (lruvec) unlock_page_lruvec_irqrestore(lruvec, flags); folios_put(fbatch->folios, folio_batch_count(fbatch)); - folio_batch_init(fbatch); + folio_batch_reinit(fbatch); } static void folio_batch_add_and_move(struct folio_batch *fbatch,